264

How can I download subtitles of a list of videos using youtube-dl? I need an option for this. However I could not find an option to download only subtitles

user198350
  • 4,269
fivetech
  • 2,803

5 Answers5

324

There is an option, mentioned in the documention:

Subtitle Options:

--write-sub                      Write subtitle file
--write-auto-sub                 Write automatic subtitle file (YouTube only)
--all-subs                       Download all the available subtitles of the video
--list-subs                      List all available subtitles for the video
--sub-format FORMAT              Subtitle format, accepts formats preference, for example: "srt" or "ass/srt/best"
--sub-lang LANGS                 Languages of the subtitles to download (optional) separated by commas, use IETF language tags like 'en,pt'

So for example, to list all subs for a video:

youtube-dl --list-subs https://www.youtube.com/watch?v=Ye8mB6VsUHw

To download all subs, but not the video:

youtube-dl --all-subs --skip-download https://www.youtube.com/watch?v=Ye8mB6VsUHw

If a video only has auto generated subtitles, then --all-subs still won't download it, instead use:

youtube-dl --write-auto-sub --skip-download https://www.youtube.com/watch?v=Ye8mB6VsUHw
rogerdpack
  • 2,394
l'L'l
  • 3,758
85

Or you can only download one subtitle

youtube-dl --write-sub --sub-lang en --skip-download URL 
m3asmi
  • 951
31

just run the following command

youtube-dl --write-auto-sub --convert-subs=srt --skip-download URL 

For example you are downloading https://www.youtube.com/watch?v=example. with title "example" --convert=srt will output to a file named example.en.srt where en stands for English es for Spanish etc.

The file will have something like this:

00:00:04.259 --> 00:00:05.259
>> I’m Elon Musk.

00:00:05.259 --> 00:00:06.669 >> What is your claim to fame?

00:00:06.669 --> 00:00:07.669 >> I’m the founder of

00:00:07.669 --> 00:00:08.669 Tesla.com.

OPTIONAL - If you need the text to be cleaned up you can use python to clean it a little:

import re
bad_words = ['-->','</c>'] 
prefix = re.compile(r"^&gt;&gt; ")

with open('example.en.vtt') as oldfile, open('newfile.txt', 'w') as newfile: for line in oldfile: if not any(bad_word in line for bad_word in bad_words): newfile.write(line)

with open('newfile.txt') as result: uniqlines = set(result.readlines()) with open('sub_out.txt', 'w') as rmdup: mylst = map(lambda each: re.sub(prefix, "", each), uniqlines) print(mylst) rmdup.writelines(set(mylst))

Output newfile.txt:

I’m Elon Musk.
What is your claim to fame?
I’m the founder of
Tesla.com.
alex
  • 113
3

Another simple way to download subtitles from YouTube is to download Google2SRT. Google2SRT is a free, open source program for Windows, Mac and Linux that is able to download, save and convert multiple subtitles from YouTube videos.

Usage

Click the links to see screenshots of steps 1 and 2.

  1. Paste the URL in the Google subtitles text box and click Read.

  2. Choose the language by selecting the appropriate check box provided and press Go.

  3. View the destination folder that was input in the SRT subtitles textbox to locate the SRT files.

0

youtube-dl has been forked, and the new command would be

yt-dlp --write-auto-sub --convert-subs=srt --skip-download https://www.youtube.com/watch?v=example

since i needed the text only, i adapted the answer of @Hernan Pesantez as follows, to clean the received format.

import re, sys

bad_words = ['-->','</c>']

new_lines = [] prefix = re.compile(r"^&gt;&gt; ") with open(sys.argv[1]) as oldfile: for line in oldfile: line = line.strip() if not line: continue line = re.sub(prefix, "", line) if any(bad_word in line for bad_word in bad_words): if new_lines and re.match(r'^\d+$', new_lines[-1]): new_lines.pop() continue if new_lines and line.startswith(new_lines[-1]): new_lines.pop() new_lines.append(line)

for line in new_lines: print(line)

alex
  • 113