We need to keep the orginal subtitles information, so that the '--load-info' option can be used to list or select the subtitles again.
We'll also be able to have a separate field for storing the automatic captions info.
For each language the extractor builds a list with the available formats sorted (like for video formats), then YoutubeDL selects one of them using the '--sub-format' option which now allows giving the format preferences (for example 'ass/srt/best').
For each format the 'url' field can be set so that we only download the contents if needed, or if the contents needs to be processed (like in crunchyroll) the 'data' field can be used.
The reasons for this change are:
* We weren't checking that the format given with '--sub-format' was available, checking it in each extractor would be repetitive.
* It allows to easily support giving a format preference.
* The subtitles were automatically downloaded in the extractor, but I think that if you use for example the '--dump-json' option you want to finish as fast as possible.
Currently only the ted extractor has been updated, but the old system still works.
If you run 'while read aurl ; do youtube-dl --extract-audio "${aurl}"; done < path_to_batch_file' (batch_file contains one url per line) each call to youtube-dl consumed some characters and 'read' would assing to 'aurl' a non valid url, something like 'tube.com/watch?v=<id>'.
Do not throw errors if view count or upload date extraction fails.
Dispose of re.MULTILINE, which had absolutely no effect without any ^ or $ in sight.
Follow PEP8 naming conventions.
* E402: we exectute code between imports, like modifying 'sys.path' in the tests
* E731: we assign to lambdas in a lot of places, we may want to consider defining functions in a single line instead (what pep8 recommends)