release 2018.09.01

[ChangeLog] Actualize
[ci skip]
2025-10-21 07:38:37 +09:00 · 2018-09-01 18:40:23 +07:00 · 2018-09-01 18:36:18 +07:00 · 2018-09-01 16:42:30 +07:00 · 2018-09-01 16:04:45 +07:00 · 2018-09-01 10:04:10 +01:00
12 changed files with 447 additions and 202 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -6,8 +6,8 @@
 ---
-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.08.28*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.09.01*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.08.28**
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.09.01**
 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -36,7 +36,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2018.08.28
+[debug] youtube-dl version 2018.09.01
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/16
+++ b/16
@@ -1,3 +1,19 @@
 version 2018.09.01
 Core
 * [utils] Skip remote IP addresses non matching to source address' IP version
  when creating a connection (#13422, #17362)
 Extractors
 + [ard] Add support for one.ard.de (#17397)
 * [niconico] Fix extraction on python3 (#17393, #17407)
 * [ard] Extract f4m formats
 * [crunchyroll] Parse vilos media data (#17343)
 + [ard] Add support for Beta ARD Mediathek
 + [bandcamp] Extract more metadata (#13197)
 * [internazionale] Fix extraction of non-available-abroad videos (#17386)
 version 2018.08.28
 Extractors
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -56,6 +56,7 @@
 - **archive.org**: archive.org videos
 - **ARD**
 - **ARD:mediathek**
 - **ARDBetaMediathek**
 - **Arkena**
 - **arte.tv**
 - **arte.tv:+7**
@@ -191,7 +192,7 @@
 - **Crackle**
 - **Criterion**
 - **CrooksAndLiars**
- - **Crunchyroll**
+ - **crunchyroll**
 - **crunchyroll:playlist**
 - **CSNNE**
 - **CSpan**: C-SPAN
--- a/youtube_dl/extractor/ard.py
+++ b/youtube_dl/extractor/ard.py
@@ -21,7 +21,7 @@ from ..compat import compat_etree_fromstring
 class ARDMediathekIE(InfoExtractor):
    IE_NAME = 'ARD:mediathek'
-    _VALID_URL = r'^https?://(?:(?:www\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
+    _VALID_URL = r'^https?://(?:(?:www\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de|one\.ard\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
    _TESTS = [{
        # available till 26.07.2022
@@ -37,6 +37,9 @@ class ARDMediathekIE(InfoExtractor):
            # m3u8 download
            'skip_download': True,
        }
    }, {
        'url': 'https://one.ard.de/tv/Mord-mit-Aussicht/Mord-mit-Aussicht-6-39-T%C3%B6dliche-Nach/ONE/Video?bcastId=46384294&documentId=55586872',
        'only_matching': True,
    }, {
        # audio
        'url': 'http://www.ardmediathek.de/tv/WDR-H%C3%B6rspiel-Speicher/Tod-eines-Fu%C3%9Fballers/WDR-3/Audio-Podcast?documentId=28488308&bcastId=23074086',
@@ -282,3 +285,76 @@ class ARDIE(InfoExtractor):
            'upload_date': upload_date,
            'thumbnail': thumbnail,
        }
 class ARDBetaMediathekIE(InfoExtractor):
    _VALID_URL = r'https://beta\.ardmediathek\.de/[a-z]+/player/(?P<video_id>[a-zA-Z0-9]+)/(?P<display_id>[^/?#]+)'
    _TESTS = [{
        'url': 'https://beta.ardmediathek.de/ard/player/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE/die-robuste-roswita',
        'md5': '2d02d996156ea3c397cfc5036b5d7f8f',
        'info_dict': {
            'display_id': 'die-robuste-roswita',
            'id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
            'title': 'Tatort: Die robuste Roswita',
            'description': r're:^Der Mord.*trüber ist als die Ilm.',
            'duration': 5316,
            'thumbnail': 'https://img.ardmediathek.de/standard/00/55/43/59/34/-1774185891/16x9/960?mandant=ard',
            'upload_date': '20180826',
            'ext': 'mp4',
        },
    }]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('video_id')
        display_id = mobj.group('display_id')
        webpage = self._download_webpage(url, display_id)
        data_json = self._search_regex(r'window\.__APOLLO_STATE__\s*=\s*(\{.*);\n', webpage, 'json')
        data = self._parse_json(data_json, display_id)
        res = {
            'id': video_id,
            'display_id': display_id,
        }
        formats = []
        for widget in data.values():
            if widget.get('_geoblocked'):
                raise ExtractorError('This video is not available due to geoblocking', expected=True)
            if '_duration' in widget:
                res['duration'] = widget['_duration']
            if 'clipTitle' in widget:
                res['title'] = widget['clipTitle']
            if '_previewImage' in widget:
                res['thumbnail'] = widget['_previewImage']
            if 'broadcastedOn' in widget:
                res['upload_date'] = unified_strdate(widget['broadcastedOn'])
            if 'synopsis' in widget:
                res['description'] = widget['synopsis']
            if '_subtitleUrl' in widget:
                res['subtitles'] = {'de': [{
                    'ext': 'ttml',
                    'url': widget['_subtitleUrl'],
                }]}
            if '_quality' in widget:
                format_url = widget['_stream']['json'][0]
                if format_url.endswith('.f4m'):
                    formats.extend(self._extract_f4m_formats(
                        format_url + '?hdcore=3.11.0',
                        video_id, f4m_id='hds', fatal=False))
                elif format_url.endswith('m3u8'):
                    formats.extend(self._extract_m3u8_formats(
                        format_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
                else:
                    formats.append({
                        'format_id': 'http-' + widget['_quality'],
                        'url': format_url,
                        'preference': 10,  # Plain HTTP, that's nice
                    })
        self._sort_formats(formats)
        res['formats'] = formats
        return res
--- a/youtube_dl/extractor/bandcamp.py
+++ b/youtube_dl/extractor/bandcamp.py
@@ -1,6 +1,5 @@
 from __future__ import unicode_literals
 import json
 import random
 import re
 import time
@@ -16,15 +15,18 @@ from ..utils import (
    int_or_none,
    KNOWN_EXTENSIONS,
    parse_filesize,
    str_or_none,
    try_get,
    unescapeHTML,
    update_url_query,
    unified_strdate,
    unified_timestamp,
    url_or_none,
 )
 class BandcampIE(InfoExtractor):
-    _VALID_URL = r'https?://.*?\.bandcamp\.com/track/(?P<title>[^/?#&]+)'
+    _VALID_URL = r'https?://[^/]+\.bandcamp\.com/track/(?P<title>[^/?#&]+)'
    _TESTS = [{
        'url': 'http://youtube-dl.bandcamp.com/track/youtube-dl-test-song',
        'md5': 'c557841d5e50261777a6585648adf439',
@@ -36,13 +38,44 @@ class BandcampIE(InfoExtractor):
        },
        '_skip': 'There is a limit of 200 free downloads / month for the test song'
    }, {
        # free download
        'url': 'http://benprunty.bandcamp.com/track/lanius-battle',
-        'md5': '0369ace6b939f0927e62c67a1a8d9fa7',
+        'md5': '853e35bf34aa1d6fe2615ae612564b36',
        'info_dict': {
            'id': '2650410135',
            'ext': 'aiff',
            'title': 'Ben Prunty - Lanius (Battle)',
            'thumbnail': r're:^https?://.*\.jpg$',
            'uploader': 'Ben Prunty',
            'timestamp': 1396508491,
            'upload_date': '20140403',
            'release_date': '20140403',
            'duration': 260.877,
            'track': 'Lanius (Battle)',
            'track_number': 1,
            'track_id': '2650410135',
            'artist': 'Ben Prunty',
            'album': 'FTL: Advanced Edition Soundtrack',
        },
    }, {
        # no free download, mp3 128
        'url': 'https://relapsealumni.bandcamp.com/track/hail-to-fire',
        'md5': 'fec12ff55e804bb7f7ebeb77a800c8b7',
        'info_dict': {
            'id': '2584466013',
            'ext': 'mp3',
            'title': 'Mastodon - Hail to Fire',
            'thumbnail': r're:^https?://.*\.jpg$',
            'uploader': 'Mastodon',
            'timestamp': 1322005399,
            'upload_date': '20111122',
            'release_date': '20040207',
            'duration': 120.79,
            'track': 'Hail to Fire',
            'track_number': 5,
            'track_id': '2584466013',
            'artist': 'Mastodon',
            'album': 'Call of the Mastodon',
        },
    }]
@@ -51,19 +84,23 @@ class BandcampIE(InfoExtractor):
        title = mobj.group('title')
        webpage = self._download_webpage(url, title)
        thumbnail = self._html_search_meta('og:image', webpage, default=None)
        m_download = re.search(r'freeDownloadPage: "(.*?)"', webpage)
        if not m_download:
            m_trackinfo = re.search(r'trackinfo: (.+),\s*?\n', webpage)
            if m_trackinfo:
                json_code = m_trackinfo.group(1)
                data = json.loads(json_code)[0]
                track_id = compat_str(data['id'])
-                if not data.get('file'):
+        track_id = None
-                    raise ExtractorError('Not streamable', video_id=track_id, expected=True)
+        track = None
        track_number = None
        duration = None
-                formats = []
+        formats = []
-                for format_id, format_url in data['file'].items():
+        track_info = self._parse_json(
            self._search_regex(
                r'trackinfo\s*:\s*\[\s*({.+?})\s*\]\s*,\s*?\n',
                webpage, 'track info', default='{}'), title)
        if track_info:
            file_ = track_info.get('file')
            if isinstance(file_, dict):
                for format_id, format_url in file_.items():
                    if not url_or_none(format_url):
                        continue
                    ext, abr_str = format_id.split('-', 1)
                    formats.append({
                        'format_id': format_id,
@@ -73,85 +110,110 @@ class BandcampIE(InfoExtractor):
                        'acodec': ext,
                        'abr': int_or_none(abr_str),
                    })
            track = track_info.get('title')
            track_id = str_or_none(track_info.get('track_id') or track_info.get('id'))
            track_number = int_or_none(track_info.get('track_num'))
            duration = float_or_none(track_info.get('duration'))
-                self._sort_formats(formats)
+        def extract(key):
            return self._search_regex(
                r'\b%s\s*["\']?\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1' % key,
                webpage, key, default=None, group='value')
-                return {
+        artist = extract('artist')
-                    'id': track_id,
+        album = extract('album_title')
-                    'title': data['title'],
+        timestamp = unified_timestamp(
-                    'thumbnail': thumbnail,
+            extract('publish_date') or extract('album_publish_date'))
-                    'formats': formats,
+        release_date = unified_strdate(extract('album_release_date'))
                    'duration': float_or_none(data.get('duration')),
                }
            else:
                raise ExtractorError('No free songs found')
-        download_link = m_download.group(1)
+        download_link = self._search_regex(
-        video_id = self._search_regex(
+            r'freeDownloadPage\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1', webpage,
-            r'(?ms)var TralbumData = .*?[{,]\s*id: (?P<id>\d+),?$',
+            'download link', default=None, group='url')
-            webpage, 'video id')
+        if download_link:
            track_id = self._search_regex(
                r'(?ms)var TralbumData = .*?[{,]\s*id: (?P<id>\d+),?$',
                webpage, 'track id')
-        download_webpage = self._download_webpage(
+            download_webpage = self._download_webpage(
-            download_link, video_id, 'Downloading free downloads page')
+                download_link, track_id, 'Downloading free downloads page')
-        blob = self._parse_json(
+            blob = self._parse_json(
-            self._search_regex(
+                self._search_regex(
-                r'data-blob=(["\'])(?P<blob>{.+?})\1', download_webpage,
+                    r'data-blob=(["\'])(?P<blob>{.+?})\1', download_webpage,
-                'blob', group='blob'),
+                    'blob', group='blob'),
-            video_id, transform_source=unescapeHTML)
+                track_id, transform_source=unescapeHTML)
-        info = blob['digital_items'][0]
+            info = try_get(
                blob, (lambda x: x['digital_items'][0],
                       lambda x: x['download_items'][0]), dict)
            if info:
                downloads = info.get('downloads')
                if isinstance(downloads, dict):
                    if not track:
                        track = info.get('title')
                    if not artist:
                        artist = info.get('artist')
                    if not thumbnail:
                        thumbnail = info.get('thumb_url')
-        downloads = info['downloads']
+                    download_formats = {}
-        track = info['title']
+                    download_formats_list = blob.get('download_formats')
                    if isinstance(download_formats_list, list):
                        for f in blob['download_formats']:
                            name, ext = f.get('name'), f.get('file_extension')
                            if all(isinstance(x, compat_str) for x in (name, ext)):
                                download_formats[name] = ext.strip('.')
-        artist = info.get('artist')
+                    for format_id, f in downloads.items():
-        title = '%s - %s' % (artist, track) if artist else track
+                        format_url = f.get('url')
                        if not format_url:
                            continue
                        # Stat URL generation algorithm is reverse engineered from
                        # download_*_bundle_*.js
                        stat_url = update_url_query(
                            format_url.replace('/download/', '/statdownload/'), {
                                '.rand': int(time.time() * 1000 * random.random()),
                            })
                        format_id = f.get('encoding_name') or format_id
                        stat = self._download_json(
                            stat_url, track_id, 'Downloading %s JSON' % format_id,
                            transform_source=lambda s: s[s.index('{'):s.rindex('}') + 1],
                            fatal=False)
                        if not stat:
                            continue
                        retry_url = url_or_none(stat.get('retry_url'))
                        if not retry_url:
                            continue
                        formats.append({
                            'url': self._proto_relative_url(retry_url, 'http:'),
                            'ext': download_formats.get(format_id),
                            'format_id': format_id,
                            'format_note': f.get('description'),
                            'filesize': parse_filesize(f.get('size_mb')),
                            'vcodec': 'none',
                        })
        download_formats = {}
        for f in blob['download_formats']:
            name, ext = f.get('name'), f.get('file_extension')
            if all(isinstance(x, compat_str) for x in (name, ext)):
                download_formats[name] = ext.strip('.')
        formats = []
        for format_id, f in downloads.items():
            format_url = f.get('url')
            if not format_url:
                continue
            # Stat URL generation algorithm is reverse engineered from
            # download_*_bundle_*.js
            stat_url = update_url_query(
                format_url.replace('/download/', '/statdownload/'), {
                    '.rand': int(time.time() * 1000 * random.random()),
                })
            format_id = f.get('encoding_name') or format_id
            stat = self._download_json(
                stat_url, video_id, 'Downloading %s JSON' % format_id,
                transform_source=lambda s: s[s.index('{'):s.rindex('}') + 1],
                fatal=False)
            if not stat:
                continue
            retry_url = url_or_none(stat.get('retry_url'))
            if not retry_url:
                continue
            formats.append({
                'url': self._proto_relative_url(retry_url, 'http:'),
                'ext': download_formats.get(format_id),
                'format_id': format_id,
                'format_note': f.get('description'),
                'filesize': parse_filesize(f.get('size_mb')),
                'vcodec': 'none',
            })
        self._sort_formats(formats)
        title = '%s - %s' % (artist, track) if artist else track
        if not duration:
            duration = float_or_none(self._html_search_meta(
                'duration', webpage, default=None))
        return {
-            'id': video_id,
+            'id': track_id,
            'title': title,
-            'thumbnail': info.get('thumb_url') or thumbnail,
+            'thumbnail': thumbnail,
-            'uploader': info.get('artist'),
+            'uploader': artist,
-            'artist': artist,
+            'timestamp': timestamp,
            'release_date': release_date,
            'duration': duration,
            'track': track,
            'track_number': track_number,
            'track_id': track_id,
            'artist': artist,
            'album': album,
            'formats': formats,
        }
--- a/youtube_dl/extractor/crunchyroll.py
+++ b/youtube_dl/extractor/crunchyroll.py
@@ -8,6 +8,7 @@ import zlib
 from hashlib import sha1
 from math import pow, sqrt, floor
 from .common import InfoExtractor
 from .vrv import VRVIE
 from ..compat import (
    compat_b64decode,
    compat_etree_fromstring,
@@ -18,6 +19,8 @@ from ..compat import (
 from ..utils import (
    ExtractorError,
    bytes_to_intlist,
    extract_attributes,
    float_or_none,
    intlist_to_bytes,
    int_or_none,
    lowercase_escape,
@@ -26,7 +29,6 @@ from ..utils import (
    unified_strdate,
    urlencode_postdata,
    xpath_text,
    extract_attributes,
 )
 from ..aes import (
    aes_cbc_decrypt,
@@ -139,7 +141,8 @@ class CrunchyrollBaseIE(InfoExtractor):
            parsed_url._replace(query=compat_urllib_parse_urlencode(qs, True)))
-class CrunchyrollIE(CrunchyrollBaseIE):
+class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
    IE_NAME = 'crunchyroll'
    _VALID_URL = r'https?://(?:(?P<prefix>www|m)\.)?(?P<url>crunchyroll\.(?:com|fr)/(?:media(?:-|/\?id=)|[^/]*/[^/?&]*?)(?P<video_id>[0-9]+))(?:[/?&]|$)'
    _TESTS = [{
        'url': 'http://www.crunchyroll.com/wanna-be-the-strongest-in-the-world/episode-1-an-idol-wrestler-is-born-645513',
@@ -148,7 +151,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
            'ext': 'mp4',
            'title': 'Wanna be the Strongest in the World Episode 1 – An Idol-Wrestler is Born!',
            'description': 'md5:2d17137920c64f2f49981a7797d275ef',
-            'thumbnail': 'http://img1.ak.crunchyroll.com/i/spire1-tmb/20c6b5e10f1a47b10516877d3c039cae1380951166_full.jpg',
+            'thumbnail': r're:^https?://.*\.jpg$',
            'uploader': 'Yomiuri Telecasting Corporation (YTV)',
            'upload_date': '20131013',
            'url': 're:(?!.*&amp)',
@@ -221,7 +224,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
        'info_dict': {
            'id': '535080',
            'ext': 'mp4',
-            'title': '11eyes Episode 1 – Piros éjszaka - Red Night',
+            'title': '11eyes Episode 1 – Red Night ~ Piros éjszaka',
            'description': 'Kakeru and Yuka are thrown into an alternate nightmarish world they call "Red Night".',
            'uploader': 'Marvelous AQL Inc.',
            'upload_date': '20091021',
@@ -437,13 +440,18 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
        if 'To view this, please log in to verify you are 18 or older.' in webpage:
            self.raise_login_required()
        media = self._parse_json(self._search_regex(
            r'vilos\.config\.media\s*=\s*({.+?});',
            webpage, 'vilos media', default='{}'), video_id)
        media_metadata = media.get('metadata') or {}
        video_title = self._html_search_regex(
            r'(?s)<h1[^>]*>((?:(?!<h1).)*?<span[^>]+itemprop=["\']title["\'][^>]*>(?:(?!<h1).)+?)</h1>',
            webpage, 'video_title')
        video_title = re.sub(r' {2,}', ' ', video_title)
-        video_description = self._parse_json(self._html_search_regex(
+        video_description = (self._parse_json(self._html_search_regex(
            r'<script[^>]*>\s*.+?\[media_id=%s\].+?({.+?"description"\s*:.+?})\);' % video_id,
-            webpage, 'description', default='{}'), video_id).get('description')
+            webpage, 'description', default='{}'), video_id) or media_metadata).get('description')
        if video_description:
            video_description = lowercase_escape(video_description.replace(r'\r\n', '\n'))
        video_upload_date = self._html_search_regex(
@@ -456,91 +464,99 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
            [r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', r'<div>\s*Publisher:\s*<span>\s*(.+?)\s*</span>\s*</div>'],
            webpage, 'video_uploader', fatal=False)
        available_fmts = []
        for a, fmt in re.findall(r'(<a[^>]+token=["\']showmedia\.([0-9]{3,4})p["\'][^>]+>)', webpage):
            attrs = extract_attributes(a)
            href = attrs.get('href')
            if href and '/freetrial' in href:
                continue
            available_fmts.append(fmt)
        if not available_fmts:
            for p in (r'token=["\']showmedia\.([0-9]{3,4})p"', r'showmedia\.([0-9]{3,4})p'):
                available_fmts = re.findall(p, webpage)
                if available_fmts:
                    break
        video_encode_ids = []
        formats = []
-        for fmt in available_fmts:
+        for stream in media.get('streams', []):
-            stream_quality, stream_format = self._FORMAT_IDS[fmt]
+            formats.extend(self._extract_vrv_formats(
-            video_format = fmt + 'p'
+                stream.get('url'), video_id, stream.get('format'),
-            stream_infos = []
+                stream.get('audio_lang'), stream.get('hardsub_lang')))
-            streamdata = self._call_rpc_api(
+        if not formats:
-                'VideoPlayer_GetStandardConfig', video_id,
+            available_fmts = []
-                'Downloading media info for %s' % video_format, data={
+            for a, fmt in re.findall(r'(<a[^>]+token=["\']showmedia\.([0-9]{3,4})p["\'][^>]+>)', webpage):
-                    'media_id': video_id,
+                attrs = extract_attributes(a)
-                    'video_format': stream_format,
+                href = attrs.get('href')
-                    'video_quality': stream_quality,
+                if href and '/freetrial' in href:
-                    'current_page': url,
+                    continue
-                })
+                available_fmts.append(fmt)
-            if streamdata is not None:
+            if not available_fmts:
-                stream_info = streamdata.find('./{default}preload/stream_info')
+                for p in (r'token=["\']showmedia\.([0-9]{3,4})p"', r'showmedia\.([0-9]{3,4})p'):
                    available_fmts = re.findall(p, webpage)
                    if available_fmts:
                        break
            if not available_fmts:
                available_fmts = self._FORMAT_IDS.keys()
            video_encode_ids = []
            for fmt in available_fmts:
                stream_quality, stream_format = self._FORMAT_IDS[fmt]
                video_format = fmt + 'p'
                stream_infos = []
                streamdata = self._call_rpc_api(
                    'VideoPlayer_GetStandardConfig', video_id,
                    'Downloading media info for %s' % video_format, data={
                        'media_id': video_id,
                        'video_format': stream_format,
                        'video_quality': stream_quality,
                        'current_page': url,
                    })
                if streamdata is not None:
                    stream_info = streamdata.find('./{default}preload/stream_info')
                    if stream_info is not None:
                        stream_infos.append(stream_info)
                stream_info = self._call_rpc_api(
                    'VideoEncode_GetStreamInfo', video_id,
                    'Downloading stream info for %s' % video_format, data={
                        'media_id': video_id,
                        'video_format': stream_format,
                        'video_encode_quality': stream_quality,
                    })
                if stream_info is not None:
                    stream_infos.append(stream_info)
-            stream_info = self._call_rpc_api(
+                for stream_info in stream_infos:
-                'VideoEncode_GetStreamInfo', video_id,
+                    video_encode_id = xpath_text(stream_info, './video_encode_id')
-                'Downloading stream info for %s' % video_format, data={
+                    if video_encode_id in video_encode_ids:
-                    'media_id': video_id,
+                        continue
-                    'video_format': stream_format,
+                    video_encode_ids.append(video_encode_id)
                    'video_encode_quality': stream_quality,
                })
            if stream_info is not None:
                stream_infos.append(stream_info)
            for stream_info in stream_infos:
                video_encode_id = xpath_text(stream_info, './video_encode_id')
                if video_encode_id in video_encode_ids:
                    continue
                video_encode_ids.append(video_encode_id)
-                video_file = xpath_text(stream_info, './file')
+                    video_file = xpath_text(stream_info, './file')
-                if not video_file:
+                    if not video_file:
-                    continue
+                        continue
-                if video_file.startswith('http'):
+                    if video_file.startswith('http'):
-                    formats.extend(self._extract_m3u8_formats(
+                        formats.extend(self._extract_m3u8_formats(
-                        video_file, video_id, 'mp4', entry_protocol='m3u8_native',
+                            video_file, video_id, 'mp4', entry_protocol='m3u8_native',
-                        m3u8_id='hls', fatal=False))
+                            m3u8_id='hls', fatal=False))
                    continue
                video_url = xpath_text(stream_info, './host')
                if not video_url:
                    continue
                metadata = stream_info.find('./metadata')
                format_info = {
                    'format': video_format,
                    'height': int_or_none(xpath_text(metadata, './height')),
                    'width': int_or_none(xpath_text(metadata, './width')),
                }
                if '.fplive.net/' in video_url:
                    video_url = re.sub(r'^rtmpe?://', 'http://', video_url.strip())
                    parsed_video_url = compat_urlparse.urlparse(video_url)
                    direct_video_url = compat_urlparse.urlunparse(parsed_video_url._replace(
                        netloc='v.lvlt.crcdn.net',
                        path='%s/%s' % (remove_end(parsed_video_url.path, '/'), video_file.split(':')[-1])))
                    if self._is_valid_url(direct_video_url, video_id, video_format):
                        format_info.update({
                            'format_id': 'http-' + video_format,
                            'url': direct_video_url,
                        })
                        formats.append(format_info)
                        continue
-                format_info.update({
+                    video_url = xpath_text(stream_info, './host')
-                    'format_id': 'rtmp-' + video_format,
+                    if not video_url:
-                    'url': video_url,
+                        continue
-                    'play_path': video_file,
+                    metadata = stream_info.find('./metadata')
-                    'ext': 'flv',
+                    format_info = {
-                })
+                        'format': video_format,
-                formats.append(format_info)
+                        'height': int_or_none(xpath_text(metadata, './height')),
                        'width': int_or_none(xpath_text(metadata, './width')),
                    }
                    if '.fplive.net/' in video_url:
                        video_url = re.sub(r'^rtmpe?://', 'http://', video_url.strip())
                        parsed_video_url = compat_urlparse.urlparse(video_url)
                        direct_video_url = compat_urlparse.urlunparse(parsed_video_url._replace(
                            netloc='v.lvlt.crcdn.net',
                            path='%s/%s' % (remove_end(parsed_video_url.path, '/'), video_file.split(':')[-1])))
                        if self._is_valid_url(direct_video_url, video_id, video_format):
                            format_info.update({
                                'format_id': 'http-' + video_format,
                                'url': direct_video_url,
                            })
                            formats.append(format_info)
                            continue
                    format_info.update({
                        'format_id': 'rtmp-' + video_format,
                        'url': video_url,
                        'play_path': video_file,
                        'ext': 'flv',
                    })
                    formats.append(format_info)
        self._sort_formats(formats, ('height', 'width', 'tbr', 'fps'))
        metadata = self._call_rpc_api(
@@ -549,7 +565,17 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
                'media_id': video_id,
            })
-        subtitles = self.extract_subtitles(video_id, webpage)
+        subtitles = {}
        for subtitle in media.get('subtitles', []):
            subtitle_url = subtitle.get('url')
            if not subtitle_url:
                continue
            subtitles.setdefault(subtitle.get('language', 'enUS'), []).append({
                'url': subtitle_url,
                'ext': subtitle.get('format', 'ass'),
            })
        if not subtitles:
            subtitles = self.extract_subtitles(video_id, webpage)
        # webpage provide more accurate data than series_title from XML
        series = self._html_search_regex(
@@ -557,8 +583,8 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
            webpage, 'series', fatal=False)
        season = xpath_text(metadata, 'series_title')
-        episode = xpath_text(metadata, 'episode_title')
+        episode = xpath_text(metadata, 'episode_title') or media_metadata.get('title')
-        episode_number = int_or_none(xpath_text(metadata, 'episode_number'))
+        episode_number = int_or_none(xpath_text(metadata, 'episode_number') or media_metadata.get('episode_number'))
        season_number = int_or_none(self._search_regex(
            r'(?s)<h\d[^>]+id=["\']showmedia_about_episode_num[^>]+>.+?</h\d>\s*<h4>\s*Season (\d+)',
@@ -568,7 +594,8 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
            'id': video_id,
            'title': video_title,
            'description': video_description,
-            'thumbnail': xpath_text(metadata, 'episode_image_url'),
+            'duration': float_or_none(media_metadata.get('duration'), 1000),
            'thumbnail': xpath_text(metadata, 'episode_image_url') or media_metadata.get('thumbnail', {}).get('url'),
            'uploader': video_uploader,
            'upload_date': video_upload_date,
            'series': series,
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@@ -54,6 +54,7 @@ from .appletrailers import (
 from .archiveorg import ArchiveOrgIE
 from .arkena import ArkenaIE
 from .ard import (
    ARDBetaMediathekIE,
    ARDIE,
    ARDMediathekIE,
 )
--- a/youtube_dl/extractor/internazionale.py
+++ b/youtube_dl/extractor/internazionale.py
@@ -7,7 +7,7 @@ from ..utils import unified_timestamp
 class InternazionaleIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?internazionale\.it/video/(?:[^/]+/)*(?P<id>[^/?#&]+)'
-    _TEST = {
+    _TESTS = [{
        'url': 'https://www.internazionale.it/video/2015/02/19/richard-linklater-racconta-una-scena-di-boyhood',
        'md5': '3e39d32b66882c1218e305acbf8348ca',
        'info_dict': {
@@ -23,7 +23,23 @@ class InternazionaleIE(InfoExtractor):
        'params': {
            'format': 'bestvideo',
        },
-    }
+    }, {
        'url': 'https://www.internazionale.it/video/2018/08/29/telefono-stare-con-noi-stessi',
        'md5': '9db8663704cab73eb972d1cee0082c79',
        'info_dict': {
            'id': '761344',
            'display_id': 'telefono-stare-con-noi-stessi',
            'ext': 'mp4',
            'title': 'Usiamo il telefono per evitare di stare con noi stessi',
            'description': 'md5:75ccfb0d6bcefc6e7428c68b4aa1fe44',
            'timestamp': 1535528954,
            'upload_date': '20180829',
            'thumbnail': r're:^https?://.*\.jpg$',
        },
        'params': {
            'format': 'bestvideo',
        },
    }]
    def _real_extract(self, url):
        display_id = self._match_id(url)
@@ -40,8 +56,13 @@ class InternazionaleIE(InfoExtractor):
            DATA_RE % 'job-id', webpage, 'video id', group='value')
        video_path = self._search_regex(
            DATA_RE % 'video-path', webpage, 'video path', group='value')
        video_available_abroad = self._search_regex(
            DATA_RE % 'video-available_abroad', webpage,
            'video available aboard', default='1', group='value')
        video_available_abroad = video_available_abroad == '1'
-        video_base = 'https://video.internazionale.it/%s/%s.' % (video_path, video_id)
+        video_base = 'https://video%s.internazionale.it/%s/%s.' % \
            ('' if video_available_abroad else '-ita', video_path, video_id)
        formats = self._extract_m3u8_formats(
            video_base + 'm3u8', display_id, 'mp4',
--- a/youtube_dl/extractor/niconico.py
+++ b/youtube_dl/extractor/niconico.py
@@ -252,7 +252,7 @@ class NiconicoIE(InfoExtractor):
                    },
                    'timing_constraint': 'unlimited'
                }
-            }))
+            }).encode())
        resolution = video_quality.get('resolution', {})
--- a/youtube_dl/extractor/vrv.py
+++ b/youtube_dl/extractor/vrv.py
@@ -72,7 +72,7 @@ class VRVBaseIE(InfoExtractor):
 class VRVIE(VRVBaseIE):
    IE_NAME = 'vrv'
    _VALID_URL = r'https?://(?:www\.)?vrv\.co/watch/(?P<id>[A-Z0-9]+)'
-    _TEST = {
+    _TESTS = [{
        'url': 'https://vrv.co/watch/GR9PNZ396/Hidden-America-with-Jonah-Ray:BOSTON-WHERE-THE-PAST-IS-THE-PRESENT',
        'info_dict': {
            'id': 'GR9PNZ396',
@@ -85,7 +85,28 @@ class VRVIE(VRVBaseIE):
            # m3u8 download
            'skip_download': True,
        },
-    }
+    }]
    def _extract_vrv_formats(self, url, video_id, stream_format, audio_lang, hardsub_lang):
        if not url or stream_format not in ('hls', 'dash'):
            return []
        stream_id = hardsub_lang or audio_lang
        format_id = '%s-%s' % (stream_format, stream_id)
        if stream_format == 'hls':
            adaptive_formats = self._extract_m3u8_formats(
                url, video_id, 'mp4', m3u8_id=format_id,
                note='Downloading %s m3u8 information' % stream_id,
                fatal=False)
        elif stream_format == 'dash':
            adaptive_formats = self._extract_mpd_formats(
                url, video_id, mpd_id=format_id,
                note='Downloading %s MPD information' % stream_id,
                fatal=False)
        if audio_lang:
            for f in adaptive_formats:
                if f.get('acodec') != 'none':
                    f['language'] = audio_lang
        return adaptive_formats
    def _real_extract(self, url):
        video_id = self._match_id(url)
@@ -115,26 +136,9 @@ class VRVIE(VRVBaseIE):
        for stream_type, streams in streams_json.get('streams', {}).items():
            if stream_type in ('adaptive_hls', 'adaptive_dash'):
                for stream in streams.values():
-                    stream_url = stream.get('url')
+                    formats.extend(self._extract_vrv_formats(
-                    if not stream_url:
+                        stream.get('url'), video_id, stream_type.split('_')[1],
-                        continue
+                        audio_locale, stream.get('hardsub_locale')))
                    stream_id = stream.get('hardsub_locale') or audio_locale
                    format_id = '%s-%s' % (stream_type.split('_')[1], stream_id)
                    if stream_type == 'adaptive_hls':
                        adaptive_formats = self._extract_m3u8_formats(
                            stream_url, video_id, 'mp4', m3u8_id=format_id,
                            note='Downloading %s m3u8 information' % stream_id,
                            fatal=False)
                    else:
                        adaptive_formats = self._extract_mpd_formats(
                            stream_url, video_id, mpd_id=format_id,
                            note='Downloading %s MPD information' % stream_id,
                            fatal=False)
                    if audio_locale:
                        for f in adaptive_formats:
                            if f.get('acodec') != 'none':
                                f['language'] = audio_locale
                    formats.extend(adaptive_formats)
        self._sort_formats(formats)
        subtitles = {}
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -49,7 +49,6 @@ from .compat import (
    compat_os_name,
    compat_parse_qs,
    compat_shlex_quote,
    compat_socket_create_connection,
    compat_str,
    compat_struct_pack,
    compat_struct_unpack,
@@ -882,13 +881,51 @@ def _create_http_connection(ydl_handler, http_class, is_https, *args, **kwargs):
        kwargs['strict'] = True
    hc = http_class(*args, **compat_kwargs(kwargs))
    source_address = ydl_handler._params.get('source_address')
    if source_address is not None:
        # This is to workaround _create_connection() from socket where it will try all
        # address data from getaddrinfo() including IPv6. This filters the result from
        # getaddrinfo() based on the source_address value.
        # This is based on the cpython socket.create_connection() function.
        # https://github.com/python/cpython/blob/master/Lib/socket.py#L691
        def _create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT, source_address=None):
            host, port = address
            err = None
            addrs = socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM)
            af = socket.AF_INET if '.' in source_address[0] else socket.AF_INET6
            ip_addrs = [addr for addr in addrs if addr[0] == af]
            if addrs and not ip_addrs:
                ip_version = 'v4' if af == socket.AF_INET else 'v6'
                raise socket.error(
                    "No remote IP%s addresses available for connect, can't use '%s' as source address"
                    % (ip_version, source_address[0]))
            for res in ip_addrs:
                af, socktype, proto, canonname, sa = res
                sock = None
                try:
                    sock = socket.socket(af, socktype, proto)
                    if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
                        sock.settimeout(timeout)
                    sock.bind(source_address)
                    sock.connect(sa)
                    err = None  # Explicitly break reference cycle
                    return sock
                except socket.error as _:
                    err = _
                    if sock is not None:
                        sock.close()
            if err is not None:
                raise err
            else:
                raise socket.error('getaddrinfo returns an empty list')
        if hasattr(hc, '_create_connection'):
            hc._create_connection = _create_connection
        sa = (source_address, 0)
        if hasattr(hc, 'source_address'):  # Python 2.7+
            hc.source_address = sa
        else:  # Python 2.6
            def _hc_connect(self, *args, **kwargs):
-                sock = compat_socket_create_connection(
+                sock = _create_connection(
                    (self.host, self.port), self.timeout, sa)
                if is_https:
                    self.sock = ssl.wrap_socket(
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals
-__version__ = '2018.08.28'
+__version__ = '2018.09.01'
Author	SHA1	Message	Date
Sergey M․	27d8e089a2	release 2018.09.01	2018-09-01 18:40:23 +07:00
Sergey M․	7bbc1b189a	[ChangeLog] Actualize [ci skip]	2018-09-01 18:36:18 +07:00
LangerJan	0b87e88453	[ard] Add support for one.ard.de	2018-09-01 16:42:30 +07:00
Gorfiend	4d59db5b90	[niconico] Fix extraction on python3 (closes #17393 )	2018-09-01 16:04:45 +07:00
Remita Amine	4627995882	[crunchyroll] limit VRVIE inheritance to CrunchyrollIE	2018-09-01 10:04:10 +01:00
Remita Amine	7f2611cb5b	[ard] extract f4m formats	2018-09-01 08:40:38 +01:00
Remita Amine	54a5be4dba	[crunchyroll] parse vilos media data(closes #17343 )	2018-09-01 08:16:41 +01:00
Philipp Hagemeister	ed6919e737	[ard] beta mediathek: make regexp for JSON more robust	2018-09-01 01:59:13 +02:00
Philipp Hagemeister	2b83da2463	[ard] Better format handling Skip f4m, doesn't work (yet); correctly extract m3u8, and prefer plain HTTP files.	2018-09-01 00:45:36 +02:00
Philipp Hagemeister	c1a37eb24a	[ard] Add support for Beta ARD Mediathek Thanks to https://blog.fefe.de/?ts=a577685d for pointing out support is missing.	2018-09-01 00:18:17 +02:00
Sergey M․	4991e16c2a	[bandcamp] Extract more metadata (closes #13197 )	2018-08-31 03:35:55 +07:00
Parmjit Virk	14b7a24c19	[bandcamp] Extract track_number (closes #17266 )	2018-08-31 02:32:35 +07:00
Leonardo Taccari	73f3bdbeb4	[internazionale] Fix extraction of non-available-abroad videos	2018-08-31 02:15:46 +07:00
Sergey M․	9e21e6d96b	[utils] Improve remote address skipping and add support for python 2.6 (closes #17362 )	2018-08-29 01:18:03 +07:00
Andrew Udvare	8959018a5f	[utils] Skip remote IP addresses non matching to source address' IP version (closes #13422 )	2018-08-29 01:17:53 +07:00
`@@ -1,3 +1,3 @@`
	`from __future__ import unicode_literals`	`from __future__ import unicode_literals`

	`__version__ = '2018.08.28'`	`__version__ = '2018.09.01'`