release 2020.12.26

[ChangeLog] Actualize
[ci skip]
2025-05-31 01:52:40 +09:00 · 2020-12-26 23:17:35 +07:00 · 2020-12-26 23:13:26 +07:00 · 2020-12-26 23:00:15 +07:00 · 2020-12-26 23:00:15 +07:00 · 2020-12-26 23:00:15 +07:00
18 changed files with 271 additions and 165 deletions
--- a/.github/ISSUE_TEMPLATE/1_broken_site.md
+++ b/.github/ISSUE_TEMPLATE/1_broken_site.md
@ -18,7 +18,7 @@ title: ''

 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.22. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
 - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
 - Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -26,7 +26,7 @@ Carefully read and work through this check list in order to prevent the most com
 -->

 - [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2020.12.22**
+- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
 - [ ] I've checked that all provided URLs are alive and playable in a browser
 - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
 - [ ] I've searched the bugtracker for similar issues including closed ones
@ -41,7 +41,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
- [debug] youtube-dl version 2020.12.22
+ [debug] youtube-dl version 2020.12.26
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/.github/ISSUE_TEMPLATE/2_site_support_request.md
+++ b/.github/ISSUE_TEMPLATE/2_site_support_request.md
@ -19,7 +19,7 @@ labels: 'site-support-request'

 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.22. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
 - Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
 - Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
 -->

 - [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2020.12.22**
+- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
 - [ ] I've checked that all provided URLs are alive and playable in a browser
 - [ ] I've checked that none of provided URLs violate any copyrights
 - [ ] I've searched the bugtracker for similar site support requests including closed ones
--- a/.github/ISSUE_TEMPLATE/3_site_feature_request.md
+++ b/.github/ISSUE_TEMPLATE/3_site_feature_request.md
@ -18,13 +18,13 @@ title: ''

 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.22. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
 - Finally, put x into all relevant boxes (like this [x])
 -->

 - [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2020.12.22**
+- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
 - [ ] I've searched the bugtracker for similar site feature requests including closed ones


--- a/.github/ISSUE_TEMPLATE/4_bug_report.md
+++ b/.github/ISSUE_TEMPLATE/4_bug_report.md
@ -18,7 +18,7 @@ title: ''

 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.22. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
 - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
 - Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
 -->

 - [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2020.12.22**
+- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
 - [ ] I've checked that all provided URLs are alive and playable in a browser
 - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
 - [ ] I've searched the bugtracker for similar bug reports including closed ones
@ -43,7 +43,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
- [debug] youtube-dl version 2020.12.22
+ [debug] youtube-dl version 2020.12.26
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/.github/ISSUE_TEMPLATE/5_feature_request.md
+++ b/.github/ISSUE_TEMPLATE/5_feature_request.md
@ -19,13 +19,13 @@ labels: 'request'

 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.22. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
 - Finally, put x into all relevant boxes (like this [x])
 -->

 - [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2020.12.22**
+- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
 - [ ] I've searched the bugtracker for similar feature requests including closed ones


--- a/32
+++ b/32
@ -1,3 +1,35 @@
+version 2020.12.26
+
+Extractors
+* [instagram] Fix comment count extraction
+ [instagram] Add support for reel URLs (#26234, #26250)
+* [bbc] Switch to media selector v6 (#23232, #23933, #26303, #26432, #26821,
+  #27538)
+* [instagram] Improve thumbnail extraction
+* [instagram] Fix extraction when authenticated (#22880, #26377, #26981,
+  #27422)
+* [spankbang:playlist] Fix extraction (#24087)
+ [spankbang] Add support for playlist videos
+* [pornhub] Improve like and dislike count extraction (#27356)
+* [pornhub] Fix lq formats extraction (#27386, #27393)
+ [bongacams] Add support for bongacams.com (#27440)
+* [youtube:tab] Extend URL regular expression (#27501)
+* [theweatherchannel] Fix extraction (#25930, #26051)
+ [sprout] Add support for Universal Kids (#22518)
+* [theplatform] Allow passing geo bypass countries from other extractors
+ [wistia] Add support for playlists (#27533)
+ [ctv] Add support for ctv.ca (#27525)
+* [9c9media] Improve info extraction
+* [youtube] Fix automatic captions extraction (#27162, #27388)
+* [sonyliv] Fix title for movies
+* [sonyliv] Fix extraction (#25667)
+* [streetvoice] Fix extraction (#27455, #27492)
+ [facebook] Add support for watchparty pages (#27507)
+* [cbslocal] Fix video extraction
+ [brightcove] Add another method to extract policyKey
+* [mewatch] Relax URL regular expression (#27506)
+
+
 version 2020.12.22

 Core
--- a/README.md
+++ b/README.md
@ -880,7 +880,7 @@ Either prepend `https://www.youtube.com/watch?v=` or separate the ID from the op

 Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.

-In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [cookies.txt](https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/) (for Firefox).
+In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [Get cookies.txt](https://chrome.google.com/webstore/detail/get-cookiestxt/bgaddhkoddajcdgocldbbfleckgcbcid/) (for Chrome) or [cookies.txt](https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/) (for Firefox).

 Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.

--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@ -112,6 +112,7 @@
 - **blinkx**
 - **Bloomberg**
 - **BokeCC**
+ - **BongaCams**
 - **BostonGlobe**
 - **Box**
 - **Bpb**: Bundeszentrale für politische Bildung
@ -146,6 +147,7 @@
 - **CBS**
 - **CBSInteractive**
 - **CBSLocal**
+ - **CBSLocalArticle**
 - **cbsnews**: CBS News
 - **cbsnews:embed**
 - **cbsnews:livevideo**: CBS News Live Videos
@ -198,6 +200,7 @@
 - **CSNNE**
 - **CSpan**: C-SPAN
 - **CtsNews**: 華視新聞
+ - **CTV**
 - **CTVNews**
 - **cu.ntv.co.jp**: Nippon Television Network
 - **Culturebox**
@ -1108,6 +1111,7 @@
 - **WeiboMobile**
 - **WeiqiTV**: WQTV
 - **Wistia**
+ - **WistiaPlaylist**
 - **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
 - **WorldStarHipHop**
 - **WSJ**: Wall Street Journal
--- a/test/test_all_urls.py
+++ b/test/test_all_urls.py
@ -36,7 +36,7 @@ class TestAllURLsMatching(unittest.TestCase):
        assertPlaylist('UUBABnxM4Ar9ten8Mdjj1j0Q')  # 585
        assertPlaylist('PL63F0C78739B09958')
        assertTab('https://www.youtube.com/playlist?list=UUBABnxM4Ar9ten8Mdjj1j0Q')
-        assertPlaylist('https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8')
+        assertTab('https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8')
        assertTab('https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC')
        assertTab('https://www.youtube.com/watch?v=AV6J6_AeFEQ&playnext=1&list=PL4023E734DA416012')  # 668
        self.assertFalse('youtube:playlist' in self.matching_ies('PLtS2H6bU1M'))
@ -57,8 +57,8 @@ class TestAllURLsMatching(unittest.TestCase):
        assertChannel('https://www.youtube.com/channel/HCtnHdj3df7iM?feature=gb_ch_rec')
        assertChannel('https://www.youtube.com/channel/HCtnHdj3df7iM/videos')

-    # def test_youtube_user_matching(self):
-    #     self.assertMatch('http://www.youtube.com/NASAgovVideo/videos', ['youtube:tab'])
+    def test_youtube_user_matching(self):
+        self.assertMatch('http://www.youtube.com/NASAgovVideo/videos', ['youtube:tab'])

    def test_youtube_feeds(self):
        self.assertMatch('https://www.youtube.com/feed/library', ['youtube:tab'])
--- a/youtube_dl/extractor/bbc.py
+++ b/youtube_dl/extractor/bbc.py
@ -49,22 +49,17 @@ class BBCCoUkIE(InfoExtractor):
    _LOGIN_URL = 'https://account.bbc.com/signin'
    _NETRC_MACHINE = 'bbc'

-    _MEDIASELECTOR_URLS = [
+    _MEDIA_SELECTOR_URL_TEMPL = 'https://open.live.bbc.co.uk/mediaselector/6/select/version/2.0/mediaset/%s/vpid/%s'
+    _MEDIA_SETS = [
        # Provides HQ HLS streams with even better quality that pc mediaset but fails
        # with geolocation in some cases when it's even not geo restricted at all (e.g.
        # http://www.bbc.co.uk/programmes/b06bp7lf). Also may fail with selectionunavailable.
-        'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',
-        'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s',
+        'iptv-all',
+        'pc',
    ]

-    _MEDIASELECTION_NS = 'http://bbc.co.uk/2008/mp/mediaselection'
    _EMP_PLAYLIST_NS = 'http://bbc.co.uk/2008/emp/playlist'

-    _NAMESPACES = (
-        _MEDIASELECTION_NS,
-        _EMP_PLAYLIST_NS,
-    )
-
    _TESTS = [
        {
            'url': 'http://www.bbc.co.uk/programmes/b039g8p7',
@ -261,8 +256,6 @@ class BBCCoUkIE(InfoExtractor):
            'only_matching': True,
        }]

-    _USP_RE = r'/([^/]+?)\.ism(?:\.hlsv2\.ism)?/[^/]+\.m3u8'
-
    def _login(self):
        username, password = self._get_login_info()
        if username is None:
@ -307,22 +300,14 @@ class BBCCoUkIE(InfoExtractor):
    def _extract_items(self, playlist):
        return playlist.findall('./{%s}item' % self._EMP_PLAYLIST_NS)

-    def _findall_ns(self, element, xpath):
-        elements = []
-        for ns in self._NAMESPACES:
-            elements.extend(element.findall(xpath % ns))
-        return elements
-
    def _extract_medias(self, media_selection):
-        error = media_selection.find('./{%s}error' % self._MEDIASELECTION_NS)
-        if error is None:
-            media_selection.find('./{%s}error' % self._EMP_PLAYLIST_NS)
-        if error is not None:
-            raise BBCCoUkIE.MediaSelectionError(error.get('id'))
-        return self._findall_ns(media_selection, './{%s}media')
+        error = media_selection.get('result')
+        if error:
+            raise BBCCoUkIE.MediaSelectionError(error)
+        return media_selection.get('media') or []

    def _extract_connections(self, media):
-        return self._findall_ns(media, './{%s}connection')
+        return media.get('connection') or []

    def _get_subtitles(self, media, programme_id):
        subtitles = {}
@ -334,13 +319,13 @@ class BBCCoUkIE(InfoExtractor):
                cc_url, programme_id, 'Downloading captions', fatal=False)
            if not isinstance(captions, compat_etree_Element):
                continue
-            lang = captions.get('{http://www.w3.org/XML/1998/namespace}lang', 'en')
-            subtitles[lang] = [
+            subtitles['en'] = [
                {
                    'url': connection.get('href'),
                    'ext': 'ttml',
                },
            ]
+            break
        return subtitles

    def _raise_extractor_error(self, media_selection_error):
@ -350,10 +335,10 @@ class BBCCoUkIE(InfoExtractor):

    def _download_media_selector(self, programme_id):
        last_exception = None
-        for mediaselector_url in self._MEDIASELECTOR_URLS:
+        for media_set in self._MEDIA_SETS:
            try:
                return self._download_media_selector_url(
-                    mediaselector_url % programme_id, programme_id)
+                    self._MEDIA_SELECTOR_URL_TEMPL % (media_set, programme_id), programme_id)
            except BBCCoUkIE.MediaSelectionError as e:
                if e.id in ('notukerror', 'geolocation', 'selectionunavailable'):
                    last_exception = e
@ -362,8 +347,8 @@ class BBCCoUkIE(InfoExtractor):
        self._raise_extractor_error(last_exception)

    def _download_media_selector_url(self, url, programme_id=None):
-        media_selection = self._download_xml(
-            url, programme_id, 'Downloading media selection XML',
+        media_selection = self._download_json(
+            url, programme_id, 'Downloading media selection JSON',
            expected_status=(403, 404))
        return self._process_media_selector(media_selection, programme_id)

@ -377,7 +362,6 @@ class BBCCoUkIE(InfoExtractor):
            if kind in ('video', 'audio'):
                bitrate = int_or_none(media.get('bitrate'))
                encoding = media.get('encoding')
-                service = media.get('service')
                width = int_or_none(media.get('width'))
                height = int_or_none(media.get('height'))
                file_size = int_or_none(media.get('media_file_size'))
@ -392,8 +376,6 @@ class BBCCoUkIE(InfoExtractor):
                    supplier = connection.get('supplier')
                    transfer_format = connection.get('transferFormat')
                    format_id = supplier or conn_kind or protocol
-                    if service:
-                        format_id = '%s_%s' % (service, format_id)
                    # ASX playlist
                    if supplier == 'asx':
                        for i, ref in enumerate(self._extract_asx_playlist(connection, programme_id)):
@ -408,20 +390,11 @@ class BBCCoUkIE(InfoExtractor):
                        formats.extend(self._extract_m3u8_formats(
                            href, programme_id, ext='mp4', entry_protocol='m3u8_native',
                            m3u8_id=format_id, fatal=False))
-                        if re.search(self._USP_RE, href):
-                            usp_formats = self._extract_m3u8_formats(
-                                re.sub(self._USP_RE, r'/\1.ism/\1.m3u8', href),
-                                programme_id, ext='mp4', entry_protocol='m3u8_native',
-                                m3u8_id=format_id, fatal=False)
-                            for f in usp_formats:
-                                if f.get('height') and f['height'] > 720:
-                                    continue
-                                formats.append(f)
                    elif transfer_format == 'hds':
                        formats.extend(self._extract_f4m_formats(
                            href, programme_id, f4m_id=format_id, fatal=False))
                    else:
-                        if not service and not supplier and bitrate:
+                        if not supplier and bitrate:
                            format_id += '-%d' % bitrate
                        fmt = {
                            'format_id': format_id,
@ -554,7 +527,7 @@ class BBCCoUkIE(InfoExtractor):
        webpage = self._download_webpage(url, group_id, 'Downloading video page')

        error = self._search_regex(
-            r'<div\b[^>]+\bclass=["\']smp__message delta["\'][^>]*>([^<]+)<',
+            r'<div\b[^>]+\bclass=["\'](?:smp|playout)__message delta["\'][^>]*>\s*([^<]+?)\s*<',
            webpage, 'error', default=None)
        if error:
            raise ExtractorError(error, expected=True)
@ -607,16 +580,9 @@ class BBCIE(BBCCoUkIE):
    IE_DESC = 'BBC'
    _VALID_URL = r'https?://(?:www\.)?bbc\.(?:com|co\.uk)/(?:[^/]+/)+(?P<id>[^/#?]+)'

-    _MEDIASELECTOR_URLS = [
-        # Provides HQ HLS streams but fails with geolocation in some cases when it's
-        # even not geo restricted at all
-        'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',
-        # Provides more formats, namely direct mp4 links, but fails on some videos with
-        # notukerror for non UK (?) users (e.g.
-        # http://www.bbc.com/travel/story/20150625-sri-lankas-spicy-secret)
-        'http://open.live.bbc.co.uk/mediaselector/4/mtis/stream/%s',
-        # Provides fewer formats, but works everywhere for everybody (hopefully)
-        'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/journalism-pc/vpid/%s',
+    _MEDIA_SETS = [
+        'mobile-tablet-main',
+        'pc',
    ]

    _TESTS = [{
--- a/youtube_dl/extractor/bongacams.py
+++ b/youtube_dl/extractor/bongacams.py
@ -0,0 +1,60 @@
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..compat import compat_str
+from ..utils import (
+    int_or_none,
+    try_get,
+    urlencode_postdata,
+)
+
+
+class BongaCamsIE(InfoExtractor):
+    _VALID_URL = r'https?://(?P<host>(?:[^/]+\.)?bongacams\d*\.com)/(?P<id>[^/?&#]+)'
+    _TESTS = [{
+        'url': 'https://de.bongacams.com/azumi-8',
+        'only_matching': True,
+    }, {
+        'url': 'https://cn.bongacams.com/azumi-8',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        host = mobj.group('host')
+        channel_id = mobj.group('id')
+
+        amf = self._download_json(
+            'https://%s/tools/amf.php' % host, channel_id,
+            data=urlencode_postdata((
+                ('method', 'getRoomData'),
+                ('args[]', channel_id),
+                ('args[]', 'false'),
+            )), headers={'X-Requested-With': 'XMLHttpRequest'})
+
+        server_url = amf['localData']['videoServerUrl']
+
+        uploader_id = try_get(
+            amf, lambda x: x['performerData']['username'], compat_str) or channel_id
+        uploader = try_get(
+            amf, lambda x: x['performerData']['displayName'], compat_str)
+        like_count = int_or_none(try_get(
+            amf, lambda x: x['performerData']['loversCount']))
+
+        formats = self._extract_m3u8_formats(
+            '%s/hls/stream_%s/playlist.m3u8' % (server_url, uploader_id),
+            channel_id, 'mp4', m3u8_id='hls', live=True)
+        self._sort_formats(formats)
+
+        return {
+            'id': channel_id,
+            'title': self._live_title(uploader or uploader_id),
+            'uploader': uploader,
+            'uploader_id': uploader_id,
+            'like_count': like_count,
+            'age_limit': 18,
+            'is_live': True,
+            'formats': formats,
+        }
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@ -119,6 +119,7 @@ from .bleacherreport import (
 from .blinkx import BlinkxIE
 from .bloomberg import BloombergIE
 from .bokecc import BokeCCIE
+from .bongacams import BongaCamsIE
 from .bostonglobe import BostonGlobeIE
 from .box import BoxIE
 from .bpb import BpbIE
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@ -2024,22 +2024,6 @@ class GenericIE(InfoExtractor):
            },
            'add_ie': [SpringboardPlatformIE.ie_key()],
        },
-        {
-            'url': 'https://www.youtube.com/shared?ci=1nEzmT-M4fU',
-            'info_dict': {
-                'id': 'uPDB5I9wfp8',
-                'ext': 'webm',
-                'title': 'Pocoyo: 90 minutos de episódios completos Português para crianças - PARTE 3',
-                'description': 'md5:d9e4d9346a2dfff4c7dc4c8cec0f546d',
-                'upload_date': '20160219',
-                'uploader': 'Pocoyo - Português (BR)',
-                'uploader_id': 'PocoyoBrazil',
-            },
-            'add_ie': [YoutubeIE.ie_key()],
-            'params': {
-                'skip_download': True,
-            },
-        },
        {
            'url': 'https://www.yapfiles.ru/show/1872528/690b05d3054d2dbe1e69523aa21bb3b1.mp4.html',
            'info_dict': {
--- a/youtube_dl/extractor/instagram.py
+++ b/youtube_dl/extractor/instagram.py
@ -22,7 +22,7 @@ from ..utils import (


 class InstagramIE(InfoExtractor):
-    _VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com/(?:p|tv)/(?P<id>[^/?#&]+))'
+    _VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com/(?:p|tv|reel)/(?P<id>[^/?#&]+))'
    _TESTS = [{
        'url': 'https://instagram.com/p/aye83DjauH/?foo=bar#abc',
        'md5': '0d2da106a9d2631273e192b372806516',
@ -35,7 +35,7 @@ class InstagramIE(InfoExtractor):
            'timestamp': 1371748545,
            'upload_date': '20130620',
            'uploader_id': 'naomipq',
-            'uploader': 'Naomi Leonor Phan-Quang',
+            'uploader': 'B E A U T Y  F O R  A S H E S',
            'like_count': int,
            'comment_count': int,
            'comments': list,
@ -95,6 +95,9 @@ class InstagramIE(InfoExtractor):
    }, {
        'url': 'https://www.instagram.com/tv/aye83DjauH/',
        'only_matching': True,
+    }, {
+        'url': 'https://www.instagram.com/reel/CDUMkliABpa/',
+        'only_matching': True,
    }]

    @staticmethod
@ -122,9 +125,9 @@ class InstagramIE(InfoExtractor):

        webpage = self._download_webpage(url, video_id)

-        (video_url, description, thumbnail, timestamp, uploader,
+        (media, video_url, description, thumbnail, timestamp, uploader,
         uploader_id, like_count, comment_count, comments, height,
-         width) = [None] * 11
+         width) = [None] * 12

        shared_data = self._parse_json(
            self._search_regex(
@ -137,59 +140,77 @@ class InstagramIE(InfoExtractor):
                (lambda x: x['entry_data']['PostPage'][0]['graphql']['shortcode_media'],
                 lambda x: x['entry_data']['PostPage'][0]['media']),
                dict)
-            if media:
-                video_url = media.get('video_url')
-                height = int_or_none(media.get('dimensions', {}).get('height'))
-                width = int_or_none(media.get('dimensions', {}).get('width'))
-                description = try_get(
-                    media, lambda x: x['edge_media_to_caption']['edges'][0]['node']['text'],
-                    compat_str) or media.get('caption')
-                thumbnail = media.get('display_src')
-                timestamp = int_or_none(media.get('taken_at_timestamp') or media.get('date'))
-                uploader = media.get('owner', {}).get('full_name')
-                uploader_id = media.get('owner', {}).get('username')
+        # _sharedData.entry_data.PostPage is empty when authenticated (see
+        # https://github.com/ytdl-org/youtube-dl/pull/22880)
+        if not media:
+            additional_data = self._parse_json(
+                self._search_regex(
+                    r'window\.__additionalDataLoaded\s*\(\s*[^,]+,\s*({.+?})\s*\)\s*;',
+                    webpage, 'additional data', default='{}'),
+                video_id, fatal=False)
+            if additional_data:
+                media = try_get(
+                    additional_data, lambda x: x['graphql']['shortcode_media'],
+                    dict)
+        if media:
+            video_url = media.get('video_url')
+            height = int_or_none(media.get('dimensions', {}).get('height'))
+            width = int_or_none(media.get('dimensions', {}).get('width'))
+            description = try_get(
+                media, lambda x: x['edge_media_to_caption']['edges'][0]['node']['text'],
+                compat_str) or media.get('caption')
+            thumbnail = media.get('display_src') or media.get('display_url')
+            timestamp = int_or_none(media.get('taken_at_timestamp') or media.get('date'))
+            uploader = media.get('owner', {}).get('full_name')
+            uploader_id = media.get('owner', {}).get('username')

-                def get_count(key, kind):
-                    return int_or_none(try_get(
+            def get_count(keys, kind):
+                if not isinstance(keys, (list, tuple)):
+                    keys = [keys]
+                for key in keys:
+                    count = int_or_none(try_get(
                        media, (lambda x: x['edge_media_%s' % key]['count'],
                                lambda x: x['%ss' % kind]['count'])))
-                like_count = get_count('preview_like', 'like')
-                comment_count = get_count('to_comment', 'comment')
+                    if count is not None:
+                        return count
+            like_count = get_count('preview_like', 'like')
+            comment_count = get_count(
+                ('preview_comment', 'to_comment', 'to_parent_comment'), 'comment')

-                comments = [{
-                    'author': comment.get('user', {}).get('username'),
-                    'author_id': comment.get('user', {}).get('id'),
-                    'id': comment.get('id'),
-                    'text': comment.get('text'),
-                    'timestamp': int_or_none(comment.get('created_at')),
-                } for comment in media.get(
-                    'comments', {}).get('nodes', []) if comment.get('text')]
-                if not video_url:
-                    edges = try_get(
-                        media, lambda x: x['edge_sidecar_to_children']['edges'],
-                        list) or []
-                    if edges:
-                        entries = []
-                        for edge_num, edge in enumerate(edges, start=1):
-                            node = try_get(edge, lambda x: x['node'], dict)
-                            if not node:
-                                continue
-                            node_video_url = url_or_none(node.get('video_url'))
-                            if not node_video_url:
-                                continue
-                            entries.append({
-                                'id': node.get('shortcode') or node['id'],
-                                'title': 'Video %d' % edge_num,
-                                'url': node_video_url,
-                                'thumbnail': node.get('display_url'),
-                                'width': int_or_none(try_get(node, lambda x: x['dimensions']['width'])),
-                                'height': int_or_none(try_get(node, lambda x: x['dimensions']['height'])),
-                                'view_count': int_or_none(node.get('video_view_count')),
-                            })
-                        return self.playlist_result(
-                            entries, video_id,
-                            'Post by %s' % uploader_id if uploader_id else None,
-                            description)
+            comments = [{
+                'author': comment.get('user', {}).get('username'),
+                'author_id': comment.get('user', {}).get('id'),
+                'id': comment.get('id'),
+                'text': comment.get('text'),
+                'timestamp': int_or_none(comment.get('created_at')),
+            } for comment in media.get(
+                'comments', {}).get('nodes', []) if comment.get('text')]
+            if not video_url:
+                edges = try_get(
+                    media, lambda x: x['edge_sidecar_to_children']['edges'],
+                    list) or []
+                if edges:
+                    entries = []
+                    for edge_num, edge in enumerate(edges, start=1):
+                        node = try_get(edge, lambda x: x['node'], dict)
+                        if not node:
+                            continue
+                        node_video_url = url_or_none(node.get('video_url'))
+                        if not node_video_url:
+                            continue
+                        entries.append({
+                            'id': node.get('shortcode') or node['id'],
+                            'title': 'Video %d' % edge_num,
+                            'url': node_video_url,
+                            'thumbnail': node.get('display_url'),
+                            'width': int_or_none(try_get(node, lambda x: x['dimensions']['width'])),
+                            'height': int_or_none(try_get(node, lambda x: x['dimensions']['height'])),
+                            'view_count': int_or_none(node.get('video_view_count')),
+                        })
+                    return self.playlist_result(
+                        entries, video_id,
+                        'Post by %s' % uploader_id if uploader_id else None,
+                        description)

        if not video_url:
            video_url = self._og_search_video_url(webpage, secure=False)
--- a/youtube_dl/extractor/pornhub.py
+++ b/youtube_dl/extractor/pornhub.py
@ -288,14 +288,24 @@ class PornHubIE(PornHubBaseIE):
            video_urls.append((v_url, None))
            video_urls_set.add(v_url)

+        def parse_quality_items(quality_items):
+            q_items = self._parse_json(quality_items, video_id, fatal=False)
+            if not isinstance(q_items, list):
+                return
+            for item in q_items:
+                if isinstance(item, dict):
+                    add_video_url(item.get('url'))
+
        if not video_urls:
-            FORMAT_PREFIXES = ('media', 'quality')
+            FORMAT_PREFIXES = ('media', 'quality', 'qualityItems')
            js_vars = extract_js_vars(
                webpage, r'(var\s+(?:%s)_.+)' % '|'.join(FORMAT_PREFIXES),
                default=None)
            if js_vars:
                for key, format_url in js_vars.items():
-                    if any(key.startswith(p) for p in FORMAT_PREFIXES):
+                    if key.startswith(FORMAT_PREFIXES[-1]):
+                        parse_quality_items(format_url)
+                    elif any(key.startswith(p) for p in FORMAT_PREFIXES[:2]):
                        add_video_url(format_url)
            if not video_urls and re.search(
                    r'<[^>]+\bid=["\']lockedPlayer', webpage):
@ -351,12 +361,16 @@ class PornHubIE(PornHubBaseIE):
            r'(?s)From:&nbsp;.+?<(?:a\b[^>]+\bhref=["\']/(?:(?:user|channel)s|model|pornstar)/|span\b[^>]+\bclass=["\']username)[^>]+>(.+?)<',
            webpage, 'uploader', default=None)

+        def extract_vote_count(kind, name):
+            return self._extract_count(
+                (r'<span[^>]+\bclass="votes%s"[^>]*>([\d,\.]+)</span>' % kind,
+                 r'<span[^>]+\bclass=["\']votes%s["\'][^>]*\bdata-rating=["\'](\d+)' % kind),
+                webpage, name)
+
        view_count = self._extract_count(
            r'<span class="count">([\d,\.]+)</span> [Vv]iews', webpage, 'view')
-        like_count = self._extract_count(
-            r'<span[^>]+class="votesUp"[^>]*>([\d,\.]+)</span>', webpage, 'like')
-        dislike_count = self._extract_count(
-            r'<span[^>]+class="votesDown"[^>]*>([\d,\.]+)</span>', webpage, 'dislike')
+        like_count = extract_vote_count('Up', 'like')
+        dislike_count = extract_vote_count('Down', 'dislike')
        comment_count = self._extract_count(
            r'All Comments\s*<span>\(([\d,.]+)\)', webpage, 'comment')

--- a/youtube_dl/extractor/spankbang.py
+++ b/youtube_dl/extractor/spankbang.py
@ -7,17 +7,24 @@ from ..utils import (
    determine_ext,
    ExtractorError,
    merge_dicts,
-    orderedSet,
    parse_duration,
    parse_resolution,
    str_to_int,
    url_or_none,
    urlencode_postdata,
+    urljoin,
 )


 class SpankBangIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:[^/]+\.)?spankbang\.com/(?P<id>[\da-z]+)/(?:video|play|embed)\b'
+    _VALID_URL = r'''(?x)
+                    https?://
+                        (?:[^/]+\.)?spankbang\.com/
+                        (?:
+                            (?P<id>[\da-z]+)/(?:video|play|embed)\b|
+                            [\da-z]+-(?P<id_2>[\da-z]+)/playlist/[^/?#&]+
+                        )
+                    '''
    _TESTS = [{
        'url': 'http://spankbang.com/3vvn/video/fantasy+solo',
        'md5': '1cc433e1d6aa14bc376535b8679302f7',
@ -57,10 +64,14 @@ class SpankBangIE(InfoExtractor):
    }, {
        'url': 'https://spankbang.com/2y3td/embed/',
        'only_matching': True,
+    }, {
+        'url': 'https://spankbang.com/2v7ik-7ecbgu/playlist/latina+booty',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
-        video_id = self._match_id(url)
+        mobj = re.match(self._VALID_URL, url)
+        video_id = mobj.group('id') or mobj.group('id_2')
        webpage = self._download_webpage(
            url.replace('/%s/embed' % video_id, '/%s/video' % video_id),
            video_id, headers={'Cookie': 'country=US'})
@ -155,30 +166,33 @@ class SpankBangIE(InfoExtractor):


 class SpankBangPlaylistIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:[^/]+\.)?spankbang\.com/(?P<id>[\da-z]+)/playlist/[^/]+'
+    _VALID_URL = r'https?://(?:[^/]+\.)?spankbang\.com/(?P<id>[\da-z]+)/playlist/(?P<display_id>[^/]+)'
    _TEST = {
        'url': 'https://spankbang.com/ug0k/playlist/big+ass+titties',
        'info_dict': {
            'id': 'ug0k',
            'title': 'Big Ass Titties',
        },
-        'playlist_mincount': 50,
+        'playlist_mincount': 40,
    }

    def _real_extract(self, url):
-        playlist_id = self._match_id(url)
+        mobj = re.match(self._VALID_URL, url)
+        playlist_id = mobj.group('id')
+        display_id = mobj.group('display_id')

        webpage = self._download_webpage(
            url, playlist_id, headers={'Cookie': 'country=US; mobile=on'})

        entries = [self.url_result(
-            'https://spankbang.com/%s/video' % video_id,
-            ie=SpankBangIE.ie_key(), video_id=video_id)
-            for video_id in orderedSet(re.findall(
-                r'<a[^>]+\bhref=["\']/?([\da-z]+)/play/', webpage))]
+            urljoin(url, mobj.group('path')),
+            ie=SpankBangIE.ie_key(), video_id=mobj.group('id'))
+            for mobj in re.finditer(
+                r'<a[^>]+\bhref=(["\'])(?P<path>/?[\da-z]+-(?P<id>[\da-z]+)/playlist/%s(?:(?!\1).)*)\1'
+                % re.escape(display_id), webpage)]

        title = self._html_search_regex(
-            r'<h1>([^<]+)\s+playlist</h1>', webpage, 'playlist title',
+            r'<h1>([^<]+)\s+playlist\s*<', webpage, 'playlist title',
            fatal=False)

        return self.playlist_result(entries, playlist_id, title)
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@ -2442,7 +2442,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
                        )/
                        (?:
                            (?:channel|c|user|feed)/|
-                            (?:playlist|watch)\?.*?\blist=
+                            (?:playlist|watch)\?.*?\blist=|
+                            (?!(?:watch|embed|v|e)\b)
                        )
                        (?P<id>[^/?\#&]+)
                    '''
@ -2711,13 +2712,22 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
        # inline playlist with not always working continuations
        'url': 'https://www.youtube.com/watch?v=UC6u0Tct-Fo&list=PL36D642111D65BE7C',
        'only_matching': True,
-    }
-        # TODO
-        # {
-        #     'url': 'https://www.youtube.com/TheYoungTurks/live',
-        #     'only_matching': True,
-        # }
-    ]
+    }, {
+        'url': 'https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.youtube.com/course',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.youtube.com/zsecurity',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.youtube.com/NASAgovVideo/videos',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.youtube.com/TheYoungTurks/live',
+        'only_matching': True,
+    }]

    def _extract_channel_id(self, webpage):
        channel_id = self._html_search_meta(
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2020.12.22'
+__version__ = '2020.12.26'
Author	SHA1	Message	Date
Sergey M․	365b3cc72d	release 2020.12.26	2020-12-26 23:17:35 +07:00
Sergey M․	a272fe21a8	[ChangeLog] Actualize [ci skip]	2020-12-26 23:13:26 +07:00
Sergey M․	cec1c2f211	[instagram] Fix test	2020-12-26 23:00:15 +07:00
Sergey M․	12053450dc	[instagram] Fix comment count extraction	2020-12-26 23:00:15 +07:00
Sergey M․	46cffb0c47	[instagram] Add support for reel URLs (closes #26234 , closes #26250 )	2020-12-26 23:00:15 +07:00
Remita Amine	c32a059f52	[bbc] switch to media selector v6 closes #23232 closes #23933 closes #26303 closes #26432 closes #26821 closes #27538	2020-12-26 16:57:02 +01:00
Sergey M․	6911312e53	[instagram] Improve thumbnail extraction	2020-12-26 22:42:58 +07:00
Sergey M․	f22b5a6b96	[instagram] Improve extraction (closes #22880 )	2020-12-26 22:37:41 +07:00
Andrew Udvare	58e55198c1	[instagram] Fix extraction when authenticated (closes #27422 )	2020-12-26 22:31:55 +07:00
Sergey M․	d61ed9f2f1	[spankbang] Remove unused import	2020-12-26 22:14:31 +07:00
Sergey M․	8bc4c6350e	[spangbang:playlist] Fix extraction (closes #24087 )	2020-12-26 21:58:26 +07:00
Sergey M․	cfa4ffa23b	[spangbang] Add support for playlist videos	2020-12-26 21:55:12 +07:00
Sergey M․	4f1dc1463d	[pornhub] Improve like and dislike count extraction (closes #27356 )	2020-12-26 21:24:43 +07:00
Sergey M․	17e0f41d34	[pornhub] Fix review issues (closes #27393 )	2020-12-26 21:17:17 +07:00
JChris246	b57b27ff8f	[pornhub] Fix lq formats extraction (closes #27386 )	2020-12-26 21:17:11 +07:00
Marco Fantauzzo	bbe8cc6662	[README.md] Update reference to cookie export extension for Chrome (closes #26885 ) (#27433 ) The cookies.txt extension doesn't exist anymore on the Chrome Web Store (see https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg), so I propose to change the link in the README.md to another similar extension called Get cookies.txt (https://chrome.google.com/webstore/detail/get-cookiestxt/bgaddhkoddajcdgocldbbfleckgcbcid/) with the same functions and utility of the old one. This PR close #26885	2020-12-26 20:50:39 +07:00
Sergey M․	98106accb6	[bongacams] Add extractor (closes #27440 )	2020-12-26 20:30:19 +07:00
Sergey M․	af1312bfc3	[youtube:tab] Extend _VALID_URL (closes #27501 )	2020-12-26 19:59:57 +07:00