Compare commits

...

18 Commits

Author SHA1 Message Date
Sergey M․
365b3cc72d
release 2020.12.26 2020-12-26 23:17:35 +07:00
Sergey M․
a272fe21a8
[ChangeLog] Actualize
[ci skip]
2020-12-26 23:13:26 +07:00
Sergey M․
cec1c2f211
[instagram] Fix test 2020-12-26 23:00:15 +07:00
Sergey M․
12053450dc
[instagram] Fix comment count extraction 2020-12-26 23:00:15 +07:00
Sergey M․
46cffb0c47
[instagram] Add support for reel URLs (closes #26234, closes #26250) 2020-12-26 23:00:15 +07:00
Remita Amine
c32a059f52 [bbc] switch to media selector v6
closes #23232
closes #23933
closes #26303
closes #26432
closes #26821
closes #27538
2020-12-26 16:57:02 +01:00
Sergey M․
6911312e53
[instagram] Improve thumbnail extraction 2020-12-26 22:42:58 +07:00
Sergey M․
f22b5a6b96
[instagram] Improve extraction (closes #22880) 2020-12-26 22:37:41 +07:00
Andrew Udvare
58e55198c1
[instagram] Fix extraction when authenticated (closes #27422) 2020-12-26 22:31:55 +07:00
Sergey M․
d61ed9f2f1
[spankbang] Remove unused import 2020-12-26 22:14:31 +07:00
Sergey M․
8bc4c6350e
[spangbang:playlist] Fix extraction (closes #24087) 2020-12-26 21:58:26 +07:00
Sergey M․
cfa4ffa23b
[spangbang] Add support for playlist videos 2020-12-26 21:55:12 +07:00
Sergey M․
4f1dc1463d
[pornhub] Improve like and dislike count extraction (closes #27356) 2020-12-26 21:24:43 +07:00
Sergey M․
17e0f41d34
[pornhub] Fix review issues (closes #27393) 2020-12-26 21:17:17 +07:00
JChris246
b57b27ff8f
[pornhub] Fix lq formats extraction (closes #27386) 2020-12-26 21:17:11 +07:00
Marco Fantauzzo
bbe8cc6662
[README.md] Update reference to cookie export extension for Chrome (closes #26885) (#27433)
The cookies.txt extension doesn't exist anymore on the Chrome Web Store (see https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg), so I propose to change the link in the README.md to another similar extension called Get cookies.txt (https://chrome.google.com/webstore/detail/get-cookiestxt/bgaddhkoddajcdgocldbbfleckgcbcid/) with the same functions and utility of the old one.

This PR close #26885
2020-12-26 20:50:39 +07:00
Sergey M․
98106accb6
[bongacams] Add extractor (closes #27440) 2020-12-26 20:30:19 +07:00
Sergey M․
af1312bfc3
[youtube:tab] Extend _VALID_URL (closes #27501) 2020-12-26 19:59:57 +07:00
18 changed files with 271 additions and 165 deletions

View File

@ -18,7 +18,7 @@ title: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.22. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -26,7 +26,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2020.12.22**
- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
@ -41,7 +41,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.12.22
[debug] youtube-dl version 2020.12.26
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@ -19,7 +19,7 @@ labels: 'site-support-request'
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.22. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2020.12.22**
- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones

View File

@ -18,13 +18,13 @@ title: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.22. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2020.12.22**
- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones

View File

@ -18,7 +18,7 @@ title: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.22. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2020.12.22**
- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
@ -43,7 +43,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.12.22
[debug] youtube-dl version 2020.12.26
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@ -19,13 +19,13 @@ labels: 'request'
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.22. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2020.12.22**
- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
- [ ] I've searched the bugtracker for similar feature requests including closed ones

View File

@ -1,3 +1,35 @@
version 2020.12.26
Extractors
* [instagram] Fix comment count extraction
+ [instagram] Add support for reel URLs (#26234, #26250)
* [bbc] Switch to media selector v6 (#23232, #23933, #26303, #26432, #26821,
#27538)
* [instagram] Improve thumbnail extraction
* [instagram] Fix extraction when authenticated (#22880, #26377, #26981,
#27422)
* [spankbang:playlist] Fix extraction (#24087)
+ [spankbang] Add support for playlist videos
* [pornhub] Improve like and dislike count extraction (#27356)
* [pornhub] Fix lq formats extraction (#27386, #27393)
+ [bongacams] Add support for bongacams.com (#27440)
* [youtube:tab] Extend URL regular expression (#27501)
* [theweatherchannel] Fix extraction (#25930, #26051)
+ [sprout] Add support for Universal Kids (#22518)
* [theplatform] Allow passing geo bypass countries from other extractors
+ [wistia] Add support for playlists (#27533)
+ [ctv] Add support for ctv.ca (#27525)
* [9c9media] Improve info extraction
* [youtube] Fix automatic captions extraction (#27162, #27388)
* [sonyliv] Fix title for movies
* [sonyliv] Fix extraction (#25667)
* [streetvoice] Fix extraction (#27455, #27492)
+ [facebook] Add support for watchparty pages (#27507)
* [cbslocal] Fix video extraction
+ [brightcove] Add another method to extract policyKey
* [mewatch] Relax URL regular expression (#27506)
version 2020.12.22
Core

View File

@ -880,7 +880,7 @@ Either prepend `https://www.youtube.com/watch?v=` or separate the ID from the op
Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [cookies.txt](https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/) (for Firefox).
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [Get cookies.txt](https://chrome.google.com/webstore/detail/get-cookiestxt/bgaddhkoddajcdgocldbbfleckgcbcid/) (for Chrome) or [cookies.txt](https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/) (for Firefox).
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.

View File

@ -112,6 +112,7 @@
- **blinkx**
- **Bloomberg**
- **BokeCC**
- **BongaCams**
- **BostonGlobe**
- **Box**
- **Bpb**: Bundeszentrale für politische Bildung
@ -146,6 +147,7 @@
- **CBS**
- **CBSInteractive**
- **CBSLocal**
- **CBSLocalArticle**
- **cbsnews**: CBS News
- **cbsnews:embed**
- **cbsnews:livevideo**: CBS News Live Videos
@ -198,6 +200,7 @@
- **CSNNE**
- **CSpan**: C-SPAN
- **CtsNews**: 華視新聞
- **CTV**
- **CTVNews**
- **cu.ntv.co.jp**: Nippon Television Network
- **Culturebox**
@ -1108,6 +1111,7 @@
- **WeiboMobile**
- **WeiqiTV**: WQTV
- **Wistia**
- **WistiaPlaylist**
- **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **WorldStarHipHop**
- **WSJ**: Wall Street Journal

View File

@ -36,7 +36,7 @@ class TestAllURLsMatching(unittest.TestCase):
assertPlaylist('UUBABnxM4Ar9ten8Mdjj1j0Q') # 585
assertPlaylist('PL63F0C78739B09958')
assertTab('https://www.youtube.com/playlist?list=UUBABnxM4Ar9ten8Mdjj1j0Q')
assertPlaylist('https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8')
assertTab('https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8')
assertTab('https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC')
assertTab('https://www.youtube.com/watch?v=AV6J6_AeFEQ&playnext=1&list=PL4023E734DA416012') # 668
self.assertFalse('youtube:playlist' in self.matching_ies('PLtS2H6bU1M'))
@ -57,8 +57,8 @@ class TestAllURLsMatching(unittest.TestCase):
assertChannel('https://www.youtube.com/channel/HCtnHdj3df7iM?feature=gb_ch_rec')
assertChannel('https://www.youtube.com/channel/HCtnHdj3df7iM/videos')
# def test_youtube_user_matching(self):
# self.assertMatch('http://www.youtube.com/NASAgovVideo/videos', ['youtube:tab'])
def test_youtube_user_matching(self):
self.assertMatch('http://www.youtube.com/NASAgovVideo/videos', ['youtube:tab'])
def test_youtube_feeds(self):
self.assertMatch('https://www.youtube.com/feed/library', ['youtube:tab'])

View File

@ -49,22 +49,17 @@ class BBCCoUkIE(InfoExtractor):
_LOGIN_URL = 'https://account.bbc.com/signin'
_NETRC_MACHINE = 'bbc'
_MEDIASELECTOR_URLS = [
_MEDIA_SELECTOR_URL_TEMPL = 'https://open.live.bbc.co.uk/mediaselector/6/select/version/2.0/mediaset/%s/vpid/%s'
_MEDIA_SETS = [
# Provides HQ HLS streams with even better quality that pc mediaset but fails
# with geolocation in some cases when it's even not geo restricted at all (e.g.
# http://www.bbc.co.uk/programmes/b06bp7lf). Also may fail with selectionunavailable.
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s',
'iptv-all',
'pc',
]
_MEDIASELECTION_NS = 'http://bbc.co.uk/2008/mp/mediaselection'
_EMP_PLAYLIST_NS = 'http://bbc.co.uk/2008/emp/playlist'
_NAMESPACES = (
_MEDIASELECTION_NS,
_EMP_PLAYLIST_NS,
)
_TESTS = [
{
'url': 'http://www.bbc.co.uk/programmes/b039g8p7',
@ -261,8 +256,6 @@ class BBCCoUkIE(InfoExtractor):
'only_matching': True,
}]
_USP_RE = r'/([^/]+?)\.ism(?:\.hlsv2\.ism)?/[^/]+\.m3u8'
def _login(self):
username, password = self._get_login_info()
if username is None:
@ -307,22 +300,14 @@ class BBCCoUkIE(InfoExtractor):
def _extract_items(self, playlist):
return playlist.findall('./{%s}item' % self._EMP_PLAYLIST_NS)
def _findall_ns(self, element, xpath):
elements = []
for ns in self._NAMESPACES:
elements.extend(element.findall(xpath % ns))
return elements
def _extract_medias(self, media_selection):
error = media_selection.find('./{%s}error' % self._MEDIASELECTION_NS)
if error is None:
media_selection.find('./{%s}error' % self._EMP_PLAYLIST_NS)
if error is not None:
raise BBCCoUkIE.MediaSelectionError(error.get('id'))
return self._findall_ns(media_selection, './{%s}media')
error = media_selection.get('result')
if error:
raise BBCCoUkIE.MediaSelectionError(error)
return media_selection.get('media') or []
def _extract_connections(self, media):
return self._findall_ns(media, './{%s}connection')
return media.get('connection') or []
def _get_subtitles(self, media, programme_id):
subtitles = {}
@ -334,13 +319,13 @@ class BBCCoUkIE(InfoExtractor):
cc_url, programme_id, 'Downloading captions', fatal=False)
if not isinstance(captions, compat_etree_Element):
continue
lang = captions.get('{http://www.w3.org/XML/1998/namespace}lang', 'en')
subtitles[lang] = [
subtitles['en'] = [
{
'url': connection.get('href'),
'ext': 'ttml',
},
]
break
return subtitles
def _raise_extractor_error(self, media_selection_error):
@ -350,10 +335,10 @@ class BBCCoUkIE(InfoExtractor):
def _download_media_selector(self, programme_id):
last_exception = None
for mediaselector_url in self._MEDIASELECTOR_URLS:
for media_set in self._MEDIA_SETS:
try:
return self._download_media_selector_url(
mediaselector_url % programme_id, programme_id)
self._MEDIA_SELECTOR_URL_TEMPL % (media_set, programme_id), programme_id)
except BBCCoUkIE.MediaSelectionError as e:
if e.id in ('notukerror', 'geolocation', 'selectionunavailable'):
last_exception = e
@ -362,8 +347,8 @@ class BBCCoUkIE(InfoExtractor):
self._raise_extractor_error(last_exception)
def _download_media_selector_url(self, url, programme_id=None):
media_selection = self._download_xml(
url, programme_id, 'Downloading media selection XML',
media_selection = self._download_json(
url, programme_id, 'Downloading media selection JSON',
expected_status=(403, 404))
return self._process_media_selector(media_selection, programme_id)
@ -377,7 +362,6 @@ class BBCCoUkIE(InfoExtractor):
if kind in ('video', 'audio'):
bitrate = int_or_none(media.get('bitrate'))
encoding = media.get('encoding')
service = media.get('service')
width = int_or_none(media.get('width'))
height = int_or_none(media.get('height'))
file_size = int_or_none(media.get('media_file_size'))
@ -392,8 +376,6 @@ class BBCCoUkIE(InfoExtractor):
supplier = connection.get('supplier')
transfer_format = connection.get('transferFormat')
format_id = supplier or conn_kind or protocol
if service:
format_id = '%s_%s' % (service, format_id)
# ASX playlist
if supplier == 'asx':
for i, ref in enumerate(self._extract_asx_playlist(connection, programme_id)):
@ -408,20 +390,11 @@ class BBCCoUkIE(InfoExtractor):
formats.extend(self._extract_m3u8_formats(
href, programme_id, ext='mp4', entry_protocol='m3u8_native',
m3u8_id=format_id, fatal=False))
if re.search(self._USP_RE, href):
usp_formats = self._extract_m3u8_formats(
re.sub(self._USP_RE, r'/\1.ism/\1.m3u8', href),
programme_id, ext='mp4', entry_protocol='m3u8_native',
m3u8_id=format_id, fatal=False)
for f in usp_formats:
if f.get('height') and f['height'] > 720:
continue
formats.append(f)
elif transfer_format == 'hds':
formats.extend(self._extract_f4m_formats(
href, programme_id, f4m_id=format_id, fatal=False))
else:
if not service and not supplier and bitrate:
if not supplier and bitrate:
format_id += '-%d' % bitrate
fmt = {
'format_id': format_id,
@ -554,7 +527,7 @@ class BBCCoUkIE(InfoExtractor):
webpage = self._download_webpage(url, group_id, 'Downloading video page')
error = self._search_regex(
r'<div\b[^>]+\bclass=["\']smp__message delta["\'][^>]*>([^<]+)<',
r'<div\b[^>]+\bclass=["\'](?:smp|playout)__message delta["\'][^>]*>\s*([^<]+?)\s*<',
webpage, 'error', default=None)
if error:
raise ExtractorError(error, expected=True)
@ -607,16 +580,9 @@ class BBCIE(BBCCoUkIE):
IE_DESC = 'BBC'
_VALID_URL = r'https?://(?:www\.)?bbc\.(?:com|co\.uk)/(?:[^/]+/)+(?P<id>[^/#?]+)'
_MEDIASELECTOR_URLS = [
# Provides HQ HLS streams but fails with geolocation in some cases when it's
# even not geo restricted at all
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',
# Provides more formats, namely direct mp4 links, but fails on some videos with
# notukerror for non UK (?) users (e.g.
# http://www.bbc.com/travel/story/20150625-sri-lankas-spicy-secret)
'http://open.live.bbc.co.uk/mediaselector/4/mtis/stream/%s',
# Provides fewer formats, but works everywhere for everybody (hopefully)
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/journalism-pc/vpid/%s',
_MEDIA_SETS = [
'mobile-tablet-main',
'pc',
]
_TESTS = [{

View File

@ -0,0 +1,60 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
try_get,
urlencode_postdata,
)
class BongaCamsIE(InfoExtractor):
_VALID_URL = r'https?://(?P<host>(?:[^/]+\.)?bongacams\d*\.com)/(?P<id>[^/?&#]+)'
_TESTS = [{
'url': 'https://de.bongacams.com/azumi-8',
'only_matching': True,
}, {
'url': 'https://cn.bongacams.com/azumi-8',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
host = mobj.group('host')
channel_id = mobj.group('id')
amf = self._download_json(
'https://%s/tools/amf.php' % host, channel_id,
data=urlencode_postdata((
('method', 'getRoomData'),
('args[]', channel_id),
('args[]', 'false'),
)), headers={'X-Requested-With': 'XMLHttpRequest'})
server_url = amf['localData']['videoServerUrl']
uploader_id = try_get(
amf, lambda x: x['performerData']['username'], compat_str) or channel_id
uploader = try_get(
amf, lambda x: x['performerData']['displayName'], compat_str)
like_count = int_or_none(try_get(
amf, lambda x: x['performerData']['loversCount']))
formats = self._extract_m3u8_formats(
'%s/hls/stream_%s/playlist.m3u8' % (server_url, uploader_id),
channel_id, 'mp4', m3u8_id='hls', live=True)
self._sort_formats(formats)
return {
'id': channel_id,
'title': self._live_title(uploader or uploader_id),
'uploader': uploader,
'uploader_id': uploader_id,
'like_count': like_count,
'age_limit': 18,
'is_live': True,
'formats': formats,
}

View File

@ -119,6 +119,7 @@ from .bleacherreport import (
from .blinkx import BlinkxIE
from .bloomberg import BloombergIE
from .bokecc import BokeCCIE
from .bongacams import BongaCamsIE
from .bostonglobe import BostonGlobeIE
from .box import BoxIE
from .bpb import BpbIE

View File

@ -2024,22 +2024,6 @@ class GenericIE(InfoExtractor):
},
'add_ie': [SpringboardPlatformIE.ie_key()],
},
{
'url': 'https://www.youtube.com/shared?ci=1nEzmT-M4fU',
'info_dict': {
'id': 'uPDB5I9wfp8',
'ext': 'webm',
'title': 'Pocoyo: 90 minutos de episódios completos Português para crianças - PARTE 3',
'description': 'md5:d9e4d9346a2dfff4c7dc4c8cec0f546d',
'upload_date': '20160219',
'uploader': 'Pocoyo - Português (BR)',
'uploader_id': 'PocoyoBrazil',
},
'add_ie': [YoutubeIE.ie_key()],
'params': {
'skip_download': True,
},
},
{
'url': 'https://www.yapfiles.ru/show/1872528/690b05d3054d2dbe1e69523aa21bb3b1.mp4.html',
'info_dict': {

View File

@ -22,7 +22,7 @@ from ..utils import (
class InstagramIE(InfoExtractor):
_VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com/(?:p|tv)/(?P<id>[^/?#&]+))'
_VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com/(?:p|tv|reel)/(?P<id>[^/?#&]+))'
_TESTS = [{
'url': 'https://instagram.com/p/aye83DjauH/?foo=bar#abc',
'md5': '0d2da106a9d2631273e192b372806516',
@ -35,7 +35,7 @@ class InstagramIE(InfoExtractor):
'timestamp': 1371748545,
'upload_date': '20130620',
'uploader_id': 'naomipq',
'uploader': 'Naomi Leonor Phan-Quang',
'uploader': 'B E A U T Y F O R A S H E S',
'like_count': int,
'comment_count': int,
'comments': list,
@ -95,6 +95,9 @@ class InstagramIE(InfoExtractor):
}, {
'url': 'https://www.instagram.com/tv/aye83DjauH/',
'only_matching': True,
}, {
'url': 'https://www.instagram.com/reel/CDUMkliABpa/',
'only_matching': True,
}]
@staticmethod
@ -122,9 +125,9 @@ class InstagramIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
(video_url, description, thumbnail, timestamp, uploader,
(media, video_url, description, thumbnail, timestamp, uploader,
uploader_id, like_count, comment_count, comments, height,
width) = [None] * 11
width) = [None] * 12
shared_data = self._parse_json(
self._search_regex(
@ -137,59 +140,77 @@ class InstagramIE(InfoExtractor):
(lambda x: x['entry_data']['PostPage'][0]['graphql']['shortcode_media'],
lambda x: x['entry_data']['PostPage'][0]['media']),
dict)
if media:
video_url = media.get('video_url')
height = int_or_none(media.get('dimensions', {}).get('height'))
width = int_or_none(media.get('dimensions', {}).get('width'))
description = try_get(
media, lambda x: x['edge_media_to_caption']['edges'][0]['node']['text'],
compat_str) or media.get('caption')
thumbnail = media.get('display_src')
timestamp = int_or_none(media.get('taken_at_timestamp') or media.get('date'))
uploader = media.get('owner', {}).get('full_name')
uploader_id = media.get('owner', {}).get('username')
# _sharedData.entry_data.PostPage is empty when authenticated (see
# https://github.com/ytdl-org/youtube-dl/pull/22880)
if not media:
additional_data = self._parse_json(
self._search_regex(
r'window\.__additionalDataLoaded\s*\(\s*[^,]+,\s*({.+?})\s*\)\s*;',
webpage, 'additional data', default='{}'),
video_id, fatal=False)
if additional_data:
media = try_get(
additional_data, lambda x: x['graphql']['shortcode_media'],
dict)
if media:
video_url = media.get('video_url')
height = int_or_none(media.get('dimensions', {}).get('height'))
width = int_or_none(media.get('dimensions', {}).get('width'))
description = try_get(
media, lambda x: x['edge_media_to_caption']['edges'][0]['node']['text'],
compat_str) or media.get('caption')
thumbnail = media.get('display_src') or media.get('display_url')
timestamp = int_or_none(media.get('taken_at_timestamp') or media.get('date'))
uploader = media.get('owner', {}).get('full_name')
uploader_id = media.get('owner', {}).get('username')
def get_count(key, kind):
return int_or_none(try_get(
def get_count(keys, kind):
if not isinstance(keys, (list, tuple)):
keys = [keys]
for key in keys:
count = int_or_none(try_get(
media, (lambda x: x['edge_media_%s' % key]['count'],
lambda x: x['%ss' % kind]['count'])))
like_count = get_count('preview_like', 'like')
comment_count = get_count('to_comment', 'comment')
if count is not None:
return count
like_count = get_count('preview_like', 'like')
comment_count = get_count(
('preview_comment', 'to_comment', 'to_parent_comment'), 'comment')
comments = [{
'author': comment.get('user', {}).get('username'),
'author_id': comment.get('user', {}).get('id'),
'id': comment.get('id'),
'text': comment.get('text'),
'timestamp': int_or_none(comment.get('created_at')),
} for comment in media.get(
'comments', {}).get('nodes', []) if comment.get('text')]
if not video_url:
edges = try_get(
media, lambda x: x['edge_sidecar_to_children']['edges'],
list) or []
if edges:
entries = []
for edge_num, edge in enumerate(edges, start=1):
node = try_get(edge, lambda x: x['node'], dict)
if not node:
continue
node_video_url = url_or_none(node.get('video_url'))
if not node_video_url:
continue
entries.append({
'id': node.get('shortcode') or node['id'],
'title': 'Video %d' % edge_num,
'url': node_video_url,
'thumbnail': node.get('display_url'),
'width': int_or_none(try_get(node, lambda x: x['dimensions']['width'])),
'height': int_or_none(try_get(node, lambda x: x['dimensions']['height'])),
'view_count': int_or_none(node.get('video_view_count')),
})
return self.playlist_result(
entries, video_id,
'Post by %s' % uploader_id if uploader_id else None,
description)
comments = [{
'author': comment.get('user', {}).get('username'),
'author_id': comment.get('user', {}).get('id'),
'id': comment.get('id'),
'text': comment.get('text'),
'timestamp': int_or_none(comment.get('created_at')),
} for comment in media.get(
'comments', {}).get('nodes', []) if comment.get('text')]
if not video_url:
edges = try_get(
media, lambda x: x['edge_sidecar_to_children']['edges'],
list) or []
if edges:
entries = []
for edge_num, edge in enumerate(edges, start=1):
node = try_get(edge, lambda x: x['node'], dict)
if not node:
continue
node_video_url = url_or_none(node.get('video_url'))
if not node_video_url:
continue
entries.append({
'id': node.get('shortcode') or node['id'],
'title': 'Video %d' % edge_num,
'url': node_video_url,
'thumbnail': node.get('display_url'),
'width': int_or_none(try_get(node, lambda x: x['dimensions']['width'])),
'height': int_or_none(try_get(node, lambda x: x['dimensions']['height'])),
'view_count': int_or_none(node.get('video_view_count')),
})
return self.playlist_result(
entries, video_id,
'Post by %s' % uploader_id if uploader_id else None,
description)
if not video_url:
video_url = self._og_search_video_url(webpage, secure=False)

View File

@ -288,14 +288,24 @@ class PornHubIE(PornHubBaseIE):
video_urls.append((v_url, None))
video_urls_set.add(v_url)
def parse_quality_items(quality_items):
q_items = self._parse_json(quality_items, video_id, fatal=False)
if not isinstance(q_items, list):
return
for item in q_items:
if isinstance(item, dict):
add_video_url(item.get('url'))
if not video_urls:
FORMAT_PREFIXES = ('media', 'quality')
FORMAT_PREFIXES = ('media', 'quality', 'qualityItems')
js_vars = extract_js_vars(
webpage, r'(var\s+(?:%s)_.+)' % '|'.join(FORMAT_PREFIXES),
default=None)
if js_vars:
for key, format_url in js_vars.items():
if any(key.startswith(p) for p in FORMAT_PREFIXES):
if key.startswith(FORMAT_PREFIXES[-1]):
parse_quality_items(format_url)
elif any(key.startswith(p) for p in FORMAT_PREFIXES[:2]):
add_video_url(format_url)
if not video_urls and re.search(
r'<[^>]+\bid=["\']lockedPlayer', webpage):
@ -351,12 +361,16 @@ class PornHubIE(PornHubBaseIE):
r'(?s)From:&nbsp;.+?<(?:a\b[^>]+\bhref=["\']/(?:(?:user|channel)s|model|pornstar)/|span\b[^>]+\bclass=["\']username)[^>]+>(.+?)<',
webpage, 'uploader', default=None)
def extract_vote_count(kind, name):
return self._extract_count(
(r'<span[^>]+\bclass="votes%s"[^>]*>([\d,\.]+)</span>' % kind,
r'<span[^>]+\bclass=["\']votes%s["\'][^>]*\bdata-rating=["\'](\d+)' % kind),
webpage, name)
view_count = self._extract_count(
r'<span class="count">([\d,\.]+)</span> [Vv]iews', webpage, 'view')
like_count = self._extract_count(
r'<span[^>]+class="votesUp"[^>]*>([\d,\.]+)</span>', webpage, 'like')
dislike_count = self._extract_count(
r'<span[^>]+class="votesDown"[^>]*>([\d,\.]+)</span>', webpage, 'dislike')
like_count = extract_vote_count('Up', 'like')
dislike_count = extract_vote_count('Down', 'dislike')
comment_count = self._extract_count(
r'All Comments\s*<span>\(([\d,.]+)\)', webpage, 'comment')

View File

@ -7,17 +7,24 @@ from ..utils import (
determine_ext,
ExtractorError,
merge_dicts,
orderedSet,
parse_duration,
parse_resolution,
str_to_int,
url_or_none,
urlencode_postdata,
urljoin,
)
class SpankBangIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?spankbang\.com/(?P<id>[\da-z]+)/(?:video|play|embed)\b'
_VALID_URL = r'''(?x)
https?://
(?:[^/]+\.)?spankbang\.com/
(?:
(?P<id>[\da-z]+)/(?:video|play|embed)\b|
[\da-z]+-(?P<id_2>[\da-z]+)/playlist/[^/?#&]+
)
'''
_TESTS = [{
'url': 'http://spankbang.com/3vvn/video/fantasy+solo',
'md5': '1cc433e1d6aa14bc376535b8679302f7',
@ -57,10 +64,14 @@ class SpankBangIE(InfoExtractor):
}, {
'url': 'https://spankbang.com/2y3td/embed/',
'only_matching': True,
}, {
'url': 'https://spankbang.com/2v7ik-7ecbgu/playlist/latina+booty',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id') or mobj.group('id_2')
webpage = self._download_webpage(
url.replace('/%s/embed' % video_id, '/%s/video' % video_id),
video_id, headers={'Cookie': 'country=US'})
@ -155,30 +166,33 @@ class SpankBangIE(InfoExtractor):
class SpankBangPlaylistIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?spankbang\.com/(?P<id>[\da-z]+)/playlist/[^/]+'
_VALID_URL = r'https?://(?:[^/]+\.)?spankbang\.com/(?P<id>[\da-z]+)/playlist/(?P<display_id>[^/]+)'
_TEST = {
'url': 'https://spankbang.com/ug0k/playlist/big+ass+titties',
'info_dict': {
'id': 'ug0k',
'title': 'Big Ass Titties',
},
'playlist_mincount': 50,
'playlist_mincount': 40,
}
def _real_extract(self, url):
playlist_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
playlist_id = mobj.group('id')
display_id = mobj.group('display_id')
webpage = self._download_webpage(
url, playlist_id, headers={'Cookie': 'country=US; mobile=on'})
entries = [self.url_result(
'https://spankbang.com/%s/video' % video_id,
ie=SpankBangIE.ie_key(), video_id=video_id)
for video_id in orderedSet(re.findall(
r'<a[^>]+\bhref=["\']/?([\da-z]+)/play/', webpage))]
urljoin(url, mobj.group('path')),
ie=SpankBangIE.ie_key(), video_id=mobj.group('id'))
for mobj in re.finditer(
r'<a[^>]+\bhref=(["\'])(?P<path>/?[\da-z]+-(?P<id>[\da-z]+)/playlist/%s(?:(?!\1).)*)\1'
% re.escape(display_id), webpage)]
title = self._html_search_regex(
r'<h1>([^<]+)\s+playlist</h1>', webpage, 'playlist title',
r'<h1>([^<]+)\s+playlist\s*<', webpage, 'playlist title',
fatal=False)
return self.playlist_result(entries, playlist_id, title)

View File

@ -2442,7 +2442,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
)/
(?:
(?:channel|c|user|feed)/|
(?:playlist|watch)\?.*?\blist=
(?:playlist|watch)\?.*?\blist=|
(?!(?:watch|embed|v|e)\b)
)
(?P<id>[^/?\#&]+)
'''
@ -2711,13 +2712,22 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
# inline playlist with not always working continuations
'url': 'https://www.youtube.com/watch?v=UC6u0Tct-Fo&list=PL36D642111D65BE7C',
'only_matching': True,
}
# TODO
# {
# 'url': 'https://www.youtube.com/TheYoungTurks/live',
# 'only_matching': True,
# }
]
}, {
'url': 'https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8',
'only_matching': True,
}, {
'url': 'https://www.youtube.com/course',
'only_matching': True,
}, {
'url': 'https://www.youtube.com/zsecurity',
'only_matching': True,
}, {
'url': 'http://www.youtube.com/NASAgovVideo/videos',
'only_matching': True,
}, {
'url': 'https://www.youtube.com/TheYoungTurks/live',
'only_matching': True,
}]
def _extract_channel_id(self, webpage):
channel_id = self._html_search_meta(

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2020.12.22'
__version__ = '2020.12.26'