Compare commits

...

37 Commits

Author SHA1 Message Date
Sergey M․
4066945919 release 2020.12.31 2020-12-31 05:17:55 +07:00
Sergey M․
2a84694b1e [ChangeLog] Actualize
[ci skip]
2020-12-31 05:14:33 +07:00
Sergey M․
4046ffe1e1 [redditr] Fix review issues and extract source thumbnail (closes #27503) 2020-12-31 05:07:57 +07:00
ozburo
d1d0612160 [redditr] Extract all thumbnails 2020-12-31 05:07:51 +07:00
Remita Amine
7b0f04ed1f [vvvvid] imporove info extraction 2020-12-30 18:16:47 +01:00
nixxo
2e21b06ea2 [vvvvid] add playlists support (#27574)
closes #18130
2020-12-30 18:12:17 +01:00
Remita Amine
a6f75e6e89 [yandexdisk] extract info from webpage
the public API does not return metadata when download limit is reached
2020-12-30 16:45:53 +01:00
Remita Amine
bd18824c2a [yandexdisk] fix extraction(closes #17861)(closes #27131) 2020-12-30 13:43:56 +01:00
Remita Amine
bdd044e67b [yandexvideo] use old api call as fallback 2020-12-30 13:30:11 +01:00
Remita Amine
f7e95fb2a0 [yandexvideo] fix extraction(closes #25000) 2020-12-30 09:30:30 +01:00
Remita Amine
9dd674e1d2 [utils] accept only supported protocols in url_or_none 2020-12-30 09:22:30 +01:00
Remita Amine
9c1e164e0c [YoutubeDL] Allow format filtering using audio language(#16209) 2020-12-29 19:29:08 +01:00
Remita Amine
c706fbe9fe [nbc] Remove CSNNE extractor 2020-12-29 17:21:05 +01:00
Remita Amine
ebdcf70b0d [nbc] fix NBCSport VPlayer URL extraction(closes #16640) 2020-12-29 17:15:13 +01:00
Remita Amine
5966095e65 [aenetworks] fix HistoryPlayerIE tests 2020-12-29 16:59:31 +01:00
Remita Amine
9ee984fc76 [aenetworks] add support for biography.com (closes #3863) 2020-12-29 16:13:36 +01:00
Remita Amine
53528e1d23 [uktvplay] match new video URLs(closes #17909) 2020-12-29 14:11:37 +01:00
Remita Amine
c931c4b8dd [sevenplay] detect API errors 2020-12-29 14:11:37 +01:00
Remita Amine
7acd042bbb [tenplay] fix format extraction(closes #26653) 2020-12-29 14:11:37 +01:00
Remita Amine
bcfe485e01 [brightcove] raise ExtractorError for DRM protected videos(closes #23467)(closes #27568) 2020-12-29 14:11:37 +01:00
Sergey M․
479cc6d5a1 release 2020.12.29 2020-12-29 02:52:31 +07:00
Sergey M․
38286ee729 [ChangeLog] Actualize
[ci skip]
2020-12-29 02:49:53 +07:00
Sergey M․
1a95953867 [youtube] Improve yt initial data extraction (closes #27524) 2020-12-29 02:29:34 +07:00
Sergey M․
71febd1c52 [youtube:tab] Improve URL matching (closes #27559) 2020-12-29 02:19:43 +07:00
Sergey M․
f1bc56c99b [youtube:tab] Restore retry on browse requests (closes #27313, closes #27564) 2020-12-29 02:11:48 +07:00
Remita Amine
64e419bd73 [aparat] Fix extraction
closes #22285
closes #22611
closes #23348
closes #24354
closes #24591
closes #24904
closes #25418
closes #26070
closes #26350
closes #26738
closes #27563
2020-12-28 18:19:30 +01:00
Remita Amine
782ea947b4 [brightcove] remove sonyliv specific code 2020-12-28 11:12:57 +01:00
Remita Amine
f27224d57b [piksel] import format extraction 2020-12-28 10:50:29 +01:00
Remita Amine
c007188598 [zype] Add support for uplynk videos 2020-12-27 23:47:28 +01:00
Remita Amine
af93ecfd88 [toggle] add support for live.mewatch.sg (closes #27555) 2020-12-27 22:26:20 +01:00
JamKage
794771a164 [go] Added support for FXNetworks (#26826)
Co-authored-by: James Kirrage <james.kirrage@mortgagegym.com>

closes #13972
closes #22467
closes #23754
2020-12-27 17:36:21 +00:00
Sergey M․
6f2eaaf73d [teachable] Improve embed detection (closes #26923) 2020-12-27 22:57:50 +07:00
Remita Amine
4c7a4dbc4d [mitele] fix free video extraction(#24624)(closes #25827)(closes #26757) 2020-12-27 16:22:43 +01:00
Remita Amine
f86b299d0e [telecinco] fix extraction 2020-12-27 16:22:43 +01:00
Sergey M
e474996541 [youtube] Update invidious.snopyta.org (#22667)
Co-authored-by: sofutru <54445344+sofutru@users.noreply.github.com>
2020-12-27 21:15:09 +07:00
Remita Amine
aed617e311 [amcnetworks] improve auth only video detection(closes #27548) 2020-12-27 09:00:08 +01:00
Remita Amine
0fa67c1d68 [generic] Add support for VHX Embeds(#27546) 2020-12-27 09:00:07 +01:00
37 changed files with 727 additions and 441 deletions

View File

@@ -18,7 +18,7 @@ title: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.31. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@@ -26,7 +26,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
- [ ] I've verified that I'm running youtube-dl version **2020.12.31**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
@@ -41,7 +41,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.12.26
[debug] youtube-dl version 2020.12.31
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -19,7 +19,7 @@ labels: 'site-support-request'
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.31. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
@@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
- [ ] I've verified that I'm running youtube-dl version **2020.12.31**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones

View File

@@ -18,13 +18,13 @@ title: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.31. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
- [ ] I've verified that I'm running youtube-dl version **2020.12.31**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones

View File

@@ -18,7 +18,7 @@ title: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.31. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
- [ ] I've verified that I'm running youtube-dl version **2020.12.31**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
@@ -43,7 +43,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.12.26
[debug] youtube-dl version 2020.12.31
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -19,13 +19,13 @@ labels: 'request'
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.26. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.31. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2020.12.26**
- [ ] I've verified that I'm running youtube-dl version **2020.12.31**
- [ ] I've searched the bugtracker for similar feature requests including closed ones

View File

@@ -1,3 +1,47 @@
version 2020.12.31
Core
* [utils] Accept only supported protocols in url_or_none
* [YoutubeDL] Allow format filtering using audio language (#16209)
Extractors
+ [redditr] Extract all thumbnails (#27503)
* [vvvvid] Improve info extraction
+ [vvvvid] Add support for playlists (#18130, #27574)
+ [yandexdisk] Extract info from webpage
* [yandexdisk] Fix extraction (#17861, #27131)
* [yandexvideo] Use old API call as fallback
* [yandexvideo] Fix extraction (#25000)
- [nbc] Remove CSNNE extractor
* [nbc] Fix NBCSport VPlayer URL extraction (#16640)
+ [aenetworks] Add support for biography.com (#3863)
* [uktvplay] Match new video URLs (#17909)
* [sevenplay] Detect API errors
* [tenplay] Fix format extraction (#26653)
* [brightcove] Raise error for DRM protected videos (#23467, #27568)
version 2020.12.29
Extractors
* [youtube] Improve yt initial data extraction (#27524)
* [youtube:tab] Improve URL matching #27559)
* [youtube:tab] Restore retry on browse requests (#27313, #27564)
* [aparat] Fix extraction (#22285, #22611, #23348, #24354, #24591, #24904,
#25418, #26070, #26350, #26738, #27563)
- [brightcove] Remove sonyliv specific code
* [piksel] Improve format extraction
+ [zype] Add support for uplynk videos
+ [toggle] Add support for live.mewatch.sg (#27555)
+ [go] Add support for fxnow.fxnetworks.com (#13972, #22467, #23754, #26826)
* [teachable] Improve embed detection (#26923)
* [mitele] Fix free video extraction (#24624, #25827, #26757)
* [telecinco] Fix extraction
* [youtube] Update invidious.snopyta.org (#22667)
* [amcnetworks] Improve auth only video detection (#27548)
+ [generic] Add support for VHX Embeds (#27546)
version 2020.12.26
Extractors

View File

@@ -678,6 +678,7 @@ Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends
- `container`: Name of the container format
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
- `format_id`: A short description of the format
- `language`: Language code
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain).

View File

@@ -104,6 +104,7 @@
- **BilibiliAudioAlbum**
- **BiliBiliPlayer**
- **BioBioChileTV**
- **Biography**
- **BIQLE**
- **BitChute**
- **BitChuteChannel**
@@ -197,7 +198,6 @@
- **CrooksAndLiars**
- **crunchyroll**
- **crunchyroll:playlist**
- **CSNNE**
- **CSpan**: C-SPAN
- **CtsNews**: 華視新聞
- **CTV**
@@ -317,7 +317,6 @@
- **Funk**
- **Fusion**
- **Fux**
- **FXNetworks**
- **Gaia**
- **GameInformer**
- **GameSpot**
@@ -350,6 +349,7 @@
- **hgtv.com:show**
- **HiDive**
- **HistoricFilms**
- **history:player**
- **history:topic**: History.com Topic
- **hitbox**
- **hitbox:live**
@@ -1089,6 +1089,7 @@
- **vube**: Vube.com
- **VuClip**
- **VVVVID**
- **VVVVIDShow**
- **VyboryMos**
- **Vzaar**
- **Wakanim**

View File

@@ -554,6 +554,11 @@ class TestUtil(unittest.TestCase):
self.assertEqual(url_or_none('http$://foo.de'), None)
self.assertEqual(url_or_none('http://foo.de'), 'http://foo.de')
self.assertEqual(url_or_none('//foo.de'), '//foo.de')
self.assertEqual(url_or_none('s3://foo.de'), None)
self.assertEqual(url_or_none('rtmpte://foo.de'), 'rtmpte://foo.de')
self.assertEqual(url_or_none('mms://foo.de'), 'mms://foo.de')
self.assertEqual(url_or_none('rtspu://foo.de'), 'rtspu://foo.de')
self.assertEqual(url_or_none('ftps://foo.de'), 'ftps://foo.de')
def test_parse_age_limit(self):
self.assertEqual(parse_age_limit(None), None)

View File

@@ -1083,7 +1083,7 @@ class YoutubeDL(object):
'*=': lambda attr, value: value in attr,
}
str_operator_rex = re.compile(r'''(?x)
\s*(?P<key>ext|acodec|vcodec|container|protocol|format_id)
\s*(?P<key>ext|acodec|vcodec|container|protocol|format_id|language)
\s*(?P<negation>!\s*)?(?P<op>%s)(?P<none_inclusive>\s*\?)?
\s*(?P<value>[a-zA-Z0-9._-]+)
\s*$

View File

@@ -6,6 +6,7 @@ import re
from .theplatform import ThePlatformIE
from ..utils import (
ExtractorError,
GeoRestrictedError,
int_or_none,
update_url_query,
urlencode_postdata,
@@ -28,6 +29,7 @@ class AENetworksBaseIE(ThePlatformIE):
'lifetimemovieclub.com': ('LIFETIMEMOVIECLUB', 'lmc'),
'fyi.tv': ('FYI', 'fyi'),
'historyvault.com': (None, 'historyvault'),
'biography.com': (None, 'biography'),
}
def _extract_aen_smil(self, smil_url, video_id, auth=None):
@@ -54,6 +56,8 @@ class AENetworksBaseIE(ThePlatformIE):
tp_formats, tp_subtitles = self._extract_theplatform_smil(
m_url, video_id, 'Downloading %s SMIL data' % (q.get('switch') or q['assetTypes']))
except ExtractorError as e:
if isinstance(e, GeoRestrictedError):
raise
last_e = e
continue
formats.extend(tp_formats)
@@ -67,6 +71,34 @@ class AENetworksBaseIE(ThePlatformIE):
'subtitles': subtitles,
}
def _extract_aetn_info(self, domain, filter_key, filter_value, url):
requestor_id, brand = self._DOMAIN_MAP[domain]
result = self._download_json(
'https://feeds.video.aetnd.com/api/v2/%s/videos' % brand,
filter_value, query={'filter[%s]' % filter_key: filter_value})['results'][0]
title = result['title']
video_id = result['id']
media_url = result['publicUrl']
theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
r'https?://link\.theplatform\.com/s/([^?]+)', media_url, 'theplatform_path'), video_id)
info = self._parse_theplatform_metadata(theplatform_metadata)
auth = None
if theplatform_metadata.get('AETN$isBehindWall'):
resource = self._get_mvpd_resource(
requestor_id, theplatform_metadata['title'],
theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'),
theplatform_metadata['ratings'][0]['rating'])
auth = self._extract_mvpd_auth(
url, video_id, requestor_id, resource)
info.update(self._extract_aen_smil(media_url, video_id, auth))
info.update({
'title': title,
'series': result.get('seriesName'),
'season_number': int_or_none(result.get('tvSeasonNumber')),
'episode_number': int_or_none(result.get('tvSeasonEpisodeNumber')),
})
return info
class AENetworksIE(AENetworksBaseIE):
IE_NAME = 'aenetworks'
@@ -139,32 +171,7 @@ class AENetworksIE(AENetworksBaseIE):
def _real_extract(self, url):
domain, canonical = re.match(self._VALID_URL, url).groups()
requestor_id, brand = self._DOMAIN_MAP[domain]
result = self._download_json(
'https://feeds.video.aetnd.com/api/v2/%s/videos' % brand,
canonical, query={'filter[canonical]': '/' + canonical})['results'][0]
title = result['title']
video_id = result['id']
media_url = result['publicUrl']
theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
r'https?://link\.theplatform\.com/s/([^?]+)', media_url, 'theplatform_path'), video_id)
info = self._parse_theplatform_metadata(theplatform_metadata)
auth = None
if theplatform_metadata.get('AETN$isBehindWall'):
resource = self._get_mvpd_resource(
requestor_id, theplatform_metadata['title'],
theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'),
theplatform_metadata['ratings'][0]['rating'])
auth = self._extract_mvpd_auth(
url, video_id, requestor_id, resource)
info.update(self._extract_aen_smil(media_url, video_id, auth))
info.update({
'title': title,
'series': result.get('seriesName'),
'season_number': int_or_none(result.get('tvSeasonNumber')),
'episode_number': int_or_none(result.get('tvSeasonEpisodeNumber')),
})
return info
return self._extract_aetn_info(domain, 'canonical', '/' + canonical, url)
class AENetworksListBaseIE(AENetworksBaseIE):
@@ -294,3 +301,42 @@ class HistoryTopicIE(AENetworksBaseIE):
return self.url_result(
'http://www.history.com/videos/' + display_id,
AENetworksIE.ie_key())
class HistoryPlayerIE(AENetworksBaseIE):
IE_NAME = 'history:player'
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|biography)\.com)/player/(?P<id>\d+)'
_TESTS = []
def _real_extract(self, url):
domain, video_id = re.match(self._VALID_URL, url).groups()
return self._extract_aetn_info(domain, 'id', video_id, url)
class BiographyIE(AENetworksBaseIE):
_VALID_URL = r'https?://(?:www\.)?biography\.com/video/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://www.biography.com/video/vincent-van-gogh-full-episode-2075049808',
'info_dict': {
'id': '30322987',
'ext': 'mp4',
'title': 'Vincent Van Gogh - Full Episode',
'description': 'A full biography about the most influential 20th century painter, Vincent Van Gogh.',
'timestamp': 1311970571,
'upload_date': '20110729',
'uploader': 'AENE-NEW',
},
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['ThePlatform'],
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
player_url = self._search_regex(
r'<phoenix-iframe[^>]+src="(%s)' % HistoryPlayerIE._VALID_URL,
webpage, 'player URL')
return self.url_result(player_url, HistoryPlayerIE.ie_key())

View File

@@ -80,7 +80,8 @@ class AMCNetworksIE(ThePlatformIE):
title = theplatform_metadata['title']
rating = try_get(
theplatform_metadata, lambda x: x['ratings'][0]['rating'])
if properties.get('videoCategory') == 'TVE-Auth':
video_category = properties.get('videoCategory')
if video_category and video_category.endswith('-Auth'):
resource = self._get_mvpd_resource(
requestor_id, title, video_id, rating)
query['auth'] = self._extract_mvpd_auth(

View File

@@ -3,6 +3,7 @@ from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
get_element_by_id,
int_or_none,
merge_dicts,
mimetype2ext,
@@ -39,23 +40,15 @@ class AparatIE(InfoExtractor):
webpage = self._download_webpage(url, video_id, fatal=False)
if not webpage:
# Note: There is an easier-to-parse configuration at
# http://www.aparat.com/video/video/config/videohash/%video_id
# but the URL in there does not work
webpage = self._download_webpage(
'http://www.aparat.com/video/video/embed/vt/frame/showvideo/yes/videohash/' + video_id,
video_id)
options = self._parse_json(
self._search_regex(
r'options\s*=\s*JSON\.parse\(\s*(["\'])(?P<value>(?:(?!\1).)+)\1\s*\)',
webpage, 'options', group='value'),
video_id)
player = options['plugins']['sabaPlayerPlugin']
options = self._parse_json(self._search_regex(
r'options\s*=\s*({.+?})\s*;', webpage, 'options'), video_id)
formats = []
for sources in player['multiSRC']:
for sources in (options.get('multiSRC') or []):
for item in sources:
if not isinstance(item, dict):
continue
@@ -85,11 +78,12 @@ class AparatIE(InfoExtractor):
info = self._search_json_ld(webpage, video_id, default={})
if not info.get('title'):
info['title'] = player['title']
info['title'] = get_element_by_id('videoTitle', webpage) or \
self._html_search_meta(['og:title', 'twitter:title', 'DC.Title', 'title'], webpage, fatal=True)
return merge_dicts(info, {
'id': video_id,
'thumbnail': url_or_none(options.get('poster')),
'duration': int_or_none(player.get('duration')),
'duration': int_or_none(options.get('duration')),
'formats': formats,
})

View File

@@ -471,13 +471,18 @@ class BrightcoveNewIE(AdobePassIE):
def _parse_brightcove_metadata(self, json_data, video_id, headers={}):
title = json_data['name'].strip()
num_drm_sources = 0
formats = []
for source in json_data.get('sources', []):
sources = json_data.get('sources') or []
for source in sources:
container = source.get('container')
ext = mimetype2ext(source.get('type'))
src = source.get('src')
# https://support.brightcove.com/playback-api-video-fields-reference#key_systems_object
if ext == 'ism' or container == 'WVM' or source.get('key_systems'):
if container == 'WVM' or source.get('key_systems'):
num_drm_sources += 1
continue
elif ext == 'ism':
continue
elif ext == 'm3u8' or container == 'M2TS':
if not src:
@@ -534,20 +539,15 @@ class BrightcoveNewIE(AdobePassIE):
'format_id': build_format_id('rtmp'),
})
formats.append(f)
if not formats:
# for sonyliv.com DRM protected videos
s3_source_url = json_data.get('custom_fields', {}).get('s3sourceurl')
if s3_source_url:
formats.append({
'url': s3_source_url,
'format_id': 'source',
})
errors = json_data.get('errors')
if not formats and errors:
error = errors[0]
raise ExtractorError(
error.get('message') or error.get('error_subcode') or error['error_code'], expected=True)
if not formats:
errors = json_data.get('errors')
if errors:
error = errors[0]
raise ExtractorError(
error.get('message') or error.get('error_subcode') or error['error_code'], expected=True)
if sources and num_drm_sources == len(sources):
raise ExtractorError('This video is DRM protected.', expected=True)
self._sort_formats(formats)

View File

@@ -33,6 +33,8 @@ from .aenetworks import (
AENetworksCollectionIE,
AENetworksShowIE,
HistoryTopicIE,
HistoryPlayerIE,
BiographyIE,
)
from .afreecatv import AfreecaTVIE
from .airmozilla import AirMozillaIE
@@ -399,7 +401,6 @@ from .fujitv import FujiTVFODPlus7IE
from .funimation import FunimationIE
from .funk import FunkIE
from .fusion import FusionIE
from .fxnetworks import FXNetworksIE
from .gaia import GaiaIE
from .gameinformer import GameInformerIE
from .gamespot import GameSpotIE
@@ -691,7 +692,6 @@ from .nba import (
NBAChannelIE,
)
from .nbc import (
CSNNEIE,
NBCIE,
NBCNewsIE,
NBCOlympicsIE,
@@ -1425,7 +1425,10 @@ from .vshare import VShareIE
from .medialaan import MedialaanIE
from .vube import VubeIE
from .vuclip import VuClipIE
from .vvvvid import VVVVIDIE
from .vvvvid import (
VVVVIDIE,
VVVVIDShowIE,
)
from .vyborymos import VyboryMosIE
from .vzaar import VzaarIE
from .wakanim import WakanimIE

View File

@@ -1,77 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .adobepass import AdobePassIE
from ..utils import (
extract_attributes,
int_or_none,
parse_age_limit,
smuggle_url,
update_url_query,
)
class FXNetworksIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?(?:fxnetworks|simpsonsworld)\.com/video/(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.fxnetworks.com/video/1032565827847',
'md5': '8d99b97b4aa7a202f55b6ed47ea7e703',
'info_dict': {
'id': 'dRzwHC_MMqIv',
'ext': 'mp4',
'title': 'First Look: Better Things - Season 2',
'description': 'Because real life is like a fart. Watch this FIRST LOOK to see what inspired the new season of Better Things.',
'age_limit': 14,
'uploader': 'NEWA-FNG-FX',
'upload_date': '20170825',
'timestamp': 1503686274,
'episode_number': 0,
'season_number': 2,
'series': 'Better Things',
},
'add_ie': ['ThePlatform'],
}, {
'url': 'http://www.simpsonsworld.com/video/716094019682',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
if 'The content you are trying to access is not available in your region.' in webpage:
self.raise_geo_restricted()
video_data = extract_attributes(self._search_regex(
r'(<a.+?rel="https?://link\.theplatform\.com/s/.+?</a>)', webpage, 'video data'))
player_type = self._search_regex(r'playerType\s*=\s*[\'"]([^\'"]+)', webpage, 'player type', default=None)
release_url = video_data['rel']
title = video_data['data-title']
rating = video_data.get('data-rating')
query = {
'mbr': 'true',
}
if player_type == 'movies':
query.update({
'manifest': 'm3u',
})
else:
query.update({
'switch': 'http',
})
if video_data.get('data-req-auth') == '1':
resource = self._get_mvpd_resource(
video_data['data-channel'], title,
video_data.get('data-guid'), rating)
query['auth'] = self._extract_mvpd_auth(url, video_id, 'fx', resource)
return {
'_type': 'url_transparent',
'id': video_id,
'title': title,
'url': smuggle_url(update_url_query(release_url, query), {'force_smil_url': True}),
'series': video_data.get('data-show-title'),
'episode_number': int_or_none(video_data.get('data-episode')),
'season_number': int_or_none(video_data.get('data-season')),
'thumbnail': video_data.get('data-large-thumb'),
'age_limit': parse_age_limit(rating),
'ie_key': 'ThePlatform',
}

View File

@@ -67,7 +67,10 @@ from .tube8 import Tube8IE
from .mofosex import MofosexEmbedIE
from .spankwire import SpankwireIE
from .youporn import YouPornIE
from .vimeo import VimeoIE
from .vimeo import (
VimeoIE,
VHXEmbedIE,
)
from .dailymotion import DailymotionIE
from .dailymail import DailyMailIE
from .onionstudios import OnionStudiosIE
@@ -2193,7 +2196,18 @@ class GenericIE(InfoExtractor):
# 'params': {
# 'force_generic_extractor': True,
# },
# }
# },
{
# VHX Embed
'url': 'https://demo.vhx.tv/category-c/videos/file-example-mp4-480-1-5mg-copy',
'info_dict': {
'id': '858208',
'ext': 'mp4',
'title': 'Untitled',
'uploader_id': 'user80538407',
'uploader': 'OTT Videos',
},
},
]
def report_following_redirect(self, new_url):
@@ -2571,6 +2585,10 @@ class GenericIE(InfoExtractor):
if vimeo_urls:
return self.playlist_from_matches(vimeo_urls, video_id, video_title, ie=VimeoIE.ie_key())
vhx_url = VHXEmbedIE._extract_url(webpage)
if vhx_url:
return self.url_result(vhx_url, VHXEmbedIE.ie_key())
vid_me_embed_url = self._search_regex(
r'src=[\'"](https?://vid\.me/[^\'"]+)[\'"]',
webpage, 'vid.me embed', default=None)

View File

@@ -38,13 +38,17 @@ class GoIE(AdobePassIE):
'disneynow': {
'brand': '011',
'resource_id': 'Disney',
}
},
'fxnow.fxnetworks': {
'brand': '025',
'requestor_id': 'dtci',
},
}
_VALID_URL = r'''(?x)
https?://
(?:
(?:(?P<sub_domain>%s)\.)?go|
(?P<sub_domain_2>abc|freeform|disneynow)
(?P<sub_domain_2>abc|freeform|disneynow|fxnow\.fxnetworks)
)\.com/
(?:
(?:[^/]+/)*(?P<id>[Vv][Dd][Kk][Aa]\w+)|
@@ -99,6 +103,19 @@ class GoIE(AdobePassIE):
# m3u8 download
'skip_download': True,
},
}, {
'url': 'https://fxnow.fxnetworks.com/shows/better-things/video/vdka12782841',
'info_dict': {
'id': 'VDKA12782841',
'ext': 'mp4',
'title': 'First Look: Better Things - Season 2',
'description': 'md5:fa73584a95761c605d9d54904e35b407',
},
'params': {
'geo_bypass_ip_block': '3.244.239.0/24',
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://abc.go.com/shows/the-catch/episode-guide/season-01/10-the-wedding',
'only_matching': True,

View File

@@ -1,15 +1,14 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from .telecinco import TelecincoIE
from ..utils import (
int_or_none,
parse_iso8601,
smuggle_url,
)
class MiTeleIE(InfoExtractor):
class MiTeleIE(TelecincoIE):
IE_DESC = 'mitele.es'
_VALID_URL = r'https?://(?:www\.)?mitele\.es/(?:[^/]+/)+(?P<id>[^/]+)/player'
@@ -31,7 +30,6 @@ class MiTeleIE(InfoExtractor):
'timestamp': 1471209401,
'upload_date': '20160814',
},
'add_ie': ['Ooyala'],
}, {
# no explicit title
'url': 'http://www.mitele.es/programas-tv/cuarto-milenio/57b0de3dc915da14058b4876/player',
@@ -54,7 +52,6 @@ class MiTeleIE(InfoExtractor):
'params': {
'skip_download': True,
},
'add_ie': ['Ooyala'],
}, {
'url': 'http://www.mitele.es/series-online/la-que-se-avecina/57aac5c1c915da951a8b45ed/player',
'only_matching': True,
@@ -70,16 +67,11 @@ class MiTeleIE(InfoExtractor):
r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=\s*({.+})',
webpage, 'Pre Player'), display_id)['prePlayer']
title = pre_player['title']
video = pre_player['video']
video_id = video['dataMediaId']
video_info = self._parse_content(pre_player['video'], url)
content = pre_player.get('content') or {}
info = content.get('info') or {}
return {
'_type': 'url_transparent',
# for some reason only HLS is supported
'url': smuggle_url('ooyala:' + video_id, {'supportedformats': 'm3u8,dash'}),
'id': video_id,
video_info.update({
'title': title,
'description': info.get('synopsis'),
'series': content.get('title'),
@@ -87,7 +79,7 @@ class MiTeleIE(InfoExtractor):
'episode': content.get('subtitle'),
'episode_number': int_or_none(info.get('episode_number')),
'duration': int_or_none(info.get('duration')),
'thumbnail': video.get('dataPoster'),
'age_limit': int_or_none(info.get('rating')),
'timestamp': parse_iso8601(pre_player.get('publishedTime')),
}
})
return video_info

View File

@@ -158,7 +158,8 @@ class NBCIE(AdobePassIE):
class NBCSportsVPlayerIE(InfoExtractor):
_VALID_URL = r'https?://vplayer\.nbcsports\.com/(?:[^/]+/)+(?P<id>[0-9a-zA-Z_]+)'
_VALID_URL_BASE = r'https?://(?:vplayer\.nbcsports\.com|(?:www\.)?nbcsports\.com/vplayer)/'
_VALID_URL = _VALID_URL_BASE + r'(?:[^/]+/)+(?P<id>[0-9a-zA-Z_]+)'
_TESTS = [{
'url': 'https://vplayer.nbcsports.com/p/BxmELC/nbcsports_embed/select/9CsDKds0kvHI',
@@ -174,12 +175,15 @@ class NBCSportsVPlayerIE(InfoExtractor):
}, {
'url': 'https://vplayer.nbcsports.com/p/BxmELC/nbcsports_embed/select/media/_hqLjQ95yx8Z',
'only_matching': True,
}, {
'url': 'https://www.nbcsports.com/vplayer/p/BxmELC/nbcsports/select/PHJSaFWbrTY9?form=html&autoPlay=true',
'only_matching': True,
}]
@staticmethod
def _extract_url(webpage):
iframe_m = re.search(
r'<iframe[^>]+src="(?P<url>https?://vplayer\.nbcsports\.com/[^"]+)"', webpage)
r'<(?:iframe[^>]+|div[^>]+data-(?:mpx-)?)src="(?P<url>%s[^"]+)"' % NBCSportsVPlayerIE._VALID_URL_BASE, webpage)
if iframe_m:
return iframe_m.group('url')
@@ -192,21 +196,29 @@ class NBCSportsVPlayerIE(InfoExtractor):
class NBCSportsIE(InfoExtractor):
# Does not include https because its certificate is invalid
_VALID_URL = r'https?://(?:www\.)?nbcsports\.com//?(?:[^/]+/)+(?P<id>[0-9a-z-]+)'
_VALID_URL = r'https?://(?:www\.)?nbcsports\.com//?(?!vplayer/)(?:[^/]+/)+(?P<id>[0-9a-z-]+)'
_TEST = {
_TESTS = [{
# iframe src
'url': 'http://www.nbcsports.com//college-basketball/ncaab/tom-izzo-michigan-st-has-so-much-respect-duke',
'info_dict': {
'id': 'PHJSaFWbrTY9',
'ext': 'flv',
'ext': 'mp4',
'title': 'Tom Izzo, Michigan St. has \'so much respect\' for Duke',
'description': 'md5:ecb459c9d59e0766ac9c7d5d0eda8113',
'uploader': 'NBCU-SPORTS',
'upload_date': '20150330',
'timestamp': 1427726529,
}
}
}, {
# data-mpx-src
'url': 'https://www.nbcsports.com/philadelphia/philadelphia-phillies/bruce-bochy-hector-neris-hes-idiot',
'only_matching': True,
}, {
# data-src
'url': 'https://www.nbcsports.com/boston/video/report-card-pats-secondary-no-match-josh-allen',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -274,33 +286,6 @@ class NBCSportsStreamIE(AdobePassIE):
}
class CSNNEIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?csnne\.com/video/(?P<id>[0-9a-z-]+)'
_TEST = {
'url': 'http://www.csnne.com/video/snc-evening-update-wright-named-red-sox-no-5-starter',
'info_dict': {
'id': 'yvBLLUgQ8WU0',
'ext': 'mp4',
'title': 'SNC evening update: Wright named Red Sox\' No. 5 starter.',
'description': 'md5:1753cfee40d9352b19b4c9b3e589b9e3',
'timestamp': 1459369979,
'upload_date': '20160330',
'uploader': 'NBCU-SPORTS',
}
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
return {
'_type': 'url_transparent',
'ie_key': 'ThePlatform',
'url': self._html_search_meta('twitter:player:stream', webpage),
'display_id': display_id,
}
class NBCNewsIE(ThePlatformIE):
_VALID_URL = r'(?x)https?://(?:www\.)?(?:nbcnews|today|msnbc)\.com/([^/]+/)*(?:.*-)?(?P<id>[^/?]+)'

View File

@@ -90,7 +90,7 @@ class NhkVodIE(NhkBaseIE):
_TESTS = [{
# video clip
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/9999011/',
'md5': '256a1be14f48d960a7e61e2532d95ec3',
'md5': '7a90abcfe610ec22a6bfe15bd46b30ca',
'info_dict': {
'id': 'a95j5iza',
'ext': 'mp4',

View File

@@ -6,16 +6,33 @@ import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
dict_get,
ExtractorError,
int_or_none,
unescapeHTML,
parse_iso8601,
try_get,
unescapeHTML,
)
class PikselIE(InfoExtractor):
_VALID_URL = r'https?://player\.piksel\.com/v/(?:refid/[^/]+/prefid/)?(?P<id>[a-z0-9_]+)'
_VALID_URL = r'''(?x)https?://
(?:
(?:
player\.
(?:
olympusattelecom|
vibebyvista
)|
(?:api|player)\.multicastmedia|
(?:api-ovp|player)\.piksel
)\.com|
(?:
mz-edge\.stream\.co|
movie-s\.nhk\.or
)\.jp|
vidego\.baltimorecity\.gov
)/v/(?:refid/(?P<refid>[^/]+)/prefid/)?(?P<id>[\w-]+)'''
_TESTS = [
{
'url': 'http://player.piksel.com/v/ums2867l',
@@ -56,46 +73,41 @@ class PikselIE(InfoExtractor):
if mobj:
return mobj.group('url')
def _call_api(self, app_token, resource, display_id, query, fatal=True):
response = (self._download_json(
'http://player.piksel.com/ws/ws_%s/api/%s/mode/json/apiv/5' % (resource, app_token),
display_id, query=query, fatal=fatal) or {}).get('response')
failure = try_get(response, lambda x: x['failure']['reason'])
if failure:
if fatal:
raise ExtractorError(failure, expected=True)
self.report_warning(failure)
return response
def _real_extract(self, url):
display_id = self._match_id(url)
ref_id, display_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r'data-de-program-uuid=[\'"]([a-z0-9]+)',
webpage, 'program uuid', default=display_id)
app_token = self._search_regex([
r'clientAPI\s*:\s*"([^"]+)"',
r'data-de-api-key\s*=\s*"([^"]+)"'
], webpage, 'app token')
response = self._download_json(
'http://player.piksel.com/ws/ws_program/api/%s/mode/json/apiv/5' % app_token,
video_id, query={
'v': video_id
})['response']
failure = response.get('failure')
if failure:
raise ExtractorError(response['failure']['reason'], expected=True)
video_data = response['WsProgramResponse']['program']['asset']
query = {'refid': ref_id, 'prefid': display_id} if ref_id else {'v': display_id}
program = self._call_api(
app_token, 'program', display_id, query)['WsProgramResponse']['program']
video_id = program['uuid']
video_data = program['asset']
title = video_data['title']
asset_type = dict_get(video_data, ['assetType', 'asset_type'])
formats = []
m3u8_url = dict_get(video_data, [
'm3u8iPadURL',
'ipadM3u8Url',
'm3u8AndroidURL',
'm3u8iPhoneURL',
'iphoneM3u8Url'])
if m3u8_url:
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
asset_type = dict_get(video_data, ['assetType', 'asset_type'])
for asset_file in video_data.get('assetFiles', []):
def process_asset_file(asset_file):
if not asset_file:
return
# TODO: extract rtmp formats
http_url = asset_file.get('http_url')
if not http_url:
continue
return
tbr = None
vbr = int_or_none(asset_file.get('videoBitrate'), 1024)
abr = int_or_none(asset_file.get('audioBitrate'), 1024)
@@ -118,6 +130,43 @@ class PikselIE(InfoExtractor):
'filesize': int_or_none(asset_file.get('filesize')),
'tbr': tbr,
})
def process_asset_files(asset_files):
for asset_file in (asset_files or []):
process_asset_file(asset_file)
process_asset_files(video_data.get('assetFiles'))
process_asset_file(video_data.get('referenceFile'))
if not formats:
asset_id = video_data.get('assetid') or program.get('assetid')
if asset_id:
process_asset_files(try_get(self._call_api(
app_token, 'asset_file', display_id, {
'assetid': asset_id,
}, False), lambda x: x['WsAssetFileResponse']['AssetFiles']))
m3u8_url = dict_get(video_data, [
'm3u8iPadURL',
'ipadM3u8Url',
'm3u8AndroidURL',
'm3u8iPhoneURL',
'iphoneM3u8Url'])
if m3u8_url:
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
smil_url = dict_get(video_data, ['httpSmil', 'hdSmil', 'rtmpSmil'])
if smil_url:
transform_source = None
if ref_id == 'nhkworld':
# TODO: figure out if this is something to be fixed in urljoin,
# _parse_smil_formats or keep it here
transform_source = lambda x: x.replace('src="/', 'src="').replace('/media"', '/media/"')
formats.extend(self._extract_smil_formats(
re.sub(r'/od/[^/]+/', '/od/http/', smil_url), video_id,
transform_source=transform_source, fatal=False))
self._sort_formats(formats)
subtitles = {}

View File

@@ -8,6 +8,7 @@ from ..utils import (
int_or_none,
float_or_none,
try_get,
unescapeHTML,
url_or_none,
)
@@ -56,7 +57,8 @@ class RedditRIE(InfoExtractor):
'id': 'zv89llsvexdz',
'ext': 'mp4',
'title': 'That small heart attack.',
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'thumbnails': 'count:4',
'timestamp': 1501941939,
'upload_date': '20170805',
'uploader': 'Antw87',
@@ -118,11 +120,34 @@ class RedditRIE(InfoExtractor):
else:
age_limit = None
thumbnails = []
def add_thumbnail(src):
if not isinstance(src, dict):
return
thumbnail_url = url_or_none(src.get('url'))
if not thumbnail_url:
return
thumbnails.append({
'url': unescapeHTML(thumbnail_url),
'width': int_or_none(src.get('width')),
'height': int_or_none(src.get('height')),
})
for image in try_get(data, lambda x: x['preview']['images']) or []:
if not isinstance(image, dict):
continue
add_thumbnail(image.get('source'))
resolutions = image.get('resolutions')
if isinstance(resolutions, list):
for resolution in resolutions:
add_thumbnail(resolution)
return {
'_type': 'url_transparent',
'url': video_url,
'title': data.get('title'),
'thumbnail': url_or_none(data.get('thumbnail')),
'thumbnails': thumbnails,
'timestamp': float_or_none(data.get('created_utc')),
'uploader': data.get('author'),
'duration': int_or_none(try_get(

View File

@@ -4,8 +4,12 @@ from __future__ import unicode_literals
import re
from .brightcove import BrightcoveNewIE
from ..compat import compat_str
from ..compat import (
compat_HTTPError,
compat_str,
)
from ..utils import (
ExtractorError,
try_get,
update_url_query,
)
@@ -41,16 +45,22 @@ class SevenPlusIE(BrightcoveNewIE):
def _real_extract(self, url):
path, episode_id = re.match(self._VALID_URL, url).groups()
media = self._download_json(
'https://videoservice.swm.digital/playback', episode_id, query={
'appId': '7plus',
'deviceType': 'web',
'platformType': 'web',
'accountId': 5303576322001,
'referenceId': 'ref:' + episode_id,
'deliveryId': 'csai',
'videoType': 'vod',
})['media']
try:
media = self._download_json(
'https://videoservice.swm.digital/playback', episode_id, query={
'appId': '7plus',
'deviceType': 'web',
'platformType': 'web',
'accountId': 5303576322001,
'referenceId': 'ref:' + episode_id,
'deliveryId': 'csai',
'videoType': 'vod',
})['media']
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
raise ExtractorError(self._parse_json(
e.cause.read().decode(), episode_id)[0]['error_code'], expected=True)
raise
for source in media.get('sources', {}):
src = source.get('src')

View File

@@ -140,7 +140,7 @@ class TeachableIE(TeachableBaseIE):
@staticmethod
def _is_teachable(webpage):
return 'teachableTracker.linker:autoLink' in webpage and re.search(
r'<link[^>]+href=["\']https?://process\.fs\.teachablecdn\.com',
r'<link[^>]+href=["\']https?://(?:process\.fs|assets)\.teachablecdn\.com',
webpage)
@staticmethod

View File

@@ -5,14 +5,11 @@ import json
import re
from .common import InfoExtractor
from .ooyala import OoyalaIE
from ..utils import (
clean_html,
determine_ext,
int_or_none,
str_or_none,
try_get,
urljoin,
)
@@ -28,7 +25,7 @@ class TelecincoIE(InfoExtractor):
'description': 'md5:716caf5601e25c3c5ab6605b1ae71529',
},
'playlist': [{
'md5': 'adb28c37238b675dad0f042292f209a7',
'md5': '7ee56d665cfd241c0e6d80fd175068b0',
'info_dict': {
'id': 'JEA5ijCnF6p5W08A1rNKn7',
'ext': 'mp4',
@@ -38,7 +35,7 @@ class TelecincoIE(InfoExtractor):
}]
}, {
'url': 'http://www.cuatro.com/deportes/futbol/barcelona/Leo_Messi-Champions-Roma_2_2052780128.html',
'md5': '9468140ebc300fbb8b9d65dc6e5c4b43',
'md5': 'c86fe0d99e3bdb46b7950d38bf6ef12a',
'info_dict': {
'id': 'jn24Od1zGLG4XUZcnUnZB6',
'ext': 'mp4',
@@ -48,7 +45,7 @@ class TelecincoIE(InfoExtractor):
},
}, {
'url': 'http://www.mediaset.es/12meses/campanas/doylacara/conlatratanohaytrato/Ayudame-dar-cara-trata-trato_2_1986630220.html',
'md5': 'ae2dc6b7b50b2392076a51c0f70e01f6',
'md5': 'eddb50291df704ce23c74821b995bcac',
'info_dict': {
'id': 'aywerkD2Sv1vGNqq9b85Q2',
'ext': 'mp4',
@@ -90,58 +87,24 @@ class TelecincoIE(InfoExtractor):
def _parse_content(self, content, url):
video_id = content['dataMediaId']
if content.get('dataCmsId') == 'ooyala':
return self.url_result(
'ooyala:%s' % video_id, OoyalaIE.ie_key(), video_id)
config_url = urljoin(url, content['dataConfig'])
config = self._download_json(
config_url, video_id, 'Downloading config JSON')
content['dataConfig'], video_id, 'Downloading config JSON')
title = config['info']['title']
def mmc_url(mmc_type):
return re.sub(
r'/(?:flash|html5)\.json', '/%s.json' % mmc_type,
config['services']['mmc'])
duration = None
formats = []
for mmc_type in ('flash', 'html5'):
mmc = self._download_json(
mmc_url(mmc_type), video_id,
'Downloading %s mmc JSON' % mmc_type, fatal=False)
if not mmc:
continue
if not duration:
duration = int_or_none(mmc.get('duration'))
for location in mmc['locations']:
gat = self._proto_relative_url(location.get('gat'), 'http:')
gcp = location.get('gcp')
ogn = location.get('ogn')
if None in (gat, gcp, ogn):
continue
token_data = {
'gcp': gcp,
'ogn': ogn,
'sta': 0,
}
media = self._download_json(
gat, video_id, data=json.dumps(token_data).encode('utf-8'),
headers={
'Content-Type': 'application/json;charset=utf-8',
'Referer': url,
}, fatal=False) or {}
stream = media.get('stream') or media.get('file')
if not stream:
continue
ext = determine_ext(stream)
if ext == 'f4m':
formats.extend(self._extract_f4m_formats(
stream + '&hdcore=3.2.0&plugin=aasp-3.2.0.77.18',
video_id, f4m_id='hds', fatal=False))
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
stream, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
services = config['services']
caronte = self._download_json(services['caronte'], video_id)
stream = caronte['dls'][0]['stream']
headers = self.geo_verification_headers()
headers.update({
'Content-Type': 'application/json;charset=UTF-8',
'Origin': re.match(r'https?://[^/]+', url).group(0),
})
cdn = self._download_json(
caronte['cerbero'], video_id, data=json.dumps({
'bbx': caronte['bbx'],
'gbx': self._download_json(services['gbx'], video_id)['gbx'],
}).encode(), headers=headers)['tokens']['1']['cdn']
formats = self._extract_m3u8_formats(
stream + '?' + cdn, video_id, 'mp4', 'm3u8_native', m3u8_id='hls')
self._sort_formats(formats)
return {
@@ -149,7 +112,7 @@ class TelecincoIE(InfoExtractor):
'title': title,
'formats': formats,
'thumbnail': content.get('dataPoster') or config.get('poster', {}).get('imageUrl'),
'duration': duration,
'duration': int_or_none(content.get('dataDuration')),
}
def _real_extract(self, url):

View File

@@ -3,9 +3,10 @@ from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
HEADRequest,
parse_age_limit,
parse_iso8601,
smuggle_url,
# smuggle_url,
)
@@ -24,14 +25,16 @@ class TenPlayIE(InfoExtractor):
'uploader_id': '2199827728001',
},
'params': {
'format': 'bestvideo',
# 'format': 'bestvideo',
'skip_download': True,
}
}, {
'url': 'https://10play.com.au/how-to-stay-married/web-extras/season-1/terrys-talks-ep-1-embracing-change/tpv190915ylupc',
'only_matching': True,
}]
BRIGHTCOVE_URL_TEMPLATE = 'https://players.brightcove.net/2199827728001/cN6vRtRQt_default/index.html?videoId=%s'
# BRIGHTCOVE_URL_TEMPLATE = 'https://players.brightcove.net/2199827728001/cN6vRtRQt_default/index.html?videoId=%s'
_GEO_BYPASS = False
_FASTLY_URL_TEMPL = 'https://10-selector.global.ssl.fastly.net/s/kYEXFC/media/%s?mbr=true&manifest=m3u&format=redirect'
def _real_extract(self, url):
content_id = self._match_id(url)
@@ -40,19 +43,28 @@ class TenPlayIE(InfoExtractor):
video = data.get('video') or {}
metadata = data.get('metaData') or {}
brightcove_id = video.get('videoId') or metadata['showContentVideoId']
brightcove_url = smuggle_url(
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
{'geo_countries': ['AU']})
# brightcove_url = smuggle_url(
# self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
# {'geo_countries': ['AU']})
m3u8_url = self._request_webpage(HEADRequest(
self._FASTLY_URL_TEMPL % brightcove_id), brightcove_id).geturl()
if '10play-not-in-oz' in m3u8_url:
self.raise_geo_restricted(countries=['AU'])
formats = self._extract_m3u8_formats(m3u8_url, brightcove_id, 'mp4')
self._sort_formats(formats)
return {
'_type': 'url_transparent',
'url': brightcove_url,
'id': content_id,
'title': video.get('title') or metadata.get('pageContentName') or metadata.get('showContentName'),
# '_type': 'url_transparent',
# 'url': brightcove_url,
'formats': formats,
'id': brightcove_id,
'title': video.get('title') or metadata.get('pageContentName') or metadata['showContentName'],
'description': video.get('description'),
'age_limit': parse_age_limit(video.get('showRatingClassification') or metadata.get('showProgramClassification')),
'series': metadata.get('showName'),
'season': metadata.get('showContentSeason'),
'timestamp': parse_iso8601(metadata.get('contentPublishDate') or metadata.get('pageContentPublishDate')),
'ie_key': 'BrightcoveNew',
'thumbnail': video.get('poster'),
'uploader_id': '2199827728001',
# 'ie_key': 'BrightcoveNew',
}

View File

@@ -200,7 +200,7 @@ class ToggleIE(InfoExtractor):
class MeWatchIE(InfoExtractor):
IE_NAME = 'mewatch'
_VALID_URL = r'https?://(?:www\.)?mewatch\.sg/watch/[^/?#&]+-(?P<id>[0-9]+)'
_VALID_URL = r'https?://(?:(?:www|live)\.)?mewatch\.sg/watch/[^/?#&]+-(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://www.mewatch.sg/watch/Recipe-Of-Life-E1-179371',
'info_dict': {
@@ -220,6 +220,9 @@ class MeWatchIE(InfoExtractor):
}, {
'url': 'https://www.mewatch.sg/watch/Little-Red-Dot-Detectives-S2-%E6%90%9C%E5%AF%86%E3%80%82%E6%89%93%E5%8D%A1%E3%80%82%E5%B0%8F%E7%BA%A2%E7%82%B9-S2-E1-176232',
'only_matching': True,
}, {
'url': 'https://live.mewatch.sg/watch/Recipe-Of-Life-E41-189759',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -5,10 +5,9 @@ from .common import InfoExtractor
class UKTVPlayIE(InfoExtractor):
_VALID_URL = r'https?://uktvplay\.uktv\.co\.uk/.+?\?.*?\bvideo=(?P<id>\d+)'
_TEST = {
_VALID_URL = r'https?://uktvplay\.uktv\.co\.uk/(?:.+?\?.*?\bvideo=|([^/]+/)*watch-online/)(?P<id>\d+)'
_TESTS = [{
'url': 'https://uktvplay.uktv.co.uk/shows/world-at-war/c/200/watch-online/?video=2117008346001',
'md5': '',
'info_dict': {
'id': '2117008346001',
'ext': 'mp4',
@@ -23,7 +22,11 @@ class UKTVPlayIE(InfoExtractor):
'skip_download': True,
},
'expected_warnings': ['Failed to download MPD manifest']
}
}, {
'url': 'https://uktvplay.uktv.co.uk/shows/africa/watch-online/5983349675001',
'only_matching': True,
}]
# BRIGHTCOVE_URL_TEMPLATE = 'https://players.brightcove.net/1242911124001/OrCyvJ2gyL_default/index.html?videoId=%s'
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1242911124001/H1xnMOqP_default/index.html?videoId=%s'
def _real_extract(self, url):

View File

@@ -1119,6 +1119,12 @@ class VHXEmbedIE(VimeoBaseInfoExtractor):
IE_NAME = 'vhx:embed'
_VALID_URL = r'https?://embed\.vhx\.tv/videos/(?P<id>\d+)'
@staticmethod
def _extract_url(webpage):
mobj = re.search(
r'<iframe[^>]+src="(https?://embed\.vhx\.tv/videos/\d+[^"]*)"', webpage)
return unescapeHTML(mobj.group(1)) if mobj else None
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
@@ -1127,5 +1133,6 @@ class VHXEmbedIE(VimeoBaseInfoExtractor):
'ott data'), video_id, js_to_json)['config_url']
config = self._download_json(config_url, video_id)
info = self._parse_config(config, video_id)
info['id'] = video_id
self._vimeo_sort_formats(info['formats'])
return info

View File

@@ -12,7 +12,8 @@ from ..utils import (
class VVVVIDIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?vvvvid\.it/(?:#!)?(?:show|anime|film|series)/(?P<show_id>\d+)/[^/]+/(?P<season_id>\d+)/(?P<id>[0-9]+)'
_VALID_URL_BASE = r'https?://(?:www\.)?vvvvid\.it/(?:#!)?(?:show|anime|film|series)/'
_VALID_URL = r'%s(?P<show_id>\d+)/[^/]+/(?P<season_id>\d+)/(?P<id>[0-9]+)' % _VALID_URL_BASE
_TESTS = [{
# video_type == 'video/vvvvid'
'url': 'https://www.vvvvid.it/#!show/434/perche-dovrei-guardarlo-di-dario-moccia/437/489048/ping-pong',
@@ -21,6 +22,16 @@ class VVVVIDIE(InfoExtractor):
'id': '489048',
'ext': 'mp4',
'title': 'Ping Pong',
'duration': 239,
'series': '"Perché dovrei guardarlo?" di Dario Moccia',
'season_id': '437',
'season_number': 1,
'episode': 'Ping Pong',
'episode_number': 1,
'episode_id': '3334',
'view_count': int,
'like_count': int,
'repost_count': int,
},
'params': {
'skip_download': True,
@@ -37,6 +48,9 @@ class VVVVIDIE(InfoExtractor):
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.vvvvid.it/show/434/perche-dovrei-guardarlo-di-dario-moccia/437/489048',
'only_matching': True
}]
_conn_id = None
@@ -45,20 +59,36 @@ class VVVVIDIE(InfoExtractor):
'https://www.vvvvid.it/user/login',
None, headers=self.geo_verification_headers())['data']['conn_id']
def _real_extract(self, url):
show_id, season_id, video_id = re.match(self._VALID_URL, url).groups()
def _download_info(self, show_id, path, video_id, fatal=True):
response = self._download_json(
'https://www.vvvvid.it/vvvvid/ondemand/%s/season/%s' % (show_id, season_id),
'https://www.vvvvid.it/vvvvid/ondemand/%s/%s' % (show_id, path),
video_id, headers=self.geo_verification_headers(), query={
'conn_id': self._conn_id,
})
if response['result'] == 'error':
}, fatal=fatal)
if not (response or fatal):
return
if response.get('result') == 'error':
raise ExtractorError('%s said: %s' % (
self.IE_NAME, response['message']), expected=True)
return response['data']
def _extract_common_video_info(self, video_data):
return {
'thumbnail': video_data.get('thumbnail'),
'episode_number': int_or_none(video_data.get('number')),
'episode_id': str_or_none(video_data.get('id')),
}
def _real_extract(self, url):
show_id, season_id, video_id = re.match(self._VALID_URL, url).groups()
response = self._download_info(
show_id, 'season/%s' % season_id, video_id)
vid = int(video_id)
video_data = list(filter(
lambda episode: episode.get('video_id') == vid, response['data']))[0]
lambda episode: episode.get('video_id') == vid, response))[0]
title = video_data['title']
formats = []
# vvvvid embed_info decryption algorithm is reverse engineered from function $ds(h) at vvvvid.js
@@ -141,18 +171,67 @@ class VVVVIDIE(InfoExtractor):
'http://sb.top-ix.org/videomg/_definst_/mp4:%s/playlist.m3u8' % embed_code, video_id))
self._sort_formats(formats)
return {
info = self._extract_common_video_info(video_data)
info.update({
'id': video_id,
'title': video_data['title'],
'title': title,
'formats': formats,
'thumbnail': video_data.get('thumbnail'),
'duration': int_or_none(video_data.get('length')),
'series': video_data.get('show_title'),
'season_id': season_id,
'season_number': video_data.get('season_number'),
'episode_id': str_or_none(video_data.get('id')),
'episode_number': int_or_none(video_data.get('number')),
'episode_title': video_data['title'],
'episode': title,
'view_count': int_or_none(video_data.get('views')),
'like_count': int_or_none(video_data.get('video_likes')),
}
'repost_count': int_or_none(video_data.get('video_shares')),
})
return info
class VVVVIDShowIE(VVVVIDIE):
_VALID_URL = r'(?P<base_url>%s(?P<id>\d+)(?:/(?P<show_title>[^/?&#]+))?)/?(?:[?#&]|$)' % VVVVIDIE._VALID_URL_BASE
_TESTS = [{
'url': 'https://www.vvvvid.it/show/156/psyco-pass',
'info_dict': {
'id': '156',
'title': 'Psycho-Pass',
'description': 'md5:94d572c0bd85894b193b8aebc9a3a806',
},
'playlist_count': 46,
}, {
'url': 'https://www.vvvvid.it/show/156',
'only_matching': True,
}]
def _real_extract(self, url):
base_url, show_id, show_title = re.match(self._VALID_URL, url).groups()
seasons = self._download_info(
show_id, 'seasons/', show_title)
show_info = self._download_info(
show_id, 'info/', show_title, fatal=False)
entries = []
for season in (seasons or []):
season_number = int_or_none(season.get('number'))
episodes = season.get('episodes') or []
for episode in episodes:
season_id = str_or_none(episode.get('season_id'))
video_id = str_or_none(episode.get('video_id'))
if not (season_id and video_id):
continue
info = self._extract_common_video_info(episode)
info.update({
'_type': 'url',
'ie_key': VVVVIDIE.ie_key(),
'url': '/'.join([base_url, season_id, video_id]),
'title': episode.get('title'),
'description': episode.get('description'),
'season_number': season_number,
'season_id': season_id,
})
entries.append(info)
return self.playlist_result(
entries, show_id, show_info.get('title'), show_info.get('description'))

View File

@@ -1,23 +1,43 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
determine_ext,
float_or_none,
int_or_none,
mimetype2ext,
try_get,
urlencode_postdata,
urljoin,
)
class YandexDiskIE(InfoExtractor):
_VALID_URL = r'https?://yadi\.sk/[di]/(?P<id>[^/?#&]+)'
_VALID_URL = r'''(?x)https?://
(?P<domain>
yadi\.sk|
disk\.yandex\.
(?:
az|
by|
co(?:m(?:\.(?:am|ge|tr))?|\.il)|
ee|
fr|
k[gz]|
l[tv]|
md|
t[jm]|
u[az]|
ru
)
)/(?:[di]/|public.*?\bhash=)(?P<id>[^/?#&]+)'''
_TESTS = [{
'url': 'https://yadi.sk/i/VdOeDou8eZs6Y',
'md5': '33955d7ae052f15853dc41f35f17581c',
'md5': 'a4a8d52958c8fddcf9845935070402ae',
'info_dict': {
'id': 'VdOeDou8eZs6Y',
'ext': 'mp4',
@@ -27,92 +47,101 @@ class YandexDiskIE(InfoExtractor):
'uploader_id': '300043621',
'view_count': int,
},
'expected_warnings': ['Unable to download JSON metadata'],
}, {
'url': 'https://yadi.sk/d/h3WAXvDS3Li3Ce',
'only_matching': True,
}, {
'url': 'https://yadi.sk/public?hash=5DZ296JK9GWCLp02f6jrObjnctjRxMs8L6%2B%2FuhNqk38%3D',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
status = self._download_webpage(
'https://disk.yandex.com/auth/status', video_id, query={
'urlOrigin': url,
'source': 'public',
'md5': 'false',
})
sk = self._search_regex(
r'(["\'])sk(?:External)?\1\s*:\s*(["\'])(?P<value>(?:(?!\2).)+)\2',
status, 'sk', group='value')
domain, video_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, video_id)
store = self._parse_json(self._search_regex(
r'<script[^>]+id="store-prefetch"[^>]*>\s*({.+?})\s*</script>',
webpage, 'store'), video_id)
resource = store['resources'][store['rootResourceId']]
models = self._parse_json(
self._search_regex(
r'<script[^>]+id=["\']models-client[^>]+>\s*(\[.+?\])\s*</script',
webpage, 'video JSON'),
video_id)
title = resource['name']
meta = resource.get('meta') or {}
data = next(
model['data'] for model in models
if model.get('model') == 'resource')
public_url = meta.get('short_url')
if public_url:
video_id = self._match_id(public_url)
video_hash = data['id']
title = data['name']
source_url = (self._download_json(
'https://cloud-api.yandex.net/v1/disk/public/resources/download',
video_id, query={'public_key': url}, fatal=False) or {}).get('href')
video_streams = resource.get('videoStreams') or {}
video_hash = resource.get('hash') or url
environment = store.get('environment') or {}
sk = environment.get('sk')
yandexuid = environment.get('yandexuid')
if sk and yandexuid and not (source_url and video_streams):
self._set_cookie(domain, 'yandexuid', yandexuid)
models = self._download_json(
'https://disk.yandex.com/models/', video_id,
data=urlencode_postdata({
'_model.0': 'videoInfo',
'id.0': video_hash,
'_model.1': 'do-get-resource-url',
'id.1': video_hash,
'version': '13.6',
'sk': sk,
}), query={'_m': 'videoInfo'})['models']
videos = try_get(models, lambda x: x[0]['data']['videos'], list) or []
source_url = try_get(
models, lambda x: x[1]['data']['file'], compat_str)
def call_api(action):
return (self._download_json(
urljoin(url, '/public/api/') + action, video_id, data=json.dumps({
'hash': video_hash,
'sk': sk,
}).encode(), headers={
'Content-Type': 'text/plain',
}, fatal=False) or {}).get('data') or {}
if not source_url:
# TODO: figure out how to detect if download limit has
# been reached and then avoid unnecessary source format
# extraction requests
source_url = call_api('download-url').get('url')
if not video_streams:
video_streams = call_api('get-video-streams')
formats = []
if source_url:
formats.append({
'url': source_url,
'format_id': 'source',
'ext': determine_ext(title, 'mp4'),
'ext': determine_ext(title, meta.get('ext') or mimetype2ext(meta.get('mime_type')) or 'mp4'),
'quality': 1,
'filesize': int_or_none(meta.get('size'))
})
for video in videos:
for video in (video_streams.get('videos') or []):
format_url = video.get('url')
if not format_url:
continue
if determine_ext(format_url) == 'm3u8':
if video.get('dimension') == 'adaptive':
formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', entry_protocol='m3u8_native',
format_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
else:
size = video.get('size') or {}
height = int_or_none(size.get('height'))
format_id = 'hls'
if height:
format_id += '-%dp' % height
formats.append({
'ext': 'mp4',
'format_id': format_id,
'height': height,
'protocol': 'm3u8_native',
'url': format_url,
'width': int_or_none(size.get('width')),
})
self._sort_formats(formats)
duration = float_or_none(try_get(
models, lambda x: x[0]['data']['duration']), 1000)
uploader = try_get(
data, lambda x: x['user']['display_name'], compat_str)
uploader_id = try_get(
data, lambda x: x['user']['uid'], compat_str)
view_count = int_or_none(try_get(
data, lambda x: x['meta']['views_counter']))
uid = resource.get('uid')
display_name = try_get(store, lambda x: x['users'][uid]['displayName'])
return {
'id': video_id,
'title': title,
'duration': duration,
'uploader': uploader,
'uploader_id': uploader_id,
'view_count': view_count,
'duration': float_or_none(video_streams.get('duration'), 1000),
'uploader': display_name,
'uploader_id': uid,
'view_count': int_or_none(meta.get('views_counter')),
'formats': formats,
}

View File

@@ -5,6 +5,7 @@ from .common import InfoExtractor
from ..utils import (
determine_ext,
int_or_none,
try_get,
url_or_none,
)
@@ -13,26 +14,30 @@ class YandexVideoIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
yandex\.ru(?:/portal/(?:video|efir))?/?\?.*?stream_id=|
yandex\.ru(?:/(?:portal/(?:video|efir)|efir))?/?\?.*?stream_id=|
frontend\.vh\.yandex\.ru/player/
)
(?P<id>[\da-f]+)
(?P<id>(?:[\da-f]{32}|[\w-]{12}))
'''
_TESTS = [{
'url': 'https://yandex.ru/portal/video?stream_id=4dbb262b4fe5cf15a215de4f34eee34d',
'md5': '33955d7ae052f15853dc41f35f17581c',
'url': 'https://yandex.ru/portal/video?stream_id=4dbb36ec4e0526d58f9f2dc8f0ecf374',
'md5': 'e02a05bfaf0d9615ef07ae3a10f4faf4',
'info_dict': {
'id': '4dbb262b4fe5cf15a215de4f34eee34d',
'id': '4dbb36ec4e0526d58f9f2dc8f0ecf374',
'ext': 'mp4',
'title': 'В Нью-Йорке баржи и теплоход оторвались от причала и расплылись по Гудзону',
'description': '',
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 0,
'duration': 30,
'title': 'Русский Вудсток - главный рок-фест в истории СССР / вДудь',
'description': 'md5:7d6b8d4bc4a3b9a56499916c1ea5b5fa',
'thumbnail': r're:^https?://',
'timestamp': 1549972939,
'duration': 5575,
'age_limit': 18,
'upload_date': '20190212',
'view_count': int,
'like_count': int,
'dislike_count': int,
},
}, {
'url': 'https://yandex.ru/portal/efir?stream_id=4dbb36ec4e0526d58f9f2dc8f0ecf374&from=morda',
'url': 'https://yandex.ru/portal/efir?stream_id=4dbb262b4fe5cf15a215de4f34eee34d&from=morda',
'only_matching': True,
}, {
'url': 'https://yandex.ru/?stream_id=4dbb262b4fe5cf15a215de4f34eee34d',
@@ -52,53 +57,88 @@ class YandexVideoIE(InfoExtractor):
# DASH with DRM
'url': 'https://yandex.ru/portal/video?from=morda&stream_id=485a92d94518d73a9d0ff778e13505f8',
'only_matching': True,
}, {
'url': 'https://yandex.ru/efir?stream_active=watching&stream_id=v7a2dZ-v5mSI&from_block=efir_newtab',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
content = self._download_json(
'https://frontend.vh.yandex.ru/v22/player/%s.json' % video_id,
video_id, query={
'stream_options': 'hires',
'disable_trackings': 1,
})['content']
player = try_get((self._download_json(
'https://frontend.vh.yandex.ru/graphql', video_id, data=b'''{
player(content_id: "%s") {
computed_title
content_url
description
dislikes
duration
likes
program_title
release_date
release_date_ut
release_year
restriction_age
season
start_time
streams
thumbnail
title
views_count
}
}''' % video_id.encode(), fatal=False)), lambda x: x['player']['content'])
if not player or player.get('error'):
player = self._download_json(
'https://frontend.vh.yandex.ru/v23/player/%s.json' % video_id,
video_id, query={
'stream_options': 'hires',
'disable_trackings': 1,
})
content = player['content']
content_url = url_or_none(content.get('content_url')) or url_or_none(
content['streams'][0]['url'])
title = content.get('title') or content.get('computed_title')
title = content.get('title') or content['computed_title']
ext = determine_ext(content_url)
if ext == 'm3u8':
formats = self._extract_m3u8_formats(
content_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls')
elif ext == 'mpd':
formats = self._extract_mpd_formats(
content_url, video_id, mpd_id='dash')
else:
formats = [{'url': content_url}]
formats = []
streams = content.get('streams') or []
streams.append({'url': content.get('content_url')})
for stream in streams:
content_url = url_or_none(stream.get('url'))
if not content_url:
continue
ext = determine_ext(content_url)
if ext == 'ismc':
continue
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
content_url, video_id, 'mp4',
'm3u8_native', m3u8_id='hls', fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
content_url, video_id, mpd_id='dash', fatal=False))
else:
formats.append({'url': content_url})
self._sort_formats(formats)
description = content.get('description')
thumbnail = content.get('thumbnail')
timestamp = (int_or_none(content.get('release_date'))
or int_or_none(content.get('release_date_ut'))
or int_or_none(content.get('start_time')))
duration = int_or_none(content.get('duration'))
series = content.get('program_title')
age_limit = int_or_none(content.get('restriction_age'))
season = content.get('season') or {}
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'description': content.get('description'),
'thumbnail': content.get('thumbnail'),
'timestamp': timestamp,
'duration': duration,
'series': series,
'age_limit': age_limit,
'duration': int_or_none(content.get('duration')),
'series': content.get('program_title'),
'age_limit': int_or_none(content.get('restriction_age')),
'view_count': int_or_none(content.get('views_count')),
'like_count': int_or_none(content.get('likes')),
'dislike_count': int_or_none(content.get('dislikes')),
'season_number': int_or_none(season.get('season_number')),
'season_id': season.get('id'),
'release_year': int_or_none(content.get('release_year')),
'formats': formats,
}

View File

@@ -16,6 +16,7 @@ from ..jsinterp import JSInterpreter
from ..swfinterp import SWFInterpreter
from ..compat import (
compat_chr,
compat_HTTPError,
compat_parse_qs,
compat_urllib_parse_unquote,
compat_urllib_parse_unquote_plus,
@@ -279,6 +280,7 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
_YT_INITIAL_DATA_RE = r'(?:window\s*\[\s*["\']ytInitialData["\']\s*\]|ytInitialData)\s*=\s*({.+?})\s*;'
_YT_INITIAL_PLAYER_RESPONSE_RE = r'ytInitialPlayerResponse\s*=\s*({.+?})\s*;'
_YT_INITIAL_BOUNDARY_RE = r'(?:var\s+meta|</script|\n)'
def _call_api(self, ep, query, video_id):
data = self._DEFAULT_API_DATA.copy()
@@ -296,7 +298,7 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
def _extract_yt_initial_data(self, video_id, webpage):
return self._parse_json(
self._search_regex(
(r'%s\s*\n' % self._YT_INITIAL_DATA_RE,
(r'%s\s*%s' % (self._YT_INITIAL_DATA_RE, self._YT_INITIAL_BOUNDARY_RE),
self._YT_INITIAL_DATA_RE), webpage, 'yt initial data'),
video_id)
@@ -321,7 +323,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# Invidious instances taken from https://github.com/omarroth/invidious/wiki/Invidious-Instances
(?:(?:www|dev)\.)?invidio\.us/|
(?:(?:www|no)\.)?invidiou\.sh/|
(?:(?:www|fi|de)\.)?invidious\.snopyta\.org/|
(?:(?:www|fi)\.)?invidious\.snopyta\.org/|
(?:www\.)?invidious\.kabi\.tk/|
(?:www\.)?invidious\.13ad\.de/|
(?:www\.)?invidious\.mastodon\.host/|
@@ -1102,6 +1104,15 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'skip_download': True,
},
},
{
# another example of '};' in ytInitialData
'url': 'https://www.youtube.com/watch?v=gVfgbahppCY',
'only_matching': True,
},
{
'url': 'https://www.youtube.com/watch_popup?v=63RmMXCd_bQ',
'only_matching': True,
},
]
def __init__(self, *args, **kwargs):
@@ -1701,7 +1712,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if not video_info and not player_response:
player_response = extract_player_response(
self._search_regex(
(r'%s\s*(?:var\s+meta|</script|\n)' % self._YT_INITIAL_PLAYER_RESPONSE_RE,
(r'%s\s*%s' % (self._YT_INITIAL_PLAYER_RESPONSE_RE, self._YT_INITIAL_BOUNDARY_RE),
self._YT_INITIAL_PLAYER_RESPONSE_RE), video_webpage,
'initial player response', default='{}'),
video_id)
@@ -2729,6 +2740,11 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if YoutubeIE.suitable(url) else super(
YoutubeTabIE, cls).suitable(url)
def _extract_channel_id(self, webpage):
channel_id = self._html_search_meta(
'channelId', webpage, 'channel id', default=None)
@@ -3009,10 +3025,24 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
for page_num in itertools.count(1):
if not continuation:
break
browse = self._download_json(
'https://www.youtube.com/browse_ajax', None,
'Downloading page %d' % page_num,
headers=headers, query=continuation, fatal=False)
count = 0
retries = 3
while count <= retries:
try:
# Downloading page may result in intermittent 5xx HTTP error
# that is usually worked around with a retry
browse = self._download_json(
'https://www.youtube.com/browse_ajax', None,
'Downloading page %d%s'
% (page_num, ' (retry #%d)' % count if count else ''),
headers=headers, query=continuation)
break
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (500, 503):
count += 1
if count <= retries:
continue
raise
if not browse:
break
response = try_get(browse, lambda x: x[1]['response'], dict)

View File

@@ -85,7 +85,13 @@ class ZypeIE(InfoExtractor):
else:
m3u8_url = self._search_regex(
r'(["\'])(?P<url>(?:(?!\1).)+\.m3u8(?:(?!\1).)*)\1',
body, 'm3u8 url', group='url')
body, 'm3u8 url', group='url', default=None)
if not m3u8_url:
source = self._parse_json(self._search_regex(
r'(?s)sources\s*:\s*\[\s*({.+?})\s*\]', body,
'source'), video_id, js_to_json)
if source.get('integration') == 'verizon-media':
m3u8_url = 'https://content.uplynk.com/%s.m3u8' % source['id']
formats = self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls')
text_tracks = self._search_regex(

View File

@@ -3640,7 +3640,7 @@ def url_or_none(url):
if not url or not isinstance(url, compat_str):
return None
url = url.strip()
return url if re.match(r'^(?:[a-zA-Z][\da-zA-Z.+-]*:)?//', url) else None
return url if re.match(r'^(?:(?:https?|rt(?:m(?:pt?[es]?|fp)|sp[su]?)|mms|ftps?):)?//', url) else None
def parse_duration(s):

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2020.12.26'
__version__ = '2020.12.31'