Compare commits

..

36 Commits

Author SHA1 Message Date
Sergey M․
e8dfecb384 release 2018.04.03 2018-04-03 00:26:11 +07:00
Sergey M․
10f9caec04 [ChangeLog] Actualize
[ci skip]
2018-04-03 00:23:03 +07:00
Sergey M․
ea6679fbeb [tvnow] Fix issues, simplify and improve (closes #15837) 2018-04-03 00:08:22 +07:00
AndroKev
3acae1e031 [tvnow] Add support for shows 2018-04-03 00:06:47 +07:00
Sergey M․
8bd1df3c31 [dramafever] Fix authentication (closes #16067) 2018-04-02 22:19:42 +07:00
Sergey M․
86693c4930 [afreecatv] Use partial view only when necessary (closes #14450) 2018-04-02 00:00:45 +07:00
Sergey M․
d563fb32ba [afreecatv] Remove debug output 2018-04-01 23:07:54 +07:00
Sergey M․
e51762be19 [afreecatv] Add support for authentication (#14450) 2018-04-01 22:47:39 +07:00
kenavera
03fcde10ce [nationalgeographic] Add support for new URL schema (closes #16001) 2018-04-01 21:22:51 +07:00
Sergey M․
95a1322bc1 [bilibili] Remove debug from player params regexes 2018-04-01 02:06:58 +07:00
Parmjit Virk
0669f8fd8f [xvideos] Fix thumbnail extraction (closes #15978) 2018-03-31 23:46:08 +07:00
kenavera
0b4bbcdcb6 [medialaan] Fix vod id 2018-03-31 22:14:49 +07:00
Luca Steeb
3e78d23b57 [openload] Add support for oload.site 2018-03-30 23:25:43 +07:00
Sergey M․
190f6c936b [naver] Fix extraction (closes #16029) 2018-03-29 23:49:09 +07:00
Sergey M․
02f6ccbce3 [dramafever] Partially switch to API v5 (closes #16026) 2018-03-29 23:06:13 +07:00
Arend v. Reinersdorff
5d60b99717 [options] Mention comments support in --batch-file 2018-03-27 22:25:29 +07:00
xofe
9e6a418015 [abc:iview] Unescape title and series meta fields 2018-03-27 22:08:40 +07:00
Attila-Mihaly Balazs
99c3091850 [videa] Extend _VALID_URL 2018-03-27 22:02:04 +07:00
Sergey M․
bbd9d8c170 release 2018.03.26.1 2018-03-26 22:32:03 +07:00
Sergey M․
c3cfc71a0c [ChangeLog] Actualize
[ci skip]
2018-03-26 22:30:11 +07:00
Sergey M․
671e241bfb release 2018.03.26 2018-03-26 05:03:47 +07:00
Sergey M․
29d9594561 [ChangeLog] Actualize
[ci skip]
2018-03-26 22:11:01 +07:00
Sergey M․
f0298f653e [downloader/external] Simplify finished progress hook reporting and add elapsed time (closes #10876) 2018-03-24 16:35:21 +07:00
Sergey M․
2ea212628e [downloader/common] Improve progress reporting when no total bytes available 2018-03-24 16:35:15 +07:00
John Hawkinson
80aa246094 [downloader/external] Fix download finalization when writing file to stdout (closes #10809)
An OSError or IOError generally indicates something a little more
wrong than a "simple" UnavailableVideoError, so print the actual
traceback that leads to the exception. Otherwise meaningful postmortem
debugging a bug report is essentially infeasible.
2018-03-24 16:34:55 +07:00
Sergey M․
0ff2c1ecb6 [downloader/fragment] Fix download finalization when writing file to stdout (closes #15799) 2018-03-24 16:04:23 +07:00
Joseph Spiros
16132cff72 [vrv] Fix extraction on python2 (closes #15928) 2018-03-24 14:57:34 +07:00
Sergey M․
86e1958944 [afreecatv] Update referrer (closes #15947) 2018-03-24 14:21:08 +07:00
Sergey M․
b015cb1af3 [24video] Add support for 24video.sexy (closes #15973) 2018-03-24 14:11:27 +07:00
Sergey M․
7d34016fb0 [crackle] Bypass geo restriction 2018-03-24 01:49:50 +07:00
Sergey M․
b9f5a41207 [crackle] Fix extraction (closes #15969) 2018-03-23 23:53:18 +07:00
Sergey M․
8b7340a45e [lenta] Add extractor (closes #15953) 2018-03-22 23:07:31 +07:00
Chih-Hsuan Yen
1d4a0520ba Merge pull request #15939 from sudovijay/patch-11
[Youku] Update ccode
2018-03-22 14:42:35 +08:00
Sergey M․
cba5d1b6b3 [instagram:user] Add pagination (closes #15934) 2018-03-21 23:43:03 +07:00
Vijay Singh
328ddf56a1 [Youku] Update ccode 2018-03-21 12:13:31 +05:30
Philipp Hagemeister
3395958d2b libsyn: adapt to new page structure and replace testcase 2018-03-20 23:07:11 +01:00
29 changed files with 753 additions and 356 deletions

View File

@@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.03.20*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.03.20**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.04.03*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.04.03**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -36,7 +36,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2018.03.20
[debug] youtube-dl version 2018.04.03
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -1,3 +1,39 @@
version 2018.04.03
Extractors
+ [tvnow] Add support for shows (#15837)
* [dramafever] Fix authentication (#16067)
* [afreecatv] Use partial view only when necessary (#14450)
+ [afreecatv] Add support for authentication (#14450)
+ [nationalgeographic] Add support for new URL schema (#16001, #16054)
* [xvideos] Fix thumbnail extraction (#15978, #15979)
* [medialaan] Fix vod id (#16038)
+ [openload] Add support for oload.site (#16039)
* [naver] Fix extraction (#16029)
* [dramafever] Partially switch to API v5 (#16026)
* [abc:iview] Unescape title and series meta fields (#15994)
* [videa] Extend URL regular expression (#16003)
version 2018.03.26.1
Core
+ [downloader/external] Add elapsed time to progress hook (#10876)
* [downloader/external,fragment] Fix download finalization when writing file
to stdout (#10809, #10876, #15799)
Extractors
* [vrv] Fix extraction on python2 (#15928)
* [afreecatv] Update referrer (#15947)
+ [24video] Add support for 24video.sexy (#15973)
* [crackle] Bypass geo restriction
* [crackle] Fix extraction (#15969)
+ [lenta] Add support for lenta.ru (#15953)
+ [instagram:user] Add pagination (#15934)
* [youku] Update ccode (#15939)
* [libsyn] Adapt to new page structure
version 2018.03.20
Core

View File

@@ -223,7 +223,9 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
## Filesystem Options:
-a, --batch-file FILE File containing URLs to download ('-' for
stdin)
stdin), one URL per line. Lines starting
with '#', ';' or ']' are considered as
comments and ignored.
--id Use only video ID in file name
-o, --output TEMPLATE Output filename template, see the "OUTPUT
TEMPLATE" for all the info

View File

@@ -419,6 +419,7 @@
- **Lecture2Go**
- **LEGO**
- **Lemonde**
- **Lenta**
- **LePlaylist**
- **LetvCloud**: 乐视云
- **Libsyn**
@@ -886,6 +887,7 @@
- **TVNoe**
- **TVNow**
- **TVNowList**
- **TVNowShow**
- **tvp**: Telewizja Polska
- **tvp:embed**: Telewizja Polska
- **tvp:series**

View File

@@ -249,12 +249,13 @@ class FileDownloader(object):
if self.params.get('noprogress', False):
self.to_screen('[download] Download completed')
else:
s['_total_bytes_str'] = format_bytes(s['total_bytes'])
msg_template = '100%%'
if s.get('total_bytes') is not None:
s['_total_bytes_str'] = format_bytes(s['total_bytes'])
msg_template += ' of %(_total_bytes_str)s'
if s.get('elapsed') is not None:
s['_elapsed_str'] = self.format_seconds(s['elapsed'])
msg_template = '100%% of %(_total_bytes_str)s in %(_elapsed_str)s'
else:
msg_template = '100%% of %(_total_bytes_str)s'
msg_template += ' in %(_elapsed_str)s'
self._report_progress_status(
msg_template % s, is_last_line=True)

View File

@@ -1,9 +1,10 @@
from __future__ import unicode_literals
import os.path
import re
import subprocess
import sys
import re
import time
from .common import FileDownloader
from ..compat import (
@@ -30,6 +31,7 @@ class ExternalFD(FileDownloader):
tmpfilename = self.temp_name(filename)
try:
started = time.time()
retval = self._call_downloader(tmpfilename, info_dict)
except KeyboardInterrupt:
if not info_dict.get('is_live'):
@@ -41,15 +43,20 @@ class ExternalFD(FileDownloader):
self.to_screen('[%s] Interrupted by user' % self.get_basename())
if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize))
self.try_rename(tmpfilename, filename)
self._hook_progress({
'downloaded_bytes': fsize,
'total_bytes': fsize,
status = {
'filename': filename,
'status': 'finished',
})
'elapsed': time.time() - started,
}
if filename != '-':
fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize))
self.try_rename(tmpfilename, filename)
status.update({
'downloaded_bytes': fsize,
'total_bytes': fsize,
})
self._hook_progress(status)
return True
else:
self.to_stderr('\n')

View File

@@ -241,12 +241,16 @@ class FragmentFD(FileDownloader):
if os.path.isfile(ytdl_filename):
os.remove(ytdl_filename)
elapsed = time.time() - ctx['started']
self.try_rename(ctx['tmpfilename'], ctx['filename'])
fsize = os.path.getsize(encodeFilename(ctx['filename']))
if ctx['tmpfilename'] == '-':
downloaded_bytes = ctx['complete_frags_downloaded_bytes']
else:
self.try_rename(ctx['tmpfilename'], ctx['filename'])
downloaded_bytes = os.path.getsize(encodeFilename(ctx['filename']))
self._hook_progress({
'downloaded_bytes': fsize,
'total_bytes': fsize,
'downloaded_bytes': downloaded_bytes,
'total_bytes': downloaded_bytes,
'filename': ctx['filename'],
'status': 'finished',
'elapsed': elapsed,

View File

@@ -13,6 +13,7 @@ from ..utils import (
int_or_none,
parse_iso8601,
try_get,
unescapeHTML,
update_url_query,
)
@@ -109,16 +110,17 @@ class ABCIViewIE(InfoExtractor):
# ABC iview programs are normally available for 14 days only.
_TESTS = [{
'url': 'http://iview.abc.net.au/programs/call-the-midwife/ZW0898A003S00',
'url': 'https://iview.abc.net.au/programs/ben-and-hollys-little-kingdom/ZY9247A021S00',
'md5': 'cde42d728b3b7c2b32b1b94b4a548afc',
'info_dict': {
'id': 'ZW0898A003S00',
'id': 'ZY9247A021S00',
'ext': 'mp4',
'title': 'Series 5 Ep 3',
'description': 'md5:e0ef7d4f92055b86c4f33611f180ed79',
'upload_date': '20171228',
'uploader_id': 'abc1',
'timestamp': 1514499187,
'title': "Gaston's Visit",
'series': "Ben And Holly's Little Kingdom",
'description': 'md5:18db170ad71cf161e006a4c688e33155',
'upload_date': '20180318',
'uploader_id': 'abc4kids',
'timestamp': 1521400959,
},
'params': {
'skip_download': True,
@@ -169,12 +171,12 @@ class ABCIViewIE(InfoExtractor):
return {
'id': video_id,
'title': title,
'title': unescapeHTML(title),
'description': self._html_search_meta(['og:description', 'twitter:description'], webpage),
'thumbnail': self._html_search_meta(['og:image', 'twitter:image:src'], webpage),
'duration': int_or_none(video_params.get('eventDuration')),
'timestamp': parse_iso8601(video_params.get('pubDate'), ' '),
'series': video_params.get('seriesTitle'),
'series': unescapeHTML(video_params.get('seriesTitle')),
'series_id': video_params.get('seriesHouseNumber') or video_id[:7],
'episode_number': int_or_none(self._html_search_meta('episodeNumber', webpage, default=None)),
'episode': self._html_search_meta('episode_title', webpage, default=None),

View File

@@ -9,6 +9,7 @@ from ..utils import (
determine_ext,
ExtractorError,
int_or_none,
urlencode_postdata,
xpath_text,
)
@@ -28,6 +29,7 @@ class AfreecaTVIE(InfoExtractor):
)
(?P<id>\d+)
'''
_NETRC_MACHINE = 'afreecatv'
_TESTS = [{
'url': 'http://live.afreecatv.com:8079/app/index.cgi?szType=read_ucc_bbs&szBjId=dailyapril&nStationNo=16711924&nBbsNo=18605867&nTitleNo=36164052&szSkin=',
'md5': 'f72c89fe7ecc14c1b5ce506c4996046e',
@@ -139,22 +141,22 @@ class AfreecaTVIE(InfoExtractor):
'skip_download': True,
},
}, {
# adult video
'url': 'http://vod.afreecatv.com/PLAYER/STATION/26542731',
# PARTIAL_ADULT
'url': 'http://vod.afreecatv.com/PLAYER/STATION/32028439',
'info_dict': {
'id': '20171001_F1AE1711_196617479_1',
'id': '20180327_27901457_202289533_1',
'ext': 'mp4',
'title': '[생]서아 초심 찾기 방송 (part 1)',
'title': '[생]빨개요♥ (part 1)',
'thumbnail': 're:^https?://(?:video|st)img.afreecatv.com/.*$',
'uploader': 'BJ서아',
'uploader': '[SA]서아',
'uploader_id': 'bjdyrksu',
'upload_date': '20171001',
'duration': 3600,
'age_limit': 18,
'upload_date': '20180327',
'duration': 3601,
},
'params': {
'skip_download': True,
},
'expected_warnings': ['adult content'],
}, {
'url': 'http://www.afreecatv.com/player/Player.swf?szType=szBjId=djleegoon&nStationNo=11273158&nBbsNo=13161095&nTitleNo=36327652',
'only_matching': True,
@@ -172,6 +174,51 @@ class AfreecaTVIE(InfoExtractor):
video_key['part'] = int(m.group('part'))
return video_key
def _real_initialize(self):
self._login()
def _login(self):
username, password = self._get_login_info()
if username is None:
return
login_form = {
'szWork': 'login',
'szType': 'json',
'szUid': username,
'szPassword': password,
'isSaveId': 'false',
'szScriptVar': 'oLoginRet',
'szAction': '',
}
response = self._download_json(
'https://login.afreecatv.com/app/LoginAction.php', None,
'Logging in', data=urlencode_postdata(login_form))
_ERRORS = {
-4: 'Your account has been suspended due to a violation of our terms and policies.',
-5: 'https://member.afreecatv.com/app/user_delete_progress.php',
-6: 'https://login.afreecatv.com/membership/changeMember.php',
-8: "Hello! AfreecaTV here.\nThe username you have entered belongs to \n an account that requires a legal guardian's consent. \nIf you wish to use our services without restriction, \nplease make sure to go through the necessary verification process.",
-9: 'https://member.afreecatv.com/app/pop_login_block.php',
-11: 'https://login.afreecatv.com/afreeca/second_login.php',
-12: 'https://member.afreecatv.com/app/user_security.php',
0: 'The username does not exist or you have entered the wrong password.',
-1: 'The username does not exist or you have entered the wrong password.',
-3: 'You have entered your username/password incorrectly.',
-7: 'You cannot use your Global AfreecaTV account to access Korean AfreecaTV.',
-10: 'Sorry for the inconvenience. \nYour account has been blocked due to an unauthorized access. \nPlease contact our Help Center for assistance.',
-32008: 'You have failed to log in. Please contact our Help Center.',
}
result = int_or_none(response.get('RESULT'))
if result != 1:
error = _ERRORS.get(result, 'You have failed to log in.')
raise ExtractorError(
'Unable to login: %s said: %s' % (self.IE_NAME, error),
expected=True)
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -188,21 +235,41 @@ class AfreecaTVIE(InfoExtractor):
video_id = self._search_regex(
r'nTitleNo\s*=\s*(\d+)', webpage, 'title', default=video_id)
video_xml = self._download_xml(
'http://afbbs.afreecatv.com:8080/api/video/get_video_info.php',
video_id, headers={
'Referer': 'http://vod.afreecatv.com/embed.php',
}, query={
partial_view = False
for _ in range(2):
query = {
'nTitleNo': video_id,
'nStationNo': station_id,
'nBbsNo': bbs_id,
'partialView': 'SKIP_ADULT',
})
}
if partial_view:
query['partialView'] = 'SKIP_ADULT'
video_xml = self._download_xml(
'http://afbbs.afreecatv.com:8080/api/video/get_video_info.php',
video_id, 'Downloading video info XML%s'
% (' (skipping adult)' if partial_view else ''),
video_id, headers={
'Referer': url,
}, query=query)
flag = xpath_text(video_xml, './track/flag', 'flag', default=None)
if flag and flag != 'SUCCEED':
flag = xpath_text(video_xml, './track/flag', 'flag', default=None)
if flag and flag == 'SUCCEED':
break
if flag == 'PARTIAL_ADULT':
self._downloader.report_warning(
'In accordance with local laws and regulations, underage users are restricted from watching adult content. '
'Only content suitable for all ages will be downloaded. '
'Provide account credentials if you wish to download restricted content.')
partial_view = True
continue
elif flag == 'ADULT':
error = 'Only users older than 19 are able to watch this video. Provide account credentials to download this content.'
else:
error = flag
raise ExtractorError(
'%s said: %s' % (self.IE_NAME, flag), expected=True)
'%s said: %s' % (self.IE_NAME, error), expected=True)
else:
raise ExtractorError('Unable to download video info')
video_element = video_xml.findall(compat_xpath('./track/video'))[-1]
if video_element is None or video_element.text is None:

View File

@@ -117,9 +117,9 @@ class BiliBiliIE(InfoExtractor):
r'cid(?:["\']:|=)(\d+)', webpage, 'cid',
default=None
) or compat_parse_qs(self._search_regex(
[r'1EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
r'1EmbedPlayer\([^)]+,\s*\\"([^"]+)\\"\)',
r'1<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
[r'EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
r'EmbedPlayer\([^)]+,\s*\\"([^"]+)\\"\)',
r'<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
webpage, 'player parameters'))['cid'][0]
else:
if 'no_bangumi_tip' not in smuggled_data:

View File

@@ -1,31 +1,45 @@
# coding: utf-8
from __future__ import unicode_literals, division
import re
from .common import InfoExtractor
from ..utils import int_or_none
from ..compat import (
compat_str,
compat_HTTPError,
)
from ..utils import (
determine_ext,
float_or_none,
int_or_none,
parse_age_limit,
parse_duration,
ExtractorError
)
class CrackleIE(InfoExtractor):
_GEO_COUNTRIES = ['US']
_VALID_URL = r'(?:crackle:|https?://(?:(?:www|m)\.)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)'
_TEST = {
'url': 'http://www.crackle.com/comedians-in-cars-getting-coffee/2498934',
# geo restricted to CA
'url': 'https://www.crackle.com/andromeda/2502343',
'info_dict': {
'id': '2498934',
'id': '2502343',
'ext': 'mp4',
'title': 'Everybody Respects A Bloody Nose',
'description': 'Jerry is kaffeeklatsching in L.A. with funnyman J.B. Smoove (Saturday Night Live, Real Husbands of Hollywood). Theyre headed for brew at 10 Speed Coffee in a 1964 Studebaker Avanti.',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 906,
'series': 'Comedians In Cars Getting Coffee',
'season_number': 8,
'episode_number': 4,
'subtitles': {
'en-US': [
{'ext': 'vtt'},
{'ext': 'tt'},
]
},
'title': 'Under The Night',
'description': 'md5:d2b8ca816579ae8a7bf28bfff8cefc8a',
'duration': 2583,
'view_count': int,
'average_rating': 0,
'age_limit': 14,
'genre': 'Action, Sci-Fi',
'creator': 'Allan Kroeker',
'artist': 'Keith Hamilton Cobb, Kevin Sorbo, Lisa Ryder, Lexa Doig, Robert Hewitt Wolfe',
'release_year': 2000,
'series': 'Andromeda',
'episode': 'Under The Night',
'season_number': 1,
'episode_number': 1,
},
'params': {
# m3u8 download
@@ -33,109 +47,118 @@ class CrackleIE(InfoExtractor):
}
}
_THUMBNAIL_RES = [
(120, 90),
(208, 156),
(220, 124),
(220, 220),
(240, 180),
(250, 141),
(315, 236),
(320, 180),
(360, 203),
(400, 300),
(421, 316),
(460, 330),
(460, 460),
(462, 260),
(480, 270),
(587, 330),
(640, 480),
(700, 330),
(700, 394),
(854, 480),
(1024, 1024),
(1920, 1080),
]
# extracted from http://legacyweb-us.crackle.com/flash/ReferrerRedirect.ashx
_MEDIA_FILE_SLOTS = {
'c544.flv': {
'width': 544,
'height': 306,
},
'360p.mp4': {
'width': 640,
'height': 360,
},
'480p.mp4': {
'width': 852,
'height': 478,
},
'480p_1mbps.mp4': {
'width': 852,
'height': 478,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
config_doc = self._download_xml(
'http://legacyweb-us.crackle.com/flash/QueryReferrer.ashx?site=16',
video_id, 'Downloading config')
country_code = self._downloader.params.get('geo_bypass_country', None)
countries = [country_code] if country_code else (
'US', 'AU', 'CA', 'AS', 'FM', 'GU', 'MP', 'PR', 'PW', 'MH', 'VI')
item = self._download_xml(
'http://legacyweb-us.crackle.com/app/revamp/vidwallcache.aspx?flags=-1&fm=%s' % video_id,
video_id, headers=self.geo_verification_headers()).find('i')
title = item.attrib['t']
last_e = None
subtitles = {}
formats = self._extract_m3u8_formats(
'http://content.uplynk.com/ext/%s/%s.m3u8' % (config_doc.attrib['strUplynkOwnerId'], video_id),
video_id, 'mp4', m3u8_id='hls', fatal=None)
thumbnails = []
path = item.attrib.get('p')
if path:
for width, height in self._THUMBNAIL_RES:
res = '%dx%d' % (width, height)
thumbnails.append({
'id': res,
'url': 'http://images-us-am.crackle.com/%stnl_%s.jpg' % (path, res),
'width': width,
'height': height,
'resolution': res,
})
http_base_url = 'http://ahttp.crackle.com/' + path
for mfs_path, mfs_info in self._MEDIA_FILE_SLOTS.items():
formats.append({
'url': http_base_url + mfs_path,
'format_id': 'http-' + mfs_path.split('.')[0],
'width': mfs_info['width'],
'height': mfs_info['height'],
})
for cc in item.findall('cc'):
locale = cc.attrib.get('l')
v = cc.attrib.get('v')
if locale and v:
if locale not in subtitles:
subtitles[locale] = []
for url_ext, ext in (('vtt', 'vtt'), ('xml', 'tt')):
subtitles.setdefault(locale, []).append({
'url': '%s/%s%s_%s.%s' % (config_doc.attrib['strSubtitleServer'], path, locale, v, url_ext),
'ext': ext,
})
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))
for country in countries:
try:
media = self._download_json(
'https://web-api-us.crackle.com/Service.svc/details/media/%s/%s'
% (video_id, country), video_id,
'Downloading media JSON as %s' % country,
'Unable to download media JSON', query={
'disableProtocols': 'true',
'format': 'json'
})
except ExtractorError as e:
# 401 means geo restriction, trying next country
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
last_e = e
continue
raise
return {
'id': video_id,
'title': title,
'description': item.attrib.get('d'),
'duration': int(item.attrib.get('r'), 16) / 1000 if item.attrib.get('r') else None,
'series': item.attrib.get('sn'),
'season_number': int_or_none(item.attrib.get('se')),
'episode_number': int_or_none(item.attrib.get('ep')),
'thumbnails': thumbnails,
'subtitles': subtitles,
'formats': formats,
}
media_urls = media.get('MediaURLs')
if not media_urls or not isinstance(media_urls, list):
continue
title = media['Title']
formats = []
for e in media['MediaURLs']:
if e.get('UseDRM') is True:
continue
format_url = e.get('Path')
if not format_url or not isinstance(format_url, compat_str):
continue
ext = determine_ext(format_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
format_url, video_id, mpd_id='dash', fatal=False))
self._sort_formats(formats)
description = media.get('Description')
duration = int_or_none(media.get(
'DurationInSeconds')) or parse_duration(media.get('Duration'))
view_count = int_or_none(media.get('CountViews'))
average_rating = float_or_none(media.get('UserRating'))
age_limit = parse_age_limit(media.get('Rating'))
genre = media.get('Genre')
release_year = int_or_none(media.get('ReleaseYear'))
creator = media.get('Directors')
artist = media.get('Cast')
if media.get('MediaTypeDisplayValue') == 'Full Episode':
series = media.get('ShowName')
episode = title
season_number = int_or_none(media.get('Season'))
episode_number = int_or_none(media.get('Episode'))
else:
series = episode = season_number = episode_number = None
subtitles = {}
cc_files = media.get('ClosedCaptionFiles')
if isinstance(cc_files, list):
for cc_file in cc_files:
if not isinstance(cc_file, dict):
continue
cc_url = cc_file.get('Path')
if not cc_url or not isinstance(cc_url, compat_str):
continue
lang = cc_file.get('Locale') or 'en'
subtitles.setdefault(lang, []).append({'url': cc_url})
thumbnails = []
images = media.get('Images')
if isinstance(images, list):
for image_key, image_url in images.items():
mobj = re.search(r'Img_(\d+)[xX](\d+)', image_key)
if not mobj:
continue
thumbnails.append({
'url': image_url,
'width': int(mobj.group(1)),
'height': int(mobj.group(2)),
})
return {
'id': video_id,
'title': title,
'description': description,
'duration': duration,
'view_count': view_count,
'average_rating': average_rating,
'age_limit': age_limit,
'genre': genre,
'creator': creator,
'artist': artist,
'release_year': release_year,
'series': series,
'episode': episode,
'season_number': season_number,
'episode_number': episode_number,
'thumbnails': thumbnails,
'subtitles': subtitles,
'formats': formats,
}
raise last_e

View File

@@ -2,26 +2,26 @@
from __future__ import unicode_literals
import itertools
import json
from .amp import AMPIE
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_str,
compat_urlparse,
)
from ..utils import (
ExtractorError,
clean_html,
ExtractorError,
int_or_none,
remove_end,
sanitized_Request,
urlencode_postdata
parse_age_limit,
parse_duration,
unified_timestamp,
)
class DramaFeverBaseIE(AMPIE):
_LOGIN_URL = 'https://www.dramafever.com/accounts/login/'
class DramaFeverBaseIE(InfoExtractor):
_NETRC_MACHINE = 'dramafever'
_GEO_COUNTRIES = ['US', 'CA']
_CONSUMER_SECRET = 'DA59dtVXYLxajktV'
@@ -38,8 +38,8 @@ class DramaFeverBaseIE(AMPIE):
'consumer secret', default=self._CONSUMER_SECRET)
def _real_initialize(self):
self._login()
self._consumer_secret = self._get_consumer_secret()
self._login()
def _login(self):
(username, password) = self._get_login_info()
@@ -51,37 +51,49 @@ class DramaFeverBaseIE(AMPIE):
'password': password,
}
request = sanitized_Request(
self._LOGIN_URL, urlencode_postdata(login_form))
response = self._download_webpage(
request, None, 'Logging in')
try:
response = self._download_json(
'https://www.dramafever.com/api/users/login', None, 'Logging in',
data=json.dumps(login_form).encode('utf-8'), headers={
'x-consumer-key': self._consumer_secret,
})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (403, 404):
response = self._parse_json(
e.cause.read().decode('utf-8'), None)
else:
raise
if all(logout_pattern not in response
for logout_pattern in ['href="/accounts/logout/"', '>Log out<']):
error = self._html_search_regex(
r'(?s)<h\d[^>]+\bclass="hidden-xs prompt"[^>]*>(.+?)</h\d',
response, 'error message', default=None)
if error:
raise ExtractorError('Unable to login: %s' % error, expected=True)
raise ExtractorError('Unable to log in')
# Successful login
if response.get('result') or response.get('guid') or response.get('user_guid'):
return
errors = response.get('errors')
if errors and isinstance(errors, list):
error = errors[0]
message = error.get('message') or error['reason']
raise ExtractorError('Unable to login: %s' % message, expected=True)
raise ExtractorError('Unable to log in')
class DramaFeverIE(DramaFeverBaseIE):
IE_NAME = 'dramafever'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
_TESTS = [{
'url': 'http://www.dramafever.com/drama/4512/1/Cooking_with_Shin/',
'url': 'https://www.dramafever.com/drama/4274/1/Heirs/',
'info_dict': {
'id': '4512.1',
'ext': 'flv',
'title': 'Cooking with Shin',
'description': 'md5:a8eec7942e1664a6896fcd5e1287bfd0',
'id': '4274.1',
'ext': 'wvm',
'title': 'Heirs - Episode 1',
'description': 'md5:362a24ba18209f6276e032a651c50bc2',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3783,
'timestamp': 1381354993,
'upload_date': '20131009',
'series': 'Heirs',
'season_number': 1,
'episode': 'Episode 1',
'episode_number': 1,
'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1404336058,
'upload_date': '20140702',
'duration': 344,
},
'params': {
# m3u8 download
@@ -110,50 +122,95 @@ class DramaFeverIE(DramaFeverBaseIE):
'only_matching': True,
}]
def _call_api(self, path, video_id, note, fatal=False):
return self._download_json(
'https://www.dramafever.com/api/5/' + path,
video_id, note=note, headers={
'x-consumer-key': self._consumer_secret,
}, fatal=fatal)
def _get_subtitles(self, video_id):
subtitles = {}
subs = self._call_api(
'video/%s/subtitles/webvtt/' % video_id, video_id,
'Downloading subtitles JSON', fatal=False)
if not subs or not isinstance(subs, list):
return subtitles
for sub in subs:
if not isinstance(sub, dict):
continue
sub_url = sub.get('url')
if not sub_url or not isinstance(sub_url, compat_str):
continue
subtitles.setdefault(
sub.get('code') or sub.get('language') or 'en', []).append({
'url': sub_url
})
return subtitles
def _real_extract(self, url):
video_id = self._match_id(url).replace('/', '.')
try:
info = self._extract_feed_info(
'http://www.dramafever.com/amp/episode/feed.json?guid=%s' % video_id)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError):
self.raise_geo_restricted(
msg='Currently unavailable in your country',
countries=self._GEO_COUNTRIES)
raise
# title is postfixed with video id for some reason, removing
if info.get('title'):
info['title'] = remove_end(info['title'], video_id).strip()
series_id, episode_number = video_id.split('.')
episode_info = self._download_json(
# We only need a single episode info, so restricting page size to one episode
# and dealing with page number as with episode number
r'http://www.dramafever.com/api/4/episode/series/?cs=%s&series_id=%s&page_number=%s&page_size=1'
% (self._consumer_secret, series_id, episode_number),
video_id, 'Downloading episode info JSON', fatal=False)
if episode_info:
value = episode_info.get('value')
if isinstance(value, list):
for v in value:
if v.get('type') == 'Episode':
subfile = v.get('subfile') or v.get('new_subfile')
if subfile and subfile != 'http://www.dramafever.com/st/':
info.setdefault('subtitles', {}).setdefault('English', []).append({
'ext': 'srt',
'url': subfile,
})
episode_number = int_or_none(v.get('number'))
episode_fallback = 'Episode'
if episode_number:
episode_fallback += ' %d' % episode_number
info['episode'] = v.get('title') or episode_fallback
info['episode_number'] = episode_number
break
return info
video = self._call_api(
'series/%s/episodes/%s/' % (series_id, episode_number), video_id,
'Downloading video JSON')
formats = []
download_assets = video.get('download_assets')
if download_assets and isinstance(download_assets, dict):
for format_id, format_dict in download_assets.items():
if not isinstance(format_dict, dict):
continue
format_url = format_dict.get('url')
if not format_url or not isinstance(format_url, compat_str):
continue
formats.append({
'url': format_url,
'format_id': format_id,
'filesize': int_or_none(video.get('filesize')),
})
stream = self._call_api(
'video/%s/stream/' % video_id, video_id, 'Downloading stream JSON',
fatal=False)
if stream:
stream_url = stream.get('stream_url')
if stream_url:
formats.extend(self._extract_m3u8_formats(
stream_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
self._sort_formats(formats)
title = video.get('title') or 'Episode %s' % episode_number
description = video.get('description')
thumbnail = video.get('thumbnail')
timestamp = unified_timestamp(video.get('release_date'))
duration = parse_duration(video.get('duration'))
age_limit = parse_age_limit(video.get('tv_rating'))
series = video.get('series_title')
season_number = int_or_none(video.get('season'))
if series:
title = '%s - %s' % (series, title)
subtitles = self.extract_subtitles(video_id)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'timestamp': timestamp,
'age_limit': age_limit,
'series': series,
'season_number': season_number,
'episode_number': int_or_none(episode_number),
'formats': formats,
'subtitles': subtitles,
}
class DramaFeverSeriesIE(DramaFeverBaseIE):

View File

@@ -532,13 +532,14 @@ from .lcp import (
)
from .learnr import LearnrIE
from .lecture2go import Lecture2GoIE
from .lego import LEGOIE
from .lemonde import LemondeIE
from .leeco import (
LeIE,
LePlaylistIE,
LetvCloudIE,
)
from .lego import LEGOIE
from .lemonde import LemondeIE
from .lenta import LentaIE
from .libraryofcongress import LibraryOfCongressIE
from .libsyn import LibsynIE
from .lifenews import (
@@ -1135,6 +1136,7 @@ from .tvnoe import TVNoeIE
from .tvnow import (
TVNowIE,
TVNowListIE,
TVNowShowIE,
)
from .tvp import (
TVPEmbedIE,

View File

@@ -1270,24 +1270,6 @@ class GenericIE(InfoExtractor):
},
'add_ie': ['Kaltura'],
},
# EaglePlatform embed (generic URL)
{
'url': 'http://lenta.ru/news/2015/03/06/navalny/',
# Not checking MD5 as sometimes the direct HTTP link results in 404 and HLS is used
'info_dict': {
'id': '227304',
'ext': 'mp4',
'title': 'Навальный вышел на свободу',
'description': 'md5:d97861ac9ae77377f3f20eaf9d04b4f5',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 87,
'view_count': int,
'age_limit': 0,
},
'params': {
'skip_download': True,
},
},
# referrer protected EaglePlatform embed
{
'url': 'https://tvrain.ru/lite/teleshow/kak_vse_nachinalos/namin-418921/',

View File

@@ -1,5 +1,6 @@
from __future__ import unicode_literals
import itertools
import json
import re
@@ -242,48 +243,69 @@ class InstagramUserIE(InfoExtractor):
return int_or_none(try_get(
node, lambda x: x['edge_media_' + suffix]['count']))
edges = self._download_json(
'https://www.instagram.com/graphql/query/', uploader_id, query={
'query_hash': '472f257a40c653c64c666ce877d59d2b',
'variables': json.dumps({
'id': uploader_id,
'first': 999999999,
cursor = ''
for page_num in itertools.count(1):
media = self._download_json(
'https://www.instagram.com/graphql/query/', uploader_id,
'Downloading JSON page %d' % page_num, query={
'query_hash': '472f257a40c653c64c666ce877d59d2b',
'variables': json.dumps({
'id': uploader_id,
'first': 100,
'after': cursor,
})
})['data']['user']['edge_owner_to_timeline_media']
edges = media.get('edges')
if not edges or not isinstance(edges, list):
break
for edge in edges:
node = edge.get('node')
if not node or not isinstance(node, dict):
continue
if node.get('__typename') != 'GraphVideo' and node.get('is_video') is not True:
continue
video_id = node.get('shortcode')
if not video_id:
continue
info = self.url_result(
'https://instagram.com/p/%s/' % video_id,
ie=InstagramIE.ie_key(), video_id=video_id)
description = try_get(
node, lambda x: x['edge_media_to_caption']['edges'][0]['node']['text'],
compat_str)
thumbnail = node.get('thumbnail_src') or node.get('display_src')
timestamp = int_or_none(node.get('taken_at_timestamp'))
comment_count = get_count('to_comment')
like_count = get_count('preview_like')
view_count = int_or_none(node.get('video_view_count'))
info.update({
'description': description,
'thumbnail': thumbnail,
'timestamp': timestamp,
'comment_count': comment_count,
'like_count': like_count,
'view_count': view_count,
})
})['data']['user']['edge_owner_to_timeline_media']['edges']
for edge in edges:
node = edge['node']
yield info
if node.get('__typename') != 'GraphVideo' and node.get('is_video') is not True:
continue
video_id = node.get('shortcode')
if not video_id:
continue
page_info = media.get('page_info')
if not page_info or not isinstance(page_info, dict):
break
info = self.url_result(
'https://instagram.com/p/%s/' % video_id,
ie=InstagramIE.ie_key(), video_id=video_id)
has_next_page = page_info.get('has_next_page')
if not has_next_page:
break
description = try_get(
node, lambda x: x['edge_media_to_caption']['edges'][0]['node']['text'],
compat_str)
thumbnail = node.get('thumbnail_src') or node.get('display_src')
timestamp = int_or_none(node.get('taken_at_timestamp'))
comment_count = get_count('to_comment')
like_count = get_count('preview_like')
view_count = int_or_none(node.get('video_view_count'))
info.update({
'description': description,
'thumbnail': thumbnail,
'timestamp': timestamp,
'comment_count': comment_count,
'like_count': like_count,
'view_count': view_count,
})
yield info
cursor = page_info.get('end_cursor')
if not cursor or not isinstance(cursor, compat_str):
break
def _real_extract(self, url):
username = self._match_id(url)

View File

@@ -0,0 +1,53 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class LentaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?lenta\.ru/[^/]+/\d+/\d+/\d+/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://lenta.ru/news/2018/03/22/savshenko_go/',
'info_dict': {
'id': '964400',
'ext': 'mp4',
'title': 'Надежду Савченко задержали',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 61,
'view_count': int,
},
'params': {
'skip_download': True,
},
}, {
# EaglePlatform iframe embed
'url': 'http://lenta.ru/news/2015/03/06/navalny/',
'info_dict': {
'id': '227304',
'ext': 'mp4',
'title': 'Навальный вышел на свободу',
'description': 'md5:d97861ac9ae77377f3f20eaf9d04b4f5',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 87,
'view_count': int,
'age_limit': 0,
},
'params': {
'skip_download': True,
},
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r'vid\s*:\s*["\']?(\d+)', webpage, 'eagleplatform id',
default=None)
if video_id:
return self.url_result(
'eagleplatform:lentaru.media.eagleplatform.com:%s' % video_id,
ie='EaglePlatform', video_id=video_id)
return self.url_result(url, ie='Generic')

View File

@@ -1,24 +1,28 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
from ..utils import unified_strdate
from ..utils import (
parse_duration,
unified_strdate,
)
class LibsynIE(InfoExtractor):
_VALID_URL = r'(?P<mainurl>https?://html5-player\.libsyn\.com/embed/episode/id/(?P<id>[0-9]+))'
_TESTS = [{
'url': 'http://html5-player.libsyn.com/embed/episode/id/3377616/',
'md5': '443360ee1b58007bc3dcf09b41d093bb',
'url': 'http://html5-player.libsyn.com/embed/episode/id/6385796/',
'md5': '2a55e75496c790cdeb058e7e6c087746',
'info_dict': {
'id': '3377616',
'id': '6385796',
'ext': 'mp3',
'title': "The Daily Show Podcast without Jon Stewart - Episode 12: Bassem Youssef: Egypt's Jon Stewart",
'description': 'md5:601cb790edd05908957dae8aaa866465',
'upload_date': '20150220',
'title': "Champion Minded - Developing a Growth Mindset",
'description': 'In this episode, Allistair talks about the importance of developing a growth mindset, not only in sports, but in life too.',
'upload_date': '20180320',
'thumbnail': 're:^https?://.*',
},
}, {
@@ -39,31 +43,45 @@ class LibsynIE(InfoExtractor):
url = m.group('mainurl')
webpage = self._download_webpage(url, video_id)
formats = [{
'url': media_url,
} for media_url in set(re.findall(r'var\s+mediaURL(?:Libsyn)?\s*=\s*"([^"]+)"', webpage))]
podcast_title = self._search_regex(
r'<h2>([^<]+)</h2>', webpage, 'podcast title', default=None)
r'<h3>([^<]+)</h3>', webpage, 'podcast title', default=None)
if podcast_title:
podcast_title = podcast_title.strip()
episode_title = self._search_regex(
r'(?:<div class="episode-title">|<h3>)([^<]+)</', webpage, 'episode title')
r'(?:<div class="episode-title">|<h4>)([^<]+)</', webpage, 'episode title')
if episode_title:
episode_title = episode_title.strip()
title = '%s - %s' % (podcast_title, episode_title) if podcast_title else episode_title
description = self._html_search_regex(
r'<div id="info_text_body">(.+?)</div>', webpage,
r'<p\s+id="info_text_body">(.+?)</p>', webpage,
'description', default=None)
thumbnail = self._search_regex(
r'<img[^>]+class="info-show-icon"[^>]+src="([^"]+)"',
webpage, 'thumbnail', fatal=False)
if description:
# Strip non-breaking and normal spaces
description = description.replace('\u00A0', ' ').strip()
release_date = unified_strdate(self._search_regex(
r'<div class="release_date">Released: ([^<]+)<', webpage, 'release date', fatal=False))
data_json = self._search_regex(r'var\s+playlistItem\s*=\s*(\{.*?\});\n', webpage, 'JSON data block')
data = json.loads(data_json)
formats = [{
'url': data['media_url'],
'format_id': 'main',
}, {
'url': data['media_url_libsyn'],
'format_id': 'libsyn',
}]
thumbnail = data.get('thumbnail_url')
duration = parse_duration(data.get('duration'))
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'upload_date': release_date,
'duration': duration,
'formats': formats,
}

View File

@@ -141,6 +141,7 @@ class MedialaanIE(GigyaBaseIE):
vod_id = config.get('vodId') or self._search_regex(
(r'\\"vodId\\"\s*:\s*\\"(.+?)\\"',
r'"vodId"\s*:\s*"(.+?)"',
r'<[^>]+id=["\']vod-(\d+)'),
webpage, 'video_id', default=None)

View File

@@ -68,11 +68,11 @@ class NationalGeographicVideoIE(InfoExtractor):
class NationalGeographicIE(ThePlatformIE, AdobePassIE):
IE_NAME = 'natgeo'
_VALID_URL = r'https?://channel\.nationalgeographic\.com/(?:(?:wild/)?[^/]+/)?(?:videos|episodes)/(?P<id>[^/?]+)'
_VALID_URL = r'https?://channel\.nationalgeographic\.com/(?:(?:(?:wild/)?[^/]+/)?(?:videos|episodes)|u)/(?P<id>[^/?]+)'
_TESTS = [
{
'url': 'http://channel.nationalgeographic.com/the-story-of-god-with-morgan-freeman/videos/uncovering-a-universal-knowledge/',
'url': 'http://channel.nationalgeographic.com/u/kdi9Ld0PN2molUUIMSBGxoeDhD729KRjQcnxtetilWPMevo8ZwUBIDuPR0Q3D2LVaTsk0MPRkRWDB8ZhqWVeyoxfsZZm36yRp1j-zPfsHEyI_EgAeFY/',
'md5': '518c9aa655686cf81493af5cc21e2a04',
'info_dict': {
'id': 'vKInpacll2pC',
@@ -86,7 +86,7 @@ class NationalGeographicIE(ThePlatformIE, AdobePassIE):
'add_ie': ['ThePlatform'],
},
{
'url': 'http://channel.nationalgeographic.com/wild/destination-wild/videos/the-stunning-red-bird-of-paradise/',
'url': 'http://channel.nationalgeographic.com/u/kdvOstqYaBY-vSBPyYgAZRUL4sWUJ5XUUPEhc7ISyBHqoIO4_dzfY3K6EjHIC0hmFXoQ7Cpzm6RkET7S3oMlm6CFnrQwSUwo/',
'md5': 'c4912f656b4cbe58f3e000c489360989',
'info_dict': {
'id': 'Pok5lWCkiEFA',
@@ -106,6 +106,14 @@ class NationalGeographicIE(ThePlatformIE, AdobePassIE):
{
'url': 'http://channel.nationalgeographic.com/videos/treasures-rediscovered/',
'only_matching': True,
},
{
'url': 'http://channel.nationalgeographic.com/the-story-of-god-with-morgan-freeman/videos/uncovering-a-universal-knowledge/',
'only_matching': True,
},
{
'url': 'http://channel.nationalgeographic.com/wild/destination-wild/videos/the-stunning-red-bird-of-paradise/',
'only_matching': True,
}
]

View File

@@ -1,8 +1,6 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
@@ -43,9 +41,14 @@ class NaverIE(InfoExtractor):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
m_id = re.search(r'var rmcPlayer = new nhn\.rmcnmv\.RMCVideoPlayer\("(.+?)", "(.+?)"',
webpage)
if m_id is None:
vid = self._search_regex(
r'videoId["\']\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1', webpage,
'video id', fatal=None, group='value')
in_key = self._search_regex(
r'inKey["\']\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1', webpage,
'key', default=None, group='value')
if not vid or not in_key:
error = self._html_search_regex(
r'(?s)<div class="(?:nation_error|nation_box|error_box)">\s*(?:<!--.*?-->)?\s*<p class="[^"]+">(?P<msg>.+?)</p>\s*</div>',
webpage, 'error', default=None)
@@ -53,9 +56,9 @@ class NaverIE(InfoExtractor):
raise ExtractorError(error, expected=True)
raise ExtractorError('couldn\'t extract vid and key')
video_data = self._download_json(
'http://play.rmcnmv.naver.com/vod/play/v2.0/' + m_id.group(1),
'http://play.rmcnmv.naver.com/vod/play/v2.0/' + vid,
video_id, query={
'key': m_id.group(2),
'key': in_key,
})
meta = video_data['meta']
title = meta['subject']

View File

@@ -243,7 +243,7 @@ class PhantomJSwrapper(object):
class OpenloadIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?:openload\.(?:co|io|link)|oload\.(?:tv|stream))/(?:f|embed)/(?P<id>[a-zA-Z0-9-_]+)'
_VALID_URL = r'https?://(?:www\.)?(?:openload\.(?:co|io|link)|oload\.(?:tv|stream|site))/(?:f|embed)/(?P<id>[a-zA-Z0-9-_]+)'
_TESTS = [{
'url': 'https://openload.co/f/kUEfGclsU9o',

View File

@@ -10,6 +10,7 @@ from ..utils import (
int_or_none,
parse_iso8601,
parse_duration,
try_get,
update_url_query,
)
@@ -58,14 +59,22 @@ class TVNowBaseIE(InfoExtractor):
duration = parse_duration(info.get('duration'))
f = info.get('format', {})
thumbnails = [{
'url': 'https://aistvnow-a.akamaihd.net/tvnow/movie/%s' % video_id,
}]
thumbnail = f.get('defaultImage169Format') or f.get('defaultImage169Logo')
if thumbnail:
thumbnails.append({
'url': thumbnail,
})
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'thumbnails': thumbnails,
'timestamp': timestamp,
'duration': duration,
'series': f.get('title'),
@@ -77,7 +86,12 @@ class TVNowBaseIE(InfoExtractor):
class TVNowIE(TVNowBaseIE):
_VALID_URL = r'https?://(?:www\.)?tvnow\.(?:de|at|ch)/(?:rtl(?:2|plus)?|nitro|superrtl|ntv|vox)/(?P<show_id>[^/]+)/(?:(?:list/[^/]+|jahr/\d{4}/\d{1,2})/)?(?P<id>[^/]+)/(?:player|preview)'
_VALID_URL = r'''(?x)
https?://
(?:www\.)?tvnow\.(?:de|at|ch)/[^/]+/
(?P<show_id>[^/]+)/
(?!(?:list|jahr)(?:/|$))(?P<id>[^/?\#&]+)
'''
_TESTS = [{
'url': 'https://www.tvnow.de/rtl2/grip-das-motormagazin/der-neue-porsche-911-gt-3/player',
@@ -99,27 +113,30 @@ class TVNowIE(TVNowBaseIE):
}, {
# rtl2
'url': 'https://www.tvnow.de/rtl2/armes-deutschland/episode-0008/player',
'only_matching': 'True',
'only_matching': True,
}, {
# rtlnitro
'url': 'https://www.tvnow.de/nitro/alarm-fuer-cobra-11-die-autobahnpolizei/auf-eigene-faust-pilot/player',
'only_matching': 'True',
'only_matching': True,
}, {
# superrtl
'url': 'https://www.tvnow.de/superrtl/die-lustigsten-schlamassel-der-welt/u-a-ketchup-effekt/player',
'only_matching': 'True',
'only_matching': True,
}, {
# ntv
'url': 'https://www.tvnow.de/ntv/startup-news/goetter-in-weiss/player',
'only_matching': 'True',
'only_matching': True,
}, {
# vox
'url': 'https://www.tvnow.de/vox/auto-mobil/neues-vom-automobilmarkt-2017-11-19-17-00-00/player',
'only_matching': 'True',
'only_matching': True,
}, {
# rtlplus
'url': 'https://www.tvnow.de/rtlplus/op-ruft-dr-bruckner/die-vernaehte-frau/player',
'only_matching': 'True',
'only_matching': True,
}, {
'url': 'https://www.tvnow.de/rtl2/grip-das-motormagazin/der-neue-porsche-911-gt-3',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -133,8 +150,30 @@ class TVNowIE(TVNowBaseIE):
return self._extract_video(info, display_id)
class TVNowListIE(TVNowBaseIE):
_VALID_URL = r'(?P<base_url>https?://(?:www\.)?tvnow\.(?:de|at|ch)/(?:rtl(?:2|plus)?|nitro|superrtl|ntv|vox)/(?P<show_id>[^/]+)/)list/(?P<id>[^?/#&]+)$'
class TVNowListBaseIE(TVNowBaseIE):
_SHOW_VALID_URL = r'''(?x)
(?P<base_url>
https?://
(?:www\.)?tvnow\.(?:de|at|ch)/[^/]+/
(?P<show_id>[^/]+)
)
'''
def _extract_list_info(self, display_id, show_id):
fields = list(self._SHOW_FIELDS)
fields.extend('formatTabs.%s' % field for field in self._SEASON_FIELDS)
fields.extend(
'formatTabs.formatTabPages.container.movies.%s' % field
for field in self._VIDEO_FIELDS)
return self._call_api(
'formats/seo', display_id, query={
'fields': ','.join(fields),
'name': show_id + '.php'
})
class TVNowListIE(TVNowListBaseIE):
_VALID_URL = r'%s/(?:list|jahr)/(?P<id>[^?\#&]+)' % TVNowListBaseIE._SHOW_VALID_URL
_SHOW_FIELDS = ('title', )
_SEASON_FIELDS = ('id', 'headline', 'seoheadline', )
@@ -147,38 +186,94 @@ class TVNowListIE(TVNowBaseIE):
'title': '30 Minuten Deutschland - Aktuell',
},
'playlist_mincount': 1,
}, {
'url': 'https://www.tvnow.de/vox/ab-ins-beet/list/staffel-14',
'only_matching': True,
}, {
'url': 'https://www.tvnow.de/rtl2/grip-das-motormagazin/jahr/2018/3',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return (False if TVNowIE.suitable(url)
else super(TVNowListIE, cls).suitable(url))
def _real_extract(self, url):
base_url, show_id, season_id = re.match(self._VALID_URL, url).groups()
fields = []
fields.extend(self._SHOW_FIELDS)
fields.extend('formatTabs.%s' % field for field in self._SEASON_FIELDS)
fields.extend(
'formatTabs.formatTabPages.container.movies.%s' % field
for field in self._VIDEO_FIELDS)
list_info = self._call_api(
'formats/seo', season_id, query={
'fields': ','.join(fields),
'name': show_id + '.php'
})
list_info = self._extract_list_info(season_id, show_id)
season = next(
season for season in list_info['formatTabs']['items']
if season.get('seoheadline') == season_id)
title = '%s - %s' % (list_info['title'], season['headline'])
title = list_info.get('title')
headline = season.get('headline')
if title and headline:
title = '%s - %s' % (title, headline)
else:
title = headline or title
entries = []
for container in season['formatTabPages']['items']:
for info in ((container.get('container') or {}).get('movies') or {}).get('items') or []:
items = try_get(
container, lambda x: x['container']['movies']['items'],
list) or []
for info in items:
seo_url = info.get('seoUrl')
if not seo_url:
continue
video_id = info.get('id')
entries.append(self.url_result(
base_url + seo_url + '/player', 'TVNow', info.get('id')))
'%s/%s/player' % (base_url, seo_url), TVNowIE.ie_key(),
compat_str(video_id) if video_id else None))
return self.playlist_result(
entries, compat_str(season.get('id') or season_id), title)
class TVNowShowIE(TVNowListBaseIE):
_VALID_URL = TVNowListBaseIE._SHOW_VALID_URL
_SHOW_FIELDS = ('id', 'title', )
_SEASON_FIELDS = ('id', 'headline', 'seoheadline', )
_VIDEO_FIELDS = ()
_TESTS = [{
'url': 'https://www.tvnow.at/vox/ab-ins-beet',
'info_dict': {
'id': 'ab-ins-beet',
'title': 'Ab ins Beet!',
},
'playlist_mincount': 7,
}, {
'url': 'https://www.tvnow.at/vox/ab-ins-beet/list',
'only_matching': True,
}, {
'url': 'https://www.tvnow.de/rtl2/grip-das-motormagazin/jahr/',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return (False if TVNowIE.suitable(url) or TVNowListIE.suitable(url)
else super(TVNowShowIE, cls).suitable(url))
def _real_extract(self, url):
base_url, show_id = re.match(self._VALID_URL, url).groups()
list_info = self._extract_list_info(show_id, show_id)
entries = []
for season_info in list_info['formatTabs']['items']:
season_url = season_info.get('seoheadline')
if not season_url:
continue
season_id = season_info.get('id')
entries.append(self.url_result(
'%s/list/%s' % (base_url, season_url), TVNowListIE.ie_key(),
compat_str(season_id) if season_id else None,
season_info.get('headline')))
return self.playlist_result(entries, show_id, list_info.get('title'))

View File

@@ -14,7 +14,7 @@ from ..utils import (
class TwentyFourVideoIE(InfoExtractor):
IE_NAME = '24video'
_VALID_URL = r'https?://(?P<host>(?:www\.)?24video\.(?:net|me|xxx|sex|tube|adult))/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)'
_VALID_URL = r'https?://(?P<host>(?:www\.)?24video\.(?:net|me|xxx|sexy?|tube|adult))/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.24video.net/video/view/1044982',

View File

@@ -16,7 +16,7 @@ from ..utils import (
class VideaIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
videa\.hu/
videa(?:kid)?\.hu/
(?:
videok/(?:[^/]+/)*[^?#&]+-|
player\?.*?\bv=|
@@ -31,7 +31,7 @@ class VideaIE(InfoExtractor):
'id': '8YfIAjxwWGwT8HVQ',
'ext': 'mp4',
'title': 'Az őrült kígyász 285 kígyót enged szabadon',
'thumbnail': 'http://videa.hu/static/still/1.4.1.1007274.1204470.3',
'thumbnail': r're:^https?://.*',
'duration': 21,
},
}, {
@@ -43,6 +43,15 @@ class VideaIE(InfoExtractor):
}, {
'url': 'http://videa.hu/player/v/8YfIAjxwWGwT8HVQ?autoplay=1',
'only_matching': True,
}, {
'url': 'https://videakid.hu/videok/origo/jarmuvek/supercars-elozes-jAHDWfWSJH5XuFhH',
'only_matching': True,
}, {
'url': 'https://videakid.hu/player?v=8YfIAjxwWGwT8HVQ',
'only_matching': True,
}, {
'url': 'https://videakid.hu/player/v/8YfIAjxwWGwT8HVQ?autoplay=1',
'only_matching': True,
}]
@staticmethod

View File

@@ -12,7 +12,7 @@ import time
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse_urlencode,
compat_urlparse,
compat_urllib_parse,
)
from ..utils import (
float_or_none,
@@ -39,11 +39,11 @@ class VRVBaseIE(InfoExtractor):
data = json.dumps(data).encode()
headers['Content-Type'] = 'application/json'
method = 'POST' if data else 'GET'
base_string = '&'.join([method, compat_urlparse.quote(base_url, ''), compat_urlparse.quote(encoded_query, '')])
base_string = '&'.join([method, compat_urllib_parse.quote(base_url, ''), compat_urllib_parse.quote(encoded_query, '')])
oauth_signature = base64.b64encode(hmac.new(
(self._API_PARAMS['oAuthSecret'] + '&').encode('ascii'),
base_string.encode(), hashlib.sha1).digest()).decode()
encoded_query += '&oauth_signature=' + compat_urlparse.quote(oauth_signature, '')
encoded_query += '&oauth_signature=' + compat_urllib_parse.quote(oauth_signature, '')
return self._download_json(
'?'.join([base_url, encoded_query]), video_id,
note='Downloading %s JSON metadata' % note, headers=headers, data=data)

View File

@@ -58,7 +58,9 @@ class XVideosIE(InfoExtractor):
group='title') or self._og_search_title(webpage)
thumbnail = self._search_regex(
r'url_bigthumb=(.+?)&amp', webpage, 'thumbnail', fatal=False)
(r'setThumbUrl\(\s*(["\'])(?P<thumbnail>(?:(?!\1).)+)\1',
r'url_bigthumb=(?P<thumbnail>.+?)&amp'),
webpage, 'thumbnail', fatal=False, group='thumbnail')
duration = int_or_none(self._og_search_property(
'duration', webpage, default=None)) or parse_duration(
self._search_regex(

View File

@@ -154,7 +154,7 @@ class YoukuIE(InfoExtractor):
# request basic data
basic_data_params = {
'vid': video_id,
'ccode': '0507',
'ccode': '0590',
'client_ip': '192.168.1.1',
'utid': cna,
'client_ts': time.time() / 1000,

View File

@@ -676,7 +676,8 @@ def parseOpts(overrideArguments=None):
filesystem.add_option(
'-a', '--batch-file',
dest='batchfile', metavar='FILE',
help='File containing URLs to download (\'-\' for stdin)')
help="File containing URLs to download ('-' for stdin), one URL per line. "
"Lines starting with '#', ';' or ']' are considered as comments and ignored.")
filesystem.add_option(
'--id', default=False,
action='store_true', dest='useid', help='Use only video ID in file name')

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2018.03.20'
__version__ = '2018.04.03'