mirror of
https://github.com/ytdl-org/youtube-dl
synced 2024-12-23 04:30:10 +09:00
Merge branch 'master' into oreilly-login
This commit is contained in:
commit
899af2ef61
24
README.md
24
README.md
@ -918,7 +918,7 @@ Either prepend `https://www.youtube.com/watch?v=` or separate the ID from the op
|
|||||||
|
|
||||||
Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
|
Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
|
||||||
|
|
||||||
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [Get cookies.txt](https://chrome.google.com/webstore/detail/get-cookiestxt/bgaddhkoddajcdgocldbbfleckgcbcid/) (for Chrome) or [cookies.txt](https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/) (for Firefox).
|
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [Get cookies.txt LOCALLY](https://chrome.google.com/webstore/detail/get-cookiestxt-locally/cclelndahbckbenkjhflpdbgdldlbecc) (for Chrome) or [cookies.txt](https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/) (for Firefox).
|
||||||
|
|
||||||
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
||||||
|
|
||||||
@ -1408,7 +1408,11 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
|
|||||||
|
|
||||||
# BUGS
|
# BUGS
|
||||||
|
|
||||||
Bugs and suggestions should be reported at: <https://github.com/ytdl-org/youtube-dl/issues>. Unless you were prompted to or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel [#youtube-dl](irc://chat.freenode.net/#youtube-dl) on freenode ([webchat](https://webchat.freenode.net/?randomnick=1&channels=youtube-dl)).
|
Bugs and suggestions should be reported in the issue tracker: <https://github.com/ytdl-org/youtube-dl/issues> (<https://yt-dl.org/bug> is an alias for this). Unless you were prompted to or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel [#youtube-dl](irc://chat.freenode.net/#youtube-dl) on freenode ([webchat](https://webchat.freenode.net/?randomnick=1&channels=youtube-dl)).
|
||||||
|
|
||||||
|
## Opening a bug report or suggestion
|
||||||
|
|
||||||
|
Be sure to follow instructions provided **below** and **in the issue tracker**. Complete the appropriate issue template fully. Consider whether your problem is covered by an existing issue: if so, follow the discussion there. Avoid commenting on existing duplicate issues as such comments do not add to the discussion of the issue and are liable to be treated as spam.
|
||||||
|
|
||||||
**Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
|
**Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
|
||||||
```
|
```
|
||||||
@ -1428,17 +1432,17 @@ $ youtube-dl -v <your command line>
|
|||||||
|
|
||||||
The output (including the first lines) contains important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
|
The output (including the first lines) contains important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
|
||||||
|
|
||||||
Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):
|
Finally please review your issue to avoid various common mistakes (you can and should use this as a checklist) listed below.
|
||||||
|
|
||||||
### Is the description of the issue itself sufficient?
|
### Is the description of the issue itself sufficient?
|
||||||
|
|
||||||
We often get issue reports that we cannot really decipher. While in most cases we eventually get the required information after asking back multiple times, this poses an unnecessary drain on our resources. Many contributors, including myself, are also not native speakers, so we may misread some parts.
|
We often get issue reports that are hard to understand. To avoid subsequent clarifications, and to assist participants who are not native English speakers, please elaborate on what feature you are requesting, or what bug you want to be fixed.
|
||||||
|
|
||||||
So please elaborate on what feature you are requesting, or what bug you want to be fixed. Make sure that it's obvious
|
Make sure that it's obvious
|
||||||
|
|
||||||
- What the problem is
|
- What the problem is
|
||||||
- How it could be fixed
|
- How it could be fixed
|
||||||
- How your proposed solution would look like
|
- How your proposed solution would look
|
||||||
|
|
||||||
If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a committer myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
|
If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a committer myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
|
||||||
|
|
||||||
@ -1448,14 +1452,14 @@ If your server has multiple IPs or you suspect censorship, adding `--call-home`
|
|||||||
|
|
||||||
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like `https://www.youtube.com/watch?v=BaW_jenozKc`. There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. `https://www.youtube.com/`) is *not* an example URL.
|
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like `https://www.youtube.com/watch?v=BaW_jenozKc`. There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. `https://www.youtube.com/`) is *not* an example URL.
|
||||||
|
|
||||||
|
### Is the issue already documented?
|
||||||
|
|
||||||
|
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/ytdl-org/youtube-dl/search?type=Issues) of this repository. Initially, at least, use the search term `-label:duplicate` to focus on active issues. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
|
||||||
|
|
||||||
### Are you using the latest version?
|
### Are you using the latest version?
|
||||||
|
|
||||||
Before reporting any issue, type `youtube-dl -U`. This should report that you're up-to-date. About 20% of the reports we receive are already fixed, but people are using outdated versions. This goes for feature requests as well.
|
Before reporting any issue, type `youtube-dl -U`. This should report that you're up-to-date. About 20% of the reports we receive are already fixed, but people are using outdated versions. This goes for feature requests as well.
|
||||||
|
|
||||||
### Is the issue already documented?
|
|
||||||
|
|
||||||
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/ytdl-org/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
|
|
||||||
|
|
||||||
### Why are existing options not enough?
|
### Why are existing options not enough?
|
||||||
|
|
||||||
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
|
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
|
||||||
|
64
devscripts/cli_to_api.py
Executable file
64
devscripts/cli_to_api.py
Executable file
@ -0,0 +1,64 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
# coding: utf-8
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
"""
|
||||||
|
This script displays the API parameters corresponding to a yt-dl command line
|
||||||
|
|
||||||
|
Example:
|
||||||
|
$ ./cli_to_api.py -f best
|
||||||
|
{u'format': 'best'}
|
||||||
|
$
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Allow direct execution
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||||
|
|
||||||
|
import youtube_dl
|
||||||
|
from types import MethodType
|
||||||
|
|
||||||
|
|
||||||
|
def cli_to_api(*opts):
|
||||||
|
YDL = youtube_dl.YoutubeDL
|
||||||
|
|
||||||
|
# to extract the parsed options, break out of YoutubeDL instantiation
|
||||||
|
|
||||||
|
# return options via this Exception
|
||||||
|
class ParseYTDLResult(Exception):
|
||||||
|
def __init__(self, result):
|
||||||
|
super(ParseYTDLResult, self).__init__('result')
|
||||||
|
self.opts = result
|
||||||
|
|
||||||
|
# replacement constructor that raises ParseYTDLResult
|
||||||
|
def ytdl_init(ydl, ydl_opts):
|
||||||
|
super(YDL, ydl).__init__(ydl_opts)
|
||||||
|
raise ParseYTDLResult(ydl_opts)
|
||||||
|
|
||||||
|
# patch in the constructor
|
||||||
|
YDL.__init__ = MethodType(ytdl_init, YDL)
|
||||||
|
|
||||||
|
# core parser
|
||||||
|
def parsed_options(argv):
|
||||||
|
try:
|
||||||
|
youtube_dl._real_main(list(argv))
|
||||||
|
except ParseYTDLResult as result:
|
||||||
|
return result.opts
|
||||||
|
|
||||||
|
# from https://github.com/yt-dlp/yt-dlp/issues/5859#issuecomment-1363938900
|
||||||
|
default = parsed_options([])
|
||||||
|
diff = dict((k, v) for k, v in parsed_options(opts).items() if default[k] != v)
|
||||||
|
if 'postprocessors' in diff:
|
||||||
|
diff['postprocessors'] = [pp for pp in diff['postprocessors'] if pp not in default['postprocessors']]
|
||||||
|
return diff
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
from pprint import pprint
|
||||||
|
pprint(cli_to_api(*sys.argv))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
@ -35,13 +35,13 @@ class InfoExtractorTestRequestHandler(compat_http_server.BaseHTTPRequestHandler)
|
|||||||
assert False
|
assert False
|
||||||
|
|
||||||
|
|
||||||
class TestIE(InfoExtractor):
|
class DummyIE(InfoExtractor):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
|
|
||||||
class TestInfoExtractor(unittest.TestCase):
|
class TestInfoExtractor(unittest.TestCase):
|
||||||
def setUp(self):
|
def setUp(self):
|
||||||
self.ie = TestIE(FakeYDL())
|
self.ie = DummyIE(FakeYDL())
|
||||||
|
|
||||||
def test_ie_key(self):
|
def test_ie_key(self):
|
||||||
self.assertEqual(get_info_extractor(YoutubeIE.ie_key()), YoutubeIE)
|
self.assertEqual(get_info_extractor(YoutubeIE.ie_key()), YoutubeIE)
|
||||||
@ -62,6 +62,7 @@ class TestInfoExtractor(unittest.TestCase):
|
|||||||
<meta name="og:test1" content='foo > < bar'/>
|
<meta name="og:test1" content='foo > < bar'/>
|
||||||
<meta name="og:test2" content="foo >//< bar"/>
|
<meta name="og:test2" content="foo >//< bar"/>
|
||||||
<meta property=og-test3 content='Ill-formatted opengraph'/>
|
<meta property=og-test3 content='Ill-formatted opengraph'/>
|
||||||
|
<meta property=og:test4 content=unquoted-value/>
|
||||||
'''
|
'''
|
||||||
self.assertEqual(ie._og_search_title(html), 'Foo')
|
self.assertEqual(ie._og_search_title(html), 'Foo')
|
||||||
self.assertEqual(ie._og_search_description(html), 'Some video\'s description ')
|
self.assertEqual(ie._og_search_description(html), 'Some video\'s description ')
|
||||||
@ -74,6 +75,7 @@ class TestInfoExtractor(unittest.TestCase):
|
|||||||
self.assertEqual(ie._og_search_property(('test0', 'test1'), html), 'foo > < bar')
|
self.assertEqual(ie._og_search_property(('test0', 'test1'), html), 'foo > < bar')
|
||||||
self.assertRaises(RegexNotFoundError, ie._og_search_property, 'test0', html, None, fatal=True)
|
self.assertRaises(RegexNotFoundError, ie._og_search_property, 'test0', html, None, fatal=True)
|
||||||
self.assertRaises(RegexNotFoundError, ie._og_search_property, ('test0', 'test00'), html, None, fatal=True)
|
self.assertRaises(RegexNotFoundError, ie._og_search_property, ('test0', 'test00'), html, None, fatal=True)
|
||||||
|
self.assertEqual(ie._og_search_property('test4', html), 'unquoted-value')
|
||||||
|
|
||||||
def test_html_search_meta(self):
|
def test_html_search_meta(self):
|
||||||
ie = self.ie
|
ie = self.ie
|
||||||
|
@ -148,6 +148,7 @@ def generator(test_case, tname):
|
|||||||
try_rm(tc_filename)
|
try_rm(tc_filename)
|
||||||
try_rm(tc_filename + '.part')
|
try_rm(tc_filename + '.part')
|
||||||
try_rm(os.path.splitext(tc_filename)[0] + '.info.json')
|
try_rm(os.path.splitext(tc_filename)[0] + '.info.json')
|
||||||
|
|
||||||
try_rm_tcs_files()
|
try_rm_tcs_files()
|
||||||
try:
|
try:
|
||||||
try_num = 1
|
try_num = 1
|
||||||
@ -213,7 +214,15 @@ def generator(test_case, tname):
|
|||||||
# First, check test cases' data against extracted data alone
|
# First, check test cases' data against extracted data alone
|
||||||
expect_info_dict(self, tc_res_dict, tc.get('info_dict', {}))
|
expect_info_dict(self, tc_res_dict, tc.get('info_dict', {}))
|
||||||
# Now, check downloaded file consistency
|
# Now, check downloaded file consistency
|
||||||
|
# support test-case with volatile ID, signalled by regexp value
|
||||||
|
if tc.get('info_dict', {}).get('id', '').startswith('re:'):
|
||||||
|
test_id = tc['info_dict']['id']
|
||||||
|
tc['info_dict']['id'] = tc_res_dict['id']
|
||||||
|
else:
|
||||||
|
test_id = None
|
||||||
tc_filename = get_tc_filename(tc)
|
tc_filename = get_tc_filename(tc)
|
||||||
|
if test_id:
|
||||||
|
tc['info_dict']['id'] = test_id
|
||||||
if not test_case.get('params', {}).get('skip_download', False):
|
if not test_case.get('params', {}).get('skip_download', False):
|
||||||
self.assertTrue(os.path.exists(tc_filename), msg='Missing file ' + tc_filename)
|
self.assertTrue(os.path.exists(tc_filename), msg='Missing file ' + tc_filename)
|
||||||
self.assertTrue(tc_filename in finished_hook_called)
|
self.assertTrue(tc_filename in finished_hook_called)
|
||||||
|
@ -139,21 +139,16 @@ class TestJSInterpreter(unittest.TestCase):
|
|||||||
self.assertTrue(math.isnan(jsi.call_function('x')))
|
self.assertTrue(math.isnan(jsi.call_function('x')))
|
||||||
|
|
||||||
def test_Date(self):
|
def test_Date(self):
|
||||||
jsi = JSInterpreter('''
|
|
||||||
function x() { return new Date('Wednesday 31 December 1969 18:01:26 MDT') - 0; }
|
|
||||||
''')
|
|
||||||
self.assertEqual(jsi.call_function('x'), 86000)
|
|
||||||
|
|
||||||
jsi = JSInterpreter('''
|
jsi = JSInterpreter('''
|
||||||
function x(dt) { return new Date(dt) - 0; }
|
function x(dt) { return new Date(dt) - 0; }
|
||||||
''')
|
''')
|
||||||
self.assertEqual(jsi.call_function('x', 'Wednesday 31 December 1969 18:01:26 MDT'), 86000)
|
self.assertEqual(jsi.call_function('x', 'Wednesday 31 December 1969 18:01:26 MDT'), 86000)
|
||||||
|
|
||||||
# date format m/d/y
|
# date format m/d/y
|
||||||
jsi = JSInterpreter('''
|
self.assertEqual(jsi.call_function('x', '12/31/1969 18:01:26 MDT'), 86000)
|
||||||
function x() { return new Date('12/31/1969 18:01:26 MDT') - 0; }
|
|
||||||
''')
|
# epoch 0
|
||||||
self.assertEqual(jsi.call_function('x'), 86000)
|
self.assertEqual(jsi.call_function('x', '1 January 1970 00:00:00 UTC'), 0)
|
||||||
|
|
||||||
def test_call(self):
|
def test_call(self):
|
||||||
jsi = JSInterpreter('''
|
jsi = JSInterpreter('''
|
||||||
@ -445,7 +440,7 @@ class TestJSInterpreter(unittest.TestCase):
|
|||||||
self.assertIs(jsi.call_function('x'), None)
|
self.assertIs(jsi.call_function('x'), None)
|
||||||
|
|
||||||
jsi = JSInterpreter('''
|
jsi = JSInterpreter('''
|
||||||
function x() { let a=/,,[/,913,/](,)}/; return a; }
|
function x() { let a=/,,[/,913,/](,)}/; "".replace(a, ""); return a; }
|
||||||
''')
|
''')
|
||||||
attrs = set(('findall', 'finditer', 'flags', 'groupindex',
|
attrs = set(('findall', 'finditer', 'flags', 'groupindex',
|
||||||
'groups', 'match', 'pattern', 'scanner',
|
'groups', 'match', 'pattern', 'scanner',
|
||||||
@ -457,6 +452,31 @@ class TestJSInterpreter(unittest.TestCase):
|
|||||||
''')
|
''')
|
||||||
self.assertEqual(jsi.call_function('x').flags & ~re.U, re.I)
|
self.assertEqual(jsi.call_function('x').flags & ~re.U, re.I)
|
||||||
|
|
||||||
|
jsi = JSInterpreter(r'''
|
||||||
|
function x() { let a="data-name".replace("data-", ""); return a }
|
||||||
|
''')
|
||||||
|
self.assertEqual(jsi.call_function('x'), 'name')
|
||||||
|
|
||||||
|
jsi = JSInterpreter(r'''
|
||||||
|
function x() { let a="data-name".replace(new RegExp("^.+-"), ""); return a; }
|
||||||
|
''')
|
||||||
|
self.assertEqual(jsi.call_function('x'), 'name')
|
||||||
|
|
||||||
|
jsi = JSInterpreter(r'''
|
||||||
|
function x() { let a="data-name".replace(/^.+-/, ""); return a; }
|
||||||
|
''')
|
||||||
|
self.assertEqual(jsi.call_function('x'), 'name')
|
||||||
|
|
||||||
|
jsi = JSInterpreter(r'''
|
||||||
|
function x() { let a="data-name".replace(/a/g, "o"); return a; }
|
||||||
|
''')
|
||||||
|
self.assertEqual(jsi.call_function('x'), 'doto-nome')
|
||||||
|
|
||||||
|
jsi = JSInterpreter(r'''
|
||||||
|
function x() { let a="data-name".replaceAll("a", "o"); return a; }
|
||||||
|
''')
|
||||||
|
self.assertEqual(jsi.call_function('x'), 'doto-nome')
|
||||||
|
|
||||||
jsi = JSInterpreter(r'''
|
jsi = JSInterpreter(r'''
|
||||||
function x() { let a=[/[)\\]/]; return a[0]; }
|
function x() { let a=[/[)\\]/]; return a[0]; }
|
||||||
''')
|
''')
|
||||||
@ -485,6 +505,12 @@ class TestJSInterpreter(unittest.TestCase):
|
|||||||
jsi = JSInterpreter('function x(){return 1236566549 << 5}')
|
jsi = JSInterpreter('function x(){return 1236566549 << 5}')
|
||||||
self.assertEqual(jsi.call_function('x'), 915423904)
|
self.assertEqual(jsi.call_function('x'), 915423904)
|
||||||
|
|
||||||
|
""" # fails so far
|
||||||
|
def test_packed(self):
|
||||||
|
jsi = JSInterpreter('''function x(p,a,c,k,e,d){while(c--)if(k[c])p=p.replace(new RegExp('\\b'+c.toString(a)+'\\b','g'),k[c]);return p}''')
|
||||||
|
self.assertEqual(jsi.call_function('x', '''h 7=g("1j");7.7h({7g:[{33:"w://7f-7e-7d-7c.v.7b/7a/79/78/77/76.74?t=73&s=2s&e=72&f=2t&71=70.0.0.1&6z=6y&6x=6w"}],6v:"w://32.v.u/6u.31",16:"r%",15:"r%",6t:"6s",6r:"",6q:"l",6p:"l",6o:"6n",6m:\'6l\',6k:"6j",9:[{33:"/2u?b=6i&n=50&6h=w://32.v.u/6g.31",6f:"6e"}],1y:{6d:1,6c:\'#6b\',6a:\'#69\',68:"67",66:30,65:r,},"64":{63:"%62 2m%m%61%5z%5y%5x.u%5w%5v%5u.2y%22 2k%m%1o%22 5t%m%1o%22 5s%m%1o%22 2j%m%5r%22 16%m%5q%22 15%m%5p%22 5o%2z%5n%5m%2z",5l:"w://v.u/d/1k/5k.2y",5j:[]},\'5i\':{"5h":"5g"},5f:"5e",5d:"w://v.u",5c:{},5b:l,1x:[0.25,0.50,0.75,1,1.25,1.5,2]});h 1m,1n,5a;h 59=0,58=0;h 7=g("1j");h 2x=0,57=0,56=0;$.55({54:{\'53-52\':\'2i-51\'}});7.j(\'4z\',6(x){c(5>0&&x.1l>=5&&1n!=1){1n=1;$(\'q.4y\').4x(\'4w\')}});7.j(\'13\',6(x){2x=x.1l});7.j(\'2g\',6(x){2w(x)});7.j(\'4v\',6(){$(\'q.2v\').4u()});6 2w(x){$(\'q.2v\').4t();c(1m)19;1m=1;17=0;c(4s.4r===l){17=1}$.4q(\'/2u?b=4p&2l=1k&4o=2t-4n-4m-2s-4l&4k=&4j=&4i=&17=\'+17,6(2r){$(\'#4h\').4g(2r)});$(\'.3-8-4f-4e:4d("4c")\').2h(6(e){2q();g().4b(0);g().4a(l)});6 2q(){h $14=$("<q />").2p({1l:"49",16:"r%",15:"r%",48:0,2n:0,2o:47,46:"45(10%, 10%, 10%, 0.4)","44-43":"42"});$("<41 />").2p({16:"60%",15:"60%",2o:40,"3z-2n":"3y"}).3x({\'2m\':\'/?b=3w&2l=1k\',\'2k\':\'0\',\'2j\':\'2i\'}).2f($14);$14.2h(6(){$(3v).3u();g().2g()});$14.2f($(\'#1j\'))}g().13(0);}6 3t(){h 9=7.1b(2e);2d.2c(9);c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==2e){2d.2c(\'!!=\'+i);7.1p(i)}}}}7.j(\'3s\',6(){g().1h("/2a/3r.29","3q 10 28",6(){g().13(g().27()+10)},"2b");$("q[26=2b]").23().21(\'.3-20-1z\');g().1h("/2a/3p.29","3o 10 28",6(){h 12=g().27()-10;c(12<0)12=0;g().13(12)},"24");$("q[26=24]").23().21(\'.3-20-1z\');});6 1i(){}7.j(\'3n\',6(){1i()});7.j(\'3m\',6(){1i()});7.j("k",6(y){h 9=7.1b();c(9.n<2)19;$(\'.3-8-3l-3k\').3j(6(){$(\'#3-8-a-k\').1e(\'3-8-a-z\');$(\'.3-a-k\').p(\'o-1f\',\'11\')});7.1h("/3i/3h.3g","3f 3e",6(){$(\'.3-1w\').3d(\'3-8-1v\');$(\'.3-8-1y, .3-8-1x\').p(\'o-1g\',\'11\');c($(\'.3-1w\').3c(\'3-8-1v\')){$(\'.3-a-k\').p(\'o-1g\',\'l\');$(\'.3-a-k\').p(\'o-1f\',\'l\');$(\'.3-8-a\').1e(\'3-8-a-z\');$(\'.3-8-a:1u\').3b(\'3-8-a-z\')}3a{$(\'.3-a-k\').p(\'o-1g\',\'11\');$(\'.3-a-k\').p(\'o-1f\',\'11\');$(\'.3-8-a:1u\').1e(\'3-8-a-z\')}},"39");7.j("38",6(y){1d.37(\'1c\',y.9[y.36].1a)});c(1d.1t(\'1c\')){35("1s(1d.1t(\'1c\'));",34)}});h 18;6 1s(1q){h 9=7.1b();c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==1q){c(i==18){19}18=i;7.1p(i)}}}}',36,270,'|||jw|||function|player|settings|tracks|submenu||if||||jwplayer|var||on|audioTracks|true|3D|length|aria|attr|div|100|||sx|filemoon|https||event|active||false|tt|seek|dd|height|width|adb|current_audio|return|name|getAudioTracks|default_audio|localStorage|removeClass|expanded|checked|addButton|callMeMaybe|vplayer|0fxcyc2ajhp1|position|vvplay|vvad|220|setCurrentAudioTrack|audio_name|for|audio_set|getItem|last|open|controls|playbackRates|captions|rewind|icon|insertAfter||detach|ff00||button|getPosition|sec|png|player8|ff11|log|console|track_name|appendTo|play|click|no|scrolling|frameborder|file_code|src|top|zIndex|css|showCCform|data|1662367683|383371|dl|video_ad|doPlay|prevt|mp4|3E||jpg|thumbs|file|300|setTimeout|currentTrack|setItem|audioTrackChanged|dualSound|else|addClass|hasClass|toggleClass|Track|Audio|svg|dualy|images|mousedown|buttons|topbar|playAttemptFailed|beforePlay|Rewind|fr|Forward|ff|ready|set_audio_track|remove|this|upload_srt|prop|50px|margin|1000001|iframe|center|align|text|rgba|background|1000000|left|absolute|pause|setCurrentCaptions|Upload|contains|item|content|html|fviews|referer|prem|embed|3e57249ef633e0d03bf76ceb8d8a4b65|216|83|hash|view|get|TokenZir|window|hide|show|complete|slow|fadeIn|video_ad_fadein|time||cache|Cache|Content|headers|ajaxSetup|v2done|tott|vastdone2|vastdone1|vvbefore|playbackRateControls|cast|aboutlink|FileMoon|abouttext|UHD|1870|qualityLabels|sites|GNOME_POWER|link|2Fiframe|3C|allowfullscreen|22360|22640|22no|marginheight|marginwidth|2FGNOME_POWER|2F0fxcyc2ajhp1|2Fe|2Ffilemoon|2F|3A||22https|3Ciframe|code|sharing|fontOpacity|backgroundOpacity|Tahoma|fontFamily|303030|backgroundColor|FFFFFF|color|userFontScale|thumbnails|kind|0fxcyc2ajhp10000|url|get_slides|start|startparam|none|preload|html5|primary|hlshtml|androidhls|duration|uniform|stretching|0fxcyc2ajhp1_xt|image|2048|sp|6871|asn|127|srv|43200|_g3XlBcu2lmD9oDexD2NLWSmah2Nu3XcDrl93m9PwXY|m3u8||master|0fxcyc2ajhp1_x|00076|01|hls2|to|s01|delivery|storage|moon|sources|setup'''.split('|')))
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
@ -250,6 +250,7 @@ class TestUtil(unittest.TestCase):
|
|||||||
self.assertEqual(sanitize_url('httpss://foo.bar'), 'https://foo.bar')
|
self.assertEqual(sanitize_url('httpss://foo.bar'), 'https://foo.bar')
|
||||||
self.assertEqual(sanitize_url('rmtps://foo.bar'), 'rtmps://foo.bar')
|
self.assertEqual(sanitize_url('rmtps://foo.bar'), 'rtmps://foo.bar')
|
||||||
self.assertEqual(sanitize_url('https://foo.bar'), 'https://foo.bar')
|
self.assertEqual(sanitize_url('https://foo.bar'), 'https://foo.bar')
|
||||||
|
self.assertEqual(sanitize_url('foo bar'), 'foo bar')
|
||||||
|
|
||||||
def test_expand_path(self):
|
def test_expand_path(self):
|
||||||
def env(var):
|
def env(var):
|
||||||
|
@ -67,6 +67,10 @@ _SIG_TESTS = [
|
|||||||
]
|
]
|
||||||
|
|
||||||
_NSIG_TESTS = [
|
_NSIG_TESTS = [
|
||||||
|
(
|
||||||
|
'https://www.youtube.com/s/player/7862ca1f/player_ias.vflset/en_US/base.js',
|
||||||
|
'X_LCxVDjAavgE5t', 'yxJ1dM6iz5ogUg',
|
||||||
|
),
|
||||||
(
|
(
|
||||||
'https://www.youtube.com/s/player/9216d1f7/player_ias.vflset/en_US/base.js',
|
'https://www.youtube.com/s/player/9216d1f7/player_ias.vflset/en_US/base.js',
|
||||||
'SLp9F5bwjAdhE9F-', 'gWnb9IK2DJ8Q1w',
|
'SLp9F5bwjAdhE9F-', 'gWnb9IK2DJ8Q1w',
|
||||||
|
@ -39,6 +39,7 @@ from .compat import (
|
|||||||
compat_str,
|
compat_str,
|
||||||
compat_tokenize_tokenize,
|
compat_tokenize_tokenize,
|
||||||
compat_urllib_error,
|
compat_urllib_error,
|
||||||
|
compat_urllib_parse,
|
||||||
compat_urllib_request,
|
compat_urllib_request,
|
||||||
compat_urllib_request_DataHandler,
|
compat_urllib_request_DataHandler,
|
||||||
)
|
)
|
||||||
@ -60,6 +61,7 @@ from .utils import (
|
|||||||
format_bytes,
|
format_bytes,
|
||||||
formatSeconds,
|
formatSeconds,
|
||||||
GeoRestrictedError,
|
GeoRestrictedError,
|
||||||
|
HEADRequest,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
ISO3166Utils,
|
ISO3166Utils,
|
||||||
locked_file,
|
locked_file,
|
||||||
@ -74,6 +76,7 @@ from .utils import (
|
|||||||
preferredencoding,
|
preferredencoding,
|
||||||
prepend_extension,
|
prepend_extension,
|
||||||
process_communicate_or_kill,
|
process_communicate_or_kill,
|
||||||
|
PUTRequest,
|
||||||
register_socks_protocols,
|
register_socks_protocols,
|
||||||
render_table,
|
render_table,
|
||||||
replace_extension,
|
replace_extension,
|
||||||
@ -2297,6 +2300,27 @@ class YoutubeDL(object):
|
|||||||
""" Start an HTTP download """
|
""" Start an HTTP download """
|
||||||
if isinstance(req, compat_basestring):
|
if isinstance(req, compat_basestring):
|
||||||
req = sanitized_Request(req)
|
req = sanitized_Request(req)
|
||||||
|
# an embedded /../ sequence is not automatically handled by urllib2
|
||||||
|
# see https://github.com/yt-dlp/yt-dlp/issues/3355
|
||||||
|
url = req.get_full_url()
|
||||||
|
parts = url.partition('/../')
|
||||||
|
if parts[1]:
|
||||||
|
url = compat_urllib_parse.urljoin(parts[0] + parts[1][:1], parts[1][1:] + parts[2])
|
||||||
|
if url:
|
||||||
|
# worse, URL path may have initial /../ against RFCs: work-around
|
||||||
|
# by stripping such prefixes, like eg Firefox
|
||||||
|
parts = compat_urllib_parse.urlsplit(url)
|
||||||
|
path = parts.path
|
||||||
|
while path.startswith('/../'):
|
||||||
|
path = path[3:]
|
||||||
|
url = parts._replace(path=path).geturl()
|
||||||
|
# get a new Request with the munged URL
|
||||||
|
if url != req.get_full_url():
|
||||||
|
req_type = {'HEAD': HEADRequest, 'PUT': PUTRequest}.get(
|
||||||
|
req.get_method(), compat_urllib_request.Request)
|
||||||
|
req = req_type(
|
||||||
|
url, data=req.data, headers=dict(req.header_items()),
|
||||||
|
origin_req_host=req.origin_req_host, unverifiable=req.unverifiable)
|
||||||
return self._opener.open(req, timeout=self._socket_timeout)
|
return self._opener.open(req, timeout=self._socket_timeout)
|
||||||
|
|
||||||
def print_debug_header(self):
|
def print_debug_header(self):
|
||||||
|
@ -88,17 +88,21 @@ class FileDownloader(object):
|
|||||||
return '---.-%'
|
return '---.-%'
|
||||||
return '%6s' % ('%3.1f%%' % percent)
|
return '%6s' % ('%3.1f%%' % percent)
|
||||||
|
|
||||||
@staticmethod
|
@classmethod
|
||||||
def calc_eta(start, now, total, current):
|
def calc_eta(cls, start_or_rate, now_or_remaining, *args):
|
||||||
|
if len(args) < 2:
|
||||||
|
rate, remaining = (start_or_rate, now_or_remaining)
|
||||||
|
if None in (rate, remaining):
|
||||||
|
return None
|
||||||
|
return int(float(remaining) / rate)
|
||||||
|
start, now = (start_or_rate, now_or_remaining)
|
||||||
|
total, current = args
|
||||||
if total is None:
|
if total is None:
|
||||||
return None
|
return None
|
||||||
if now is None:
|
if now is None:
|
||||||
now = time.time()
|
now = time.time()
|
||||||
dif = now - start
|
rate = cls.calc_speed(start, now, current)
|
||||||
if current == 0 or dif < 0.001: # One millisecond
|
return rate and int((float(total) - float(current)) / rate)
|
||||||
return None
|
|
||||||
rate = float(current) / dif
|
|
||||||
return int((float(total) - float(current)) / rate)
|
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def format_eta(eta):
|
def format_eta(eta):
|
||||||
@ -123,6 +127,12 @@ class FileDownloader(object):
|
|||||||
def format_retries(retries):
|
def format_retries(retries):
|
||||||
return 'inf' if retries == float('inf') else '%.0f' % retries
|
return 'inf' if retries == float('inf') else '%.0f' % retries
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def filesize_or_none(unencoded_filename):
|
||||||
|
fn = encodeFilename(unencoded_filename)
|
||||||
|
if os.path.isfile(fn):
|
||||||
|
return os.path.getsize(fn)
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def best_block_size(elapsed_time, bytes):
|
def best_block_size(elapsed_time, bytes):
|
||||||
new_min = max(bytes / 2.0, 1.0)
|
new_min = max(bytes / 2.0, 1.0)
|
||||||
|
@ -38,8 +38,7 @@ class DashSegmentsFD(FragmentFD):
|
|||||||
# In DASH, the first segment contains necessary headers to
|
# In DASH, the first segment contains necessary headers to
|
||||||
# generate a valid MP4 file, so always abort for the first segment
|
# generate a valid MP4 file, so always abort for the first segment
|
||||||
fatal = i == 0 or not skip_unavailable_fragments
|
fatal = i == 0 or not skip_unavailable_fragments
|
||||||
count = 0
|
for count in range(fragment_retries + 1):
|
||||||
while count <= fragment_retries:
|
|
||||||
try:
|
try:
|
||||||
fragment_url = fragment.get('url')
|
fragment_url = fragment.get('url')
|
||||||
if not fragment_url:
|
if not fragment_url:
|
||||||
@ -57,9 +56,8 @@ class DashSegmentsFD(FragmentFD):
|
|||||||
# is usually enough) thus allowing to download the whole file successfully.
|
# is usually enough) thus allowing to download the whole file successfully.
|
||||||
# To be future-proof we will retry all fragments that fail with any
|
# To be future-proof we will retry all fragments that fail with any
|
||||||
# HTTP error.
|
# HTTP error.
|
||||||
count += 1
|
if count < fragment_retries:
|
||||||
if count <= fragment_retries:
|
self.report_retry_fragment(err, frag_index, count + 1, fragment_retries)
|
||||||
self.report_retry_fragment(err, frag_index, count, fragment_retries)
|
|
||||||
except DownloadError:
|
except DownloadError:
|
||||||
# Don't retry fragment if error occurred during HTTP downloading
|
# Don't retry fragment if error occurred during HTTP downloading
|
||||||
# itself since it has own retry settings
|
# itself since it has own retry settings
|
||||||
@ -68,7 +66,7 @@ class DashSegmentsFD(FragmentFD):
|
|||||||
break
|
break
|
||||||
raise
|
raise
|
||||||
|
|
||||||
if count > fragment_retries:
|
if count >= fragment_retries:
|
||||||
if not fatal:
|
if not fatal:
|
||||||
self.report_skip_fragment(frag_index)
|
self.report_skip_fragment(frag_index)
|
||||||
continue
|
continue
|
||||||
|
@ -273,7 +273,7 @@ class HttpieFD(ExternalFD):
|
|||||||
class FFmpegFD(ExternalFD):
|
class FFmpegFD(ExternalFD):
|
||||||
@classmethod
|
@classmethod
|
||||||
def supports(cls, info_dict):
|
def supports(cls, info_dict):
|
||||||
return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps', 'm3u8', 'rtsp', 'rtmp', 'mms')
|
return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps', 'm3u8', 'rtsp', 'rtmp', 'mms', 'http_dash_segments')
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def available(cls):
|
def available(cls):
|
||||||
|
@ -71,7 +71,7 @@ class FragmentFD(FileDownloader):
|
|||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def __do_ytdl_file(ctx):
|
def __do_ytdl_file(ctx):
|
||||||
return not ctx['live'] and not ctx['tmpfilename'] == '-'
|
return ctx['live'] is not True and ctx['tmpfilename'] != '-'
|
||||||
|
|
||||||
def _read_ytdl_file(self, ctx):
|
def _read_ytdl_file(self, ctx):
|
||||||
assert 'ytdl_corrupt' not in ctx
|
assert 'ytdl_corrupt' not in ctx
|
||||||
@ -101,6 +101,13 @@ class FragmentFD(FileDownloader):
|
|||||||
'url': frag_url,
|
'url': frag_url,
|
||||||
'http_headers': headers or info_dict.get('http_headers'),
|
'http_headers': headers or info_dict.get('http_headers'),
|
||||||
}
|
}
|
||||||
|
frag_resume_len = 0
|
||||||
|
if ctx['dl'].params.get('continuedl', True):
|
||||||
|
frag_resume_len = self.filesize_or_none(
|
||||||
|
self.temp_name(fragment_filename))
|
||||||
|
fragment_info_dict['frag_resume_len'] = frag_resume_len
|
||||||
|
ctx['frag_resume_len'] = frag_resume_len or 0
|
||||||
|
|
||||||
success = ctx['dl'].download(fragment_filename, fragment_info_dict)
|
success = ctx['dl'].download(fragment_filename, fragment_info_dict)
|
||||||
if not success:
|
if not success:
|
||||||
return False, None
|
return False, None
|
||||||
@ -124,9 +131,7 @@ class FragmentFD(FileDownloader):
|
|||||||
del ctx['fragment_filename_sanitized']
|
del ctx['fragment_filename_sanitized']
|
||||||
|
|
||||||
def _prepare_frag_download(self, ctx):
|
def _prepare_frag_download(self, ctx):
|
||||||
if 'live' not in ctx:
|
if not ctx.setdefault('live', False):
|
||||||
ctx['live'] = False
|
|
||||||
if not ctx['live']:
|
|
||||||
total_frags_str = '%d' % ctx['total_frags']
|
total_frags_str = '%d' % ctx['total_frags']
|
||||||
ad_frags = ctx.get('ad_frags', 0)
|
ad_frags = ctx.get('ad_frags', 0)
|
||||||
if ad_frags:
|
if ad_frags:
|
||||||
@ -136,10 +141,11 @@ class FragmentFD(FileDownloader):
|
|||||||
self.to_screen(
|
self.to_screen(
|
||||||
'[%s] Total fragments: %s' % (self.FD_NAME, total_frags_str))
|
'[%s] Total fragments: %s' % (self.FD_NAME, total_frags_str))
|
||||||
self.report_destination(ctx['filename'])
|
self.report_destination(ctx['filename'])
|
||||||
|
continuedl = self.params.get('continuedl', True)
|
||||||
dl = HttpQuietDownloader(
|
dl = HttpQuietDownloader(
|
||||||
self.ydl,
|
self.ydl,
|
||||||
{
|
{
|
||||||
'continuedl': True,
|
'continuedl': continuedl,
|
||||||
'quiet': True,
|
'quiet': True,
|
||||||
'noprogress': True,
|
'noprogress': True,
|
||||||
'ratelimit': self.params.get('ratelimit'),
|
'ratelimit': self.params.get('ratelimit'),
|
||||||
@ -150,12 +156,11 @@ class FragmentFD(FileDownloader):
|
|||||||
)
|
)
|
||||||
tmpfilename = self.temp_name(ctx['filename'])
|
tmpfilename = self.temp_name(ctx['filename'])
|
||||||
open_mode = 'wb'
|
open_mode = 'wb'
|
||||||
resume_len = 0
|
|
||||||
|
|
||||||
# Establish possible resume length
|
# Establish possible resume length
|
||||||
if os.path.isfile(encodeFilename(tmpfilename)):
|
resume_len = self.filesize_or_none(tmpfilename) or 0
|
||||||
|
if resume_len > 0:
|
||||||
open_mode = 'ab'
|
open_mode = 'ab'
|
||||||
resume_len = os.path.getsize(encodeFilename(tmpfilename))
|
|
||||||
|
|
||||||
# Should be initialized before ytdl file check
|
# Should be initialized before ytdl file check
|
||||||
ctx.update({
|
ctx.update({
|
||||||
@ -164,7 +169,8 @@ class FragmentFD(FileDownloader):
|
|||||||
})
|
})
|
||||||
|
|
||||||
if self.__do_ytdl_file(ctx):
|
if self.__do_ytdl_file(ctx):
|
||||||
if os.path.isfile(encodeFilename(self.ytdl_filename(ctx['filename']))):
|
ytdl_file_exists = os.path.isfile(encodeFilename(self.ytdl_filename(ctx['filename'])))
|
||||||
|
if continuedl and ytdl_file_exists:
|
||||||
self._read_ytdl_file(ctx)
|
self._read_ytdl_file(ctx)
|
||||||
is_corrupt = ctx.get('ytdl_corrupt') is True
|
is_corrupt = ctx.get('ytdl_corrupt') is True
|
||||||
is_inconsistent = ctx['fragment_index'] > 0 and resume_len == 0
|
is_inconsistent = ctx['fragment_index'] > 0 and resume_len == 0
|
||||||
@ -178,7 +184,12 @@ class FragmentFD(FileDownloader):
|
|||||||
if 'ytdl_corrupt' in ctx:
|
if 'ytdl_corrupt' in ctx:
|
||||||
del ctx['ytdl_corrupt']
|
del ctx['ytdl_corrupt']
|
||||||
self._write_ytdl_file(ctx)
|
self._write_ytdl_file(ctx)
|
||||||
|
|
||||||
else:
|
else:
|
||||||
|
if not continuedl:
|
||||||
|
if ytdl_file_exists:
|
||||||
|
self._read_ytdl_file(ctx)
|
||||||
|
ctx['fragment_index'] = resume_len = 0
|
||||||
self._write_ytdl_file(ctx)
|
self._write_ytdl_file(ctx)
|
||||||
assert ctx['fragment_index'] == 0
|
assert ctx['fragment_index'] == 0
|
||||||
|
|
||||||
@ -209,6 +220,7 @@ class FragmentFD(FileDownloader):
|
|||||||
start = time.time()
|
start = time.time()
|
||||||
ctx.update({
|
ctx.update({
|
||||||
'started': start,
|
'started': start,
|
||||||
|
'fragment_started': start,
|
||||||
# Amount of fragment's bytes downloaded by the time of the previous
|
# Amount of fragment's bytes downloaded by the time of the previous
|
||||||
# frag progress hook invocation
|
# frag progress hook invocation
|
||||||
'prev_frag_downloaded_bytes': 0,
|
'prev_frag_downloaded_bytes': 0,
|
||||||
@ -218,6 +230,9 @@ class FragmentFD(FileDownloader):
|
|||||||
if s['status'] not in ('downloading', 'finished'):
|
if s['status'] not in ('downloading', 'finished'):
|
||||||
return
|
return
|
||||||
|
|
||||||
|
if not total_frags and ctx.get('fragment_count'):
|
||||||
|
state['fragment_count'] = ctx['fragment_count']
|
||||||
|
|
||||||
time_now = time.time()
|
time_now = time.time()
|
||||||
state['elapsed'] = time_now - start
|
state['elapsed'] = time_now - start
|
||||||
frag_total_bytes = s.get('total_bytes') or 0
|
frag_total_bytes = s.get('total_bytes') or 0
|
||||||
@ -232,16 +247,17 @@ class FragmentFD(FileDownloader):
|
|||||||
ctx['fragment_index'] = state['fragment_index']
|
ctx['fragment_index'] = state['fragment_index']
|
||||||
state['downloaded_bytes'] += frag_total_bytes - ctx['prev_frag_downloaded_bytes']
|
state['downloaded_bytes'] += frag_total_bytes - ctx['prev_frag_downloaded_bytes']
|
||||||
ctx['complete_frags_downloaded_bytes'] = state['downloaded_bytes']
|
ctx['complete_frags_downloaded_bytes'] = state['downloaded_bytes']
|
||||||
|
ctx['speed'] = state['speed'] = self.calc_speed(
|
||||||
|
ctx['fragment_started'], time_now, frag_total_bytes)
|
||||||
|
ctx['fragment_started'] = time.time()
|
||||||
ctx['prev_frag_downloaded_bytes'] = 0
|
ctx['prev_frag_downloaded_bytes'] = 0
|
||||||
else:
|
else:
|
||||||
frag_downloaded_bytes = s['downloaded_bytes']
|
frag_downloaded_bytes = s['downloaded_bytes']
|
||||||
state['downloaded_bytes'] += frag_downloaded_bytes - ctx['prev_frag_downloaded_bytes']
|
state['downloaded_bytes'] += frag_downloaded_bytes - ctx['prev_frag_downloaded_bytes']
|
||||||
|
ctx['speed'] = state['speed'] = self.calc_speed(
|
||||||
|
ctx['fragment_started'], time_now, frag_downloaded_bytes - ctx['frag_resume_len'])
|
||||||
if not ctx['live']:
|
if not ctx['live']:
|
||||||
state['eta'] = self.calc_eta(
|
state['eta'] = self.calc_eta(state['speed'], estimated_size - state['downloaded_bytes'])
|
||||||
start, time_now, estimated_size - resume_len,
|
|
||||||
state['downloaded_bytes'] - resume_len)
|
|
||||||
state['speed'] = s.get('speed') or ctx.get('speed')
|
|
||||||
ctx['speed'] = state['speed']
|
|
||||||
ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes
|
ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes
|
||||||
self._hook_progress(state)
|
self._hook_progress(state)
|
||||||
|
|
||||||
@ -268,7 +284,7 @@ class FragmentFD(FileDownloader):
|
|||||||
os.utime(ctx['filename'], (time.time(), filetime))
|
os.utime(ctx['filename'], (time.time(), filetime))
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
downloaded_bytes = os.path.getsize(encodeFilename(ctx['filename']))
|
downloaded_bytes = self.filesize_or_none(ctx['filename']) or 0
|
||||||
|
|
||||||
self._hook_progress({
|
self._hook_progress({
|
||||||
'downloaded_bytes': downloaded_bytes,
|
'downloaded_bytes': downloaded_bytes,
|
||||||
|
@ -58,9 +58,9 @@ class HttpFD(FileDownloader):
|
|||||||
|
|
||||||
if self.params.get('continuedl', True):
|
if self.params.get('continuedl', True):
|
||||||
# Establish possible resume length
|
# Establish possible resume length
|
||||||
if os.path.isfile(encodeFilename(ctx.tmpfilename)):
|
ctx.resume_len = info_dict.get('frag_resume_len')
|
||||||
ctx.resume_len = os.path.getsize(
|
if ctx.resume_len is None:
|
||||||
encodeFilename(ctx.tmpfilename))
|
ctx.resume_len = self.filesize_or_none(ctx.tmpfilename) or 0
|
||||||
|
|
||||||
ctx.is_resume = ctx.resume_len > 0
|
ctx.is_resume = ctx.resume_len > 0
|
||||||
|
|
||||||
@ -115,9 +115,9 @@ class HttpFD(FileDownloader):
|
|||||||
raise RetryDownload(err)
|
raise RetryDownload(err)
|
||||||
raise err
|
raise err
|
||||||
# When trying to resume, Content-Range HTTP header of response has to be checked
|
# When trying to resume, Content-Range HTTP header of response has to be checked
|
||||||
# to match the value of requested Range HTTP header. This is due to a webservers
|
# to match the value of requested Range HTTP header. This is due to webservers
|
||||||
# that don't support resuming and serve a whole file with no Content-Range
|
# that don't support resuming and serve a whole file with no Content-Range
|
||||||
# set in response despite of requested Range (see
|
# set in response despite requested Range (see
|
||||||
# https://github.com/ytdl-org/youtube-dl/issues/6057#issuecomment-126129799)
|
# https://github.com/ytdl-org/youtube-dl/issues/6057#issuecomment-126129799)
|
||||||
if has_range:
|
if has_range:
|
||||||
content_range = ctx.data.headers.get('Content-Range')
|
content_range = ctx.data.headers.get('Content-Range')
|
||||||
@ -293,10 +293,7 @@ class HttpFD(FileDownloader):
|
|||||||
|
|
||||||
# Progress message
|
# Progress message
|
||||||
speed = self.calc_speed(start, now, byte_counter - ctx.resume_len)
|
speed = self.calc_speed(start, now, byte_counter - ctx.resume_len)
|
||||||
if ctx.data_len is None:
|
eta = self.calc_eta(speed, ctx.data_len and (ctx.data_len - ctx.resume_len))
|
||||||
eta = None
|
|
||||||
else:
|
|
||||||
eta = self.calc_eta(start, time.time(), ctx.data_len - ctx.resume_len, byte_counter - ctx.resume_len)
|
|
||||||
|
|
||||||
self._hook_progress({
|
self._hook_progress({
|
||||||
'status': 'downloading',
|
'status': 'downloading',
|
||||||
|
@ -8,6 +8,8 @@ from ..utils import (
|
|||||||
ExtractorError,
|
ExtractorError,
|
||||||
GeoRestrictedError,
|
GeoRestrictedError,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
|
remove_start,
|
||||||
|
traverse_obj,
|
||||||
update_url_query,
|
update_url_query,
|
||||||
urlencode_postdata,
|
urlencode_postdata,
|
||||||
)
|
)
|
||||||
@ -33,14 +35,17 @@ class AENetworksBaseIE(ThePlatformIE):
|
|||||||
}
|
}
|
||||||
|
|
||||||
def _extract_aen_smil(self, smil_url, video_id, auth=None):
|
def _extract_aen_smil(self, smil_url, video_id, auth=None):
|
||||||
query = {'mbr': 'true'}
|
query = {
|
||||||
|
'mbr': 'true',
|
||||||
|
'formats': 'M3U+none,MPEG-DASH+none,MPEG4,MP3',
|
||||||
|
}
|
||||||
if auth:
|
if auth:
|
||||||
query['auth'] = auth
|
query['auth'] = auth
|
||||||
TP_SMIL_QUERY = [{
|
TP_SMIL_QUERY = [{
|
||||||
'assetTypes': 'high_video_ak',
|
'assetTypes': 'high_video_ak',
|
||||||
'switch': 'hls_high_ak'
|
'switch': 'hls_high_ak',
|
||||||
}, {
|
}, {
|
||||||
'assetTypes': 'high_video_s3'
|
'assetTypes': 'high_video_s3',
|
||||||
}, {
|
}, {
|
||||||
'assetTypes': 'high_video_s3',
|
'assetTypes': 'high_video_s3',
|
||||||
'switch': 'hls_high_fastly',
|
'switch': 'hls_high_fastly',
|
||||||
@ -75,7 +80,14 @@ class AENetworksBaseIE(ThePlatformIE):
|
|||||||
requestor_id, brand = self._DOMAIN_MAP[domain]
|
requestor_id, brand = self._DOMAIN_MAP[domain]
|
||||||
result = self._download_json(
|
result = self._download_json(
|
||||||
'https://feeds.video.aetnd.com/api/v2/%s/videos' % brand,
|
'https://feeds.video.aetnd.com/api/v2/%s/videos' % brand,
|
||||||
filter_value, query={'filter[%s]' % filter_key: filter_value})['results'][0]
|
filter_value, query={'filter[%s]' % filter_key: filter_value})
|
||||||
|
result = traverse_obj(
|
||||||
|
result, ('results',
|
||||||
|
lambda k, v: k == 0 and v[filter_key] == filter_value),
|
||||||
|
get_all=False)
|
||||||
|
if not result:
|
||||||
|
raise ExtractorError('Show not found in A&E feed (too new?)', expected=True,
|
||||||
|
video_id=remove_start(filter_value, '/'))
|
||||||
title = result['title']
|
title = result['title']
|
||||||
video_id = result['id']
|
video_id = result['id']
|
||||||
media_url = result['publicUrl']
|
media_url = result['publicUrl']
|
||||||
@ -126,7 +138,7 @@ class AENetworksIE(AENetworksBaseIE):
|
|||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
'add_ie': ['ThePlatform'],
|
'add_ie': ['ThePlatform'],
|
||||||
'skip': 'This video is only available for users of participating TV providers.',
|
'skip': 'Geo-restricted - This content is not available in your location.'
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.aetv.com/shows/duck-dynasty/season-9/episode-1',
|
'url': 'http://www.aetv.com/shows/duck-dynasty/season-9/episode-1',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -143,6 +155,7 @@ class AENetworksIE(AENetworksBaseIE):
|
|||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
'add_ie': ['ThePlatform'],
|
'add_ie': ['ThePlatform'],
|
||||||
|
'skip': 'This video is only available for users of participating TV providers.',
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.fyi.tv/shows/tiny-house-nation/season-1/episode-8',
|
'url': 'http://www.fyi.tv/shows/tiny-house-nation/season-1/episode-8',
|
||||||
'only_matching': True
|
'only_matching': True
|
||||||
|
@ -1087,7 +1087,7 @@ class InfoExtractor(object):
|
|||||||
# Helper functions for extracting OpenGraph info
|
# Helper functions for extracting OpenGraph info
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def _og_regexes(prop):
|
def _og_regexes(prop):
|
||||||
content_re = r'content=(?:"([^"]+?)"|\'([^\']+?)\'|\s*([^\s"\'=<>`]+?))'
|
content_re = r'content=(?:"([^"]+?)"|\'([^\']+?)\'|\s*([^\s"\'=<>`]+?)(?=\s|/?>))'
|
||||||
property_re = (r'(?:name|property)=(?:\'og[:-]%(prop)s\'|"og[:-]%(prop)s"|\s*og[:-]%(prop)s\b)'
|
property_re = (r'(?:name|property)=(?:\'og[:-]%(prop)s\'|"og[:-]%(prop)s"|\s*og[:-]%(prop)s\b)'
|
||||||
% {'prop': re.escape(prop)})
|
% {'prop': re.escape(prop)})
|
||||||
template = r'<meta[^>]+?%s[^>]+?%s'
|
template = r'<meta[^>]+?%s[^>]+?%s'
|
||||||
|
@ -2320,6 +2320,25 @@ class GenericIE(InfoExtractor):
|
|||||||
'height': 720,
|
'height': 720,
|
||||||
'age_limit': 18,
|
'age_limit': 18,
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
# would like to use the yt-dl test video but searching for
|
||||||
|
# '"\'/\\ä↭𝕐' fails, so using an old vid from YouTube Korea
|
||||||
|
'note': 'Test default search',
|
||||||
|
'url': 'Shorts로 허락 필요없이 놀자! (BTS편)',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'usDGO4Zb-dc',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'YouTube Shorts로 허락 필요없이 놀자! (BTS편)',
|
||||||
|
'description': 'md5:96e31607eba81ab441567b5e289f4716',
|
||||||
|
'upload_date': '20211107',
|
||||||
|
'uploader': 'YouTube Korea',
|
||||||
|
'location': '대한민국',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'default_search': 'ytsearch',
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
'expected_warnings': ['uploader id'],
|
||||||
},
|
},
|
||||||
]
|
]
|
||||||
|
|
||||||
|
@ -1,19 +1,29 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
|
compat_filter as filter,
|
||||||
|
compat_HTTPError,
|
||||||
compat_parse_qs,
|
compat_parse_qs,
|
||||||
compat_urllib_parse_urlparse,
|
compat_urlparse,
|
||||||
)
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
HEADRequest,
|
|
||||||
determine_ext,
|
determine_ext,
|
||||||
|
error_to_compat_str,
|
||||||
|
extract_attributes,
|
||||||
|
ExtractorError,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
|
merge_dicts,
|
||||||
|
orderedSet,
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
strip_or_none,
|
strip_or_none,
|
||||||
try_get,
|
traverse_obj,
|
||||||
|
url_or_none,
|
||||||
|
urljoin,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -22,14 +32,102 @@ class IGNBaseIE(InfoExtractor):
|
|||||||
return self._download_json(
|
return self._download_json(
|
||||||
'http://apis.ign.com/{0}/v3/{0}s/slug/{1}'.format(self._PAGE_TYPE, slug), slug)
|
'http://apis.ign.com/{0}/v3/{0}s/slug/{1}'.format(self._PAGE_TYPE, slug), slug)
|
||||||
|
|
||||||
|
def _checked_call_api(self, slug):
|
||||||
|
try:
|
||||||
|
return self._call_api(slug)
|
||||||
|
except ExtractorError as e:
|
||||||
|
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
|
||||||
|
e.cause.args = e.cause.args or [
|
||||||
|
e.cause.geturl(), e.cause.getcode(), e.cause.reason]
|
||||||
|
raise ExtractorError(
|
||||||
|
'Content not found: expired?', cause=e.cause,
|
||||||
|
expected=True)
|
||||||
|
raise
|
||||||
|
|
||||||
|
def _extract_video_info(self, video, fatal=True):
|
||||||
|
video_id = video['videoId']
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
refs = traverse_obj(video, 'refs', expected_type=dict) or {}
|
||||||
|
|
||||||
|
m3u8_url = url_or_none(refs.get('m3uUrl'))
|
||||||
|
if m3u8_url:
|
||||||
|
formats.extend(self._extract_m3u8_formats(
|
||||||
|
m3u8_url, video_id, 'mp4', 'm3u8_native',
|
||||||
|
m3u8_id='hls', fatal=False))
|
||||||
|
|
||||||
|
f4m_url = url_or_none(refs.get('f4mUrl'))
|
||||||
|
if f4m_url:
|
||||||
|
formats.extend(self._extract_f4m_formats(
|
||||||
|
f4m_url, video_id, f4m_id='hds', fatal=False))
|
||||||
|
|
||||||
|
for asset in (video.get('assets') or []):
|
||||||
|
asset_url = url_or_none(asset.get('url'))
|
||||||
|
if not asset_url:
|
||||||
|
continue
|
||||||
|
formats.append({
|
||||||
|
'url': asset_url,
|
||||||
|
'tbr': int_or_none(asset.get('bitrate'), 1000),
|
||||||
|
'fps': int_or_none(asset.get('frame_rate')),
|
||||||
|
'height': int_or_none(asset.get('height')),
|
||||||
|
'width': int_or_none(asset.get('width')),
|
||||||
|
})
|
||||||
|
|
||||||
|
mezzanine_url = traverse_obj(
|
||||||
|
video, ('system', 'mezzanineUrl'), expected_type=url_or_none)
|
||||||
|
if mezzanine_url:
|
||||||
|
formats.append({
|
||||||
|
'ext': determine_ext(mezzanine_url, 'mp4'),
|
||||||
|
'format_id': 'mezzanine',
|
||||||
|
'preference': 1,
|
||||||
|
'url': mezzanine_url,
|
||||||
|
})
|
||||||
|
|
||||||
|
if formats or fatal:
|
||||||
|
self._sort_formats(formats)
|
||||||
|
else:
|
||||||
|
return
|
||||||
|
|
||||||
|
thumbnails = traverse_obj(
|
||||||
|
video, ('thumbnails', Ellipsis, {'url': 'url'}), expected_type=url_or_none)
|
||||||
|
tags = traverse_obj(
|
||||||
|
video, ('tags', Ellipsis, 'displayName'),
|
||||||
|
expected_type=lambda x: x.strip() or None)
|
||||||
|
|
||||||
|
metadata = traverse_obj(video, 'metadata', expected_type=dict) or {}
|
||||||
|
title = traverse_obj(
|
||||||
|
metadata, 'longTitle', 'title', 'name',
|
||||||
|
expected_type=lambda x: x.strip() or None)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'description': strip_or_none(metadata.get('description')),
|
||||||
|
'timestamp': parse_iso8601(metadata.get('publishDate')),
|
||||||
|
'duration': int_or_none(metadata.get('duration')),
|
||||||
|
'thumbnails': thumbnails,
|
||||||
|
'formats': formats,
|
||||||
|
'tags': tags,
|
||||||
|
}
|
||||||
|
|
||||||
|
# yt-dlp shim
|
||||||
|
@classmethod
|
||||||
|
def _extract_from_webpage(cls, url, webpage):
|
||||||
|
for embed_url in orderedSet(
|
||||||
|
cls._extract_embed_urls(url, webpage) or [], lazy=True):
|
||||||
|
yield cls.url_result(embed_url, None if cls._VALID_URL is False else cls)
|
||||||
|
|
||||||
|
|
||||||
class IGNIE(IGNBaseIE):
|
class IGNIE(IGNBaseIE):
|
||||||
"""
|
"""
|
||||||
Extractor for some of the IGN sites, like www.ign.com, es.ign.com de.ign.com.
|
Extractor for some of the IGN sites, like www.ign.com, es.ign.com de.ign.com.
|
||||||
Some videos of it.ign.com are also supported
|
Some videos of it.ign.com are also supported
|
||||||
"""
|
"""
|
||||||
|
_VIDEO_PATH_RE = r'/(?:\d{4}/\d{2}/\d{2}/)?(?P<id>.+?)'
|
||||||
_VALID_URL = r'https?://(?:.+?\.ign|www\.pcmag)\.com/videos/(?:\d{4}/\d{2}/\d{2}/)?(?P<id>[^/?&#]+)'
|
_PLAYLIST_PATH_RE = r'(?:/?\?(?P<filt>[^&#]+))?'
|
||||||
|
_VALID_URL = (
|
||||||
|
r'https?://(?:.+?\.ign|www\.pcmag)\.com/videos(?:%s)'
|
||||||
|
% '|'.join((_VIDEO_PATH_RE + r'(?:[/?&#]|$)', _PLAYLIST_PATH_RE)))
|
||||||
IE_NAME = 'ign.com'
|
IE_NAME = 'ign.com'
|
||||||
_PAGE_TYPE = 'video'
|
_PAGE_TYPE = 'video'
|
||||||
|
|
||||||
@ -44,7 +142,10 @@ class IGNIE(IGNBaseIE):
|
|||||||
'timestamp': 1370440800,
|
'timestamp': 1370440800,
|
||||||
'upload_date': '20130605',
|
'upload_date': '20130605',
|
||||||
'tags': 'count:9',
|
'tags': 'count:9',
|
||||||
}
|
},
|
||||||
|
'params': {
|
||||||
|
'nocheckcertificate': True,
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.pcmag.com/videos/2015/01/06/010615-whats-new-now-is-gogo-snooping-on-your-data',
|
'url': 'http://www.pcmag.com/videos/2015/01/06/010615-whats-new-now-is-gogo-snooping-on-your-data',
|
||||||
'md5': 'f1581a6fe8c5121be5b807684aeac3f6',
|
'md5': 'f1581a6fe8c5121be5b807684aeac3f6',
|
||||||
@ -56,86 +157,51 @@ class IGNIE(IGNBaseIE):
|
|||||||
'timestamp': 1420571160,
|
'timestamp': 1420571160,
|
||||||
'upload_date': '20150106',
|
'upload_date': '20150106',
|
||||||
'tags': 'count:4',
|
'tags': 'count:4',
|
||||||
}
|
},
|
||||||
|
'skip': '404 Not Found',
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.ign.com/videos/is-a-resident-evil-4-remake-on-the-way-ign-daily-fix',
|
'url': 'https://www.ign.com/videos/is-a-resident-evil-4-remake-on-the-way-ign-daily-fix',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def _extract_embed_urls(cls, url, webpage):
|
||||||
|
grids = re.findall(
|
||||||
|
r'''(?s)<section\b[^>]+\bclass\s*=\s*['"](?:[\w-]+\s+)*?content-feed-grid(?!\B|-)[^>]+>(.+?)</section[^>]*>''',
|
||||||
|
webpage)
|
||||||
|
return filter(None,
|
||||||
|
(urljoin(url, m.group('path')) for m in re.finditer(
|
||||||
|
r'''<a\b[^>]+\bhref\s*=\s*('|")(?P<path>/videos%s)\1'''
|
||||||
|
% cls._VIDEO_PATH_RE, grids[0] if grids else '')))
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
m = re.match(self._VALID_URL, url)
|
||||||
|
display_id = m.group('id')
|
||||||
|
if display_id:
|
||||||
|
return self._extract_video(url, display_id)
|
||||||
|
display_id = m.group('filt') or 'all'
|
||||||
|
return self._extract_playlist(url, display_id)
|
||||||
|
|
||||||
|
def _extract_playlist(self, url, display_id):
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
return self.playlist_result(
|
||||||
|
(self.url_result(u, ie=self.ie_key())
|
||||||
|
for u in self._extract_embed_urls(url, webpage)),
|
||||||
|
playlist_id=display_id)
|
||||||
|
|
||||||
|
def _extract_video(self, url, display_id):
|
||||||
display_id = self._match_id(url)
|
display_id = self._match_id(url)
|
||||||
video = self._call_api(display_id)
|
video = self._checked_call_api(display_id)
|
||||||
video_id = video['videoId']
|
|
||||||
metadata = video['metadata']
|
|
||||||
title = metadata.get('longTitle') or metadata.get('title') or metadata['name']
|
|
||||||
|
|
||||||
formats = []
|
info = self._extract_video_info(video)
|
||||||
refs = video.get('refs') or {}
|
|
||||||
|
|
||||||
m3u8_url = refs.get('m3uUrl')
|
return merge_dicts({
|
||||||
if m3u8_url:
|
|
||||||
formats.extend(self._extract_m3u8_formats(
|
|
||||||
m3u8_url, video_id, 'mp4', 'm3u8_native',
|
|
||||||
m3u8_id='hls', fatal=False))
|
|
||||||
|
|
||||||
f4m_url = refs.get('f4mUrl')
|
|
||||||
if f4m_url:
|
|
||||||
formats.extend(self._extract_f4m_formats(
|
|
||||||
f4m_url, video_id, f4m_id='hds', fatal=False))
|
|
||||||
|
|
||||||
for asset in (video.get('assets') or []):
|
|
||||||
asset_url = asset.get('url')
|
|
||||||
if not asset_url:
|
|
||||||
continue
|
|
||||||
formats.append({
|
|
||||||
'url': asset_url,
|
|
||||||
'tbr': int_or_none(asset.get('bitrate'), 1000),
|
|
||||||
'fps': int_or_none(asset.get('frame_rate')),
|
|
||||||
'height': int_or_none(asset.get('height')),
|
|
||||||
'width': int_or_none(asset.get('width')),
|
|
||||||
})
|
|
||||||
|
|
||||||
mezzanine_url = try_get(video, lambda x: x['system']['mezzanineUrl'])
|
|
||||||
if mezzanine_url:
|
|
||||||
formats.append({
|
|
||||||
'ext': determine_ext(mezzanine_url, 'mp4'),
|
|
||||||
'format_id': 'mezzanine',
|
|
||||||
'preference': 1,
|
|
||||||
'url': mezzanine_url,
|
|
||||||
})
|
|
||||||
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
thumbnails = []
|
|
||||||
for thumbnail in (video.get('thumbnails') or []):
|
|
||||||
thumbnail_url = thumbnail.get('url')
|
|
||||||
if not thumbnail_url:
|
|
||||||
continue
|
|
||||||
thumbnails.append({
|
|
||||||
'url': thumbnail_url,
|
|
||||||
})
|
|
||||||
|
|
||||||
tags = []
|
|
||||||
for tag in (video.get('tags') or []):
|
|
||||||
display_name = tag.get('displayName')
|
|
||||||
if not display_name:
|
|
||||||
continue
|
|
||||||
tags.append(display_name)
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': video_id,
|
|
||||||
'title': title,
|
|
||||||
'description': strip_or_none(metadata.get('description')),
|
|
||||||
'timestamp': parse_iso8601(metadata.get('publishDate')),
|
|
||||||
'duration': int_or_none(metadata.get('duration')),
|
|
||||||
'display_id': display_id,
|
'display_id': display_id,
|
||||||
'thumbnails': thumbnails,
|
}, info)
|
||||||
'formats': formats,
|
|
||||||
'tags': tags,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class IGNVideoIE(InfoExtractor):
|
class IGNVideoIE(IGNBaseIE):
|
||||||
_VALID_URL = r'https?://.+?\.ign\.com/(?:[a-z]{2}/)?[^/]+/(?P<id>\d+)/(?:video|trailer)/'
|
_VALID_URL = r'https?://.+?\.ign\.com/(?:[a-z]{2}/)?[^/]+/(?P<id>\d+)/(?:video|trailer)/'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://me.ign.com/en/videos/112203/video/how-hitman-aims-to-be-different-than-every-other-s',
|
'url': 'http://me.ign.com/en/videos/112203/video/how-hitman-aims-to-be-different-than-every-other-s',
|
||||||
@ -147,7 +213,8 @@ class IGNVideoIE(InfoExtractor):
|
|||||||
'description': 'Taking out assassination targets in Hitman has never been more stylish.',
|
'description': 'Taking out assassination targets in Hitman has never been more stylish.',
|
||||||
'timestamp': 1444665600,
|
'timestamp': 1444665600,
|
||||||
'upload_date': '20151012',
|
'upload_date': '20151012',
|
||||||
}
|
},
|
||||||
|
'expected_warnings': ['HTTP Error 400: Bad Request'],
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://me.ign.com/ar/angry-birds-2/106533/video/lrd-ldyy-lwl-lfylm-angry-birds',
|
'url': 'http://me.ign.com/ar/angry-birds-2/106533/video/lrd-ldyy-lwl-lfylm-angry-birds',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -167,22 +234,38 @@ class IGNVideoIE(InfoExtractor):
|
|||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
req = HEADRequest(url.rsplit('/', 1)[0] + '/embed')
|
parsed_url = compat_urlparse.urlparse(url)
|
||||||
url = self._request_webpage(req, video_id).geturl()
|
embed_url = compat_urlparse.urlunparse(
|
||||||
|
parsed_url._replace(path=parsed_url.path.rsplit('/', 1)[0] + '/embed'))
|
||||||
|
|
||||||
|
webpage, urlh = self._download_webpage_handle(embed_url, video_id)
|
||||||
|
new_url = urlh.geturl()
|
||||||
ign_url = compat_parse_qs(
|
ign_url = compat_parse_qs(
|
||||||
compat_urllib_parse_urlparse(url).query).get('url', [None])[0]
|
compat_urlparse.urlparse(new_url).query).get('url', [None])[-1]
|
||||||
if ign_url:
|
if ign_url:
|
||||||
return self.url_result(ign_url, IGNIE.ie_key())
|
return self.url_result(ign_url, IGNIE.ie_key())
|
||||||
return self.url_result(url)
|
video = self._search_regex(r'(<div\b[^>]+\bdata-video-id\s*=\s*[^>]+>)', webpage, 'video element', fatal=False)
|
||||||
|
if not video:
|
||||||
|
if new_url == url:
|
||||||
|
raise ExtractorError('Redirect loop: ' + url)
|
||||||
|
return self.url_result(new_url)
|
||||||
|
video = extract_attributes(video)
|
||||||
|
video_data = video.get('data-settings') or '{}'
|
||||||
|
video_data = self._parse_json(video_data, video_id)['video']
|
||||||
|
info = self._extract_video_info(video_data)
|
||||||
|
|
||||||
|
return merge_dicts({
|
||||||
|
'display_id': video_id,
|
||||||
|
}, info)
|
||||||
|
|
||||||
|
|
||||||
class IGNArticleIE(IGNBaseIE):
|
class IGNArticleIE(IGNBaseIE):
|
||||||
_VALID_URL = r'https?://.+?\.ign\.com/(?:articles(?:/\d{4}/\d{2}/\d{2})?|(?:[a-z]{2}/)?feature/\d+)/(?P<id>[^/?&#]+)'
|
_VALID_URL = r'https?://.+?\.ign\.com/(?:articles(?:/\d{4}/\d{2}/\d{2})?|(?:[a-z]{2}/)?(?:[\w-]+/)*?feature/\d+)/(?P<id>[^/?&#]+)'
|
||||||
_PAGE_TYPE = 'article'
|
_PAGE_TYPE = 'article'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://me.ign.com/en/feature/15775/100-little-things-in-gta-5-that-will-blow-your-mind',
|
'url': 'http://me.ign.com/en/feature/15775/100-little-things-in-gta-5-that-will-blow-your-mind',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '524497489e4e8ff5848ece34',
|
'id': '72113',
|
||||||
'title': '100 Little Things in GTA 5 That Will Blow Your Mind',
|
'title': '100 Little Things in GTA 5 That Will Blow Your Mind',
|
||||||
},
|
},
|
||||||
'playlist': [
|
'playlist': [
|
||||||
@ -190,7 +273,7 @@ class IGNArticleIE(IGNBaseIE):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '5ebbd138523268b93c9141af17bec937',
|
'id': '5ebbd138523268b93c9141af17bec937',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'GTA 5 Video Review',
|
'title': 'Grand Theft Auto V Video Review',
|
||||||
'description': 'Rockstar drops the mic on this generation of games. Watch our review of the masterly Grand Theft Auto V.',
|
'description': 'Rockstar drops the mic on this generation of games. Watch our review of the masterly Grand Theft Auto V.',
|
||||||
'timestamp': 1379339880,
|
'timestamp': 1379339880,
|
||||||
'upload_date': '20130916',
|
'upload_date': '20130916',
|
||||||
@ -200,7 +283,7 @@ class IGNArticleIE(IGNBaseIE):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '638672ee848ae4ff108df2a296418ee2',
|
'id': '638672ee848ae4ff108df2a296418ee2',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': '26 Twisted Moments from GTA 5 in Slow Motion',
|
'title': 'GTA 5 In Slow Motion',
|
||||||
'description': 'The twisted beauty of GTA 5 in stunning slow motion.',
|
'description': 'The twisted beauty of GTA 5 in stunning slow motion.',
|
||||||
'timestamp': 1386878820,
|
'timestamp': 1386878820,
|
||||||
'upload_date': '20131212',
|
'upload_date': '20131212',
|
||||||
@ -208,16 +291,17 @@ class IGNArticleIE(IGNBaseIE):
|
|||||||
},
|
},
|
||||||
],
|
],
|
||||||
'params': {
|
'params': {
|
||||||
'playlist_items': '2-3',
|
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
|
'expected_warnings': ['Backend fetch failed'],
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.ign.com/articles/2014/08/15/rewind-theater-wild-trailer-gamescom-2014?watch',
|
'url': 'http://www.ign.com/articles/2014/08/15/rewind-theater-wild-trailer-gamescom-2014?watch',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '53ee806780a81ec46e0790f8',
|
'id': '53ee806780a81ec46e0790f8',
|
||||||
'title': 'Rewind Theater - Wild Trailer Gamescom 2014',
|
'title': 'Rewind Theater - Wild Trailer Gamescom 2014',
|
||||||
},
|
},
|
||||||
'playlist_count': 2,
|
'playlist_count': 1,
|
||||||
|
'expected_warnings': ['Backend fetch failed'],
|
||||||
}, {
|
}, {
|
||||||
# videoId pattern
|
# videoId pattern
|
||||||
'url': 'http://www.ign.com/articles/2017/06/08/new-ducktales-short-donalds-birthday-doesnt-go-as-planned',
|
'url': 'http://www.ign.com/articles/2017/06/08/new-ducktales-short-donalds-birthday-doesnt-go-as-planned',
|
||||||
@ -240,18 +324,91 @@ class IGNArticleIE(IGNBaseIE):
|
|||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
|
def _checked_call_api(self, slug):
|
||||||
|
try:
|
||||||
|
return self._call_api(slug)
|
||||||
|
except ExtractorError as e:
|
||||||
|
if isinstance(e.cause, compat_HTTPError):
|
||||||
|
e.cause.args = e.cause.args or [
|
||||||
|
e.cause.geturl(), e.cause.getcode(), e.cause.reason]
|
||||||
|
if e.cause.code == 404:
|
||||||
|
raise ExtractorError(
|
||||||
|
'Content not found: expired?', cause=e.cause,
|
||||||
|
expected=True)
|
||||||
|
elif e.cause.code == 503:
|
||||||
|
self.report_warning(error_to_compat_str(e.cause))
|
||||||
|
return
|
||||||
|
raise
|
||||||
|
|
||||||
|
def _search_nextjs_data(self, webpage, video_id, **kw):
|
||||||
|
return self._parse_json(
|
||||||
|
self._search_regex(
|
||||||
|
r'(?s)<script[^>]+id=[\'"]__NEXT_DATA__[\'"][^>]*>([^<]+)</script>',
|
||||||
|
webpage, 'next.js data', **kw),
|
||||||
|
video_id, **kw)
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
display_id = self._match_id(url)
|
||||||
article = self._call_api(display_id)
|
article = self._checked_call_api(display_id)
|
||||||
|
|
||||||
|
if article:
|
||||||
|
# obsolete ?
|
||||||
def entries():
|
def entries():
|
||||||
media_url = try_get(article, lambda x: x['mediaRelations'][0]['media']['metadata']['url'])
|
media_url = traverse_obj(
|
||||||
|
article, ('mediaRelations', 0, 'media', 'metadata', 'url'),
|
||||||
|
expected_type=url_or_none)
|
||||||
if media_url:
|
if media_url:
|
||||||
yield self.url_result(media_url, IGNIE.ie_key())
|
yield self.url_result(media_url, IGNIE.ie_key())
|
||||||
for content in (article.get('content') or []):
|
for content in (article.get('content') or []):
|
||||||
for video_url in re.findall(r'(?:\[(?:ignvideo\s+url|youtube\s+clip_id)|<iframe[^>]+src)="([^"]+)"', content):
|
for video_url in re.findall(r'(?:\[(?:ignvideo\s+url|youtube\s+clip_id)|<iframe[^>]+src)="([^"]+)"', content):
|
||||||
|
if url_or_none(video_url):
|
||||||
yield self.url_result(video_url)
|
yield self.url_result(video_url)
|
||||||
|
|
||||||
return self.playlist_result(
|
return self.playlist_result(
|
||||||
entries(), article.get('articleId'),
|
entries(), article.get('articleId'),
|
||||||
strip_or_none(try_get(article, lambda x: x['metadata']['headline'])))
|
traverse_obj(
|
||||||
|
article, ('metadata', 'headline'),
|
||||||
|
expected_type=lambda x: x.strip() or None))
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
playlist_id = self._html_search_meta('dable:item_id', webpage, default=None)
|
||||||
|
if playlist_id:
|
||||||
|
|
||||||
|
def entries():
|
||||||
|
for m in re.finditer(
|
||||||
|
r'''(?s)<object\b[^>]+\bclass\s*=\s*("|')ign-videoplayer\1[^>]*>(?P<params>.+?)</object''',
|
||||||
|
webpage):
|
||||||
|
flashvars = self._search_regex(
|
||||||
|
r'''(<param\b[^>]+\bname\s*=\s*("|')flashvars\2[^>]*>)''',
|
||||||
|
m.group('params'), 'flashvars', default='')
|
||||||
|
flashvars = compat_parse_qs(extract_attributes(flashvars).get('value') or '')
|
||||||
|
v_url = url_or_none((flashvars.get('url') or [None])[-1])
|
||||||
|
if v_url:
|
||||||
|
yield self.url_result(v_url)
|
||||||
|
else:
|
||||||
|
playlist_id = self._search_regex(
|
||||||
|
r'''\bdata-post-id\s*=\s*("|')(?P<id>[\da-f]+)\1''',
|
||||||
|
webpage, 'id', group='id', default=None)
|
||||||
|
|
||||||
|
nextjs_data = self._search_nextjs_data(webpage, display_id)
|
||||||
|
|
||||||
|
def entries():
|
||||||
|
for player in traverse_obj(
|
||||||
|
nextjs_data,
|
||||||
|
('props', 'apolloState', 'ROOT_QUERY', lambda k, _: k.startswith('videoPlayerProps('), '__ref')):
|
||||||
|
# skip promo links (which may not always be served, eg GH CI servers)
|
||||||
|
if traverse_obj(nextjs_data,
|
||||||
|
('props', 'apolloState', player.replace('PlayerProps', 'ModernContent')),
|
||||||
|
expected_type=dict):
|
||||||
|
continue
|
||||||
|
video = traverse_obj(nextjs_data, ('props', 'apolloState', player), expected_type=dict) or {}
|
||||||
|
info = self._extract_video_info(video, fatal=False)
|
||||||
|
if info:
|
||||||
|
yield merge_dicts({
|
||||||
|
'display_id': display_id,
|
||||||
|
}, info)
|
||||||
|
|
||||||
|
return self.playlist_result(
|
||||||
|
entries(), playlist_id or display_id,
|
||||||
|
re.sub(r'\s+-\s+IGN\s*$', '', self._og_search_title(webpage, default='')) or None)
|
||||||
|
@ -270,17 +270,23 @@ class VimeoIE(VimeoBaseInfoExtractor):
|
|||||||
\.
|
\.
|
||||||
)?
|
)?
|
||||||
vimeo(?:pro)?\.com/
|
vimeo(?:pro)?\.com/
|
||||||
|
(?:
|
||||||
|
(?P<u>user)|
|
||||||
(?!(?:channels|album|showcase)/[^/?#]+/?(?:$|[?#])|[^/]+/review/|ondemand/)
|
(?!(?:channels|album|showcase)/[^/?#]+/?(?:$|[?#])|[^/]+/review/|ondemand/)
|
||||||
(?:.*?/)??
|
(?:.*?/)??
|
||||||
(?:
|
(?P<q>
|
||||||
(?:
|
(?:
|
||||||
play_redirect_hls|
|
play_redirect_hls|
|
||||||
moogaloop\.swf)\?clip_id=
|
moogaloop\.swf)\?clip_id=
|
||||||
)?
|
)?
|
||||||
(?:videos?/)?
|
(?:videos?/)?
|
||||||
|
)
|
||||||
(?P<id>[0-9]+)
|
(?P<id>[0-9]+)
|
||||||
(?:/(?P<unlisted_hash>[\da-f]{10}))?
|
(?(u)
|
||||||
/?(?:[?&].*)?(?:[#].*)?$
|
/(?!videos|likes)[^/?#]+/?|
|
||||||
|
(?(q)|/(?P<unlisted_hash>[\da-f]{10}))?
|
||||||
|
)
|
||||||
|
(?:(?(q)[&]|(?(u)|/?)[?]).+?)?(?:[#].*)?$
|
||||||
'''
|
'''
|
||||||
IE_NAME = 'vimeo'
|
IE_NAME = 'vimeo'
|
||||||
_TESTS = [
|
_TESTS = [
|
||||||
@ -539,7 +545,12 @@ class VimeoIE(VimeoBaseInfoExtractor):
|
|||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
}
|
},
|
||||||
|
{
|
||||||
|
# user playlist alias -> https://vimeo.com/258705797
|
||||||
|
'url': 'https://vimeo.com/user26785108/newspiritualguide',
|
||||||
|
'only_matching': True,
|
||||||
|
},
|
||||||
# https://gettingthingsdone.com/workflowmap/
|
# https://gettingthingsdone.com/workflowmap/
|
||||||
# vimeo embed with check-password page protected by Referer header
|
# vimeo embed with check-password page protected by Referer header
|
||||||
]
|
]
|
||||||
@ -663,7 +674,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
|
|||||||
|
|
||||||
if '//player.vimeo.com/video/' in url:
|
if '//player.vimeo.com/video/' in url:
|
||||||
config = self._parse_json(self._search_regex(
|
config = self._parse_json(self._search_regex(
|
||||||
r'\b(?:playerC|c)onfig\s*=\s*({.+?})\s*;', webpage, 'info section'), video_id)
|
r'(?s)\b(?:playerC|c)onfig\s*=\s*({.+?})\s*[;\n]', webpage, 'info section'), video_id)
|
||||||
if config.get('view') == 4:
|
if config.get('view') == 4:
|
||||||
config = self._verify_player_video_password(
|
config = self._verify_player_video_password(
|
||||||
redirect_url, video_id, headers)
|
redirect_url, video_id, headers)
|
||||||
|
@ -31,6 +31,7 @@ from ..utils import (
|
|||||||
get_element_by_attribute,
|
get_element_by_attribute,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
js_to_json,
|
js_to_json,
|
||||||
|
merge_dicts,
|
||||||
mimetype2ext,
|
mimetype2ext,
|
||||||
parse_codecs,
|
parse_codecs,
|
||||||
parse_duration,
|
parse_duration,
|
||||||
@ -400,6 +401,62 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
|
|||||||
break
|
break
|
||||||
data['continuation'] = token
|
data['continuation'] = token
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _owner_endpoints_path():
|
||||||
|
return [
|
||||||
|
Ellipsis,
|
||||||
|
lambda k, _: k.endswith('SecondaryInfoRenderer'),
|
||||||
|
('owner', 'videoOwner'), 'videoOwnerRenderer', 'title',
|
||||||
|
'runs', Ellipsis]
|
||||||
|
|
||||||
|
def _extract_channel_id(self, webpage, videodetails={}, metadata={}, renderers=[]):
|
||||||
|
channel_id = None
|
||||||
|
if any((videodetails, metadata, renderers)):
|
||||||
|
channel_id = (
|
||||||
|
traverse_obj(videodetails, 'channelId')
|
||||||
|
or traverse_obj(metadata, 'externalChannelId', 'externalId')
|
||||||
|
or traverse_obj(renderers,
|
||||||
|
self._owner_endpoints_path() + [
|
||||||
|
'navigationEndpoint', 'browseEndpoint', 'browseId'],
|
||||||
|
get_all=False)
|
||||||
|
)
|
||||||
|
return channel_id or self._html_search_meta(
|
||||||
|
'channelId', webpage, 'channel id', default=None)
|
||||||
|
|
||||||
|
def _extract_author_var(self, webpage, var_name,
|
||||||
|
videodetails={}, metadata={}, renderers=[]):
|
||||||
|
result = None
|
||||||
|
paths = {
|
||||||
|
# (HTML, videodetails, metadata, renderers)
|
||||||
|
'name': ('content', 'author', (('ownerChannelName', None), 'title'), ['text']),
|
||||||
|
'url': ('href', 'ownerProfileUrl', 'vanityChannelUrl',
|
||||||
|
['navigationEndpoint', 'browseEndpoint', 'canonicalBaseUrl'])
|
||||||
|
}
|
||||||
|
if any((videodetails, metadata, renderers)):
|
||||||
|
result = (
|
||||||
|
traverse_obj(videodetails, paths[var_name][1], get_all=False)
|
||||||
|
or traverse_obj(metadata, paths[var_name][2], get_all=False)
|
||||||
|
or traverse_obj(renderers,
|
||||||
|
self._owner_endpoints_path() + paths[var_name][3],
|
||||||
|
get_all=False)
|
||||||
|
)
|
||||||
|
return result or traverse_obj(
|
||||||
|
extract_attributes(self._search_regex(
|
||||||
|
r'''(?s)(<link\b[^>]+\bitemprop\s*=\s*("|')%s\2[^>]*>)'''
|
||||||
|
% re.escape(var_name),
|
||||||
|
get_element_by_attribute('itemprop', 'author', webpage) or '',
|
||||||
|
'author link', default='')),
|
||||||
|
paths[var_name][0])
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _yt_urljoin(url_or_path):
|
||||||
|
return urljoin('https://www.youtube.com', url_or_path)
|
||||||
|
|
||||||
|
def _extract_uploader_id(self, uploader_url):
|
||||||
|
return self._search_regex(
|
||||||
|
r'/(?:(?:channel|user)/|(?=@))([^/?&#]+)', uploader_url or '',
|
||||||
|
'uploader id', default=None)
|
||||||
|
|
||||||
|
|
||||||
class YoutubeIE(YoutubeBaseInfoExtractor):
|
class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||||
IE_DESC = 'YouTube.com'
|
IE_DESC = 'YouTube.com'
|
||||||
@ -516,8 +573,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'youtube-dl test video "\'/\\ä↭𝕐',
|
'title': 'youtube-dl test video "\'/\\ä↭𝕐',
|
||||||
'uploader': 'Philipp Hagemeister',
|
'uploader': 'Philipp Hagemeister',
|
||||||
'uploader_id': 'phihag',
|
'uploader_id': '@PhilippHagemeister',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/phihag',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@PhilippHagemeister',
|
||||||
'channel': 'Philipp Hagemeister',
|
'channel': 'Philipp Hagemeister',
|
||||||
'channel_id': 'UCLqxVugv74EIW3VWh2NOa3Q',
|
'channel_id': 'UCLqxVugv74EIW3VWh2NOa3Q',
|
||||||
'channel_url': r're:https?://(?:www\.)?youtube\.com/channel/UCLqxVugv74EIW3VWh2NOa3Q',
|
'channel_url': r're:https?://(?:www\.)?youtube\.com/channel/UCLqxVugv74EIW3VWh2NOa3Q',
|
||||||
@ -557,8 +614,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'youtube-dl test video "\'/\\ä↭𝕐',
|
'title': 'youtube-dl test video "\'/\\ä↭𝕐',
|
||||||
'uploader': 'Philipp Hagemeister',
|
'uploader': 'Philipp Hagemeister',
|
||||||
'uploader_id': 'phihag',
|
'uploader_id': '@PhilippHagemeister',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/phihag',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@PhilippHagemeister',
|
||||||
'upload_date': '20121002',
|
'upload_date': '20121002',
|
||||||
'description': 'test chars: "\'/\\ä↭𝕐\ntest URL: https://github.com/rg3/youtube-dl/issues/1892\n\nThis is a test video for youtube-dl.\n\nFor more information, contact phihag@phihag.de .',
|
'description': 'test chars: "\'/\\ä↭𝕐\ntest URL: https://github.com/rg3/youtube-dl/issues/1892\n\nThis is a test video for youtube-dl.\n\nFor more information, contact phihag@phihag.de .',
|
||||||
'categories': ['Science & Technology'],
|
'categories': ['Science & Technology'],
|
||||||
@ -588,7 +645,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'youtube_include_dash_manifest': True,
|
'youtube_include_dash_manifest': True,
|
||||||
'format': '141',
|
'format': '141',
|
||||||
},
|
},
|
||||||
'skip': 'format 141 not served anymore',
|
'skip': 'format 141 not served any more',
|
||||||
},
|
},
|
||||||
# DASH manifest with encrypted signature
|
# DASH manifest with encrypted signature
|
||||||
{
|
{
|
||||||
@ -600,7 +657,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'description': 'md5:8f5e2b82460520b619ccac1f509d43bf',
|
'description': 'md5:8f5e2b82460520b619ccac1f509d43bf',
|
||||||
'duration': 244,
|
'duration': 244,
|
||||||
'uploader': 'AfrojackVEVO',
|
'uploader': 'AfrojackVEVO',
|
||||||
'uploader_id': 'AfrojackVEVO',
|
'uploader_id': '@AfrojackVEVO',
|
||||||
'upload_date': '20131011',
|
'upload_date': '20131011',
|
||||||
'abr': 129.495,
|
'abr': 129.495,
|
||||||
},
|
},
|
||||||
@ -618,8 +675,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'duration': 219,
|
'duration': 219,
|
||||||
'upload_date': '20100909',
|
'upload_date': '20100909',
|
||||||
'uploader': 'Amazing Atheist',
|
'uploader': 'Amazing Atheist',
|
||||||
'uploader_id': 'TheAmazingAtheist',
|
'uploader_id': '@theamazingatheist',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/TheAmazingAtheist',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@theamazingatheist',
|
||||||
'title': 'Burning Everyone\'s Koran',
|
'title': 'Burning Everyone\'s Koran',
|
||||||
'description': 'SUBSCRIBE: http://www.youtube.com/saturninefilms \r\n\r\nEven Obama has taken a stand against freedom on this issue: http://www.huffingtonpost.com/2010/09/09/obama-gma-interview-quran_n_710282.html',
|
'description': 'SUBSCRIBE: http://www.youtube.com/saturninefilms \r\n\r\nEven Obama has taken a stand against freedom on this issue: http://www.huffingtonpost.com/2010/09/09/obama-gma-interview-quran_n_710282.html',
|
||||||
}
|
}
|
||||||
@ -635,8 +692,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'description': r're:(?s).{100,}About the Game\n.*?The Witcher 3: Wild Hunt.{100,}',
|
'description': r're:(?s).{100,}About the Game\n.*?The Witcher 3: Wild Hunt.{100,}',
|
||||||
'duration': 142,
|
'duration': 142,
|
||||||
'uploader': 'The Witcher',
|
'uploader': 'The Witcher',
|
||||||
'uploader_id': 'WitcherGame',
|
'uploader_id': '@thewitcher',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/WitcherGame',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@thewitcher',
|
||||||
'upload_date': '20140605',
|
'upload_date': '20140605',
|
||||||
'thumbnail': 'https://i.ytimg.com/vi/HtVdAasjOgU/maxresdefault.jpg',
|
'thumbnail': 'https://i.ytimg.com/vi/HtVdAasjOgU/maxresdefault.jpg',
|
||||||
'age_limit': 18,
|
'age_limit': 18,
|
||||||
@ -659,7 +716,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'description': 'md5:bf77e03fcae5529475e500129b05668a',
|
'description': 'md5:bf77e03fcae5529475e500129b05668a',
|
||||||
'duration': 177,
|
'duration': 177,
|
||||||
'uploader': 'FlyingKitty',
|
'uploader': 'FlyingKitty',
|
||||||
'uploader_id': 'FlyingKitty900',
|
'uploader_id': '@FlyingKitty900',
|
||||||
'upload_date': '20200408',
|
'upload_date': '20200408',
|
||||||
'thumbnail': 'https://i.ytimg.com/vi/HsUATh_Nc2U/maxresdefault.jpg',
|
'thumbnail': 'https://i.ytimg.com/vi/HsUATh_Nc2U/maxresdefault.jpg',
|
||||||
'age_limit': 18,
|
'age_limit': 18,
|
||||||
@ -682,7 +739,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'description': 'md5:17eccca93a786d51bc67646756894066',
|
'description': 'md5:17eccca93a786d51bc67646756894066',
|
||||||
'duration': 106,
|
'duration': 106,
|
||||||
'uploader': 'Projekt Melody',
|
'uploader': 'Projekt Melody',
|
||||||
'uploader_id': 'UC1yoRdFoFJaCY-AGfD9W0wQ',
|
'uploader_id': '@ProjektMelody',
|
||||||
'upload_date': '20191227',
|
'upload_date': '20191227',
|
||||||
'age_limit': 18,
|
'age_limit': 18,
|
||||||
'thumbnail': 'https://i.ytimg.com/vi/Tq92D6wQ1mg/sddefault.jpg',
|
'thumbnail': 'https://i.ytimg.com/vi/Tq92D6wQ1mg/sddefault.jpg',
|
||||||
@ -704,10 +761,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'title': 'OOMPH! - Such Mich Find Mich (Lyrics)',
|
'title': 'OOMPH! - Such Mich Find Mich (Lyrics)',
|
||||||
'description': 'Fan Video. Music & Lyrics by OOMPH!.',
|
'description': 'Fan Video. Music & Lyrics by OOMPH!.',
|
||||||
'duration': 210,
|
'duration': 210,
|
||||||
'uploader': 'Herr Lurik',
|
|
||||||
'uploader_id': 'st3in234',
|
|
||||||
'upload_date': '20130730',
|
'upload_date': '20130730',
|
||||||
'uploader_url': 'http://www.youtube.com/user/st3in234',
|
'uploader': 'Herr Lurik',
|
||||||
|
'uploader_id': '@HerrLurik',
|
||||||
|
'uploader_url': 'http://www.youtube.com/@HerrLurik',
|
||||||
'age_limit': 0,
|
'age_limit': 0,
|
||||||
'thumbnail': 'https://i.ytimg.com/vi/MeJVWBSsPAY/hqdefault.jpg',
|
'thumbnail': 'https://i.ytimg.com/vi/MeJVWBSsPAY/hqdefault.jpg',
|
||||||
'tags': ['oomph', 'such mich find mich', 'lyrics', 'german industrial', 'musica industrial'],
|
'tags': ['oomph', 'such mich find mich', 'lyrics', 'german industrial', 'musica industrial'],
|
||||||
@ -740,8 +797,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'duration': 266,
|
'duration': 266,
|
||||||
'upload_date': '20100430',
|
'upload_date': '20100430',
|
||||||
'uploader_id': 'deadmau5',
|
'uploader_id': '@deadmau5',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/deadmau5',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@deadmau5',
|
||||||
'creator': 'deadmau5',
|
'creator': 'deadmau5',
|
||||||
'description': 'md5:6cbcd3a92ce1bc676fc4d6ab4ace2336',
|
'description': 'md5:6cbcd3a92ce1bc676fc4d6ab4ace2336',
|
||||||
'uploader': 'deadmau5',
|
'uploader': 'deadmau5',
|
||||||
@ -762,8 +819,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'description': r're:(?s)(?:.+\s)?HO09 - Women - GER-AUS - Hockey - 31 July 2012 - London 2012 Olympic Games\s*',
|
'description': r're:(?s)(?:.+\s)?HO09 - Women - GER-AUS - Hockey - 31 July 2012 - London 2012 Olympic Games\s*',
|
||||||
'duration': 6085,
|
'duration': 6085,
|
||||||
'upload_date': '20150827',
|
'upload_date': '20150827',
|
||||||
'uploader_id': 'olympic',
|
'uploader_id': '@Olympics',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/olympic',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@Olympics',
|
||||||
'uploader': r're:Olympics?',
|
'uploader': r're:Olympics?',
|
||||||
'age_limit': 0,
|
'age_limit': 0,
|
||||||
'thumbnail': 'https://i.ytimg.com/vi/lqQg6PlCWgI/maxresdefault.jpg',
|
'thumbnail': 'https://i.ytimg.com/vi/lqQg6PlCWgI/maxresdefault.jpg',
|
||||||
@ -785,8 +842,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'stretched_ratio': 16 / 9.,
|
'stretched_ratio': 16 / 9.,
|
||||||
'duration': 85,
|
'duration': 85,
|
||||||
'upload_date': '20110310',
|
'upload_date': '20110310',
|
||||||
'uploader_id': 'AllenMeow',
|
'uploader_id': '@AllenMeow',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/AllenMeow',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@AllenMeow',
|
||||||
'description': 'made by Wacom from Korea | 字幕&加油添醋 by TY\'s Allen | 感謝heylisa00cavey1001同學熱情提供梗及翻譯',
|
'description': 'made by Wacom from Korea | 字幕&加油添醋 by TY\'s Allen | 感謝heylisa00cavey1001同學熱情提供梗及翻譯',
|
||||||
'uploader': '孫ᄋᄅ',
|
'uploader': '孫ᄋᄅ',
|
||||||
'title': '[A-made] 變態妍字幕版 太妍 我就是這樣的人',
|
'title': '[A-made] 變態妍字幕版 太妍 我就是這樣的人',
|
||||||
@ -824,7 +881,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'uploader': 'dorappi2000',
|
'uploader': 'dorappi2000',
|
||||||
'formats': 'mincount:31',
|
'formats': 'mincount:31',
|
||||||
},
|
},
|
||||||
'skip': 'not actual anymore',
|
'skip': 'not actual any more',
|
||||||
},
|
},
|
||||||
# DASH manifest with segment_list
|
# DASH manifest with segment_list
|
||||||
{
|
{
|
||||||
@ -905,6 +962,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
|
'skip': 'Not multifeed any more',
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
# Multifeed video with comma in title (see https://github.com/ytdl-org/youtube-dl/issues/8536)
|
# Multifeed video with comma in title (see https://github.com/ytdl-org/youtube-dl/issues/8536)
|
||||||
@ -914,7 +972,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'title': 'DevConf.cz 2016 Day 2 Workshops 1 14:00 - 15:30',
|
'title': 'DevConf.cz 2016 Day 2 Workshops 1 14:00 - 15:30',
|
||||||
},
|
},
|
||||||
'playlist_count': 2,
|
'playlist_count': 2,
|
||||||
'skip': 'Not multifeed anymore',
|
'skip': 'Not multifeed any more',
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
'url': 'https://vid.plus/FlRa-iH7PGw',
|
'url': 'https://vid.plus/FlRa-iH7PGw',
|
||||||
@ -938,8 +996,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'description': 'md5:8085699c11dc3f597ce0410b0dcbb34a',
|
'description': 'md5:8085699c11dc3f597ce0410b0dcbb34a',
|
||||||
'duration': 133,
|
'duration': 133,
|
||||||
'upload_date': '20151119',
|
'upload_date': '20151119',
|
||||||
'uploader_id': 'IronSoulElf',
|
'uploader_id': '@IronSoulElf',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/IronSoulElf',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@IronSoulElf',
|
||||||
'uploader': 'IronSoulElf',
|
'uploader': 'IronSoulElf',
|
||||||
'creator': r're:Todd Haberman[;,]\s+Daniel Law Heath and Aaron Kaplan',
|
'creator': r're:Todd Haberman[;,]\s+Daniel Law Heath and Aaron Kaplan',
|
||||||
'track': 'Dark Walk',
|
'track': 'Dark Walk',
|
||||||
@ -987,8 +1045,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'description': 'md5:a677553cf0840649b731a3024aeff4cc',
|
'description': 'md5:a677553cf0840649b731a3024aeff4cc',
|
||||||
'duration': 721,
|
'duration': 721,
|
||||||
'upload_date': '20150127',
|
'upload_date': '20150127',
|
||||||
'uploader_id': 'BerkmanCenter',
|
'uploader_id': '@BKCHarvard',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/BerkmanCenter',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@BKCHarvard',
|
||||||
'uploader': 'The Berkman Klein Center for Internet & Society',
|
'uploader': 'The Berkman Klein Center for Internet & Society',
|
||||||
'license': 'Creative Commons Attribution license (reuse allowed)',
|
'license': 'Creative Commons Attribution license (reuse allowed)',
|
||||||
},
|
},
|
||||||
@ -1007,8 +1065,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'duration': 4060,
|
'duration': 4060,
|
||||||
'upload_date': '20151119',
|
'upload_date': '20151119',
|
||||||
'uploader': 'Bernie Sanders',
|
'uploader': 'Bernie Sanders',
|
||||||
'uploader_id': 'UCH1dpzjCEiGAt8CXkryhkZg',
|
'uploader_id': '@BernieSanders',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/channel/UCH1dpzjCEiGAt8CXkryhkZg',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@BernieSanders',
|
||||||
'license': 'Creative Commons Attribution license (reuse allowed)',
|
'license': 'Creative Commons Attribution license (reuse allowed)',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
@ -1054,8 +1112,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'duration': 2085,
|
'duration': 2085,
|
||||||
'upload_date': '20170118',
|
'upload_date': '20170118',
|
||||||
'uploader': 'Vsauce',
|
'uploader': 'Vsauce',
|
||||||
'uploader_id': 'Vsauce',
|
'uploader_id': '@Vsauce',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/Vsauce',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@Vsauce',
|
||||||
'series': 'Mind Field',
|
'series': 'Mind Field',
|
||||||
'season_number': 1,
|
'season_number': 1,
|
||||||
'episode_number': 1,
|
'episode_number': 1,
|
||||||
@ -1134,7 +1192,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
'youtube_include_dash_manifest': False,
|
'youtube_include_dash_manifest': False,
|
||||||
},
|
},
|
||||||
'skip': 'not actual anymore',
|
'skip': 'not actual any more',
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
# Youtube Music Auto-generated description
|
# Youtube Music Auto-generated description
|
||||||
@ -1191,8 +1249,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'title': 'IMG 3456',
|
'title': 'IMG 3456',
|
||||||
'description': '',
|
'description': '',
|
||||||
'upload_date': '20170613',
|
'upload_date': '20170613',
|
||||||
'uploader_id': 'ElevageOrVert',
|
|
||||||
'uploader': 'ElevageOrVert',
|
'uploader': 'ElevageOrVert',
|
||||||
|
'uploader_id': '@ElevageOrVert',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -1210,8 +1268,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'title': 'Part 77 Sort a list of simple types in c#',
|
'title': 'Part 77 Sort a list of simple types in c#',
|
||||||
'description': 'md5:b8746fa52e10cdbf47997903f13b20dc',
|
'description': 'md5:b8746fa52e10cdbf47997903f13b20dc',
|
||||||
'upload_date': '20130831',
|
'upload_date': '20130831',
|
||||||
'uploader_id': 'kudvenkat',
|
|
||||||
'uploader': 'kudvenkat',
|
'uploader': 'kudvenkat',
|
||||||
|
'uploader_id': '@Csharp-video-tutorialsBlogspot',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -1263,8 +1321,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'description': 'md5:ea770e474b7cd6722b4c95b833c03630',
|
'description': 'md5:ea770e474b7cd6722b4c95b833c03630',
|
||||||
'upload_date': '20201120',
|
'upload_date': '20201120',
|
||||||
'uploader': 'Walk around Japan',
|
'uploader': 'Walk around Japan',
|
||||||
'uploader_id': 'UC3o_t8PzBmXf5S9b7GLx1Mw',
|
'uploader_id': '@walkaroundjapan7124',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/channel/UC3o_t8PzBmXf5S9b7GLx1Mw',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@walkaroundjapan7124',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -1276,11 +1334,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '4L2J27mJ3Dc',
|
'id': '4L2J27mJ3Dc',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
|
'title': 'Midwest Squid Game #Shorts',
|
||||||
|
'description': 'md5:976512b8a29269b93bbd8a61edc45a6d',
|
||||||
'upload_date': '20211025',
|
'upload_date': '20211025',
|
||||||
'uploader': 'Charlie Berens',
|
'uploader': 'Charlie Berens',
|
||||||
'description': 'md5:976512b8a29269b93bbd8a61edc45a6d',
|
'uploader_id': '@CharlieBerens',
|
||||||
'uploader_id': 'fivedlrmilkshake',
|
|
||||||
'title': 'Midwest Squid Game #Shorts',
|
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -1636,8 +1694,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
if n_response is None:
|
if n_response is None:
|
||||||
# give up if descrambling failed
|
# give up if descrambling failed
|
||||||
break
|
break
|
||||||
fmt['url'] = update_url(
|
for fmt_dct in traverse_obj(fmt, (None, (None, ('fragments', Ellipsis))), expected_type=dict):
|
||||||
parsed_fmt_url, query_update={'n': [n_response]})
|
fmt_dct['url'] = update_url(
|
||||||
|
fmt_dct['url'], query_update={'n': [n_response]})
|
||||||
|
|
||||||
# from yt-dlp, with tweaks
|
# from yt-dlp, with tweaks
|
||||||
def _extract_signature_timestamp(self, video_id, player_url, ytcfg=None, fatal=False):
|
def _extract_signature_timestamp(self, video_id, player_url, ytcfg=None, fatal=False):
|
||||||
@ -1989,10 +2048,19 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
if no_video:
|
if no_video:
|
||||||
dct['abr'] = tbr
|
dct['abr'] = tbr
|
||||||
if no_audio or no_video:
|
if no_audio or no_video:
|
||||||
dct['downloader_options'] = {
|
CHUNK_SIZE = 10 << 20
|
||||||
# Youtube throttles chunks >~10M
|
# avoid Youtube throttling
|
||||||
'http_chunk_size': 10485760,
|
dct.update({
|
||||||
}
|
'protocol': 'http_dash_segments',
|
||||||
|
'fragments': [{
|
||||||
|
'url': update_url_query(dct['url'], {
|
||||||
|
'range': '{0}-{1}'.format(range_start, min(range_start + CHUNK_SIZE - 1, dct['filesize']))
|
||||||
|
})
|
||||||
|
} for range_start in range(0, dct['filesize'], CHUNK_SIZE)]
|
||||||
|
} if dct['filesize'] else {
|
||||||
|
'downloader_options': {'http_chunk_size': CHUNK_SIZE} # No longer useful?
|
||||||
|
})
|
||||||
|
|
||||||
if dct.get('ext'):
|
if dct.get('ext'):
|
||||||
dct['container'] = dct['ext'] + '_dash'
|
dct['container'] = dct['ext'] + '_dash'
|
||||||
formats.append(dct)
|
formats.append(dct)
|
||||||
@ -2088,25 +2156,19 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
thumbnails = [{'url': thumbnail}]
|
thumbnails = [{'url': thumbnail}]
|
||||||
|
|
||||||
category = microformat.get('category') or search_meta('genre')
|
category = microformat.get('category') or search_meta('genre')
|
||||||
channel_id = video_details.get('channelId') \
|
channel_id = self._extract_channel_id(
|
||||||
or microformat.get('externalChannelId') \
|
webpage, videodetails=video_details, metadata=microformat)
|
||||||
or search_meta('channelId')
|
|
||||||
duration = int_or_none(
|
duration = int_or_none(
|
||||||
video_details.get('lengthSeconds')
|
video_details.get('lengthSeconds')
|
||||||
or microformat.get('lengthSeconds')) \
|
or microformat.get('lengthSeconds')) \
|
||||||
or parse_duration(search_meta('duration'))
|
or parse_duration(search_meta('duration'))
|
||||||
is_live = video_details.get('isLive')
|
is_live = video_details.get('isLive')
|
||||||
|
|
||||||
def gen_owner_profile_url():
|
owner_profile_url = self._yt_urljoin(self._extract_author_var(
|
||||||
yield microformat.get('ownerProfileUrl')
|
webpage, 'url', videodetails=video_details, metadata=microformat))
|
||||||
yield extract_attributes(self._search_regex(
|
|
||||||
r'''(?s)(<link\b[^>]+\bitemprop\s*=\s*("|')url\2[^>]*>)''',
|
|
||||||
get_element_by_attribute('itemprop', 'author', webpage),
|
|
||||||
'owner_profile_url', default='')).get('href')
|
|
||||||
|
|
||||||
owner_profile_url = next(
|
uploader = self._extract_author_var(
|
||||||
(x for x in map(url_or_none, gen_owner_profile_url()) if x),
|
webpage, 'name', videodetails=video_details, metadata=microformat)
|
||||||
None)
|
|
||||||
|
|
||||||
if not player_url:
|
if not player_url:
|
||||||
player_url = self._extract_player_url(webpage)
|
player_url = self._extract_player_url(webpage)
|
||||||
@ -2121,11 +2183,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'upload_date': unified_strdate(
|
'upload_date': unified_strdate(
|
||||||
microformat.get('uploadDate')
|
microformat.get('uploadDate')
|
||||||
or search_meta('uploadDate')),
|
or search_meta('uploadDate')),
|
||||||
'uploader': video_details['author'],
|
'uploader': uploader,
|
||||||
'uploader_id': self._search_regex(r'/(?:channel|user)/([^/?&#]+)', owner_profile_url, 'uploader id') if owner_profile_url else None,
|
|
||||||
'uploader_url': owner_profile_url,
|
|
||||||
'channel_id': channel_id,
|
'channel_id': channel_id,
|
||||||
'channel_url': 'https://www.youtube.com/channel/' + channel_id if channel_id else None,
|
|
||||||
'duration': duration,
|
'duration': duration,
|
||||||
'view_count': int_or_none(
|
'view_count': int_or_none(
|
||||||
video_details.get('viewCount')
|
video_details.get('viewCount')
|
||||||
@ -2255,6 +2314,13 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
initial_data,
|
initial_data,
|
||||||
lambda x: x['contents']['twoColumnWatchNextResults']['results']['results']['contents'],
|
lambda x: x['contents']['twoColumnWatchNextResults']['results']['results']['contents'],
|
||||||
list) or []
|
list) or []
|
||||||
|
if not info['channel_id']:
|
||||||
|
channel_id = self._extract_channel_id('', renderers=contents)
|
||||||
|
if not info['uploader']:
|
||||||
|
info['uploader'] = self._extract_author_var('', 'name', renderers=contents)
|
||||||
|
if not owner_profile_url:
|
||||||
|
owner_profile_url = self._yt_urljoin(self._extract_author_var('', 'url', renderers=contents))
|
||||||
|
|
||||||
for content in contents:
|
for content in contents:
|
||||||
vpir = content.get('videoPrimaryInfoRenderer')
|
vpir = content.get('videoPrimaryInfoRenderer')
|
||||||
if vpir:
|
if vpir:
|
||||||
@ -2302,10 +2368,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
})
|
})
|
||||||
vsir = content.get('videoSecondaryInfoRenderer')
|
vsir = content.get('videoSecondaryInfoRenderer')
|
||||||
if vsir:
|
if vsir:
|
||||||
info['channel'] = get_text(try_get(
|
|
||||||
vsir,
|
|
||||||
lambda x: x['owner']['videoOwnerRenderer']['title'],
|
|
||||||
dict))
|
|
||||||
rows = try_get(
|
rows = try_get(
|
||||||
vsir,
|
vsir,
|
||||||
lambda x: x['metadataRowContainer']['metadataRowContainerRenderer']['rows'],
|
lambda x: x['metadataRowContainer']['metadataRowContainerRenderer']['rows'],
|
||||||
@ -2363,7 +2425,14 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
|
|
||||||
self.mark_watched(video_id, player_response)
|
self.mark_watched(video_id, player_response)
|
||||||
|
|
||||||
return info
|
return merge_dicts(
|
||||||
|
info, {
|
||||||
|
'uploader_id': self._extract_uploader_id(owner_profile_url),
|
||||||
|
'uploader_url': owner_profile_url,
|
||||||
|
'channel_id': channel_id,
|
||||||
|
'channel_url': channel_id and self._yt_urljoin('/channel/' + channel_id),
|
||||||
|
'channel': info['uploader'],
|
||||||
|
})
|
||||||
|
|
||||||
|
|
||||||
class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
||||||
@ -2392,6 +2461,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'description': 'Short clips from Super Cooper Sundays!',
|
'description': 'Short clips from Super Cooper Sundays!',
|
||||||
'id': 'UCKMA8kHZ8bPYpnMNaUSxfEQ',
|
'id': 'UCKMA8kHZ8bPYpnMNaUSxfEQ',
|
||||||
'title': 'Super Cooper Shorts - Shorts',
|
'title': 'Super Cooper Shorts - Shorts',
|
||||||
|
'uploader': 'Super Cooper Shorts',
|
||||||
|
'uploader_id': '@SuperCooperShorts',
|
||||||
}
|
}
|
||||||
}, {
|
}, {
|
||||||
# Channel that does not have a Shorts tab. Test should just download videos on Home tab instead
|
# Channel that does not have a Shorts tab. Test should just download videos on Home tab instead
|
||||||
@ -2402,14 +2473,17 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'title': 'Emergency Awesome - Home',
|
'title': 'Emergency Awesome - Home',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 5,
|
'playlist_mincount': 5,
|
||||||
|
'skip': 'new test page needed to replace `Emergency Awesome - Shorts`',
|
||||||
}, {
|
}, {
|
||||||
# playlists, multipage
|
# playlists, multipage
|
||||||
'url': 'https://www.youtube.com/c/ИгорьКлейнер/playlists?view=1&flow=grid',
|
'url': 'https://www.youtube.com/c/ИгорьКлейнер/playlists?view=1&flow=grid',
|
||||||
'playlist_mincount': 94,
|
'playlist_mincount': 94,
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
'id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
||||||
'title': 'Игорь Клейнер - Playlists',
|
'title': 'Igor Kleiner - Playlists',
|
||||||
'description': 'md5:be97ee0f14ee314f1f002cf187166ee2',
|
'description': 'md5:be97ee0f14ee314f1f002cf187166ee2',
|
||||||
|
'uploader': 'Igor Kleiner',
|
||||||
|
'uploader_id': '@IgorDataScience',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
# playlists, multipage, different order
|
# playlists, multipage, different order
|
||||||
@ -2417,8 +2491,10 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'playlist_mincount': 94,
|
'playlist_mincount': 94,
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
'id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
||||||
'title': 'Игорь Клейнер - Playlists',
|
'title': 'Igor Kleiner - Playlists',
|
||||||
'description': 'md5:be97ee0f14ee314f1f002cf187166ee2',
|
'description': 'md5:be97ee0f14ee314f1f002cf187166ee2',
|
||||||
|
'uploader': 'Igor Kleiner',
|
||||||
|
'uploader_id': '@IgorDataScience',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
# playlists, series
|
# playlists, series
|
||||||
@ -2428,6 +2504,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'id': 'UCYO_jab_esuFRV4b17AJtAw',
|
'id': 'UCYO_jab_esuFRV4b17AJtAw',
|
||||||
'title': '3Blue1Brown - Playlists',
|
'title': '3Blue1Brown - Playlists',
|
||||||
'description': 'md5:e1384e8a133307dd10edee76e875d62f',
|
'description': 'md5:e1384e8a133307dd10edee76e875d62f',
|
||||||
|
'uploader': '3Blue1Brown',
|
||||||
|
'uploader_id': '@3blue1brown',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
# playlists, singlepage
|
# playlists, singlepage
|
||||||
@ -2437,6 +2515,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'id': 'UCAEtajcuhQ6an9WEzY9LEMQ',
|
'id': 'UCAEtajcuhQ6an9WEzY9LEMQ',
|
||||||
'title': 'ThirstForScience - Playlists',
|
'title': 'ThirstForScience - Playlists',
|
||||||
'description': 'md5:609399d937ea957b0f53cbffb747a14c',
|
'description': 'md5:609399d937ea957b0f53cbffb747a14c',
|
||||||
|
'uploader': 'ThirstForScience',
|
||||||
|
'uploader_id': '@ThirstForScience',
|
||||||
}
|
}
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.youtube.com/c/ChristophLaimer/playlists',
|
'url': 'https://www.youtube.com/c/ChristophLaimer/playlists',
|
||||||
@ -2445,20 +2525,22 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
# basic, single video playlist
|
# basic, single video playlist
|
||||||
'url': 'https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc',
|
'url': 'https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'uploader_id': 'UCmlqkdCBesrv2Lak1mF_MxA',
|
|
||||||
'uploader': 'Sergey M.',
|
|
||||||
'id': 'PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc',
|
'id': 'PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc',
|
||||||
'title': 'youtube-dl public playlist',
|
'title': 'youtube-dl public playlist',
|
||||||
|
'uploader': 'Sergey M.',
|
||||||
|
'uploader_id': '@sergeym.6173',
|
||||||
|
'channel_id': 'UCmlqkdCBesrv2Lak1mF_MxA',
|
||||||
},
|
},
|
||||||
'playlist_count': 1,
|
'playlist_count': 1,
|
||||||
}, {
|
}, {
|
||||||
# empty playlist
|
# empty playlist
|
||||||
'url': 'https://www.youtube.com/playlist?list=PL4lCao7KL_QFodcLWhDpGCYnngnHtQ-Xf',
|
'url': 'https://www.youtube.com/playlist?list=PL4lCao7KL_QFodcLWhDpGCYnngnHtQ-Xf',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'uploader_id': 'UCmlqkdCBesrv2Lak1mF_MxA',
|
|
||||||
'uploader': 'Sergey M.',
|
|
||||||
'id': 'PL4lCao7KL_QFodcLWhDpGCYnngnHtQ-Xf',
|
'id': 'PL4lCao7KL_QFodcLWhDpGCYnngnHtQ-Xf',
|
||||||
'title': 'youtube-dl empty playlist',
|
'title': 'youtube-dl empty playlist',
|
||||||
|
'uploader': 'Sergey M.',
|
||||||
|
'uploader_id': '@sergeym.6173',
|
||||||
|
'channel_id': 'UCmlqkdCBesrv2Lak1mF_MxA',
|
||||||
},
|
},
|
||||||
'playlist_count': 0,
|
'playlist_count': 0,
|
||||||
}, {
|
}, {
|
||||||
@ -2468,6 +2550,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
||||||
'title': 'lex will - Home',
|
'title': 'lex will - Home',
|
||||||
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
||||||
|
'uploader': 'lex will',
|
||||||
|
'uploader_id': '@lexwill718',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 2,
|
'playlist_mincount': 2,
|
||||||
}, {
|
}, {
|
||||||
@ -2477,6 +2561,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
||||||
'title': 'lex will - Videos',
|
'title': 'lex will - Videos',
|
||||||
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
||||||
|
'uploader': 'lex will',
|
||||||
|
'uploader_id': '@lexwill718',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 975,
|
'playlist_mincount': 975,
|
||||||
}, {
|
}, {
|
||||||
@ -2486,6 +2572,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
||||||
'title': 'lex will - Videos',
|
'title': 'lex will - Videos',
|
||||||
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
||||||
|
'uploader': 'lex will',
|
||||||
|
'uploader_id': '@lexwill718',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 199,
|
'playlist_mincount': 199,
|
||||||
}, {
|
}, {
|
||||||
@ -2495,6 +2583,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
||||||
'title': 'lex will - Playlists',
|
'title': 'lex will - Playlists',
|
||||||
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
||||||
|
'uploader': 'lex will',
|
||||||
|
'uploader_id': '@lexwill718',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 17,
|
'playlist_mincount': 17,
|
||||||
}, {
|
}, {
|
||||||
@ -2504,6 +2594,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
||||||
'title': 'lex will - Community',
|
'title': 'lex will - Community',
|
||||||
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
||||||
|
'uploader': 'lex will',
|
||||||
|
'uploader_id': '@lexwill718',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 18,
|
'playlist_mincount': 18,
|
||||||
}, {
|
}, {
|
||||||
@ -2513,8 +2605,10 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
||||||
'title': 'lex will - Channels',
|
'title': 'lex will - Channels',
|
||||||
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
||||||
|
'uploader': 'lex will',
|
||||||
|
'uploader_id': '@lexwill718',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 138,
|
'playlist_mincount': 75,
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://invidio.us/channel/UCmlqkdCBesrv2Lak1mF_MxA',
|
'url': 'https://invidio.us/channel/UCmlqkdCBesrv2Lak1mF_MxA',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -2531,7 +2625,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'title': '29C3: Not my department',
|
'title': '29C3: Not my department',
|
||||||
'id': 'PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC',
|
'id': 'PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC',
|
||||||
'uploader': 'Christiaan008',
|
'uploader': 'Christiaan008',
|
||||||
'uploader_id': 'UCEPzS1rYsrkqzSLNp76nrcg',
|
'uploader_id': '@ChRiStIaAn008',
|
||||||
|
'channel_id': 'UCEPzS1rYsrkqzSLNp76nrcg',
|
||||||
},
|
},
|
||||||
'playlist_count': 96,
|
'playlist_count': 96,
|
||||||
}, {
|
}, {
|
||||||
@ -2541,7 +2636,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'title': 'Uploads from Cauchemar',
|
'title': 'Uploads from Cauchemar',
|
||||||
'id': 'UUBABnxM4Ar9ten8Mdjj1j0Q',
|
'id': 'UUBABnxM4Ar9ten8Mdjj1j0Q',
|
||||||
'uploader': 'Cauchemar',
|
'uploader': 'Cauchemar',
|
||||||
'uploader_id': 'UCBABnxM4Ar9ten8Mdjj1j0Q',
|
'uploader_id': '@Cauchemar89',
|
||||||
|
'channel_id': 'UCBABnxM4Ar9ten8Mdjj1j0Q',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 1123,
|
'playlist_mincount': 1123,
|
||||||
}, {
|
}, {
|
||||||
@ -2555,7 +2651,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'title': 'Uploads from Interstellar Movie',
|
'title': 'Uploads from Interstellar Movie',
|
||||||
'id': 'UUXw-G3eDE9trcvY2sBMM_aA',
|
'id': 'UUXw-G3eDE9trcvY2sBMM_aA',
|
||||||
'uploader': 'Interstellar Movie',
|
'uploader': 'Interstellar Movie',
|
||||||
'uploader_id': 'UCXw-G3eDE9trcvY2sBMM_aA',
|
'uploader_id': '@InterstellarMovie',
|
||||||
|
'channel_id': 'UCXw-G3eDE9trcvY2sBMM_aA',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 21,
|
'playlist_mincount': 21,
|
||||||
}, {
|
}, {
|
||||||
@ -2564,8 +2661,9 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'title': 'Data Analysis with Dr Mike Pound',
|
'title': 'Data Analysis with Dr Mike Pound',
|
||||||
'id': 'PLzH6n4zXuckpfMu_4Ff8E7Z1behQks5ba',
|
'id': 'PLzH6n4zXuckpfMu_4Ff8E7Z1behQks5ba',
|
||||||
'uploader_id': 'UC9-y-6csu5WGm29I7JiwpnA',
|
|
||||||
'uploader': 'Computerphile',
|
'uploader': 'Computerphile',
|
||||||
|
'uploader_id': '@Computerphile',
|
||||||
|
'channel_id': 'UC9-y-6csu5WGm29I7JiwpnA',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 11,
|
'playlist_mincount': 11,
|
||||||
}, {
|
}, {
|
||||||
@ -2603,14 +2701,14 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://www.youtube.com/channel/UCoMdktPbSTixAyNGwb-UYkQ/live',
|
'url': 'https://www.youtube.com/channel/UCoMdktPbSTixAyNGwb-UYkQ/live',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '9Auq9mYxFEE',
|
'id': r're:[\da-zA-Z_-]{8,}',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Watch Sky News live',
|
'title': r're:(?s)[A-Z].{20,}',
|
||||||
'uploader': 'Sky News',
|
'uploader': 'Sky News',
|
||||||
'uploader_id': 'skynews',
|
'uploader_id': '@SkyNews',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/skynews',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@SkyNews',
|
||||||
'upload_date': '20191102',
|
'upload_date': r're:\d{8}',
|
||||||
'description': 'md5:78de4e1c2359d0ea3ed829678e38b662',
|
'description': r're:(?s)(?:.*\n)+SUBSCRIBE to our YouTube channel for more videos: http://www\.youtube\.com/skynews *\n.*',
|
||||||
'categories': ['News & Politics'],
|
'categories': ['News & Politics'],
|
||||||
'tags': list,
|
'tags': list,
|
||||||
'like_count': int,
|
'like_count': int,
|
||||||
@ -2699,34 +2797,22 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'note': 'Search tab',
|
'note': 'Search tab',
|
||||||
'url': 'https://www.youtube.com/c/3blue1brown/search?query=linear%20algebra',
|
'url': 'https://www.youtube.com/c/3blue1brown/search?query=linear%20algebra',
|
||||||
'playlist_mincount': 40,
|
'playlist_mincount': 20,
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCYO_jab_esuFRV4b17AJtAw',
|
'id': 'UCYO_jab_esuFRV4b17AJtAw',
|
||||||
'title': '3Blue1Brown - Search - linear algebra',
|
'title': '3Blue1Brown - Search - linear algebra',
|
||||||
'description': 'md5:e1384e8a133307dd10edee76e875d62f',
|
'description': 'md5:e1384e8a133307dd10edee76e875d62f',
|
||||||
'uploader': '3Blue1Brown',
|
'uploader': '3Blue1Brown',
|
||||||
'uploader_id': 'UCYO_jab_esuFRV4b17AJtAw',
|
'uploader_id': '@3blue1brown',
|
||||||
|
'channel_id': 'UCYO_jab_esuFRV4b17AJtAw',
|
||||||
}
|
}
|
||||||
}]
|
}]
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def suitable(cls, url):
|
def suitable(cls, url):
|
||||||
return False if YoutubeIE.suitable(url) else super(
|
return not YoutubeIE.suitable(url) and super(
|
||||||
YoutubeTabIE, cls).suitable(url)
|
YoutubeTabIE, cls).suitable(url)
|
||||||
|
|
||||||
def _extract_channel_id(self, webpage):
|
|
||||||
channel_id = self._html_search_meta(
|
|
||||||
'channelId', webpage, 'channel id', default=None)
|
|
||||||
if channel_id:
|
|
||||||
return channel_id
|
|
||||||
channel_url = self._html_search_meta(
|
|
||||||
('og:url', 'al:ios:url', 'al:android:url', 'al:web:url',
|
|
||||||
'twitter:url', 'twitter:app:url:iphone', 'twitter:app:url:ipad',
|
|
||||||
'twitter:app:url:googleplay'), webpage, 'channel url')
|
|
||||||
return self._search_regex(
|
|
||||||
r'https?://(?:www\.)?youtube\.com/channel/([^/?#&])+',
|
|
||||||
channel_url, 'channel id')
|
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def _extract_grid_item_renderer(item):
|
def _extract_grid_item_renderer(item):
|
||||||
assert isinstance(item, dict)
|
assert isinstance(item, dict)
|
||||||
@ -3114,27 +3200,18 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
else:
|
else:
|
||||||
raise ExtractorError('Unable to find selected tab')
|
raise ExtractorError('Unable to find selected tab')
|
||||||
|
|
||||||
@staticmethod
|
def _extract_uploader(self, metadata, data):
|
||||||
def _extract_uploader(data):
|
|
||||||
uploader = {}
|
uploader = {}
|
||||||
sidebar_renderer = try_get(
|
renderers = traverse_obj(data,
|
||||||
data, lambda x: x['sidebar']['playlistSidebarRenderer']['items'], list)
|
('sidebar', 'playlistSidebarRenderer', 'items'))
|
||||||
if sidebar_renderer:
|
uploader['channel_id'] = self._extract_channel_id('', metadata=metadata, renderers=renderers)
|
||||||
for item in sidebar_renderer:
|
uploader['uploader'] = (
|
||||||
if not isinstance(item, dict):
|
self._extract_author_var('', 'name', renderers=renderers)
|
||||||
continue
|
or self._extract_author_var('', 'name', metadata=metadata))
|
||||||
renderer = item.get('playlistSidebarSecondaryInfoRenderer')
|
uploader['uploader_url'] = self._yt_urljoin(
|
||||||
if not isinstance(renderer, dict):
|
self._extract_author_var('', 'url', metadata=metadata, renderers=renderers))
|
||||||
continue
|
uploader['uploader_id'] = self._extract_uploader_id(uploader['uploader_url'])
|
||||||
owner = try_get(
|
uploader['channel'] = uploader['uploader']
|
||||||
renderer, lambda x: x['videoOwner']['videoOwnerRenderer']['title']['runs'][0], dict)
|
|
||||||
if owner:
|
|
||||||
uploader['uploader'] = owner.get('text')
|
|
||||||
uploader['uploader_id'] = try_get(
|
|
||||||
owner, lambda x: x['navigationEndpoint']['browseEndpoint']['browseId'], compat_str)
|
|
||||||
uploader['uploader_url'] = urljoin(
|
|
||||||
'https://www.youtube.com/',
|
|
||||||
try_get(owner, lambda x: x['navigationEndpoint']['browseEndpoint']['canonicalBaseUrl'], compat_str))
|
|
||||||
return uploader
|
return uploader
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
@ -3185,8 +3262,7 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
|
|||||||
self._entries(selected_tab, item_id, webpage),
|
self._entries(selected_tab, item_id, webpage),
|
||||||
playlist_id=playlist_id, playlist_title=title,
|
playlist_id=playlist_id, playlist_title=title,
|
||||||
playlist_description=description)
|
playlist_description=description)
|
||||||
playlist.update(self._extract_uploader(data))
|
return merge_dicts(playlist, self._extract_uploader(renderer, data))
|
||||||
return playlist
|
|
||||||
|
|
||||||
def _extract_from_playlist(self, item_id, url, data, playlist):
|
def _extract_from_playlist(self, item_id, url, data, playlist):
|
||||||
title = playlist.get('title') or try_get(
|
title = playlist.get('title') or try_get(
|
||||||
@ -3273,8 +3349,9 @@ class YoutubePlaylistIE(InfoExtractor):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'title': '[OLD]Team Fortress 2 (Class-based LP)',
|
'title': '[OLD]Team Fortress 2 (Class-based LP)',
|
||||||
'id': 'PLBB231211A4F62143',
|
'id': 'PLBB231211A4F62143',
|
||||||
'uploader': 'Wickydoo',
|
'uploader': 'Wickman',
|
||||||
'uploader_id': 'UCKSpbfbl5kRQpTdL7kMc-1Q',
|
'uploader_id': '@WickmanVT',
|
||||||
|
'channel_id': 'UCKSpbfbl5kRQpTdL7kMc-1Q',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 29,
|
'playlist_mincount': 29,
|
||||||
}, {
|
}, {
|
||||||
@ -3288,21 +3365,25 @@ class YoutubePlaylistIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'note': 'embedded',
|
'note': 'embedded',
|
||||||
'url': 'https://www.youtube.com/embed/videoseries?list=PL6IaIsEjSbf96XFRuNccS_RuEXwNdsoEu',
|
'url': 'https://www.youtube.com/embed/videoseries?list=PL6IaIsEjSbf96XFRuNccS_RuEXwNdsoEu',
|
||||||
'playlist_count': 4,
|
# TODO: full playlist requires _reload_with_unavailable_videos()
|
||||||
|
# 'playlist_count': 4,
|
||||||
|
'playlist_mincount': 1,
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'title': 'JODA15',
|
'title': 'JODA15',
|
||||||
'id': 'PL6IaIsEjSbf96XFRuNccS_RuEXwNdsoEu',
|
'id': 'PL6IaIsEjSbf96XFRuNccS_RuEXwNdsoEu',
|
||||||
'uploader': 'milan',
|
'uploader': 'milan',
|
||||||
'uploader_id': 'UCEI1-PVPcYXjB73Hfelbmaw',
|
'uploader_id': '@milan5503',
|
||||||
|
'channel_id': 'UCEI1-PVPcYXjB73Hfelbmaw',
|
||||||
}
|
}
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.youtube.com/embed/_xDOZElKyNU?list=PLsyOSbh5bs16vubvKePAQ1x3PhKavfBIl',
|
'url': 'http://www.youtube.com/embed/_xDOZElKyNU?list=PLsyOSbh5bs16vubvKePAQ1x3PhKavfBIl',
|
||||||
'playlist_mincount': 982,
|
'playlist_mincount': 455,
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'title': '2018 Chinese New Singles (11/6 updated)',
|
'title': '2018 Chinese New Singles (11/6 updated)',
|
||||||
'id': 'PLsyOSbh5bs16vubvKePAQ1x3PhKavfBIl',
|
'id': 'PLsyOSbh5bs16vubvKePAQ1x3PhKavfBIl',
|
||||||
'uploader': 'LBK',
|
'uploader': 'LBK',
|
||||||
'uploader_id': 'UC21nz3_MesPLqtDqwdvnoxA',
|
'uploader_id': '@music_king',
|
||||||
|
'channel_id': 'UC21nz3_MesPLqtDqwdvnoxA',
|
||||||
}
|
}
|
||||||
}, {
|
}, {
|
||||||
'url': 'TLGGrESM50VT6acwMjAyMjAxNw',
|
'url': 'TLGGrESM50VT6acwMjAyMjAxNw',
|
||||||
@ -3340,8 +3421,8 @@ class YoutubeYtBeIE(InfoExtractor):
|
|||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Small Scale Baler and Braiding Rugs',
|
'title': 'Small Scale Baler and Braiding Rugs',
|
||||||
'uploader': 'Backus-Page House Museum',
|
'uploader': 'Backus-Page House Museum',
|
||||||
'uploader_id': 'backuspagemuseum',
|
'uploader_id': '@backuspagemuseum',
|
||||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/backuspagemuseum',
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@backuspagemuseum',
|
||||||
'upload_date': '20161008',
|
'upload_date': '20161008',
|
||||||
'description': 'md5:800c0c78d5eb128500bffd4f0b4f2e8a',
|
'description': 'md5:800c0c78d5eb128500bffd4f0b4f2e8a',
|
||||||
'categories': ['Nonprofits & Activism'],
|
'categories': ['Nonprofits & Activism'],
|
||||||
|
@ -12,9 +12,11 @@ from .utils import (
|
|||||||
js_to_json,
|
js_to_json,
|
||||||
remove_quotes,
|
remove_quotes,
|
||||||
unified_timestamp,
|
unified_timestamp,
|
||||||
|
variadic,
|
||||||
)
|
)
|
||||||
from .compat import (
|
from .compat import (
|
||||||
compat_basestring,
|
compat_basestring,
|
||||||
|
compat_chr,
|
||||||
compat_collections_chain_map as ChainMap,
|
compat_collections_chain_map as ChainMap,
|
||||||
compat_itertools_zip_longest as zip_longest,
|
compat_itertools_zip_longest as zip_longest,
|
||||||
compat_str,
|
compat_str,
|
||||||
@ -201,14 +203,14 @@ class JSInterpreter(object):
|
|||||||
def __init__(self, msg, *args, **kwargs):
|
def __init__(self, msg, *args, **kwargs):
|
||||||
expr = kwargs.pop('expr', None)
|
expr = kwargs.pop('expr', None)
|
||||||
if expr is not None:
|
if expr is not None:
|
||||||
msg = '{0} in: {1!r}'.format(msg.rstrip(), expr[:100])
|
msg = '{0} in: {1!r:.100}'.format(msg.rstrip(), expr)
|
||||||
super(JSInterpreter.Exception, self).__init__(msg, *args, **kwargs)
|
super(JSInterpreter.Exception, self).__init__(msg, *args, **kwargs)
|
||||||
|
|
||||||
class JS_RegExp(object):
|
class JS_RegExp(object):
|
||||||
_RE_FLAGS = {
|
RE_FLAGS = {
|
||||||
# special knowledge: Python's re flags are bitmask values, current max 128
|
# special knowledge: Python's re flags are bitmask values, current max 128
|
||||||
# invent new bitmask values well above that for literal parsing
|
# invent new bitmask values well above that for literal parsing
|
||||||
# TODO: new pattern class to execute matches with these flags
|
# TODO: execute matches with these flags (remaining: d, y)
|
||||||
'd': 1024, # Generate indices for substring matches
|
'd': 1024, # Generate indices for substring matches
|
||||||
'g': 2048, # Global search
|
'g': 2048, # Global search
|
||||||
'i': re.I, # Case-insensitive search
|
'i': re.I, # Case-insensitive search
|
||||||
@ -218,12 +220,19 @@ class JSInterpreter(object):
|
|||||||
'y': 4096, # Perform a "sticky" search that matches starting at the current position in the target string
|
'y': 4096, # Perform a "sticky" search that matches starting at the current position in the target string
|
||||||
}
|
}
|
||||||
|
|
||||||
def __init__(self, pattern_txt, flags=''):
|
def __init__(self, pattern_txt, flags=0):
|
||||||
if isinstance(flags, compat_str):
|
if isinstance(flags, compat_str):
|
||||||
flags, _ = self.regex_flags(flags)
|
flags, _ = self.regex_flags(flags)
|
||||||
# Thx: https://stackoverflow.com/questions/44773522/setattr-on-python2-sre-sre-pattern
|
|
||||||
# First, avoid https://github.com/python/cpython/issues/74534
|
# First, avoid https://github.com/python/cpython/issues/74534
|
||||||
self.__self = re.compile(pattern_txt.replace('[[', r'[\['), flags)
|
self.__self = None
|
||||||
|
self.__pattern_txt = pattern_txt.replace('[[', r'[\[')
|
||||||
|
self.__flags = flags
|
||||||
|
|
||||||
|
def __instantiate(self):
|
||||||
|
if self.__self:
|
||||||
|
return
|
||||||
|
self.__self = re.compile(self.__pattern_txt, self.__flags)
|
||||||
|
# Thx: https://stackoverflow.com/questions/44773522/setattr-on-python2-sre-sre-pattern
|
||||||
for name in dir(self.__self):
|
for name in dir(self.__self):
|
||||||
# Only these? Obviously __class__, __init__.
|
# Only these? Obviously __class__, __init__.
|
||||||
# PyPy creates a __weakref__ attribute with value None
|
# PyPy creates a __weakref__ attribute with value None
|
||||||
@ -232,15 +241,21 @@ class JSInterpreter(object):
|
|||||||
continue
|
continue
|
||||||
setattr(self, name, getattr(self.__self, name))
|
setattr(self, name, getattr(self.__self, name))
|
||||||
|
|
||||||
|
def __getattr__(self, name):
|
||||||
|
self.__instantiate()
|
||||||
|
if hasattr(self, name):
|
||||||
|
return getattr(self, name)
|
||||||
|
return super(JSInterpreter.JS_RegExp, self).__getattr__(name)
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def regex_flags(cls, expr):
|
def regex_flags(cls, expr):
|
||||||
flags = 0
|
flags = 0
|
||||||
if not expr:
|
if not expr:
|
||||||
return flags, expr
|
return flags, expr
|
||||||
for idx, ch in enumerate(expr):
|
for idx, ch in enumerate(expr):
|
||||||
if ch not in cls._RE_FLAGS:
|
if ch not in cls.RE_FLAGS:
|
||||||
break
|
break
|
||||||
flags |= cls._RE_FLAGS[ch]
|
flags |= cls.RE_FLAGS[ch]
|
||||||
return flags, expr[idx + 1:]
|
return flags, expr[idx + 1:]
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
@ -262,20 +277,20 @@ class JSInterpreter(object):
|
|||||||
if not expr:
|
if not expr:
|
||||||
return
|
return
|
||||||
# collections.Counter() is ~10% slower in both 2.7 and 3.9
|
# collections.Counter() is ~10% slower in both 2.7 and 3.9
|
||||||
counters = {k: 0 for k in _MATCHING_PARENS.values()}
|
counters = dict((k, 0) for k in _MATCHING_PARENS.values())
|
||||||
start, splits, pos, delim_len = 0, 0, 0, len(delim) - 1
|
start, splits, pos, delim_len = 0, 0, 0, len(delim) - 1
|
||||||
in_quote, escaping, skipping = None, False, 0
|
in_quote, escaping, skipping = None, False, 0
|
||||||
after_op, in_regex_char_group, skip_re = True, False, 0
|
after_op, in_regex_char_group = True, False
|
||||||
|
|
||||||
for idx, char in enumerate(expr):
|
for idx, char in enumerate(expr):
|
||||||
if skip_re > 0:
|
paren_delta = 0
|
||||||
skip_re -= 1
|
|
||||||
continue
|
|
||||||
if not in_quote:
|
if not in_quote:
|
||||||
if char in _MATCHING_PARENS:
|
if char in _MATCHING_PARENS:
|
||||||
counters[_MATCHING_PARENS[char]] += 1
|
counters[_MATCHING_PARENS[char]] += 1
|
||||||
|
paren_delta = 1
|
||||||
elif char in counters:
|
elif char in counters:
|
||||||
counters[char] -= 1
|
counters[char] -= 1
|
||||||
|
paren_delta = -1
|
||||||
if not escaping:
|
if not escaping:
|
||||||
if char in _QUOTES and in_quote in (char, None):
|
if char in _QUOTES and in_quote in (char, None):
|
||||||
if in_quote or after_op or char != '/':
|
if in_quote or after_op or char != '/':
|
||||||
@ -283,7 +298,7 @@ class JSInterpreter(object):
|
|||||||
elif in_quote == '/' and char in '[]':
|
elif in_quote == '/' and char in '[]':
|
||||||
in_regex_char_group = char == '['
|
in_regex_char_group = char == '['
|
||||||
escaping = not escaping and in_quote and char == '\\'
|
escaping = not escaping and in_quote and char == '\\'
|
||||||
after_op = not in_quote and (char in cls.OP_CHARS or (char.isspace() and after_op))
|
after_op = not in_quote and (char in cls.OP_CHARS or paren_delta > 0 or (after_op and char.isspace()))
|
||||||
|
|
||||||
if char != delim[pos] or any(counters.values()) or in_quote:
|
if char != delim[pos] or any(counters.values()) or in_quote:
|
||||||
pos = skipping = 0
|
pos = skipping = 0
|
||||||
@ -293,7 +308,7 @@ class JSInterpreter(object):
|
|||||||
continue
|
continue
|
||||||
elif pos == 0 and skip_delims:
|
elif pos == 0 and skip_delims:
|
||||||
here = expr[idx:]
|
here = expr[idx:]
|
||||||
for s in skip_delims if isinstance(skip_delims, (list, tuple)) else [skip_delims]:
|
for s in variadic(skip_delims):
|
||||||
if here.startswith(s) and s:
|
if here.startswith(s) and s:
|
||||||
skipping = len(s) - 1
|
skipping = len(s) - 1
|
||||||
break
|
break
|
||||||
@ -316,7 +331,7 @@ class JSInterpreter(object):
|
|||||||
separated = list(cls._separate(expr, delim, 1))
|
separated = list(cls._separate(expr, delim, 1))
|
||||||
|
|
||||||
if len(separated) < 2:
|
if len(separated) < 2:
|
||||||
raise cls.Exception('No terminating paren {delim} in {expr}'.format(**locals()))
|
raise cls.Exception('No terminating paren {delim} in {expr!r:.5500}'.format(**locals()))
|
||||||
return separated[0][1:].strip(), separated[1].strip()
|
return separated[0][1:].strip(), separated[1].strip()
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
@ -361,6 +376,20 @@ class JSInterpreter(object):
|
|||||||
except TypeError:
|
except TypeError:
|
||||||
return self._named_object(namespace, obj)
|
return self._named_object(namespace, obj)
|
||||||
|
|
||||||
|
# used below
|
||||||
|
_VAR_RET_THROW_RE = re.compile(r'''(?x)
|
||||||
|
(?P<var>(?:var|const|let)\s)|return(?:\s+|(?=["'])|$)|(?P<throw>throw\s+)
|
||||||
|
''')
|
||||||
|
_COMPOUND_RE = re.compile(r'''(?x)
|
||||||
|
(?P<try>try)\s*\{|
|
||||||
|
(?P<if>if)\s*\(|
|
||||||
|
(?P<switch>switch)\s*\(|
|
||||||
|
(?P<for>for)\s*\(|
|
||||||
|
(?P<while>while)\s*\(
|
||||||
|
''')
|
||||||
|
_FINALLY_RE = re.compile(r'finally\s*\{')
|
||||||
|
_SWITCH_RE = re.compile(r'switch\s*\(')
|
||||||
|
|
||||||
def interpret_statement(self, stmt, local_vars, allow_recursion=100):
|
def interpret_statement(self, stmt, local_vars, allow_recursion=100):
|
||||||
if allow_recursion < 0:
|
if allow_recursion < 0:
|
||||||
raise self.Exception('Recursion limit reached')
|
raise self.Exception('Recursion limit reached')
|
||||||
@ -375,7 +404,7 @@ class JSInterpreter(object):
|
|||||||
if should_return:
|
if should_return:
|
||||||
return ret, should_return
|
return ret, should_return
|
||||||
|
|
||||||
m = re.match(r'(?P<var>(?:var|const|let)\s)|return(?:\s+|(?=["\'])|$)|(?P<throw>throw\s+)', stmt)
|
m = self._VAR_RET_THROW_RE.match(stmt)
|
||||||
if m:
|
if m:
|
||||||
expr = stmt[len(m.group(0)):].strip()
|
expr = stmt[len(m.group(0)):].strip()
|
||||||
if m.group('throw'):
|
if m.group('throw'):
|
||||||
@ -405,7 +434,7 @@ class JSInterpreter(object):
|
|||||||
left, right = self._separate_at_paren(obj[len(klass):])
|
left, right = self._separate_at_paren(obj[len(klass):])
|
||||||
argvals = self.interpret_iter(left, local_vars, allow_recursion)
|
argvals = self.interpret_iter(left, local_vars, allow_recursion)
|
||||||
expr = konstr(*argvals)
|
expr = konstr(*argvals)
|
||||||
if not expr:
|
if expr is None:
|
||||||
raise self.Exception('Failed to parse {klass} {left!r:.100}'.format(**locals()), expr=expr)
|
raise self.Exception('Failed to parse {klass} {left!r:.100}'.format(**locals()), expr=expr)
|
||||||
expr = self._dump(expr, local_vars) + right
|
expr = self._dump(expr, local_vars) + right
|
||||||
break
|
break
|
||||||
@ -447,13 +476,7 @@ class JSInterpreter(object):
|
|||||||
for item in self._separate(inner)])
|
for item in self._separate(inner)])
|
||||||
expr = name + outer
|
expr = name + outer
|
||||||
|
|
||||||
m = re.match(r'''(?x)
|
m = self._COMPOUND_RE.match(expr)
|
||||||
(?P<try>try)\s*\{|
|
|
||||||
(?P<if>if)\s*\(|
|
|
||||||
(?P<switch>switch)\s*\(|
|
|
||||||
(?P<for>for)\s*\(|
|
|
||||||
(?P<while>while)\s*\(
|
|
||||||
''', expr)
|
|
||||||
md = m.groupdict() if m else {}
|
md = m.groupdict() if m else {}
|
||||||
if md.get('if'):
|
if md.get('if'):
|
||||||
cndn, expr = self._separate_at_paren(expr[m.end() - 1:])
|
cndn, expr = self._separate_at_paren(expr[m.end() - 1:])
|
||||||
@ -512,7 +535,7 @@ class JSInterpreter(object):
|
|||||||
err = None
|
err = None
|
||||||
pending = self.interpret_statement(sub_expr, catch_vars, allow_recursion)
|
pending = self.interpret_statement(sub_expr, catch_vars, allow_recursion)
|
||||||
|
|
||||||
m = re.match(r'finally\s*\{', expr)
|
m = self._FINALLY_RE.match(expr)
|
||||||
if m:
|
if m:
|
||||||
sub_expr, expr = self._separate_at_paren(expr[m.end() - 1:])
|
sub_expr, expr = self._separate_at_paren(expr[m.end() - 1:])
|
||||||
ret, should_abort = self.interpret_statement(sub_expr, local_vars, allow_recursion)
|
ret, should_abort = self.interpret_statement(sub_expr, local_vars, allow_recursion)
|
||||||
@ -531,7 +554,7 @@ class JSInterpreter(object):
|
|||||||
if remaining.startswith('{'):
|
if remaining.startswith('{'):
|
||||||
body, expr = self._separate_at_paren(remaining)
|
body, expr = self._separate_at_paren(remaining)
|
||||||
else:
|
else:
|
||||||
switch_m = re.match(r'switch\s*\(', remaining) # FIXME
|
switch_m = self._SWITCH_RE.match(remaining) # FIXME
|
||||||
if switch_m:
|
if switch_m:
|
||||||
switch_val, remaining = self._separate_at_paren(remaining[switch_m.end() - 1:])
|
switch_val, remaining = self._separate_at_paren(remaining[switch_m.end() - 1:])
|
||||||
body, expr = self._separate_at_paren(remaining, '}')
|
body, expr = self._separate_at_paren(remaining, '}')
|
||||||
@ -699,7 +722,7 @@ class JSInterpreter(object):
|
|||||||
""" assert, but without risk of getting optimized out """
|
""" assert, but without risk of getting optimized out """
|
||||||
if not cndn:
|
if not cndn:
|
||||||
memb = member
|
memb = member
|
||||||
raise self.Exception('{member} {msg}'.format(**locals()), expr=expr)
|
raise self.Exception('{memb} {msg}'.format(**locals()), expr=expr)
|
||||||
|
|
||||||
def eval_method():
|
def eval_method():
|
||||||
if (variable, member) == ('console', 'debug'):
|
if (variable, member) == ('console', 'debug'):
|
||||||
@ -735,7 +758,7 @@ class JSInterpreter(object):
|
|||||||
if obj == compat_str:
|
if obj == compat_str:
|
||||||
if member == 'fromCharCode':
|
if member == 'fromCharCode':
|
||||||
assertion(argvals, 'takes one or more arguments')
|
assertion(argvals, 'takes one or more arguments')
|
||||||
return ''.join(map(chr, argvals))
|
return ''.join(map(compat_chr, argvals))
|
||||||
raise self.Exception('Unsupported string method ' + member, expr=expr)
|
raise self.Exception('Unsupported string method ' + member, expr=expr)
|
||||||
elif obj == float:
|
elif obj == float:
|
||||||
if member == 'pow':
|
if member == 'pow':
|
||||||
@ -808,10 +831,17 @@ class JSInterpreter(object):
|
|||||||
if idx >= len(obj):
|
if idx >= len(obj):
|
||||||
return None
|
return None
|
||||||
return ord(obj[idx])
|
return ord(obj[idx])
|
||||||
elif member == 'replace':
|
elif member in ('replace', 'replaceAll'):
|
||||||
assertion(isinstance(obj, compat_str), 'must be applied on a string')
|
assertion(isinstance(obj, compat_str), 'must be applied on a string')
|
||||||
assertion(len(argvals) == 2, 'takes exactly two arguments')
|
assertion(len(argvals) == 2, 'takes exactly two arguments')
|
||||||
return re.sub(argvals[0], argvals[1], obj)
|
# TODO: argvals[1] callable, other Py vs JS edge cases
|
||||||
|
if isinstance(argvals[0], self.JS_RegExp):
|
||||||
|
count = 0 if argvals[0].flags & self.JS_RegExp.RE_FLAGS['g'] else 1
|
||||||
|
assertion(member != 'replaceAll' or count == 0,
|
||||||
|
'replaceAll must be called with a global RegExp')
|
||||||
|
return argvals[0].sub(argvals[1], obj, count=count)
|
||||||
|
count = ('replaceAll', 'replace').index(member)
|
||||||
|
return re.sub(re.escape(argvals[0]), argvals[1], obj, count=count)
|
||||||
|
|
||||||
idx = int(member) if isinstance(obj, list) else member
|
idx = int(member) if isinstance(obj, list) else member
|
||||||
return obj[idx](argvals, allow_recursion=allow_recursion)
|
return obj[idx](argvals, allow_recursion=allow_recursion)
|
||||||
|
@ -2176,11 +2176,11 @@ def sanitize_url(url):
|
|||||||
for mistake, fixup in COMMON_TYPOS:
|
for mistake, fixup in COMMON_TYPOS:
|
||||||
if re.match(mistake, url):
|
if re.match(mistake, url):
|
||||||
return re.sub(mistake, fixup, url)
|
return re.sub(mistake, fixup, url)
|
||||||
return escape_url(url)
|
return url
|
||||||
|
|
||||||
|
|
||||||
def sanitized_Request(url, *args, **kwargs):
|
def sanitized_Request(url, *args, **kwargs):
|
||||||
return compat_urllib_request.Request(sanitize_url(url), *args, **kwargs)
|
return compat_urllib_request.Request(escape_url(sanitize_url(url)), *args, **kwargs)
|
||||||
|
|
||||||
|
|
||||||
def expand_path(s):
|
def expand_path(s):
|
||||||
|
Loading…
Reference in New Issue
Block a user