Merge branch 'master' into fixes_for_vier_extractor

This commit is contained in:
Cedric Nugteren 2021-02-06 13:27:46 +01:00
commit 95e0844f0c
63 changed files with 3443 additions and 3367 deletions

View File

@ -18,7 +18,7 @@ title: ''
<!-- <!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl: Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.01.08. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED. - First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.02.04.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser. - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape. - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates. - Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -26,7 +26,7 @@ Carefully read and work through this check list in order to prevent the most com
--> -->
- [ ] I'm reporting a broken site support - [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2021.01.08** - [ ] I've verified that I'm running youtube-dl version **2021.02.04.1**
- [ ] I've checked that all provided URLs are alive and playable in a browser - [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones - [ ] I've searched the bugtracker for similar issues including closed ones
@ -41,7 +41,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2021.01.08 [debug] youtube-dl version 2021.02.04.1
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}

View File

@ -19,7 +19,7 @@ labels: 'site-support-request'
<!-- <!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl: Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.01.08. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED. - First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.02.04.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser. - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights. - Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates. - Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
--> -->
- [ ] I'm reporting a new site support request - [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2021.01.08** - [ ] I've verified that I'm running youtube-dl version **2021.02.04.1**
- [ ] I've checked that all provided URLs are alive and playable in a browser - [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights - [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones - [ ] I've searched the bugtracker for similar site support requests including closed ones

View File

@ -18,13 +18,13 @@ title: ''
<!-- <!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl: Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.01.08. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED. - First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.02.04.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates. - Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x]) - Finally, put x into all relevant boxes (like this [x])
--> -->
- [ ] I'm reporting a site feature request - [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2021.01.08** - [ ] I've verified that I'm running youtube-dl version **2021.02.04.1**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones - [ ] I've searched the bugtracker for similar site feature requests including closed ones

View File

@ -18,7 +18,7 @@ title: ''
<!-- <!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl: Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.01.08. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED. - First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.02.04.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser. - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape. - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates. - Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
--> -->
- [ ] I'm reporting a broken site support issue - [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2021.01.08** - [ ] I've verified that I'm running youtube-dl version **2021.02.04.1**
- [ ] I've checked that all provided URLs are alive and playable in a browser - [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones - [ ] I've searched the bugtracker for similar bug reports including closed ones
@ -43,7 +43,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2021.01.08 [debug] youtube-dl version 2021.02.04.1
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}

View File

@ -19,13 +19,13 @@ labels: 'request'
<!-- <!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl: Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.01.08. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED. - First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.02.04.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates. - Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x]) - Finally, put x into all relevant boxes (like this [x])
--> -->
- [ ] I'm reporting a feature request - [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2021.01.08** - [ ] I've verified that I'm running youtube-dl version **2021.02.04.1**
- [ ] I've searched the bugtracker for similar feature requests including closed ones - [ ] I've searched the bugtracker for similar feature requests including closed ones

View File

@ -246,3 +246,4 @@ Enes Solak
Nathan Rossi Nathan Rossi
Thomas van der Berg Thomas van der Berg
Luca Cherubin Luca Cherubin
Adrian Heine

104
ChangeLog
View File

@ -1,3 +1,107 @@
version 2021.02.04.1
Extractors
* [youtube] Prefer DASH formats (#28070)
* [azmedien] Fix extraction (#28064)
version 2021.02.04
Extractors
* [pornhub] Implement lazy playlist extraction
* [svtplay] Fix video id extraction (#28058)
+ [pornhub] Add support for authentication (#18797, #21416, #24294)
* [pornhub:user] Improve paging
+ [pornhub:user] Add support for URLs unavailable via /videos page (#27853)
+ [bravotv] Add support for oxygen.com (#13357, #22500)
+ [youtube] Pass embed URL to get_video_info request
* [ccma] Improve metadata extraction (#27994)
+ Extract age limit, alt title, categories, series and episode number
* Fix timestamp multiple subtitles extraction
* [egghead] Update API domain (#28038)
- [vidzi] Remove extractor (#12629)
* [vidio] Improve metadata extraction
* [youtube] Improve subtitles extraction
* [youtube] Fix chapter extraction fallback
* [youtube] Rewrite extractor
* Improve format sorting
* Remove unused code
* Fix series metadata extraction
* Fix trailer video extraction
* Improve error reporting
+ Extract video location
+ [vvvvid] Add support for youtube embeds (#27825)
* [googledrive] Report download page errors (#28005)
* [vlive] Fix error message decoding for python 2 (#28004)
* [youtube] Improve DASH formats file size extraction
* [cda] Improve birth validation detection (#14022, #27929)
+ [awaan] Extract uploader id (#27963)
+ [medialaan] Add support DPG Media MyChannels based websites (#14871, #15597,
#16106, #16489)
* [abcnews] Fix extraction (#12394, #27920)
* [AMP] Fix upload date and timestamp extraction (#27970)
* [tv4] Relax URL regular expression (#27964)
+ [tv2] Add support for mtvuutiset.fi (#27744)
* [adn] Improve login warning reporting
* [zype] Fix uplynk id extraction (#27956)
+ [adn] Add support for authentication (#17091, #27841, #27937)
version 2021.01.24.1
Core
* Introduce --output-na-placeholder (#27896)
Extractors
* [franceculture] Make thumbnail optional (#18807)
* [franceculture] Fix extraction (#27891, #27903)
* [njpwworld] Fix extraction (#27890)
* [comedycentral] Fix extraction (#27905)
* [wat] Fix format extraction (#27901)
+ [americastestkitchen:season] Add support for seasons (#27861)
+ [trovo] Add support for trovo.live (#26125)
+ [aol] Add support for yahoo videos (#26650)
* [yahoo] Fix single video extraction
* [lbry] Unescape lbry URI (#27872)
* [9gag] Fix and improve extraction (#23022)
* [americastestkitchen] Improve metadata extraction for ATK episodes (#27860)
* [aljazeera] Fix extraction (#20911, #27779)
+ [minds] Add support for minds.com (#17934)
* [ard] Fix title and description extraction (#27761)
+ [spotify] Add support for Spotify Podcasts (#27443)
version 2021.01.16
Core
* [YoutubeDL] Protect from infinite recursion due to recursively nested
playlists (#27833)
* [YoutubeDL] Ignore failure to create existing directory (#27811)
* [YoutubeDL] Raise syntax error for format selection expressions with multiple
+ operators (#27803)
Extractors
+ [animeondemand] Add support for lazy playlist extraction (#27829)
* [youporn] Restrict fallback download URL (#27822)
* [youporn] Improve height and tbr extraction (#20425, #23659)
* [youporn] Fix extraction (#27822)
+ [twitter] Add support for unified cards (#27826)
+ [twitch] Add Authorization header with OAuth token for GraphQL requests
(#27790)
* [mixcloud:playlist:base] Extract video id in flat playlist mode (#27787)
* [cspan] Improve info extraction (#27791)
* [adn] Improve info extraction
* [adn] Fix extraction (#26963, #27732)
* [youtube:search] Extract from all sections (#27604)
* [youtube:search] fix viewcount and try to extract all video sections (#27604)
* [twitch] Improve login error extraction
* [twitch] Fix authentication (#27743)
* [3qsdn] Improve extraction (#21058)
* [peertube] Extract formats from streamingPlaylists (#26002, #27586, #27728)
* [khanacademy] Fix extraction (#2887, #26803)
* [spike] Update Paramount Network feed URL (#27715)
version 2021.01.08 version 2021.01.08
Core Core

763
README.md
View File

@ -52,394 +52,431 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
youtube-dl [OPTIONS] URL [URL...] youtube-dl [OPTIONS] URL [URL...]
# OPTIONS # OPTIONS
-h, --help Print this help text and exit -h, --help Print this help text and exit
--version Print program version and exit --version Print program version and exit
-U, --update Update this program to latest version. Make -U, --update Update this program to latest version.
sure that you have sufficient permissions Make sure that you have sufficient
(run with sudo if needed) permissions (run with sudo if needed)
-i, --ignore-errors Continue on download errors, for example to -i, --ignore-errors Continue on download errors, for
skip unavailable videos in a playlist example to skip unavailable videos in a
--abort-on-error Abort downloading of further videos (in the playlist
playlist or the command line) if an error --abort-on-error Abort downloading of further videos (in
occurs the playlist or the command line) if an
--dump-user-agent Display the current browser identification error occurs
--list-extractors List all supported extractors --dump-user-agent Display the current browser
--extractor-descriptions Output descriptions of all supported identification
extractors --list-extractors List all supported extractors
--force-generic-extractor Force extraction to use the generic --extractor-descriptions Output descriptions of all supported
extractor extractors
--default-search PREFIX Use this prefix for unqualified URLs. For --force-generic-extractor Force extraction to use the generic
example "gvsearch2:" downloads two videos extractor
from google videos for youtube-dl "large --default-search PREFIX Use this prefix for unqualified URLs.
apple". Use the value "auto" to let For example "gvsearch2:" downloads two
youtube-dl guess ("auto_warning" to emit a videos from google videos for youtube-
warning when guessing). "error" just throws dl "large apple". Use the value "auto"
an error. The default value "fixup_error" to let youtube-dl guess ("auto_warning"
repairs broken URLs, but emits an error if to emit a warning when guessing).
this is not possible instead of searching. "error" just throws an error. The
--ignore-config Do not read configuration files. When given default value "fixup_error" repairs
in the global configuration file broken URLs, but emits an error if this
/etc/youtube-dl.conf: Do not read the user is not possible instead of searching.
configuration in ~/.config/youtube- --ignore-config Do not read configuration files. When
dl/config (%APPDATA%/youtube-dl/config.txt given in the global configuration file
on Windows) /etc/youtube-dl.conf: Do not read the
--config-location PATH Location of the configuration file; either user configuration in
the path to the config or its containing ~/.config/youtube-dl/config
directory. (%APPDATA%/youtube-dl/config.txt on
--flat-playlist Do not extract the videos of a playlist, Windows)
only list them. --config-location PATH Location of the configuration file;
--mark-watched Mark videos watched (YouTube only) either the path to the config or its
--no-mark-watched Do not mark videos watched (YouTube only) containing directory.
--no-color Do not emit color codes in output --flat-playlist Do not extract the videos of a
playlist, only list them.
--mark-watched Mark videos watched (YouTube only)
--no-mark-watched Do not mark videos watched (YouTube
only)
--no-color Do not emit color codes in output
## Network Options: ## Network Options:
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy. --proxy URL Use the specified HTTP/HTTPS/SOCKS
To enable SOCKS proxy, specify a proper proxy. To enable SOCKS proxy, specify a
scheme. For example proper scheme. For example
socks5://127.0.0.1:1080/. Pass in an empty socks5://127.0.0.1:1080/. Pass in an
string (--proxy "") for direct connection empty string (--proxy "") for direct
--socket-timeout SECONDS Time to wait before giving up, in seconds connection
--source-address IP Client-side IP address to bind to --socket-timeout SECONDS Time to wait before giving up, in
-4, --force-ipv4 Make all connections via IPv4 seconds
-6, --force-ipv6 Make all connections via IPv6 --source-address IP Client-side IP address to bind to
-4, --force-ipv4 Make all connections via IPv4
-6, --force-ipv6 Make all connections via IPv6
## Geo Restriction: ## Geo Restriction:
--geo-verification-proxy URL Use this proxy to verify the IP address for --geo-verification-proxy URL Use this proxy to verify the IP address
some geo-restricted sites. The default for some geo-restricted sites. The
proxy specified by --proxy (or none, if the default proxy specified by --proxy (or
option is not present) is used for the none, if the option is not present) is
actual downloading. used for the actual downloading.
--geo-bypass Bypass geographic restriction via faking --geo-bypass Bypass geographic restriction via
X-Forwarded-For HTTP header faking X-Forwarded-For HTTP header
--no-geo-bypass Do not bypass geographic restriction via --no-geo-bypass Do not bypass geographic restriction
faking X-Forwarded-For HTTP header via faking X-Forwarded-For HTTP header
--geo-bypass-country CODE Force bypass geographic restriction with --geo-bypass-country CODE Force bypass geographic restriction
explicitly provided two-letter ISO 3166-2 with explicitly provided two-letter ISO
country code 3166-2 country code
--geo-bypass-ip-block IP_BLOCK Force bypass geographic restriction with --geo-bypass-ip-block IP_BLOCK Force bypass geographic restriction
explicitly provided IP block in CIDR with explicitly provided IP block in
notation CIDR notation
## Video Selection: ## Video Selection:
--playlist-start NUMBER Playlist video to start at (default is 1) --playlist-start NUMBER Playlist video to start at (default is
--playlist-end NUMBER Playlist video to end at (default is last) 1)
--playlist-items ITEM_SPEC Playlist video items to download. Specify --playlist-end NUMBER Playlist video to end at (default is
indices of the videos in the playlist last)
separated by commas like: "--playlist-items --playlist-items ITEM_SPEC Playlist video items to download.
1,2,5,8" if you want to download videos Specify indices of the videos in the
indexed 1, 2, 5, 8 in the playlist. You can playlist separated by commas like: "--
specify range: "--playlist-items playlist-items 1,2,5,8" if you want to
1-3,7,10-13", it will download the videos download videos indexed 1, 2, 5, 8 in
at index 1, 2, 3, 7, 10, 11, 12 and 13. the playlist. You can specify range: "
--match-title REGEX Download only matching titles (regex or --playlist-items 1-3,7,10-13", it will
caseless sub-string) download the videos at index 1, 2, 3,
--reject-title REGEX Skip download for matching titles (regex or 7, 10, 11, 12 and 13.
caseless sub-string) --match-title REGEX Download only matching titles (regex or
--max-downloads NUMBER Abort after downloading NUMBER files caseless sub-string)
--min-filesize SIZE Do not download any videos smaller than --reject-title REGEX Skip download for matching titles
SIZE (e.g. 50k or 44.6m) (regex or caseless sub-string)
--max-filesize SIZE Do not download any videos larger than SIZE --max-downloads NUMBER Abort after downloading NUMBER files
(e.g. 50k or 44.6m) --min-filesize SIZE Do not download any videos smaller than
--date DATE Download only videos uploaded in this date SIZE (e.g. 50k or 44.6m)
--datebefore DATE Download only videos uploaded on or before --max-filesize SIZE Do not download any videos larger than
this date (i.e. inclusive) SIZE (e.g. 50k or 44.6m)
--dateafter DATE Download only videos uploaded on or after --date DATE Download only videos uploaded in this
this date (i.e. inclusive) date
--min-views COUNT Do not download any videos with less than --datebefore DATE Download only videos uploaded on or
COUNT views before this date (i.e. inclusive)
--max-views COUNT Do not download any videos with more than --dateafter DATE Download only videos uploaded on or
COUNT views after this date (i.e. inclusive)
--match-filter FILTER Generic video filter. Specify any key (see --min-views COUNT Do not download any videos with less
the "OUTPUT TEMPLATE" for a list of than COUNT views
available keys) to match if the key is --max-views COUNT Do not download any videos with more
present, !key to check if the key is not than COUNT views
present, key > NUMBER (like "comment_count --match-filter FILTER Generic video filter. Specify any key
> 12", also works with >=, <, <=, !=, =) to (see the "OUTPUT TEMPLATE" for a list
compare against a number, key = 'LITERAL' of available keys) to match if the key
(like "uploader = 'Mike Smith'", also works is present, !key to check if the key is
with !=) to match against a string literal not present, key > NUMBER (like
and & to require multiple matches. Values "comment_count > 12", also works with
which are not known are excluded unless you >=, <, <=, !=, =) to compare against a
put a question mark (?) after the operator. number, key = 'LITERAL' (like "uploader
For example, to only match videos that have = 'Mike Smith'", also works with !=) to
been liked more than 100 times and disliked match against a string literal and & to
less than 50 times (or the dislike require multiple matches. Values which
functionality is not available at the given are not known are excluded unless you
service), but who also have a description, put a question mark (?) after the
use --match-filter "like_count > 100 & operator. For example, to only match
dislike_count <? 50 & description" . videos that have been liked more than
--no-playlist Download only the video, if the URL refers 100 times and disliked less than 50
to a video and a playlist. times (or the dislike functionality is
--yes-playlist Download the playlist, if the URL refers to not available at the given service),
a video and a playlist. but who also have a description, use
--age-limit YEARS Download only videos suitable for the given --match-filter "like_count > 100 &
age dislike_count <? 50 & description" .
--download-archive FILE Download only videos not listed in the --no-playlist Download only the video, if the URL
archive file. Record the IDs of all refers to a video and a playlist.
downloaded videos in it. --yes-playlist Download the playlist, if the URL
--include-ads Download advertisements as well refers to a video and a playlist.
(experimental) --age-limit YEARS Download only videos suitable for the
given age
--download-archive FILE Download only videos not listed in the
archive file. Record the IDs of all
downloaded videos in it.
--include-ads Download advertisements as well
(experimental)
## Download Options: ## Download Options:
-r, --limit-rate RATE Maximum download rate in bytes per second -r, --limit-rate RATE Maximum download rate in bytes per
(e.g. 50K or 4.2M) second (e.g. 50K or 4.2M)
-R, --retries RETRIES Number of retries (default is 10), or -R, --retries RETRIES Number of retries (default is 10), or
"infinite". "infinite".
--fragment-retries RETRIES Number of retries for a fragment (default --fragment-retries RETRIES Number of retries for a fragment
is 10), or "infinite" (DASH, hlsnative and (default is 10), or "infinite" (DASH,
ISM) hlsnative and ISM)
--skip-unavailable-fragments Skip unavailable fragments (DASH, hlsnative --skip-unavailable-fragments Skip unavailable fragments (DASH,
and ISM) hlsnative and ISM)
--abort-on-unavailable-fragment Abort downloading when some fragment is not --abort-on-unavailable-fragment Abort downloading when some fragment is
available not available
--keep-fragments Keep downloaded fragments on disk after --keep-fragments Keep downloaded fragments on disk after
downloading is finished; fragments are downloading is finished; fragments are
erased by default erased by default
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K) --buffer-size SIZE Size of download buffer (e.g. 1024 or
(default is 1024) 16K) (default is 1024)
--no-resize-buffer Do not automatically adjust the buffer --no-resize-buffer Do not automatically adjust the buffer
size. By default, the buffer size is size. By default, the buffer size is
automatically resized from an initial value automatically resized from an initial
of SIZE. value of SIZE.
--http-chunk-size SIZE Size of a chunk for chunk-based HTTP --http-chunk-size SIZE Size of a chunk for chunk-based HTTP
downloading (e.g. 10485760 or 10M) (default downloading (e.g. 10485760 or 10M)
is disabled). May be useful for bypassing (default is disabled). May be useful
bandwidth throttling imposed by a webserver for bypassing bandwidth throttling
(experimental) imposed by a webserver (experimental)
--playlist-reverse Download playlist videos in reverse order --playlist-reverse Download playlist videos in reverse
--playlist-random Download playlist videos in random order order
--xattr-set-filesize Set file xattribute ytdl.filesize with --playlist-random Download playlist videos in random
expected file size order
--hls-prefer-native Use the native HLS downloader instead of --xattr-set-filesize Set file xattribute ytdl.filesize with
ffmpeg expected file size
--hls-prefer-ffmpeg Use ffmpeg instead of the native HLS --hls-prefer-native Use the native HLS downloader instead
downloader of ffmpeg
--hls-use-mpegts Use the mpegts container for HLS videos, --hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
allowing to play the video while downloader
downloading (some players may not be able --hls-use-mpegts Use the mpegts container for HLS
to play it) videos, allowing to play the video
--external-downloader COMMAND Use the specified external downloader. while downloading (some players may not
Currently supports be able to play it)
aria2c,avconv,axel,curl,ffmpeg,httpie,wget --external-downloader COMMAND Use the specified external downloader.
--external-downloader-args ARGS Give these arguments to the external Currently supports aria2c,avconv,axel,c
downloader url,ffmpeg,httpie,wget
--external-downloader-args ARGS Give these arguments to the external
downloader
## Filesystem Options: ## Filesystem Options:
-a, --batch-file FILE File containing URLs to download ('-' for -a, --batch-file FILE File containing URLs to download ('-'
stdin), one URL per line. Lines starting for stdin), one URL per line. Lines
with '#', ';' or ']' are considered as starting with '#', ';' or ']' are
comments and ignored. considered as comments and ignored.
--id Use only video ID in file name --id Use only video ID in file name
-o, --output TEMPLATE Output filename template, see the "OUTPUT -o, --output TEMPLATE Output filename template, see the
TEMPLATE" for all the info "OUTPUT TEMPLATE" for all the info
--autonumber-start NUMBER Specify the start value for %(autonumber)s --output-na-placeholder PLACEHOLDER Placeholder value for unavailable meta
(default is 1) fields in output filename template
--restrict-filenames Restrict filenames to only ASCII (default is "NA")
characters, and avoid "&" and spaces in --autonumber-start NUMBER Specify the start value for
filenames %(autonumber)s (default is 1)
-w, --no-overwrites Do not overwrite files --restrict-filenames Restrict filenames to only ASCII
-c, --continue Force resume of partially downloaded files. characters, and avoid "&" and spaces in
By default, youtube-dl will resume filenames
downloads if possible. -w, --no-overwrites Do not overwrite files
--no-continue Do not resume partially downloaded files -c, --continue Force resume of partially downloaded
(restart from beginning) files. By default, youtube-dl will
--no-part Do not use .part files - write directly resume downloads if possible.
into output file --no-continue Do not resume partially downloaded
--no-mtime Do not use the Last-modified header to set files (restart from beginning)
the file modification time --no-part Do not use .part files - write directly
--write-description Write video description to a .description into output file
file --no-mtime Do not use the Last-modified header to
--write-info-json Write video metadata to a .info.json file set the file modification time
--write-annotations Write video annotations to a --write-description Write video description to a
.annotations.xml file .description file
--load-info-json FILE JSON file containing the video information --write-info-json Write video metadata to a .info.json
(created with the "--write-info-json" file
option) --write-annotations Write video annotations to a
--cookies FILE File to read cookies from and dump cookie .annotations.xml file
jar in --load-info-json FILE JSON file containing the video
--cache-dir DIR Location in the filesystem where youtube-dl information (created with the "--write-
can store some downloaded information info-json" option)
permanently. By default --cookies FILE File to read cookies from and dump
$XDG_CACHE_HOME/youtube-dl or cookie jar in
~/.cache/youtube-dl . At the moment, only --cache-dir DIR Location in the filesystem where
YouTube player files (for videos with youtube-dl can store some downloaded
obfuscated signatures) are cached, but that information permanently. By default
may change. $XDG_CACHE_HOME/youtube-dl or
--no-cache-dir Disable filesystem caching ~/.cache/youtube-dl . At the moment,
--rm-cache-dir Delete all filesystem cache files only YouTube player files (for videos
with obfuscated signatures) are cached,
but that may change.
--no-cache-dir Disable filesystem caching
--rm-cache-dir Delete all filesystem cache files
## Thumbnail images: ## Thumbnail images:
--write-thumbnail Write thumbnail image to disk --write-thumbnail Write thumbnail image to disk
--write-all-thumbnails Write all thumbnail image formats to disk --write-all-thumbnails Write all thumbnail image formats to
--list-thumbnails Simulate and list all available thumbnail disk
formats --list-thumbnails Simulate and list all available
thumbnail formats
## Verbosity / Simulation Options: ## Verbosity / Simulation Options:
-q, --quiet Activate quiet mode -q, --quiet Activate quiet mode
--no-warnings Ignore warnings --no-warnings Ignore warnings
-s, --simulate Do not download the video and do not write -s, --simulate Do not download the video and do not
anything to disk write anything to disk
--skip-download Do not download the video --skip-download Do not download the video
-g, --get-url Simulate, quiet but print URL -g, --get-url Simulate, quiet but print URL
-e, --get-title Simulate, quiet but print title -e, --get-title Simulate, quiet but print title
--get-id Simulate, quiet but print id --get-id Simulate, quiet but print id
--get-thumbnail Simulate, quiet but print thumbnail URL --get-thumbnail Simulate, quiet but print thumbnail URL
--get-description Simulate, quiet but print video description --get-description Simulate, quiet but print video
--get-duration Simulate, quiet but print video length description
--get-filename Simulate, quiet but print output filename --get-duration Simulate, quiet but print video length
--get-format Simulate, quiet but print output format --get-filename Simulate, quiet but print output
-j, --dump-json Simulate, quiet but print JSON information. filename
See the "OUTPUT TEMPLATE" for a description --get-format Simulate, quiet but print output format
of available keys. -j, --dump-json Simulate, quiet but print JSON
-J, --dump-single-json Simulate, quiet but print JSON information information. See the "OUTPUT TEMPLATE"
for each command-line argument. If the URL for a description of available keys.
refers to a playlist, dump the whole -J, --dump-single-json Simulate, quiet but print JSON
playlist information in a single line. information for each command-line
--print-json Be quiet and print the video information as argument. If the URL refers to a
JSON (video is still being downloaded). playlist, dump the whole playlist
--newline Output progress bar as new lines information in a single line.
--no-progress Do not print progress bar --print-json Be quiet and print the video
--console-title Display progress in console titlebar information as JSON (video is still
-v, --verbose Print various debugging information being downloaded).
--dump-pages Print downloaded pages encoded using base64 --newline Output progress bar as new lines
to debug problems (very verbose) --no-progress Do not print progress bar
--write-pages Write downloaded intermediary pages to --console-title Display progress in console titlebar
files in the current directory to debug -v, --verbose Print various debugging information
problems --dump-pages Print downloaded pages encoded using
--print-traffic Display sent and read HTTP traffic base64 to debug problems (very verbose)
-C, --call-home Contact the youtube-dl server for debugging --write-pages Write downloaded intermediary pages to
--no-call-home Do NOT contact the youtube-dl server for files in the current directory to debug
debugging problems
--print-traffic Display sent and read HTTP traffic
-C, --call-home Contact the youtube-dl server for
debugging
--no-call-home Do NOT contact the youtube-dl server
for debugging
## Workarounds: ## Workarounds:
--encoding ENCODING Force the specified encoding (experimental) --encoding ENCODING Force the specified encoding
--no-check-certificate Suppress HTTPS certificate validation (experimental)
--prefer-insecure Use an unencrypted connection to retrieve --no-check-certificate Suppress HTTPS certificate validation
information about the video. (Currently --prefer-insecure Use an unencrypted connection to
supported only for YouTube) retrieve information about the video.
--user-agent UA Specify a custom user agent (Currently supported only for YouTube)
--referer URL Specify a custom referer, use if the video --user-agent UA Specify a custom user agent
access is restricted to one domain --referer URL Specify a custom referer, use if the
--add-header FIELD:VALUE Specify a custom HTTP header and its value, video access is restricted to one
separated by a colon ':'. You can use this domain
option multiple times --add-header FIELD:VALUE Specify a custom HTTP header and its
--bidi-workaround Work around terminals that lack value, separated by a colon ':'. You
bidirectional text support. Requires bidiv can use this option multiple times
or fribidi executable in PATH --bidi-workaround Work around terminals that lack
--sleep-interval SECONDS Number of seconds to sleep before each bidirectional text support. Requires
download when used alone or a lower bound bidiv or fribidi executable in PATH
of a range for randomized sleep before each --sleep-interval SECONDS Number of seconds to sleep before each
download (minimum possible number of download when used alone or a lower
seconds to sleep) when used along with bound of a range for randomized sleep
--max-sleep-interval. before each download (minimum possible
--max-sleep-interval SECONDS Upper bound of a range for randomized sleep number of seconds to sleep) when used
before each download (maximum possible along with --max-sleep-interval.
number of seconds to sleep). Must only be --max-sleep-interval SECONDS Upper bound of a range for randomized
used along with --min-sleep-interval. sleep before each download (maximum
possible number of seconds to sleep).
Must only be used along with --min-
sleep-interval.
## Video Format Options: ## Video Format Options:
-f, --format FORMAT Video format code, see the "FORMAT -f, --format FORMAT Video format code, see the "FORMAT
SELECTION" for all the info SELECTION" for all the info
--all-formats Download all available video formats --all-formats Download all available video formats
--prefer-free-formats Prefer free video formats unless a specific --prefer-free-formats Prefer free video formats unless a
one is requested specific one is requested
-F, --list-formats List all available formats of requested -F, --list-formats List all available formats of requested
videos videos
--youtube-skip-dash-manifest Do not download the DASH manifests and --youtube-skip-dash-manifest Do not download the DASH manifests and
related data on YouTube videos related data on YouTube videos
--merge-output-format FORMAT If a merge is required (e.g. --merge-output-format FORMAT If a merge is required (e.g.
bestvideo+bestaudio), output to given bestvideo+bestaudio), output to given
container format. One of mkv, mp4, ogg, container format. One of mkv, mp4, ogg,
webm, flv. Ignored if no merge is required webm, flv. Ignored if no merge is
required
## Subtitle Options: ## Subtitle Options:
--write-sub Write subtitle file --write-sub Write subtitle file
--write-auto-sub Write automatically generated subtitle file --write-auto-sub Write automatically generated subtitle
(YouTube only) file (YouTube only)
--all-subs Download all the available subtitles of the --all-subs Download all the available subtitles of
video the video
--list-subs List all available subtitles for the video --list-subs List all available subtitles for the
--sub-format FORMAT Subtitle format, accepts formats video
preference, for example: "srt" or --sub-format FORMAT Subtitle format, accepts formats
"ass/srt/best" preference, for example: "srt" or
--sub-lang LANGS Languages of the subtitles to download "ass/srt/best"
(optional) separated by commas, use --list- --sub-lang LANGS Languages of the subtitles to download
subs for available language tags (optional) separated by commas, use
--list-subs for available language tags
## Authentication Options: ## Authentication Options:
-u, --username USERNAME Login with this account ID -u, --username USERNAME Login with this account ID
-p, --password PASSWORD Account password. If this option is left -p, --password PASSWORD Account password. If this option is
out, youtube-dl will ask interactively. left out, youtube-dl will ask
-2, --twofactor TWOFACTOR Two-factor authentication code interactively.
-n, --netrc Use .netrc authentication data -2, --twofactor TWOFACTOR Two-factor authentication code
--video-password PASSWORD Video password (vimeo, youku) -n, --netrc Use .netrc authentication data
--video-password PASSWORD Video password (vimeo, youku)
## Adobe Pass Options: ## Adobe Pass Options:
--ap-mso MSO Adobe Pass multiple-system operator (TV --ap-mso MSO Adobe Pass multiple-system operator (TV
provider) identifier, use --ap-list-mso for provider) identifier, use --ap-list-mso
a list of available MSOs for a list of available MSOs
--ap-username USERNAME Multiple-system operator account login --ap-username USERNAME Multiple-system operator account login
--ap-password PASSWORD Multiple-system operator account password. --ap-password PASSWORD Multiple-system operator account
If this option is left out, youtube-dl will password. If this option is left out,
ask interactively. youtube-dl will ask interactively.
--ap-list-mso List all supported multiple-system --ap-list-mso List all supported multiple-system
operators operators
## Post-processing Options: ## Post-processing Options:
-x, --extract-audio Convert video files to audio-only files -x, --extract-audio Convert video files to audio-only files
(requires ffmpeg or avconv and ffprobe or (requires ffmpeg/avconv and
avprobe) ffprobe/avprobe)
--audio-format FORMAT Specify audio format: "best", "aac", --audio-format FORMAT Specify audio format: "best", "aac",
"flac", "mp3", "m4a", "opus", "vorbis", or "flac", "mp3", "m4a", "opus", "vorbis",
"wav"; "best" by default; No effect without or "wav"; "best" by default; No effect
-x without -x
--audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert --audio-quality QUALITY Specify ffmpeg/avconv audio quality,
a value between 0 (better) and 9 (worse) insert a value between 0 (better) and 9
for VBR or a specific bitrate like 128K (worse) for VBR or a specific bitrate
(default 5) like 128K (default 5)
--recode-video FORMAT Encode the video to another format if --recode-video FORMAT Encode the video to another format if
necessary (currently supported: necessary (currently supported:
mp4|flv|ogg|webm|mkv|avi) mp4|flv|ogg|webm|mkv|avi)
--postprocessor-args ARGS Give these arguments to the postprocessor --postprocessor-args ARGS Give these arguments to the
-k, --keep-video Keep the video file on disk after the post- postprocessor
processing; the video is erased by default -k, --keep-video Keep the video file on disk after the
--no-post-overwrites Do not overwrite post-processed files; the post-processing; the video is erased by
post-processed files are overwritten by default
default --no-post-overwrites Do not overwrite post-processed files;
--embed-subs Embed subtitles in the video (only for mp4, the post-processed files are
webm and mkv videos) overwritten by default
--embed-thumbnail Embed thumbnail in the audio as cover art --embed-subs Embed subtitles in the video (only for
--add-metadata Write metadata to the video file mp4, webm and mkv videos)
--metadata-from-title FORMAT Parse additional metadata like song title / --embed-thumbnail Embed thumbnail in the audio as cover
artist from the video title. The format art
syntax is the same as --output. Regular --add-metadata Write metadata to the video file
expression with named capture groups may --metadata-from-title FORMAT Parse additional metadata like song
also be used. The parsed parameters replace title / artist from the video title.
existing values. Example: --metadata-from- The format syntax is the same as
title "%(artist)s - %(title)s" matches a --output. Regular expression with named
title like "Coldplay - Paradise". Example capture groups may also be used. The
(regex): --metadata-from-title parsed parameters replace existing
"(?P<artist>.+?) - (?P<title>.+)" values. Example: --metadata-from-title
--xattrs Write metadata to the video file's xattrs "%(artist)s - %(title)s" matches a
(using dublin core and xdg standards) title like "Coldplay - Paradise".
--fixup POLICY Automatically correct known faults of the Example (regex): --metadata-from-title
file. One of never (do nothing), warn (only "(?P<artist>.+?) - (?P<title>.+)"
emit a warning), detect_or_warn (the --xattrs Write metadata to the video file's
default; fix file if we can, warn xattrs (using dublin core and xdg
otherwise) standards)
--prefer-avconv Prefer avconv over ffmpeg for running the --fixup POLICY Automatically correct known faults of
postprocessors the file. One of never (do nothing),
--prefer-ffmpeg Prefer ffmpeg over avconv for running the warn (only emit a warning),
postprocessors (default) detect_or_warn (the default; fix file
--ffmpeg-location PATH Location of the ffmpeg/avconv binary; if we can, warn otherwise)
either the path to the binary or its --prefer-avconv Prefer avconv over ffmpeg for running
containing directory. the postprocessors
--exec CMD Execute a command on the file after --prefer-ffmpeg Prefer ffmpeg over avconv for running
downloading and post-processing, similar to the postprocessors (default)
find's -exec syntax. Example: --exec 'adb --ffmpeg-location PATH Location of the ffmpeg/avconv binary;
push {} /sdcard/Music/ && rm {}' either the path to the binary or its
--convert-subs FORMAT Convert the subtitles to other format containing directory.
(currently supported: srt|ass|vtt|lrc) --exec CMD Execute a command on the file after
downloading and post-processing,
similar to find's -exec syntax.
Example: --exec 'adb push {}
/sdcard/Music/ && rm {}'
--convert-subs FORMAT Convert the subtitles to other format
(currently supported: srt|ass|vtt|lrc)
# CONFIGURATION # CONFIGURATION
@ -583,7 +620,7 @@ Available for the media that is a track or a part of a music album:
- `disc_number` (numeric): Number of the disc or other physical medium the track belongs to - `disc_number` (numeric): Number of the disc or other physical medium the track belongs to
- `release_year` (numeric): Year (YYYY) when the album was released - `release_year` (numeric): Year (YYYY) when the album was released
Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with `NA`. Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with placeholder value provided with `--output-na-placeholder` (`NA` by default).
For example for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `youtube-dl test video` and id `BaW_jenozKcj`, this will result in a `youtube-dl test video-BaW_jenozKcj.mp4` file created in the current directory. For example for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `youtube-dl test video` and id `BaW_jenozKcj`, this will result in a `youtube-dl test video-BaW_jenozKcj.mp4` file created in the current directory.

View File

@ -46,10 +46,11 @@
- **Amara** - **Amara**
- **AMCNetworks** - **AMCNetworks**
- **AmericasTestKitchen** - **AmericasTestKitchen**
- **AmericasTestKitchenSeason**
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **AnimeOnDemand** - **AnimeOnDemand**
- **Anvato** - **Anvato**
- **aol.com** - **aol.com**: Yahoo screen and movies
- **APA** - **APA**
- **Aparat** - **Aparat**
- **AppleConnect** - **AppleConnect**
@ -192,8 +193,6 @@
- **CNNArticle** - **CNNArticle**
- **CNNBlogs** - **CNNBlogs**
- **ComedyCentral** - **ComedyCentral**
- **ComedyCentralFullEpisodes**
- **ComedyCentralShortname**
- **ComedyCentralTV** - **ComedyCentralTV**
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED - **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
- **CONtv** - **CONtv**
@ -418,7 +417,8 @@
- **Katsomo** - **Katsomo**
- **KeezMovies** - **KeezMovies**
- **Ketnet** - **Ketnet**
- **KhanAcademy** - **khanacademy**
- **khanacademy:unit**
- **KickStarter** - **KickStarter**
- **KinjaEmbed** - **KinjaEmbed**
- **KinoPoisk** - **KinoPoisk**
@ -505,6 +505,9 @@
- **Mgoon** - **Mgoon**
- **MGTV**: 芒果TV - **MGTV**: 芒果TV
- **MiaoPai** - **MiaoPai**
- **minds**
- **minds:channel**
- **minds:group**
- **MinistryGrid** - **MinistryGrid**
- **Minoto** - **Minoto**
- **miomio.tv** - **miomio.tv**
@ -534,6 +537,7 @@
- **mtv:video** - **mtv:video**
- **mtvjapan** - **mtvjapan**
- **mtvservices:embedded** - **mtvservices:embedded**
- **MTVUutisetArticle**
- **MuenchenTV**: münchen.tv - **MuenchenTV**: münchen.tv
- **mva**: Microsoft Virtual Academy videos - **mva**: Microsoft Virtual Academy videos
- **mva:course**: Microsoft Virtual Academy courses - **mva:course**: Microsoft Virtual Academy courses
@ -858,6 +862,8 @@
- **Sport5** - **Sport5**
- **SportBox** - **SportBox**
- **SportDeutschland** - **SportDeutschland**
- **spotify**
- **spotify:show**
- **Spreaker** - **Spreaker**
- **SpreakerPage** - **SpreakerPage**
- **SpreakerShow** - **SpreakerShow**
@ -939,12 +945,13 @@
- **TNAFlixNetworkEmbed** - **TNAFlixNetworkEmbed**
- **toggle** - **toggle**
- **ToonGoggles** - **ToonGoggles**
- **Tosh**: Tosh.0
- **tou.tv** - **tou.tv**
- **Toypics**: Toypics video - **Toypics**: Toypics video
- **ToypicsUser**: Toypics user profile - **ToypicsUser**: Toypics user profile
- **TrailerAddict** (Currently broken) - **TrailerAddict** (Currently broken)
- **Trilulilu** - **Trilulilu**
- **Trovo**
- **TrovoVod**
- **TruNews** - **TruNews**
- **TruTV** - **TruTV**
- **Tube8** - **Tube8**
@ -1052,7 +1059,6 @@
- **vidme** - **vidme**
- **vidme:user** - **vidme:user**
- **vidme:user:likes** - **vidme:user:likes**
- **Vidzi**
- **vier**: vier.be and vijf.be - **vier**: vier.be and vijf.be
- **vier:videos** - **vier:videos**
- **viewlift** - **viewlift**
@ -1097,6 +1103,7 @@
- **vrv** - **vrv**
- **vrv:series** - **vrv:series**
- **VShare** - **VShare**
- **VTM**
- **VTXTV** - **VTXTV**
- **vube**: Vube.com - **vube**: Vube.com
- **VuClip** - **VuClip**

View File

@ -633,13 +633,20 @@ class TestYoutubeDL(unittest.TestCase):
'title2': '%PATH%', 'title2': '%PATH%',
} }
def fname(templ): def fname(templ, na_placeholder='NA'):
ydl = YoutubeDL({'outtmpl': templ}) params = {'outtmpl': templ}
if na_placeholder != 'NA':
params['outtmpl_na_placeholder'] = na_placeholder
ydl = YoutubeDL(params)
return ydl.prepare_filename(info) return ydl.prepare_filename(info)
self.assertEqual(fname('%(id)s.%(ext)s'), '1234.mp4') self.assertEqual(fname('%(id)s.%(ext)s'), '1234.mp4')
self.assertEqual(fname('%(id)s-%(width)s.%(ext)s'), '1234-NA.mp4') self.assertEqual(fname('%(id)s-%(width)s.%(ext)s'), '1234-NA.mp4')
# Replace missing fields with 'NA' NA_TEST_OUTTMPL = '%(uploader_date)s-%(width)d-%(id)s.%(ext)s'
self.assertEqual(fname('%(uploader_date)s-%(id)s.%(ext)s'), 'NA-1234.mp4') # Replace missing fields with 'NA' by default
self.assertEqual(fname(NA_TEST_OUTTMPL), 'NA-NA-1234.mp4')
# Or by provided placeholder
self.assertEqual(fname(NA_TEST_OUTTMPL, na_placeholder='none'), 'none-none-1234.mp4')
self.assertEqual(fname(NA_TEST_OUTTMPL, na_placeholder=''), '--1234.mp4')
self.assertEqual(fname('%(height)d.%(ext)s'), '1080.mp4') self.assertEqual(fname('%(height)d.%(ext)s'), '1080.mp4')
self.assertEqual(fname('%(height)6d.%(ext)s'), ' 1080.mp4') self.assertEqual(fname('%(height)6d.%(ext)s'), ' 1080.mp4')
self.assertEqual(fname('%(height)-6d.%(ext)s'), '1080 .mp4') self.assertEqual(fname('%(height)-6d.%(ext)s'), '1080 .mp4')

View File

@ -1,275 +0,0 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import expect_value
from youtube_dl.extractor import YoutubeIE
class TestYoutubeChapters(unittest.TestCase):
_TEST_CASES = [
(
# https://www.youtube.com/watch?v=A22oy8dFjqc
# pattern: 00:00 - <title>
'''This is the absolute ULTIMATE experience of Queen's set at LIVE AID, this is the best video mixed to the absolutely superior stereo radio broadcast. This vastly superior audio mix takes a huge dump on all of the official mixes. Best viewed in 1080p. ENJOY! ***MAKE SURE TO READ THE DESCRIPTION***<br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+36);return false;">00:36</a> - Bohemian Rhapsody<br /><a href="#" onclick="yt.www.watch.player.seekTo(02*60+42);return false;">02:42</a> - Radio Ga Ga<br /><a href="#" onclick="yt.www.watch.player.seekTo(06*60+53);return false;">06:53</a> - Ay Oh!<br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+34);return false;">07:34</a> - Hammer To Fall<br /><a href="#" onclick="yt.www.watch.player.seekTo(12*60+08);return false;">12:08</a> - Crazy Little Thing Called Love<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+03);return false;">16:03</a> - We Will Rock You<br /><a href="#" onclick="yt.www.watch.player.seekTo(17*60+18);return false;">17:18</a> - We Are The Champions<br /><a href="#" onclick="yt.www.watch.player.seekTo(21*60+12);return false;">21:12</a> - Is This The World We Created...?<br /><br />Short song analysis:<br /><br />- "Bohemian Rhapsody": Although it's a short medley version, it's one of the best performances of the ballad section, with Freddie nailing the Bb4s with the correct studio phrasing (for the first time ever!).<br /><br />- "Radio Ga Ga": Although it's missing one chorus, this is one of - if not the best - the best versions ever, Freddie nails all the Bb4s and sounds very clean! Spike Edney's Roland Jupiter 8 also really shines through on this mix, compared to the DVD releases!<br /><br />- "Audience Improv": A great improv, Freddie sounds strong and confident. You gotta love when he sustains that A4 for 4 seconds!<br /><br />- "Hammer To Fall": Despite missing a verse and a chorus, it's a strong version (possibly the best ever). Freddie sings the song amazingly, and even ad-libs a C#5 and a C5! Also notice how heavy Brian's guitar sounds compared to the thin DVD mixes - it roars!<br /><br />- "Crazy Little Thing Called Love": A great version, the crowd loves the song, the jam is great as well! Only downside to this is the slight feedback issues.<br /><br />- "We Will Rock You": Although cut down to the 1st verse and chorus, Freddie sounds strong. He nails the A4, and the solo from Dr. May is brilliant!<br /><br />- "We Are the Champions": Perhaps the high-light of the performance - Freddie is very daring on this version, he sustains the pre-chorus Bb4s, nails the 1st C5, belts great A4s, but most importantly: He nails the chorus Bb4s, in all 3 choruses! This is the only time he has ever done so! It has to be said though, the last one sounds a bit rough, but that's a side effect of belting high notes for the past 18 minutes, with nodules AND laryngitis!<br /><br />- "Is This The World We Created... ?": Freddie and Brian perform a beautiful version of this, and it is one of the best versions ever. It's both sad and hilarious that a couple of BBC engineers are talking over the song, one of them being completely oblivious of the fact that he is interrupting the performance, on live television... Which was being televised to almost 2 billion homes.<br /><br /><br />All rights go to their respective owners!<br />-----Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for fair use for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use''',
1477,
[{
'start_time': 36,
'end_time': 162,
'title': 'Bohemian Rhapsody',
}, {
'start_time': 162,
'end_time': 413,
'title': 'Radio Ga Ga',
}, {
'start_time': 413,
'end_time': 454,
'title': 'Ay Oh!',
}, {
'start_time': 454,
'end_time': 728,
'title': 'Hammer To Fall',
}, {
'start_time': 728,
'end_time': 963,
'title': 'Crazy Little Thing Called Love',
}, {
'start_time': 963,
'end_time': 1038,
'title': 'We Will Rock You',
}, {
'start_time': 1038,
'end_time': 1272,
'title': 'We Are The Champions',
}, {
'start_time': 1272,
'end_time': 1477,
'title': 'Is This The World We Created...?',
}]
),
(
# https://www.youtube.com/watch?v=ekYlRhALiRQ
# pattern: <num>. <title> 0:00
'1. Those Beaten Paths of Confusion <a href="#" onclick="yt.www.watch.player.seekTo(0*60+00);return false;">0:00</a><br />2. Beyond the Shadows of Emptiness & Nothingness <a href="#" onclick="yt.www.watch.player.seekTo(11*60+47);return false;">11:47</a><br />3. Poison Yourself...With Thought <a href="#" onclick="yt.www.watch.player.seekTo(26*60+30);return false;">26:30</a><br />4. The Agents of Transformation <a href="#" onclick="yt.www.watch.player.seekTo(35*60+57);return false;">35:57</a><br />5. Drowning in the Pain of Consciousness <a href="#" onclick="yt.www.watch.player.seekTo(44*60+32);return false;">44:32</a><br />6. Deny the Disease of Life <a href="#" onclick="yt.www.watch.player.seekTo(53*60+07);return false;">53:07</a><br /><br />More info/Buy: http://crepusculonegro.storenvy.com/products/257645-cn-03-arizmenda-within-the-vacuum-of-infinity<br /><br />No copyright is intended. The rights to this video are assumed by the owner and its affiliates.',
4009,
[{
'start_time': 0,
'end_time': 707,
'title': '1. Those Beaten Paths of Confusion',
}, {
'start_time': 707,
'end_time': 1590,
'title': '2. Beyond the Shadows of Emptiness & Nothingness',
}, {
'start_time': 1590,
'end_time': 2157,
'title': '3. Poison Yourself...With Thought',
}, {
'start_time': 2157,
'end_time': 2672,
'title': '4. The Agents of Transformation',
}, {
'start_time': 2672,
'end_time': 3187,
'title': '5. Drowning in the Pain of Consciousness',
}, {
'start_time': 3187,
'end_time': 4009,
'title': '6. Deny the Disease of Life',
}]
),
(
# https://www.youtube.com/watch?v=WjL4pSzog9w
# pattern: 00:00 <title>
'<a href="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" class="yt-uix-servicelink " data-target-new-window="True" data-servicelink="CDAQ6TgYACITCNf1raqT2dMCFdRjGAod_o0CBSj4HQ" data-url="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" rel="nofollow noopener" target="_blank">https://arizmenda.bandcamp.com/merch/...</a><br /><br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+00);return false;">00:00</a> Christening Unborn Deformities <br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+08);return false;">07:08</a> Taste of Purity<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+16);return false;">16:16</a> Sculpting Sins of a Universal Tongue<br /><a href="#" onclick="yt.www.watch.player.seekTo(24*60+45);return false;">24:45</a> Birth<br /><a href="#" onclick="yt.www.watch.player.seekTo(31*60+24);return false;">31:24</a> Neves<br /><a href="#" onclick="yt.www.watch.player.seekTo(37*60+55);return false;">37:55</a> Libations in Limbo',
2705,
[{
'start_time': 0,
'end_time': 428,
'title': 'Christening Unborn Deformities',
}, {
'start_time': 428,
'end_time': 976,
'title': 'Taste of Purity',
}, {
'start_time': 976,
'end_time': 1485,
'title': 'Sculpting Sins of a Universal Tongue',
}, {
'start_time': 1485,
'end_time': 1884,
'title': 'Birth',
}, {
'start_time': 1884,
'end_time': 2275,
'title': 'Neves',
}, {
'start_time': 2275,
'end_time': 2705,
'title': 'Libations in Limbo',
}]
),
(
# https://www.youtube.com/watch?v=o3r1sn-t3is
# pattern: <title> 00:00 <note>
'Download this show in MP3: <a href="http://sh.st/njZKK" class="yt-uix-servicelink " data-url="http://sh.st/njZKK" data-target-new-window="True" data-servicelink="CDAQ6TgYACITCK3j8_6o2dMCFVDCGAoduVAKKij4HQ" rel="nofollow noopener" target="_blank">http://sh.st/njZKK</a><br /><br />Setlist:<br />I-E-A-I-A-I-O <a href="#" onclick="yt.www.watch.player.seekTo(00*60+45);return false;">00:45</a><br />Suite-Pee <a href="#" onclick="yt.www.watch.player.seekTo(4*60+26);return false;">4:26</a> (Incomplete)<br />Attack <a href="#" onclick="yt.www.watch.player.seekTo(5*60+31);return false;">5:31</a> (First live performance since 2011)<br />Prison Song <a href="#" onclick="yt.www.watch.player.seekTo(8*60+42);return false;">8:42</a><br />Know <a href="#" onclick="yt.www.watch.player.seekTo(12*60+32);return false;">12:32</a> (First live performance since 2011)<br />Aerials <a href="#" onclick="yt.www.watch.player.seekTo(15*60+32);return false;">15:32</a><br />Soldier Side - Intro <a href="#" onclick="yt.www.watch.player.seekTo(19*60+13);return false;">19:13</a><br />B.Y.O.B. <a href="#" onclick="yt.www.watch.player.seekTo(20*60+09);return false;">20:09</a><br />Soil <a href="#" onclick="yt.www.watch.player.seekTo(24*60+32);return false;">24:32</a><br />Darts <a href="#" onclick="yt.www.watch.player.seekTo(27*60+48);return false;">27:48</a><br />Radio/Video <a href="#" onclick="yt.www.watch.player.seekTo(30*60+38);return false;">30:38</a><br />Hypnotize <a href="#" onclick="yt.www.watch.player.seekTo(35*60+05);return false;">35:05</a><br />Temper <a href="#" onclick="yt.www.watch.player.seekTo(38*60+08);return false;">38:08</a> (First live performance since 1999)<br />CUBErt <a href="#" onclick="yt.www.watch.player.seekTo(41*60+00);return false;">41:00</a><br />Needles <a href="#" onclick="yt.www.watch.player.seekTo(42*60+57);return false;">42:57</a><br />Deer Dance <a href="#" onclick="yt.www.watch.player.seekTo(46*60+27);return false;">46:27</a><br />Bounce <a href="#" onclick="yt.www.watch.player.seekTo(49*60+38);return false;">49:38</a><br />Suggestions <a href="#" onclick="yt.www.watch.player.seekTo(51*60+25);return false;">51:25</a><br />Psycho <a href="#" onclick="yt.www.watch.player.seekTo(53*60+52);return false;">53:52</a><br />Chop Suey! <a href="#" onclick="yt.www.watch.player.seekTo(58*60+13);return false;">58:13</a><br />Lonely Day <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+01*60+15);return false;">1:01:15</a><br />Question! <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+04*60+14);return false;">1:04:14</a><br />Lost in Hollywood <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+08*60+10);return false;">1:08:10</a><br />Vicinity of Obscenity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+13*60+40);return false;">1:13:40</a>(First live performance since 2012)<br />Forest <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+16*60+17);return false;">1:16:17</a><br />Cigaro <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+20*60+02);return false;">1:20:02</a><br />Toxicity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+23*60+57);return false;">1:23:57</a>(with Chino Moreno)<br />Sugar <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+27*60+53);return false;">1:27:53</a>',
5640,
[{
'start_time': 45,
'end_time': 266,
'title': 'I-E-A-I-A-I-O',
}, {
'start_time': 266,
'end_time': 331,
'title': 'Suite-Pee (Incomplete)',
}, {
'start_time': 331,
'end_time': 522,
'title': 'Attack (First live performance since 2011)',
}, {
'start_time': 522,
'end_time': 752,
'title': 'Prison Song',
}, {
'start_time': 752,
'end_time': 932,
'title': 'Know (First live performance since 2011)',
}, {
'start_time': 932,
'end_time': 1153,
'title': 'Aerials',
}, {
'start_time': 1153,
'end_time': 1209,
'title': 'Soldier Side - Intro',
}, {
'start_time': 1209,
'end_time': 1472,
'title': 'B.Y.O.B.',
}, {
'start_time': 1472,
'end_time': 1668,
'title': 'Soil',
}, {
'start_time': 1668,
'end_time': 1838,
'title': 'Darts',
}, {
'start_time': 1838,
'end_time': 2105,
'title': 'Radio/Video',
}, {
'start_time': 2105,
'end_time': 2288,
'title': 'Hypnotize',
}, {
'start_time': 2288,
'end_time': 2460,
'title': 'Temper (First live performance since 1999)',
}, {
'start_time': 2460,
'end_time': 2577,
'title': 'CUBErt',
}, {
'start_time': 2577,
'end_time': 2787,
'title': 'Needles',
}, {
'start_time': 2787,
'end_time': 2978,
'title': 'Deer Dance',
}, {
'start_time': 2978,
'end_time': 3085,
'title': 'Bounce',
}, {
'start_time': 3085,
'end_time': 3232,
'title': 'Suggestions',
}, {
'start_time': 3232,
'end_time': 3493,
'title': 'Psycho',
}, {
'start_time': 3493,
'end_time': 3675,
'title': 'Chop Suey!',
}, {
'start_time': 3675,
'end_time': 3854,
'title': 'Lonely Day',
}, {
'start_time': 3854,
'end_time': 4090,
'title': 'Question!',
}, {
'start_time': 4090,
'end_time': 4420,
'title': 'Lost in Hollywood',
}, {
'start_time': 4420,
'end_time': 4577,
'title': 'Vicinity of Obscenity (First live performance since 2012)',
}, {
'start_time': 4577,
'end_time': 4802,
'title': 'Forest',
}, {
'start_time': 4802,
'end_time': 5037,
'title': 'Cigaro',
}, {
'start_time': 5037,
'end_time': 5273,
'title': 'Toxicity (with Chino Moreno)',
}, {
'start_time': 5273,
'end_time': 5640,
'title': 'Sugar',
}]
),
(
# https://www.youtube.com/watch?v=PkYLQbsqCE8
# pattern: <num> - <title> [<latinized title>] 0:00:00
'''Затемно (Zatemno) is an Obscure Black Metal Band from Russia.<br /><br />"Во прах (Vo prakh)'' Into The Ashes", Debut mini-album released may 6, 2016, by Death Knell Productions<br />Released on 6 panel digipak CD, limited to 100 copies only<br />And digital format on Bandcamp<br /><br />Tracklist<br /><br />1 - Во прах [Vo prakh] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+00*60+00);return false;">0:00:00</a><br />2 - Искупление [Iskupleniye] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+08*60+10);return false;">0:08:10</a><br />3 - Из серпов луны...[Iz serpov luny] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+14*60+30);return false;">0:14:30</a><br /><br />Links:<br /><a href="https://deathknellprod.bandcamp.com/album/--2" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://deathknellprod.bandcamp.com/album/--2" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://deathknellprod.bandcamp.com/a...</a><br /><a href="https://www.facebook.com/DeathKnellProd/" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://www.facebook.com/DeathKnellProd/" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://www.facebook.com/DeathKnellProd/</a><br /><br /><br />I don't have any right about this artifact, my only intention is to spread the music of the band, all rights are reserved to the Затемно (Zatemno) and his producers, Death Knell Productions.<br /><br />------------------------------------------------------------------<br /><br />Subscribe for more videos like this.<br />My link: <a href="https://web.facebook.com/AttackOfTheDragons" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://web.facebook.com/AttackOfTheDragons" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://web.facebook.com/AttackOfTheD...</a>''',
1138,
[{
'start_time': 0,
'end_time': 490,
'title': '1 - Во прах [Vo prakh]',
}, {
'start_time': 490,
'end_time': 870,
'title': '2 - Искупление [Iskupleniye]',
}, {
'start_time': 870,
'end_time': 1138,
'title': '3 - Из серпов луны...[Iz serpov luny]',
}]
),
(
# https://www.youtube.com/watch?v=xZW70zEasOk
# time point more than duration
'''● LCS Spring finals: Saturday and Sunday from <a href="#" onclick="yt.www.watch.player.seekTo(13*60+30);return false;">13:30</a> outside the venue! <br />● PAX East: Fri, Sat & Sun - more info in tomorrows video on the main channel!''',
283,
[]
),
]
def test_youtube_chapters(self):
for description, duration, expected_chapters in self._TEST_CASES:
ie = YoutubeIE()
expect_value(
self, ie._extract_chapters_from_description(description, duration),
expected_chapters, None)
if __name__ == '__main__':
unittest.main()

View File

@ -86,13 +86,9 @@ class TestPlayerInfo(unittest.TestCase):
('https://www.youtube.com/yts/jsbin/player-en_US-vflaxXRn1/base.js', 'vflaxXRn1'), ('https://www.youtube.com/yts/jsbin/player-en_US-vflaxXRn1/base.js', 'vflaxXRn1'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'), ('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'), ('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'),
('http://s.ytimg.com/yt/swfbin/watch_as3-vflrEm9Nq.swf', 'vflrEm9Nq'),
('https://s.ytimg.com/yts/swfbin/player-vflenCdZL/watch_as3.swf', 'vflenCdZL'),
) )
for player_url, expected_player_id in PLAYER_URLS: for player_url, expected_player_id in PLAYER_URLS:
expected_player_type = player_url.split('.')[-1] player_id = YoutubeIE._extract_player_info(player_url)
player_type, player_id = YoutubeIE._extract_player_info(player_url)
self.assertEqual(player_type, expected_player_type)
self.assertEqual(player_id, expected_player_id) self.assertEqual(player_id, expected_player_id)

View File

@ -163,6 +163,7 @@ class YoutubeDL(object):
simulate: Do not download the video files. simulate: Do not download the video files.
format: Video format code. See options.py for more information. format: Video format code. See options.py for more information.
outtmpl: Template for output names. outtmpl: Template for output names.
outtmpl_na_placeholder: Placeholder for unavailable meta fields.
restrictfilenames: Do not allow "&" and spaces in file names restrictfilenames: Do not allow "&" and spaces in file names
ignoreerrors: Do not stop on download errors. ignoreerrors: Do not stop on download errors.
force_generic_extractor: Force downloader to use the generic extractor force_generic_extractor: Force downloader to use the generic extractor
@ -338,6 +339,8 @@ class YoutubeDL(object):
_pps = [] _pps = []
_download_retcode = None _download_retcode = None
_num_downloads = None _num_downloads = None
_playlist_level = 0
_playlist_urls = set()
_screen_file = None _screen_file = None
def __init__(self, params=None, auto_init=True): def __init__(self, params=None, auto_init=True):
@ -656,7 +659,7 @@ class YoutubeDL(object):
template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v)) template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
for k, v in template_dict.items() for k, v in template_dict.items()
if v is not None and not isinstance(v, (list, tuple, dict))) if v is not None and not isinstance(v, (list, tuple, dict)))
template_dict = collections.defaultdict(lambda: 'NA', template_dict) template_dict = collections.defaultdict(lambda: self.params.get('outtmpl_na_placeholder', 'NA'), template_dict)
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL) outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
@ -676,8 +679,8 @@ class YoutubeDL(object):
# Missing numeric fields used together with integer presentation types # Missing numeric fields used together with integer presentation types
# in format specification will break the argument substitution since # in format specification will break the argument substitution since
# string 'NA' is returned for missing fields. We will patch output # string NA placeholder is returned for missing fields. We will patch
# template for missing fields to meet string presentation type. # output template for missing fields to meet string presentation type.
for numeric_field in self._NUMERIC_FIELDS: for numeric_field in self._NUMERIC_FIELDS:
if numeric_field not in template_dict: if numeric_field not in template_dict:
# As of [1] format syntax is: # As of [1] format syntax is:
@ -906,115 +909,23 @@ class YoutubeDL(object):
return self.process_ie_result( return self.process_ie_result(
new_result, download=download, extra_info=extra_info) new_result, download=download, extra_info=extra_info)
elif result_type in ('playlist', 'multi_video'): elif result_type in ('playlist', 'multi_video'):
# We process each entry in the playlist # Protect from infinite recursion due to recursively nested playlists
playlist = ie_result.get('title') or ie_result.get('id') # (see https://github.com/ytdl-org/youtube-dl/issues/27833)
self.to_screen('[download] Downloading playlist: %s' % playlist) webpage_url = ie_result['webpage_url']
if webpage_url in self._playlist_urls:
playlist_results = []
playliststart = self.params.get('playliststart', 1) - 1
playlistend = self.params.get('playlistend')
# For backwards compatibility, interpret -1 as whole list
if playlistend == -1:
playlistend = None
playlistitems_str = self.params.get('playlist_items')
playlistitems = None
if playlistitems_str is not None:
def iter_playlistitems(format):
for string_segment in format.split(','):
if '-' in string_segment:
start, end = string_segment.split('-')
for item in range(int(start), int(end) + 1):
yield int(item)
else:
yield int(string_segment)
playlistitems = orderedSet(iter_playlistitems(playlistitems_str))
ie_entries = ie_result['entries']
def make_playlistitems_entries(list_ie_entries):
num_entries = len(list_ie_entries)
return [
list_ie_entries[i - 1] for i in playlistitems
if -num_entries <= i - 1 < num_entries]
def report_download(num_entries):
self.to_screen( self.to_screen(
'[%s] playlist %s: Downloading %d videos' % '[download] Skipping already downloaded playlist: %s'
(ie_result['extractor'], playlist, num_entries)) % ie_result.get('title') or ie_result.get('id'))
return
if isinstance(ie_entries, list): self._playlist_level += 1
n_all_entries = len(ie_entries) self._playlist_urls.add(webpage_url)
if playlistitems: try:
entries = make_playlistitems_entries(ie_entries) return self.__process_playlist(ie_result, download)
else: finally:
entries = ie_entries[playliststart:playlistend] self._playlist_level -= 1
n_entries = len(entries) if not self._playlist_level:
self.to_screen( self._playlist_urls.clear()
'[%s] playlist %s: Collected %d video ids (downloading %d of them)' %
(ie_result['extractor'], playlist, n_all_entries, n_entries))
elif isinstance(ie_entries, PagedList):
if playlistitems:
entries = []
for item in playlistitems:
entries.extend(ie_entries.getslice(
item - 1, item
))
else:
entries = ie_entries.getslice(
playliststart, playlistend)
n_entries = len(entries)
report_download(n_entries)
else: # iterable
if playlistitems:
entries = make_playlistitems_entries(list(itertools.islice(
ie_entries, 0, max(playlistitems))))
else:
entries = list(itertools.islice(
ie_entries, playliststart, playlistend))
n_entries = len(entries)
report_download(n_entries)
if self.params.get('playlistreverse', False):
entries = entries[::-1]
if self.params.get('playlistrandom', False):
random.shuffle(entries)
x_forwarded_for = ie_result.get('__x_forwarded_for_ip')
for i, entry in enumerate(entries, 1):
self.to_screen('[download] Downloading video %s of %s' % (i, n_entries))
# This __x_forwarded_for_ip thing is a bit ugly but requires
# minimal changes
if x_forwarded_for:
entry['__x_forwarded_for_ip'] = x_forwarded_for
extra = {
'n_entries': n_entries,
'playlist': playlist,
'playlist_id': ie_result.get('id'),
'playlist_title': ie_result.get('title'),
'playlist_uploader': ie_result.get('uploader'),
'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_index': playlistitems[i - 1] if playlistitems else i + playliststart,
'extractor': ie_result['extractor'],
'webpage_url': ie_result['webpage_url'],
'webpage_url_basename': url_basename(ie_result['webpage_url']),
'extractor_key': ie_result['extractor_key'],
}
reason = self._match_entry(entry, incomplete=True)
if reason is not None:
self.to_screen('[download] ' + reason)
continue
entry_result = self.__process_iterable_entry(entry, download, extra)
# TODO: skip failed (empty) entries?
playlist_results.append(entry_result)
ie_result['entries'] = playlist_results
self.to_screen('[download] Finished downloading playlist: %s' % playlist)
return ie_result
elif result_type == 'compat_list': elif result_type == 'compat_list':
self.report_warning( self.report_warning(
'Extractor %s returned a compat_list result. ' 'Extractor %s returned a compat_list result. '
@ -1039,6 +950,118 @@ class YoutubeDL(object):
else: else:
raise Exception('Invalid result type: %s' % result_type) raise Exception('Invalid result type: %s' % result_type)
def __process_playlist(self, ie_result, download):
# We process each entry in the playlist
playlist = ie_result.get('title') or ie_result.get('id')
self.to_screen('[download] Downloading playlist: %s' % playlist)
playlist_results = []
playliststart = self.params.get('playliststart', 1) - 1
playlistend = self.params.get('playlistend')
# For backwards compatibility, interpret -1 as whole list
if playlistend == -1:
playlistend = None
playlistitems_str = self.params.get('playlist_items')
playlistitems = None
if playlistitems_str is not None:
def iter_playlistitems(format):
for string_segment in format.split(','):
if '-' in string_segment:
start, end = string_segment.split('-')
for item in range(int(start), int(end) + 1):
yield int(item)
else:
yield int(string_segment)
playlistitems = orderedSet(iter_playlistitems(playlistitems_str))
ie_entries = ie_result['entries']
def make_playlistitems_entries(list_ie_entries):
num_entries = len(list_ie_entries)
return [
list_ie_entries[i - 1] for i in playlistitems
if -num_entries <= i - 1 < num_entries]
def report_download(num_entries):
self.to_screen(
'[%s] playlist %s: Downloading %d videos' %
(ie_result['extractor'], playlist, num_entries))
if isinstance(ie_entries, list):
n_all_entries = len(ie_entries)
if playlistitems:
entries = make_playlistitems_entries(ie_entries)
else:
entries = ie_entries[playliststart:playlistend]
n_entries = len(entries)
self.to_screen(
'[%s] playlist %s: Collected %d video ids (downloading %d of them)' %
(ie_result['extractor'], playlist, n_all_entries, n_entries))
elif isinstance(ie_entries, PagedList):
if playlistitems:
entries = []
for item in playlistitems:
entries.extend(ie_entries.getslice(
item - 1, item
))
else:
entries = ie_entries.getslice(
playliststart, playlistend)
n_entries = len(entries)
report_download(n_entries)
else: # iterable
if playlistitems:
entries = make_playlistitems_entries(list(itertools.islice(
ie_entries, 0, max(playlistitems))))
else:
entries = list(itertools.islice(
ie_entries, playliststart, playlistend))
n_entries = len(entries)
report_download(n_entries)
if self.params.get('playlistreverse', False):
entries = entries[::-1]
if self.params.get('playlistrandom', False):
random.shuffle(entries)
x_forwarded_for = ie_result.get('__x_forwarded_for_ip')
for i, entry in enumerate(entries, 1):
self.to_screen('[download] Downloading video %s of %s' % (i, n_entries))
# This __x_forwarded_for_ip thing is a bit ugly but requires
# minimal changes
if x_forwarded_for:
entry['__x_forwarded_for_ip'] = x_forwarded_for
extra = {
'n_entries': n_entries,
'playlist': playlist,
'playlist_id': ie_result.get('id'),
'playlist_title': ie_result.get('title'),
'playlist_uploader': ie_result.get('uploader'),
'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_index': playlistitems[i - 1] if playlistitems else i + playliststart,
'extractor': ie_result['extractor'],
'webpage_url': ie_result['webpage_url'],
'webpage_url_basename': url_basename(ie_result['webpage_url']),
'extractor_key': ie_result['extractor_key'],
}
reason = self._match_entry(entry, incomplete=True)
if reason is not None:
self.to_screen('[download] ' + reason)
continue
entry_result = self.__process_iterable_entry(entry, download, extra)
# TODO: skip failed (empty) entries?
playlist_results.append(entry_result)
ie_result['entries'] = playlist_results
self.to_screen('[download] Finished downloading playlist: %s' % playlist)
return ie_result
@__handle_extraction_exceptions @__handle_extraction_exceptions
def __process_iterable_entry(self, entry, download, extra_info): def __process_iterable_entry(self, entry, download, extra_info):
return self.process_ie_result( return self.process_ie_result(

View File

@ -340,6 +340,7 @@ def _real_main(argv=None):
'format': opts.format, 'format': opts.format,
'listformats': opts.listformats, 'listformats': opts.listformats,
'outtmpl': outtmpl, 'outtmpl': outtmpl,
'outtmpl_na_placeholder': opts.outtmpl_na_placeholder,
'autonumber_size': opts.autonumber_size, 'autonumber_size': opts.autonumber_size,
'autonumber_start': opts.autonumber_start, 'autonumber_start': opts.autonumber_start,
'restrictfilenames': opts.restrictfilenames, 'restrictfilenames': opts.restrictfilenames,

View File

@ -1,14 +1,15 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import calendar
import re import re
import time
from .amp import AMPIE from .amp import AMPIE
from .common import InfoExtractor from .common import InfoExtractor
from .youtube import YoutubeIE from ..utils import (
from ..compat import compat_urlparse parse_duration,
parse_iso8601,
try_get,
)
class AbcNewsVideoIE(AMPIE): class AbcNewsVideoIE(AMPIE):
@ -18,8 +19,8 @@ class AbcNewsVideoIE(AMPIE):
(?: (?:
abcnews\.go\.com/ abcnews\.go\.com/
(?: (?:
[^/]+/video/(?P<display_id>[0-9a-z-]+)-| (?:[^/]+/)*video/(?P<display_id>[0-9a-z-]+)-|
video/embed\?.*?\bid= video/(?:embed|itemfeed)\?.*?\bid=
)| )|
fivethirtyeight\.abcnews\.go\.com/video/embed/\d+/ fivethirtyeight\.abcnews\.go\.com/video/embed/\d+/
) )
@ -36,6 +37,8 @@ class AbcNewsVideoIE(AMPIE):
'description': 'George Stephanopoulos goes one-on-one with Iranian Foreign Minister Dr. Javad Zarif.', 'description': 'George Stephanopoulos goes one-on-one with Iranian Foreign Minister Dr. Javad Zarif.',
'duration': 180, 'duration': 180,
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1380454200,
'upload_date': '20130929',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@ -47,6 +50,12 @@ class AbcNewsVideoIE(AMPIE):
}, { }, {
'url': 'http://abcnews.go.com/2020/video/2020-husband-stands-teacher-jail-student-affairs-26119478', 'url': 'http://abcnews.go.com/2020/video/2020-husband-stands-teacher-jail-student-affairs-26119478',
'only_matching': True, 'only_matching': True,
}, {
'url': 'http://abcnews.go.com/video/itemfeed?id=46979033',
'only_matching': True,
}, {
'url': 'https://abcnews.go.com/GMA/News/video/history-christmas-story-67894761',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -67,28 +76,23 @@ class AbcNewsIE(InfoExtractor):
_VALID_URL = r'https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)' _VALID_URL = r'https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY', # Youtube Embeds
'url': 'https://abcnews.go.com/Entertainment/peter-billingsley-child-actor-christmas-story-hollywood-power/story?id=51286501',
'info_dict': { 'info_dict': {
'id': '10505354', 'id': '51286501',
'ext': 'flv', 'title': "Peter Billingsley: From child actor in 'A Christmas Story' to Hollywood power player",
'display_id': 'dramatic-video-rare-death-job-america', 'description': 'Billingsley went from a child actor to Hollywood power player.',
'title': 'Occupational Hazards',
'description': 'Nightline investigates the dangers that lurk at various jobs.',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20100428',
'timestamp': 1272412800,
}, },
'add_ie': ['AbcNewsVideo'], 'playlist_count': 5,
}, { }, {
'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818', 'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818',
'info_dict': { 'info_dict': {
'id': '38897857', 'id': '38897857',
'ext': 'mp4', 'ext': 'mp4',
'display_id': 'justin-timberlake-performs-stop-feeling-eurovision-2016',
'title': 'Justin Timberlake Drops Hints For Secret Single', 'title': 'Justin Timberlake Drops Hints For Secret Single',
'description': 'Lara Spencer reports the buzziest stories of the day in "GMA" Pop News.', 'description': 'Lara Spencer reports the buzziest stories of the day in "GMA" Pop News.',
'upload_date': '20160515', 'upload_date': '20160505',
'timestamp': 1463329500, 'timestamp': 1462442280,
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@ -100,49 +104,55 @@ class AbcNewsIE(InfoExtractor):
}, { }, {
'url': 'http://abcnews.go.com/Technology/exclusive-apple-ceo-tim-cook-iphone-cracking-software/story?id=37173343', 'url': 'http://abcnews.go.com/Technology/exclusive-apple-ceo-tim-cook-iphone-cracking-software/story?id=37173343',
'only_matching': True, 'only_matching': True,
}, {
# inline.type == 'video'
'url': 'http://abcnews.go.com/Technology/exclusive-apple-ceo-tim-cook-iphone-cracking-software/story?id=37173343',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) story_id = self._match_id(url)
display_id = mobj.group('display_id') webpage = self._download_webpage(url, story_id)
video_id = mobj.group('id') story = self._parse_json(self._search_regex(
r"window\['__abcnews__'\]\s*=\s*({.+?});",
webpage, 'data'), story_id)['page']['content']['story']['everscroll'][0]
article_contents = story.get('articleContents') or {}
webpage = self._download_webpage(url, video_id) def entries():
video_url = self._search_regex( featured_video = story.get('featuredVideo') or {}
r'window\.abcnvideo\.url\s*=\s*"([^"]+)"', webpage, 'video URL') feed = try_get(featured_video, lambda x: x['video']['feed'])
full_video_url = compat_urlparse.urljoin(url, video_url) if feed:
yield {
'_type': 'url',
'id': featured_video.get('id'),
'title': featured_video.get('name'),
'url': feed,
'thumbnail': featured_video.get('images'),
'description': featured_video.get('description'),
'timestamp': parse_iso8601(featured_video.get('uploadDate')),
'duration': parse_duration(featured_video.get('duration')),
'ie_key': AbcNewsVideoIE.ie_key(),
}
youtube_url = YoutubeIE._extract_url(webpage) for inline in (article_contents.get('inlines') or []):
inline_type = inline.get('type')
if inline_type == 'iframe':
iframe_url = try_get(inline, lambda x: x['attrs']['src'])
if iframe_url:
yield self.url_result(iframe_url)
elif inline_type == 'video':
video_id = inline.get('id')
if video_id:
yield {
'_type': 'url',
'id': video_id,
'url': 'http://abcnews.go.com/video/embed?id=' + video_id,
'thumbnail': inline.get('imgSrc') or inline.get('imgDefault'),
'description': inline.get('description'),
'duration': parse_duration(inline.get('duration')),
'ie_key': AbcNewsVideoIE.ie_key(),
}
timestamp = None return self.playlist_result(
date_str = self._html_search_regex( entries(), story_id, article_contents.get('headline'),
r'<span[^>]+class="timestamp">([^<]+)</span>', article_contents.get('subHead'))
webpage, 'timestamp', fatal=False)
if date_str:
tz_offset = 0
if date_str.endswith(' ET'): # Eastern Time
tz_offset = -5
date_str = date_str[:-3]
date_formats = ['%b. %d, %Y', '%b %d, %Y, %I:%M %p']
for date_format in date_formats:
try:
timestamp = calendar.timegm(time.strptime(date_str.strip(), date_format))
except ValueError:
continue
if timestamp is not None:
timestamp -= tz_offset * 3600
entry = {
'_type': 'url_transparent',
'ie_key': AbcNewsVideoIE.ie_key(),
'url': full_video_url,
'id': video_id,
'display_id': display_id,
'timestamp': timestamp,
}
if youtube_url:
entries = [entry, self.url_result(youtube_url, ie=YoutubeIE.ie_key())]
return self.playlist_result(entries)
return entry

View File

@ -26,6 +26,7 @@ from ..utils import (
strip_or_none, strip_or_none,
try_get, try_get,
unified_strdate, unified_strdate,
urlencode_postdata,
) )
@ -51,9 +52,12 @@ class ADNIE(InfoExtractor):
} }
} }
_NETRC_MACHINE = 'animedigitalnetwork'
_BASE_URL = 'http://animedigitalnetwork.fr' _BASE_URL = 'http://animedigitalnetwork.fr'
_API_BASE_URL = 'https://gw.api.animedigitalnetwork.fr/' _API_BASE_URL = 'https://gw.api.animedigitalnetwork.fr/'
_PLAYER_BASE_URL = _API_BASE_URL + 'player/' _PLAYER_BASE_URL = _API_BASE_URL + 'player/'
_HEADERS = {}
_LOGIN_ERR_MESSAGE = 'Unable to log in'
_RSA_KEY = (0x9B42B08905199A5CCE2026274399CA560ECB209EE9878A708B1C0812E1BB8CB5D1FB7441861147C1A1F2F3A0476DD63A9CAC20D3E983613346850AA6CB38F16DC7D720FD7D86FC6E5B3D5BBC72E14CD0BF9E869F2CEA2CCAD648F1DCE38F1FF916CEFB2D339B64AA0264372344BC775E265E8A852F88144AB0BD9AA06C1A4ABB, 65537) _RSA_KEY = (0x9B42B08905199A5CCE2026274399CA560ECB209EE9878A708B1C0812E1BB8CB5D1FB7441861147C1A1F2F3A0476DD63A9CAC20D3E983613346850AA6CB38F16DC7D720FD7D86FC6E5B3D5BBC72E14CD0BF9E869F2CEA2CCAD648F1DCE38F1FF916CEFB2D339B64AA0264372344BC775E265E8A852F88144AB0BD9AA06C1A4ABB, 65537)
_POS_ALIGN_MAP = { _POS_ALIGN_MAP = {
'start': 1, 'start': 1,
@ -129,19 +133,42 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
}]) }])
return subtitles return subtitles
def _real_initialize(self):
username, password = self._get_login_info()
if not username:
return
try:
access_token = (self._download_json(
self._API_BASE_URL + 'authentication/login', None,
'Logging in', self._LOGIN_ERR_MESSAGE, fatal=False,
data=urlencode_postdata({
'password': password,
'rememberMe': False,
'source': 'Web',
'username': username,
})) or {}).get('accessToken')
if access_token:
self._HEADERS = {'authorization': 'Bearer ' + access_token}
except ExtractorError as e:
message = None
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
resp = self._parse_json(
e.cause.read().decode(), None, fatal=False) or {}
message = resp.get('message') or resp.get('code')
self.report_warning(message or self._LOGIN_ERR_MESSAGE)
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
video_base_url = self._PLAYER_BASE_URL + 'video/%s/' % video_id video_base_url = self._PLAYER_BASE_URL + 'video/%s/' % video_id
player = self._download_json( player = self._download_json(
video_base_url + 'configuration', video_id, video_base_url + 'configuration', video_id,
'Downloading player config JSON metadata')['player'] 'Downloading player config JSON metadata',
headers=self._HEADERS)['player']
options = player['options'] options = player['options']
user = options['user'] user = options['user']
if not user.get('hasAccess'): if not user.get('hasAccess'):
raise ExtractorError( self.raise_login_required()
'This video is only available for paying users', expected=True)
# self.raise_login_required() # FIXME: Login is not implemented
token = self._download_json( token = self._download_json(
user.get('refreshTokenUrl') or (self._PLAYER_BASE_URL + 'refresh/token'), user.get('refreshTokenUrl') or (self._PLAYER_BASE_URL + 'refresh/token'),
@ -188,8 +215,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
message = error.get('message') message = error.get('message')
if e.cause.code == 403 and error.get('code') == 'player-bad-geolocation-country': if e.cause.code == 403 and error.get('code') == 'player-bad-geolocation-country':
self.raise_geo_restricted(msg=message) self.raise_geo_restricted(msg=message)
else: raise ExtractorError(message)
raise ExtractorError(message)
else: else:
raise ExtractorError('Giving up retrying') raise ExtractorError('Giving up retrying')

View File

@ -252,11 +252,11 @@ class AENetworksShowIE(AENetworksListBaseIE):
_TESTS = [{ _TESTS = [{
'url': 'http://www.history.com/shows/ancient-aliens', 'url': 'http://www.history.com/shows/ancient-aliens',
'info_dict': { 'info_dict': {
'id': 'SH012427480000', 'id': 'SERIES1574',
'title': 'Ancient Aliens', 'title': 'Ancient Aliens',
'description': 'md5:3f6d74daf2672ff3ae29ed732e37ea7f', 'description': 'md5:3f6d74daf2672ff3ae29ed732e37ea7f',
}, },
'playlist_mincount': 168, 'playlist_mincount': 150,
}] }]
_RESOURCE = 'series' _RESOURCE = 'series'
_ITEMS_KEY = 'episodes' _ITEMS_KEY = 'episodes'

View File

@ -1,13 +1,16 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor from .common import InfoExtractor
class AlJazeeraIE(InfoExtractor): class AlJazeeraIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?aljazeera\.com/(?:programmes|video)/.*?/(?P<id>[^/]+)\.html' _VALID_URL = r'https?://(?:www\.)?aljazeera\.com/(?P<type>program/[^/]+|(?:feature|video)s)/\d{4}/\d{1,2}/\d{1,2}/(?P<id>[^/?&#]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html', 'url': 'https://www.aljazeera.com/program/episode/2014/9/19/deliverance',
'info_dict': { 'info_dict': {
'id': '3792260579001', 'id': '3792260579001',
'ext': 'mp4', 'ext': 'mp4',
@ -20,14 +23,34 @@ class AlJazeeraIE(InfoExtractor):
'add_ie': ['BrightcoveNew'], 'add_ie': ['BrightcoveNew'],
'skip': 'Not accessible from Travis CI server', 'skip': 'Not accessible from Travis CI server',
}, { }, {
'url': 'http://www.aljazeera.com/video/news/2017/05/sierra-leone-709-carat-diamond-auctioned-170511100111930.html', 'url': 'https://www.aljazeera.com/videos/2017/5/11/sierra-leone-709-carat-diamond-to-be-auctioned-off',
'only_matching': True,
}, {
'url': 'https://www.aljazeera.com/features/2017/8/21/transforming-pakistans-buses-into-art',
'only_matching': True, 'only_matching': True,
}] }]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/665003303001/default_default/index.html?videoId=%s' BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/%s_default/index.html?videoId=%s'
def _real_extract(self, url): def _real_extract(self, url):
program_name = self._match_id(url) post_type, name = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, program_name) post_type = {
brightcove_id = self._search_regex( 'features': 'post',
r'RenderPagesVideo\(\'(.+?)\'', webpage, 'brightcove id') 'program': 'episode',
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id) 'videos': 'video',
}[post_type.split('/')[0]]
video = self._download_json(
'https://www.aljazeera.com/graphql', name, query={
'operationName': 'SingleArticleQuery',
'variables': json.dumps({
'name': name,
'postType': post_type,
}),
}, headers={
'wp-site': 'aje',
})['data']['article']['video']
video_id = video['id']
account_id = video.get('accountId') or '665003303001'
player_id = video.get('playerId') or 'BkeSH5BDb'
return self.url_result(
self.BRIGHTCOVE_URL_TEMPLATE % (account_id, player_id, video_id),
'BrightcoveNew', video_id)

View File

@ -1,13 +1,16 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import json
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
clean_html, clean_html,
int_or_none,
try_get, try_get,
unified_strdate, unified_strdate,
unified_timestamp,
) )
@ -22,8 +25,8 @@ class AmericasTestKitchenIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'description': 'md5:64e606bfee910627efc4b5f050de92b3', 'description': 'md5:64e606bfee910627efc4b5f050de92b3',
'thumbnail': r're:^https?://', 'thumbnail': r're:^https?://',
'timestamp': 1523664000, 'timestamp': 1523318400,
'upload_date': '20180414', 'upload_date': '20180410',
'release_date': '20180410', 'release_date': '20180410',
'series': "America's Test Kitchen", 'series': "America's Test Kitchen",
'season_number': 18, 'season_number': 18,
@ -33,6 +36,27 @@ class AmericasTestKitchenIE(InfoExtractor):
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
}, {
# Metadata parsing behaves differently for newer episodes (705) as opposed to older episodes (582 above)
'url': 'https://www.americastestkitchen.com/episode/705-simple-chicken-dinner',
'md5': '06451608c57651e985a498e69cec17e5',
'info_dict': {
'id': '5fbe8c61bda2010001c6763b',
'title': 'Simple Chicken Dinner',
'ext': 'mp4',
'description': 'md5:eb68737cc2fd4c26ca7db30139d109e7',
'thumbnail': r're:^https?://',
'timestamp': 1610755200,
'upload_date': '20210116',
'release_date': '20210116',
'series': "America's Test Kitchen",
'season_number': 21,
'episode': 'Simple Chicken Dinner',
'episode_number': 3,
},
'params': {
'skip_download': True,
},
}, { }, {
'url': 'https://www.americastestkitchen.com/videos/3420-pan-seared-salmon', 'url': 'https://www.americastestkitchen.com/videos/3420-pan-seared-salmon',
'only_matching': True, 'only_matching': True,
@ -60,7 +84,76 @@ class AmericasTestKitchenIE(InfoExtractor):
'url': 'https://player.zype.com/embed/%s.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ' % video['zypeId'], 'url': 'https://player.zype.com/embed/%s.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ' % video['zypeId'],
'ie_key': 'Zype', 'ie_key': 'Zype',
'description': clean_html(video.get('description')), 'description': clean_html(video.get('description')),
'timestamp': unified_timestamp(video.get('publishDate')),
'release_date': unified_strdate(video.get('publishDate')), 'release_date': unified_strdate(video.get('publishDate')),
'episode_number': int_or_none(episode.get('number')),
'season_number': int_or_none(episode.get('season')),
'series': try_get(episode, lambda x: x['show']['title']), 'series': try_get(episode, lambda x: x['show']['title']),
'episode': episode.get('title'), 'episode': episode.get('title'),
} }
class AmericasTestKitchenSeasonIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?P<show>americastestkitchen|cookscountry)\.com/episodes/browse/season_(?P<id>\d+)'
_TESTS = [{
# ATK Season
'url': 'https://www.americastestkitchen.com/episodes/browse/season_1',
'info_dict': {
'id': 'season_1',
'title': 'Season 1',
},
'playlist_count': 13,
}, {
# Cooks Country Season
'url': 'https://www.cookscountry.com/episodes/browse/season_12',
'info_dict': {
'id': 'season_12',
'title': 'Season 12',
},
'playlist_count': 13,
}]
def _real_extract(self, url):
show_name, season_number = re.match(self._VALID_URL, url).groups()
season_number = int(season_number)
slug = 'atk' if show_name == 'americastestkitchen' else 'cco'
season = 'Season %d' % season_number
season_search = self._download_json(
'https://y1fnzxui30-dsn.algolia.net/1/indexes/everest_search_%s_season_desc_production' % slug,
season, headers={
'Origin': 'https://www.%s.com' % show_name,
'X-Algolia-API-Key': '8d504d0099ed27c1b73708d22871d805',
'X-Algolia-Application-Id': 'Y1FNZXUI30',
}, query={
'facetFilters': json.dumps([
'search_season_list:' + season,
'search_document_klass:episode',
'search_show_slug:' + slug,
]),
'attributesToRetrieve': 'description,search_%s_episode_number,search_document_date,search_url,title' % slug,
'attributesToHighlight': '',
'hitsPerPage': 1000,
})
def entries():
for episode in (season_search.get('hits') or []):
search_url = episode.get('search_url')
if not search_url:
continue
yield {
'_type': 'url',
'url': 'https://www.%s.com%s' % (show_name, search_url),
'id': try_get(episode, lambda e: e['objectID'].split('_')[-1]),
'title': episode.get('title'),
'description': episode.get('description'),
'timestamp': unified_timestamp(episode.get('search_document_date')),
'season_number': season_number,
'episode_number': int_or_none(episode.get('search_%s_episode_number' % slug)),
'ie_key': AmericasTestKitchenIE.ie_key(),
}
return self.playlist_result(
entries(), 'season_%d' % season_number, season)

View File

@ -8,6 +8,7 @@ from ..utils import (
int_or_none, int_or_none,
mimetype2ext, mimetype2ext,
parse_iso8601, parse_iso8601,
unified_timestamp,
url_or_none, url_or_none,
) )
@ -88,7 +89,7 @@ class AMPIE(InfoExtractor):
self._sort_formats(formats) self._sort_formats(formats)
timestamp = parse_iso8601(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date')) timestamp = unified_timestamp(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date'))
return { return {
'id': video_id, 'id': video_id,

View File

@ -116,8 +116,6 @@ class AnimeOnDemandIE(InfoExtractor):
r'(?s)<div[^>]+itemprop="description"[^>]*>(.+?)</div>', r'(?s)<div[^>]+itemprop="description"[^>]*>(.+?)</div>',
webpage, 'anime description', default=None) webpage, 'anime description', default=None)
entries = []
def extract_info(html, video_id, num=None): def extract_info(html, video_id, num=None):
title, description = [None] * 2 title, description = [None] * 2
formats = [] formats = []
@ -233,7 +231,7 @@ class AnimeOnDemandIE(InfoExtractor):
self._sort_formats(info['formats']) self._sort_formats(info['formats'])
f = common_info.copy() f = common_info.copy()
f.update(info) f.update(info)
entries.append(f) yield f
# Extract teaser/trailer only when full episode is not available # Extract teaser/trailer only when full episode is not available
if not info['formats']: if not info['formats']:
@ -247,7 +245,7 @@ class AnimeOnDemandIE(InfoExtractor):
'title': m.group('title'), 'title': m.group('title'),
'url': urljoin(url, m.group('href')), 'url': urljoin(url, m.group('href')),
}) })
entries.append(f) yield f
def extract_episodes(html): def extract_episodes(html):
for num, episode_html in enumerate(re.findall( for num, episode_html in enumerate(re.findall(
@ -275,7 +273,8 @@ class AnimeOnDemandIE(InfoExtractor):
'episode_number': episode_number, 'episode_number': episode_number,
} }
extract_entries(episode_html, video_id, common_info) for e in extract_entries(episode_html, video_id, common_info):
yield e
def extract_film(html, video_id): def extract_film(html, video_id):
common_info = { common_info = {
@ -283,11 +282,18 @@ class AnimeOnDemandIE(InfoExtractor):
'title': anime_title, 'title': anime_title,
'description': anime_description, 'description': anime_description,
} }
extract_entries(html, video_id, common_info) for e in extract_entries(html, video_id, common_info):
yield e
extract_episodes(webpage) def entries():
has_episodes = False
for e in extract_episodes(webpage):
has_episodes = True
yield e
if not entries: if not has_episodes:
extract_film(webpage, anime_id) for e in extract_film(webpage, anime_id):
yield e
return self.playlist_result(entries, anime_id, anime_title, anime_description) return self.playlist_result(
entries(), anime_id, anime_title, anime_description)

View File

@ -3,7 +3,7 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .yahoo import YahooIE
from ..compat import ( from ..compat import (
compat_parse_qs, compat_parse_qs,
compat_urllib_parse_urlparse, compat_urllib_parse_urlparse,
@ -15,9 +15,9 @@ from ..utils import (
) )
class AolIE(InfoExtractor): class AolIE(YahooIE):
IE_NAME = 'aol.com' IE_NAME = 'aol.com'
_VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>[0-9a-f]+)' _VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>\d{9}|[0-9a-f]{24}|[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12})'
_TESTS = [{ _TESTS = [{
# video with 5min ID # video with 5min ID
@ -76,10 +76,16 @@ class AolIE(InfoExtractor):
}, { }, {
'url': 'https://www.aol.jp/video/playlist/5a28e936a1334d000137da0c/5a28f3151e642219fde19831/', 'url': 'https://www.aol.jp/video/playlist/5a28e936a1334d000137da0c/5a28f3151e642219fde19831/',
'only_matching': True, 'only_matching': True,
}, {
# Yahoo video
'url': 'https://www.aol.com/video/play/991e6700-ac02-11ea-99ff-357400036f61/24bbc846-3e30-3c46-915e-fe8ccd7fcc46/',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
if '-' in video_id:
return self._extract_yahoo_video(video_id, 'us')
response = self._download_json( response = self._download_json(
'https://feedapi.b2c.on.aol.com/v1.0/app/videos/aolon/%s/details' % video_id, 'https://feedapi.b2c.on.aol.com/v1.0/app/videos/aolon/%s/details' % video_id,

View File

@ -187,13 +187,13 @@ class ARDMediathekIE(ARDMediathekBaseIE):
if doc.tag == 'rss': if doc.tag == 'rss':
return GenericIE()._extract_rss(url, video_id, doc) return GenericIE()._extract_rss(url, video_id, doc)
title = self._html_search_regex( title = self._og_search_title(webpage, default=None) or self._html_search_regex(
[r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>', [r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>',
r'<meta name="dcterms\.title" content="(.*?)"/>', r'<meta name="dcterms\.title" content="(.*?)"/>',
r'<h4 class="headline">(.*?)</h4>', r'<h4 class="headline">(.*?)</h4>',
r'<title[^>]*>(.*?)</title>'], r'<title[^>]*>(.*?)</title>'],
webpage, 'title') webpage, 'title')
description = self._html_search_meta( description = self._og_search_description(webpage, default=None) or self._html_search_meta(
'dcterms.abstract', webpage, 'description', default=None) 'dcterms.abstract', webpage, 'description', default=None)
if description is None: if description is None:
description = self._html_search_meta( description = self._html_search_meta(
@ -249,18 +249,18 @@ class ARDMediathekIE(ARDMediathekBaseIE):
class ARDIE(InfoExtractor): class ARDIE(InfoExtractor):
_VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos(?:extern)?/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html' _VALID_URL = r'(?P<mainurl>https?://(?:www\.)?daserste\.de/[^?#]+/videos(?:extern)?/(?P<display_id>[^/?#]+)-(?:video-?)?(?P<id>[0-9]+))\.html'
_TESTS = [{ _TESTS = [{
# available till 14.02.2019 # available till 7.01.2022
'url': 'http://www.daserste.de/information/talk/maischberger/videos/das-groko-drama-zerlegen-sich-die-volksparteien-video-102.html', 'url': 'https://www.daserste.de/information/talk/maischberger/videos/maischberger-die-woche-video100.html',
'md5': '8e4ec85f31be7c7fc08a26cdbc5a1f49', 'md5': '867d8aa39eeaf6d76407c5ad1bb0d4c1',
'info_dict': { 'info_dict': {
'display_id': 'das-groko-drama-zerlegen-sich-die-volksparteien-video', 'display_id': 'maischberger-die-woche',
'id': '102', 'id': '100',
'ext': 'mp4', 'ext': 'mp4',
'duration': 4435.0, 'duration': 3687.0,
'title': 'Das GroKo-Drama: Zerlegen sich die Volksparteien?', 'title': 'maischberger. die woche vom 7. Januar 2021',
'upload_date': '20180214', 'upload_date': '20210107',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
}, },
}, { }, {
@ -315,17 +315,17 @@ class ARDIE(InfoExtractor):
class ARDBetaMediathekIE(ARDMediathekBaseIE): class ARDBetaMediathekIE(ARDMediathekBaseIE):
_VALID_URL = r'https://(?:(?:beta|www)\.)?ardmediathek\.de/(?P<client>[^/]+)/(?:player|live|video)/(?P<display_id>(?:[^/]+/)*)(?P<video_id>[a-zA-Z0-9]+)' _VALID_URL = r'https://(?:(?:beta|www)\.)?ardmediathek\.de/(?P<client>[^/]+)/(?:player|live|video)/(?P<display_id>(?:[^/]+/)*)(?P<video_id>[a-zA-Z0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://ardmediathek.de/ard/video/die-robuste-roswita/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE', 'url': 'https://www.ardmediathek.de/mdr/video/die-robuste-roswita/Y3JpZDovL21kci5kZS9iZWl0cmFnL2Ntcy84MWMxN2MzZC0wMjkxLTRmMzUtODk4ZS0wYzhlOWQxODE2NGI/',
'md5': 'dfdc87d2e7e09d073d5a80770a9ce88f', 'md5': 'a1dc75a39c61601b980648f7c9f9f71d',
'info_dict': { 'info_dict': {
'display_id': 'die-robuste-roswita', 'display_id': 'die-robuste-roswita',
'id': '70153354', 'id': '78566716',
'title': 'Die robuste Roswita', 'title': 'Die robuste Roswita',
'description': r're:^Der Mord.*trüber ist als die Ilm.', 'description': r're:^Der Mord.*totgeglaubte Ehefrau Roswita',
'duration': 5316, 'duration': 5316,
'thumbnail': 'https://img.ardmediathek.de/standard/00/70/15/33/90/-1852531467/16x9/960?mandant=ard', 'thumbnail': 'https://img.ardmediathek.de/standard/00/78/56/67/84/575672121/16x9/960?mandant=ard',
'timestamp': 1577047500, 'timestamp': 1596658200,
'upload_date': '20191222', 'upload_date': '20200805',
'ext': 'mp4', 'ext': 'mp4',
}, },
}, { }, {

View File

@ -48,6 +48,7 @@ class AWAANBaseIE(InfoExtractor):
'duration': int_or_none(video_data.get('duration')), 'duration': int_or_none(video_data.get('duration')),
'timestamp': parse_iso8601(video_data.get('create_time'), ' '), 'timestamp': parse_iso8601(video_data.get('create_time'), ' '),
'is_live': is_live, 'is_live': is_live,
'uploader_id': video_data.get('user_id'),
} }
@ -107,6 +108,7 @@ class AWAANLiveIE(AWAANBaseIE):
'title': 're:Dubai Al Oula [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$', 'title': 're:Dubai Al Oula [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'upload_date': '20150107', 'upload_date': '20150107',
'timestamp': 1420588800, 'timestamp': 1420588800,
'uploader_id': '71',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download

View File

@ -47,7 +47,7 @@ class AZMedienIE(InfoExtractor):
'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1', 'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1',
'only_matching': True 'only_matching': True
}] }]
_API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/cb9f2f81ed22e9b47f4ca64ea3cc5a5d13e88d1d' _API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/a4016f65fe62b81dc6664dd9f4910e4ab40383be'
_PARTNER_ID = '1719221' _PARTNER_ID = '1719221'
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -90,13 +90,19 @@ class BleacherReportCMSIE(AMPIE):
_VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36}|\d{5})' _VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36}|\d{5})'
_TESTS = [{ _TESTS = [{
'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1&library=video-cms', 'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1&library=video-cms',
'md5': '2e4b0a997f9228ffa31fada5c53d1ed1', 'md5': '670b2d73f48549da032861130488c681',
'info_dict': { 'info_dict': {
'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1', 'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
'ext': 'flv', 'ext': 'mp4',
'title': 'Cena vs. Rollins Would Expose the Heavyweight Division', 'title': 'Cena vs. Rollins Would Expose the Heavyweight Division',
'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e', 'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e',
'upload_date': '20150723',
'timestamp': 1437679032,
}, },
'expected_warnings': [
'Unable to download f4m manifest'
]
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -12,7 +12,7 @@ from ..utils import (
class BravoTVIE(AdobePassIE): class BravoTVIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?(?P<req_id>bravotv|oxygen)\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is', 'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is',
'md5': 'e34684cfea2a96cd2ee1ef3a60909de9', 'md5': 'e34684cfea2a96cd2ee1ef3a60909de9',
@ -28,10 +28,13 @@ class BravoTVIE(AdobePassIE):
}, { }, {
'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1', 'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.oxygen.com/in-ice-cold-blood/season-2/episode-16/videos/handling-the-horwitz-house-after-the-murder-season-2',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) site, display_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
settings = self._parse_json(self._search_regex( settings = self._parse_json(self._search_regex(
r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>({.+?})</script>', webpage, 'drupal settings'), r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>({.+?})</script>', webpage, 'drupal settings'),
@ -53,11 +56,14 @@ class BravoTVIE(AdobePassIE):
tp_path = release_pid = tve['release_pid'] tp_path = release_pid = tve['release_pid']
if tve.get('entitlement') == 'auth': if tve.get('entitlement') == 'auth':
adobe_pass = settings.get('tve_adobe_auth', {}) adobe_pass = settings.get('tve_adobe_auth', {})
if site == 'bravotv':
site = 'bravo'
resource = self._get_mvpd_resource( resource = self._get_mvpd_resource(
adobe_pass.get('adobePassResourceId', 'bravo'), adobe_pass.get('adobePassResourceId') or site,
tve['title'], release_pid, tve.get('rating')) tve['title'], release_pid, tve.get('rating'))
query['auth'] = self._extract_mvpd_auth( query['auth'] = self._extract_mvpd_auth(
url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource) url, release_pid,
adobe_pass.get('adobePassRequestorId') or site, resource)
else: else:
shared_playlist = settings['ls_playlist'] shared_playlist = settings['ls_playlist']
account_pid = shared_playlist['account_pid'] account_pid = shared_playlist['account_pid']

View File

@ -1,6 +1,7 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import datetime
import re import re
from .common import InfoExtractor from .common import InfoExtractor
@ -8,8 +9,8 @@ from ..utils import (
clean_html, clean_html,
int_or_none, int_or_none,
parse_duration, parse_duration,
parse_iso8601,
parse_resolution, parse_resolution,
try_get,
url_or_none, url_or_none,
) )
@ -24,8 +25,9 @@ class CCMAIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'L\'espot de La Marató de TV3', 'title': 'L\'espot de La Marató de TV3',
'description': 'md5:f12987f320e2f6e988e9908e4fe97765', 'description': 'md5:f12987f320e2f6e988e9908e4fe97765',
'timestamp': 1470918540, 'timestamp': 1478608140,
'upload_date': '20160811', 'upload_date': '20161108',
'age_limit': 0,
} }
}, { }, {
'url': 'http://www.ccma.cat/catradio/alacarta/programa/el-consell-de-savis-analitza-el-derbi/audio/943685/', 'url': 'http://www.ccma.cat/catradio/alacarta/programa/el-consell-de-savis-analitza-el-derbi/audio/943685/',
@ -35,8 +37,24 @@ class CCMAIE(InfoExtractor):
'ext': 'mp3', 'ext': 'mp3',
'title': 'El Consell de Savis analitza el derbi', 'title': 'El Consell de Savis analitza el derbi',
'description': 'md5:e2a3648145f3241cb9c6b4b624033e53', 'description': 'md5:e2a3648145f3241cb9c6b4b624033e53',
'upload_date': '20171205', 'upload_date': '20170512',
'timestamp': 1512507300, 'timestamp': 1494622500,
'vcodec': 'none',
'categories': ['Esports'],
}
}, {
'url': 'http://www.ccma.cat/tv3/alacarta/crims/crims-josep-tallada-lespereu-me-capitol-1/video/6031387/',
'md5': 'b43c3d3486f430f3032b5b160d80cbc3',
'info_dict': {
'id': '6031387',
'ext': 'mp4',
'title': 'Crims - Josep Talleda, l\'"Espereu-me" (capítol 1)',
'description': 'md5:7cbdafb640da9d0d2c0f62bad1e74e60',
'timestamp': 1582577700,
'upload_date': '20200224',
'subtitles': 'mincount:4',
'age_limit': 16,
'series': 'Crims',
} }
}] }]
@ -72,17 +90,27 @@ class CCMAIE(InfoExtractor):
informacio = media['informacio'] informacio = media['informacio']
title = informacio['titol'] title = informacio['titol']
durada = informacio.get('durada', {}) durada = informacio.get('durada') or {}
duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text')) duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text'))
timestamp = parse_iso8601(informacio.get('data_emissio', {}).get('utc')) tematica = try_get(informacio, lambda x: x['tematica']['text'])
timestamp = None
data_utc = try_get(informacio, lambda x: x['data_emissio']['utc'])
try:
timestamp = datetime.datetime.strptime(
data_utc, '%Y-%d-%mT%H:%M:%S%z').timestamp()
except TypeError:
pass
subtitles = {} subtitles = {}
subtitols = media.get('subtitols', {}) subtitols = media.get('subtitols') or []
if subtitols: if isinstance(subtitols, dict):
sub_url = subtitols.get('url') subtitols = [subtitols]
for st in subtitols:
sub_url = st.get('url')
if sub_url: if sub_url:
subtitles.setdefault( subtitles.setdefault(
subtitols.get('iso') or subtitols.get('text') or 'ca', []).append({ st.get('iso') or st.get('text') or 'ca', []).append({
'url': sub_url, 'url': sub_url,
}) })
@ -97,6 +125,16 @@ class CCMAIE(InfoExtractor):
'height': int_or_none(imatges.get('alcada')), 'height': int_or_none(imatges.get('alcada')),
}] }]
age_limit = None
codi_etic = try_get(informacio, lambda x: x['codi_etic']['id'])
if codi_etic:
codi_etic_s = codi_etic.split('_')
if len(codi_etic_s) == 2:
if codi_etic_s[1] == 'TP':
age_limit = 0
else:
age_limit = int_or_none(codi_etic_s[1])
return { return {
'id': media_id, 'id': media_id,
'title': title, 'title': title,
@ -106,4 +144,9 @@ class CCMAIE(InfoExtractor):
'thumbnails': thumbnails, 'thumbnails': thumbnails,
'subtitles': subtitles, 'subtitles': subtitles,
'formats': formats, 'formats': formats,
'age_limit': age_limit,
'alt_title': informacio.get('titol_complet'),
'episode_number': int_or_none(informacio.get('capitol')),
'categories': [tematica] if tematica else None,
'series': informacio.get('programa'),
} }

View File

@ -96,7 +96,7 @@ class CDAIE(InfoExtractor):
raise ExtractorError('This video is only available for premium users.', expected=True) raise ExtractorError('This video is only available for premium users.', expected=True)
need_confirm_age = False need_confirm_age = False
if self._html_search_regex(r'(<form[^>]+action="/a/validatebirth")', if self._html_search_regex(r'(<form[^>]+action="[^"]*/a/validatebirth[^"]*")',
webpage, 'birthday validate form', default=None): webpage, 'birthday validate form', default=None):
webpage = self._download_age_confirm_page( webpage = self._download_age_confirm_page(
url, video_id, note='Confirming age') url, video_id, note='Confirming age')

View File

@ -1,142 +1,51 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .mtv import MTVServicesInfoExtractor from .mtv import MTVServicesInfoExtractor
from .common import InfoExtractor
class ComedyCentralIE(MTVServicesInfoExtractor): class ComedyCentralIE(MTVServicesInfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/ _VALID_URL = r'https?://(?:www\.)?cc\.com/(?:episodes|video(?:-clips)?)/(?P<id>[0-9a-z]{6})'
(video-clips|episodes|cc-studios|video-collections|shows(?=/[^/]+/(?!full-episodes)))
/(?P<title>.*)'''
_FEED_URL = 'http://comedycentral.com/feeds/mrss/' _FEED_URL = 'http://comedycentral.com/feeds/mrss/'
_TESTS = [{ _TESTS = [{
'url': 'http://www.cc.com/video-clips/kllhuv/stand-up-greg-fitzsimmons--uncensored---too-good-of-a-mother', 'url': 'http://www.cc.com/video-clips/5ke9v2/the-daily-show-with-trevor-noah-doc-rivers-and-steve-ballmer---the-nba-player-strike',
'md5': 'c4f48e9eda1b16dd10add0744344b6d8', 'md5': 'b8acb347177c680ff18a292aa2166f80',
'info_dict': { 'info_dict': {
'id': 'cef0cbb3-e776-4bc9-b62e-8016deccb354', 'id': '89ccc86e-1b02-4f83-b0c9-1d9592ecd025',
'ext': 'mp4', 'ext': 'mp4',
'title': 'CC:Stand-Up|August 18, 2013|1|0101|Uncensored - Too Good of a Mother', 'title': 'The Daily Show with Trevor Noah|August 28, 2020|25|25149|Doc Rivers and Steve Ballmer - The NBA Player Strike',
'description': 'After a certain point, breastfeeding becomes c**kblocking.', 'description': 'md5:5334307c433892b85f4f5e5ac9ef7498',
'timestamp': 1376798400, 'timestamp': 1598670000,
'upload_date': '20130818', 'upload_date': '20200829',
}, },
}, { }, {
'url': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/interviews/6yx39d/exclusive-rand-paul-extended-interview', 'url': 'http://www.cc.com/episodes/pnzzci/drawn-together--american-idol--parody-clip-show-season-3-ep-314',
'only_matching': True, 'only_matching': True,
}]
class ComedyCentralFullEpisodesIE(MTVServicesInfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
(?:full-episodes|shows(?=/[^/]+/full-episodes))
/(?P<id>[^?]+)'''
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
_TESTS = [{
'url': 'http://www.cc.com/full-episodes/pv391a/the-daily-show-with-trevor-noah-november-28--2016---ryan-speedo-green-season-22-ep-22028',
'info_dict': {
'description': 'Donald Trump is accused of exploiting his president-elect status for personal gain, Cuban leader Fidel Castro dies, and Ryan Speedo Green discusses "Sing for Your Life."',
'title': 'November 28, 2016 - Ryan Speedo Green',
},
'playlist_count': 4,
}, { }, {
'url': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes', 'url': 'https://www.cc.com/video/k3sdvm/the-daily-show-with-jon-stewart-exclusive-the-fourth-estate',
'only_matching': True,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
mgid = self._extract_triforce_mgid(webpage, data_zone='t2_lc_promo1')
videos_info = self._get_videos_info(mgid)
return videos_info
class ToshIE(MTVServicesInfoExtractor):
IE_DESC = 'Tosh.0'
_VALID_URL = r'^https?://tosh\.cc\.com/video-(?:clips|collections)/[^/]+/(?P<videotitle>[^/?#]+)'
_FEED_URL = 'http://tosh.cc.com/feeds/mrss'
_TESTS = [{
'url': 'http://tosh.cc.com/video-clips/68g93d/twitter-users-share-summer-plans',
'info_dict': {
'description': 'Tosh asked fans to share their summer plans.',
'title': 'Twitter Users Share Summer Plans',
},
'playlist': [{
'md5': 'f269e88114c1805bb6d7653fecea9e06',
'info_dict': {
'id': '90498ec2-ed00-11e0-aca6-0026b9414f30',
'ext': 'mp4',
'title': 'Tosh.0|June 9, 2077|2|211|Twitter Users Share Summer Plans',
'description': 'Tosh asked fans to share their summer plans.',
'thumbnail': r're:^https?://.*\.jpg',
# It's really reported to be published on year 2077
'upload_date': '20770610',
'timestamp': 3390510600,
'subtitles': {
'en': 'mincount:3',
},
},
}]
}, {
'url': 'http://tosh.cc.com/video-collections/x2iz7k/just-plain-foul/m5q4fp',
'only_matching': True, 'only_matching': True,
}] }]
class ComedyCentralTVIE(MTVServicesInfoExtractor): class ComedyCentralTVIE(MTVServicesInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?comedycentral\.tv/(?:staffeln|shows)/(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:www\.)?comedycentral\.tv/folgen/(?P<id>[0-9a-z]{6})'
_TESTS = [{ _TESTS = [{
'url': 'http://www.comedycentral.tv/staffeln/7436-the-mindy-project-staffel-4', 'url': 'https://www.comedycentral.tv/folgen/pxdpec/josh-investigates-klimawandel-staffel-1-ep-1',
'info_dict': { 'info_dict': {
'id': 'local_playlist-f99b626bdfe13568579a', 'id': '15907dc3-ec3c-11e8-a442-0e40cf2fc285',
'ext': 'flv', 'ext': 'mp4',
'title': 'Episode_the-mindy-project_shows_season-4_episode-3_full-episode_part1', 'title': 'Josh Investigates',
'description': 'Steht uns das Ende der Welt bevor?',
}, },
'params': {
# rtmp download
'skip_download': True,
},
}, {
'url': 'http://www.comedycentral.tv/shows/1074-workaholics',
'only_matching': True,
}, {
'url': 'http://www.comedycentral.tv/shows/1727-the-mindy-project/bonus',
'only_matching': True,
}] }]
_FEED_URL = 'http://feeds.mtvnservices.com/od/feed/intl-mrss-player-feed'
_GEO_COUNTRIES = ['DE']
def _real_extract(self, url): def _get_feed_query(self, uri):
video_id = self._match_id(url) return {
'accountOverride': 'intl.mtvi.com',
webpage = self._download_webpage(url, video_id) 'arcEp': 'web.cc.tv',
'ep': 'b9032c3a',
mrss_url = self._search_regex( 'imageEp': 'web.cc.tv',
r'data-mrss=(["\'])(?P<url>(?:(?!\1).)+)\1', 'mgid': uri,
webpage, 'mrss url', group='url')
return self._get_videos_info_from_url(mrss_url, video_id)
class ComedyCentralShortnameIE(InfoExtractor):
_VALID_URL = r'^:(?P<id>tds|thedailyshow|theopposition)$'
_TESTS = [{
'url': ':tds',
'only_matching': True,
}, {
'url': ':thedailyshow',
'only_matching': True,
}, {
'url': ':theopposition',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
shortcut_map = {
'tds': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes',
'thedailyshow': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes',
'theopposition': 'http://www.cc.com/shows/the-opposition-with-jordan-klepper/full-episodes',
} }
return self.url_result(shortcut_map[video_id])

View File

@ -2064,7 +2064,7 @@ class InfoExtractor(object):
}) })
return entries return entries
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}, data=None, headers={}, query={}): def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, data=None, headers={}, query={}):
res = self._download_xml_handle( res = self._download_xml_handle(
mpd_url, video_id, mpd_url, video_id,
note=note or 'Downloading MPD manifest', note=note or 'Downloading MPD manifest',
@ -2078,10 +2078,9 @@ class InfoExtractor(object):
mpd_base_url = base_url(urlh.geturl()) mpd_base_url = base_url(urlh.geturl())
return self._parse_mpd_formats( return self._parse_mpd_formats(
mpd_doc, mpd_id=mpd_id, mpd_base_url=mpd_base_url, mpd_doc, mpd_id, mpd_base_url, mpd_url)
formats_dict=formats_dict, mpd_url=mpd_url)
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', formats_dict={}, mpd_url=None): def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', mpd_url=None):
""" """
Parse formats from MPD manifest. Parse formats from MPD manifest.
References: References:
@ -2359,15 +2358,7 @@ class InfoExtractor(object):
else: else:
# Assuming direct URL to unfragmented media. # Assuming direct URL to unfragmented media.
f['url'] = base_url f['url'] = base_url
formats.append(f)
# According to [1, 5.3.5.2, Table 7, page 35] @id of Representation
# is not necessarily unique within a Period thus formats with
# the same `format_id` are quite possible. There are numerous examples
# of such manifests (see https://github.com/ytdl-org/youtube-dl/issues/15111,
# https://github.com/ytdl-org/youtube-dl/issues/13919)
full_info = formats_dict.get(representation_id, {}).copy()
full_info.update(f)
formats.append(full_info)
else: else:
self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type) self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
return formats return formats

View File

@ -12,7 +12,14 @@ from ..utils import (
) )
class EggheadCourseIE(InfoExtractor): class EggheadBaseIE(InfoExtractor):
def _call_api(self, path, video_id, resource, fatal=True):
return self._download_json(
'https://app.egghead.io/api/v1/' + path,
video_id, 'Downloading %s JSON' % resource, fatal=fatal)
class EggheadCourseIE(EggheadBaseIE):
IE_DESC = 'egghead.io course' IE_DESC = 'egghead.io course'
IE_NAME = 'egghead:course' IE_NAME = 'egghead:course'
_VALID_URL = r'https://egghead\.io/courses/(?P<id>[^/?#&]+)' _VALID_URL = r'https://egghead\.io/courses/(?P<id>[^/?#&]+)'
@ -28,10 +35,9 @@ class EggheadCourseIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
playlist_id = self._match_id(url) playlist_id = self._match_id(url)
series_path = 'series/' + playlist_id
lessons = self._download_json( lessons = self._call_api(
'https://egghead.io/api/v1/series/%s/lessons' % playlist_id, series_path + '/lessons', playlist_id, 'course lessons')
playlist_id, 'Downloading course lessons JSON')
entries = [] entries = []
for lesson in lessons: for lesson in lessons:
@ -44,9 +50,8 @@ class EggheadCourseIE(InfoExtractor):
entries.append(self.url_result( entries.append(self.url_result(
lesson_url, ie=EggheadLessonIE.ie_key(), video_id=lesson_id)) lesson_url, ie=EggheadLessonIE.ie_key(), video_id=lesson_id))
course = self._download_json( course = self._call_api(
'https://egghead.io/api/v1/series/%s' % playlist_id, series_path, playlist_id, 'course', False) or {}
playlist_id, 'Downloading course JSON', fatal=False) or {}
playlist_id = course.get('id') playlist_id = course.get('id')
if playlist_id: if playlist_id:
@ -57,7 +62,7 @@ class EggheadCourseIE(InfoExtractor):
course.get('description')) course.get('description'))
class EggheadLessonIE(InfoExtractor): class EggheadLessonIE(EggheadBaseIE):
IE_DESC = 'egghead.io lesson' IE_DESC = 'egghead.io lesson'
IE_NAME = 'egghead:lesson' IE_NAME = 'egghead:lesson'
_VALID_URL = r'https://egghead\.io/(?:api/v1/)?lessons/(?P<id>[^/?#&]+)' _VALID_URL = r'https://egghead\.io/(?:api/v1/)?lessons/(?P<id>[^/?#&]+)'
@ -74,7 +79,7 @@ class EggheadLessonIE(InfoExtractor):
'upload_date': '20161209', 'upload_date': '20161209',
'duration': 304, 'duration': 304,
'view_count': 0, 'view_count': 0,
'tags': ['javascript', 'free'], 'tags': 'count:2',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@ -88,8 +93,8 @@ class EggheadLessonIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
lesson = self._download_json( lesson = self._call_api(
'https://egghead.io/api/v1/lessons/%s' % display_id, display_id) 'lessons/' + display_id, display_id, 'lesson')
lesson_id = compat_str(lesson['id']) lesson_id = compat_str(lesson['id'])
title = lesson['title'] title = lesson['title']

View File

@ -42,7 +42,10 @@ from .aljazeera import AlJazeeraIE
from .alphaporno import AlphaPornoIE from .alphaporno import AlphaPornoIE
from .amara import AmaraIE from .amara import AmaraIE
from .amcnetworks import AMCNetworksIE from .amcnetworks import AMCNetworksIE
from .americastestkitchen import AmericasTestKitchenIE from .americastestkitchen import (
AmericasTestKitchenIE,
AmericasTestKitchenSeasonIE,
)
from .animeondemand import AnimeOnDemandIE from .animeondemand import AnimeOnDemandIE
from .anvato import AnvatoIE from .anvato import AnvatoIE
from .aol import AolIE from .aol import AolIE
@ -232,11 +235,8 @@ from .cnn import (
) )
from .coub import CoubIE from .coub import CoubIE
from .comedycentral import ( from .comedycentral import (
ComedyCentralFullEpisodesIE,
ComedyCentralIE, ComedyCentralIE,
ComedyCentralShortnameIE,
ComedyCentralTVIE, ComedyCentralTVIE,
ToshIE,
) )
from .commonmistakes import CommonMistakesIE, UnicodeBOMIE from .commonmistakes import CommonMistakesIE, UnicodeBOMIE
from .commonprotocols import ( from .commonprotocols import (
@ -651,6 +651,11 @@ from .microsoftvirtualacademy import (
MicrosoftVirtualAcademyIE, MicrosoftVirtualAcademyIE,
MicrosoftVirtualAcademyCourseIE, MicrosoftVirtualAcademyCourseIE,
) )
from .minds import (
MindsIE,
MindsChannelIE,
MindsGroupIE,
)
from .ministrygrid import MinistryGridIE from .ministrygrid import MinistryGridIE
from .minoto import MinotoIE from .minoto import MinotoIE
from .miomio import MioMioIE from .miomio import MioMioIE
@ -1116,6 +1121,10 @@ from .stitcher import (
from .sport5 import Sport5IE from .sport5 import Sport5IE
from .sportbox import SportBoxIE from .sportbox import SportBoxIE
from .sportdeutschland import SportDeutschlandIE from .sportdeutschland import SportDeutschlandIE
from .spotify import (
SpotifyIE,
SpotifyShowIE,
)
from .spreaker import ( from .spreaker import (
SpreakerIE, SpreakerIE,
SpreakerPageIE, SpreakerPageIE,
@ -1229,6 +1238,10 @@ from .toutv import TouTvIE
from .toypics import ToypicsUserIE, ToypicsIE from .toypics import ToypicsUserIE, ToypicsIE
from .traileraddict import TrailerAddictIE from .traileraddict import TrailerAddictIE
from .trilulilu import TriluliluIE from .trilulilu import TriluliluIE
from .trovo import (
TrovoIE,
TrovoVodIE,
)
from .trunews import TruNewsIE from .trunews import TruNewsIE
from .trutv import TruTVIE from .trutv import TruTVIE
from .tube8 import Tube8IE from .tube8 import Tube8IE
@ -1247,6 +1260,7 @@ from .tv2 import (
TV2IE, TV2IE,
TV2ArticleIE, TV2ArticleIE,
KatsomoIE, KatsomoIE,
MTVUutisetArticleIE,
) )
from .tv2dk import ( from .tv2dk import (
TV2DKIE, TV2DKIE,
@ -1385,7 +1399,6 @@ from .vidme import (
VidmeUserIE, VidmeUserIE,
VidmeUserLikesIE, VidmeUserLikesIE,
) )
from .vidzi import VidziIE
from .vier import VierIE, VierVideosIE, VierVijfKijkOnlineIE from .vier import VierIE, VierVideosIE, VierVijfKijkOnlineIE
from .viewlift import ( from .viewlift import (
ViewLiftIE, ViewLiftIE,
@ -1445,6 +1458,7 @@ from .vrv import (
VRVSeriesIE, VRVSeriesIE,
) )
from .vshare import VShareIE from .vshare import VShareIE
from .vtm import VTMIE
from .medialaan import MedialaanIE from .medialaan import MedialaanIE
from .vube import VubeIE from .vube import VubeIE
from .vuclip import VuClipIE from .vuclip import VuClipIE

View File

@ -11,7 +11,7 @@ from ..utils import (
class FranceCultureIE(InfoExtractor): class FranceCultureIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?franceculture\.fr/emissions/(?:[^/]+/)*(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:www\.)?franceculture\.fr/emissions/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TEST = { _TESTS = [{
'url': 'http://www.franceculture.fr/emissions/carnet-nomade/rendez-vous-au-pays-des-geeks', 'url': 'http://www.franceculture.fr/emissions/carnet-nomade/rendez-vous-au-pays-des-geeks',
'info_dict': { 'info_dict': {
'id': 'rendez-vous-au-pays-des-geeks', 'id': 'rendez-vous-au-pays-des-geeks',
@ -20,10 +20,14 @@ class FranceCultureIE(InfoExtractor):
'title': 'Rendez-vous au pays des geeks', 'title': 'Rendez-vous au pays des geeks',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140301', 'upload_date': '20140301',
'timestamp': 1393642916, 'timestamp': 1393700400,
'vcodec': 'none', 'vcodec': 'none',
} }
} }, {
# no thumbnail
'url': 'https://www.franceculture.fr/emissions/la-recherche-montre-en-main/la-recherche-montre-en-main-du-mercredi-10-octobre-2018',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
@ -36,19 +40,19 @@ class FranceCultureIE(InfoExtractor):
</h1>| </h1>|
<div[^>]+class="[^"]*?(?:title-zone-diffusion|heading-zone-(?:wrapper|player-button))[^"]*?"[^>]*> <div[^>]+class="[^"]*?(?:title-zone-diffusion|heading-zone-(?:wrapper|player-button))[^"]*?"[^>]*>
).*? ).*?
(<button[^>]+data-asset-source="[^"]+"[^>]+>) (<button[^>]+data-(?:url|asset-source)="[^"]+"[^>]+>)
''', ''',
webpage, 'video data')) webpage, 'video data'))
video_url = video_data['data-asset-source'] video_url = video_data.get('data-url') or video_data['data-asset-source']
title = video_data.get('data-asset-title') or self._og_search_title(webpage) title = video_data.get('data-asset-title') or video_data.get('data-diffusion-title') or self._og_search_title(webpage)
description = self._html_search_regex( description = self._html_search_regex(
r'(?s)<div[^>]+class="intro"[^>]*>.*?<h2>(.+?)</h2>', r'(?s)<div[^>]+class="intro"[^>]*>.*?<h2>(.+?)</h2>',
webpage, 'description', default=None) webpage, 'description', default=None)
thumbnail = self._search_regex( thumbnail = self._search_regex(
r'(?s)<figure[^>]+itemtype="https://schema.org/ImageObject"[^>]*>.*?<img[^>]+(?:data-dejavu-)?src="([^"]+)"', r'(?s)<figure[^>]+itemtype="https://schema.org/ImageObject"[^>]*>.*?<img[^>]+(?:data-dejavu-)?src="([^"]+)"',
webpage, 'thumbnail', fatal=False) webpage, 'thumbnail', default=None)
uploader = self._html_search_regex( uploader = self._html_search_regex(
r'(?s)<span class="author">(.*?)</span>', r'(?s)<span class="author">(.*?)</span>',
webpage, 'uploader', default=None) webpage, 'uploader', default=None)
@ -64,6 +68,6 @@ class FranceCultureIE(InfoExtractor):
'ext': ext, 'ext': ext,
'vcodec': 'none' if ext == 'mp3' else None, 'vcodec': 'none' if ext == 'mp3' else None,
'uploader': uploader, 'uploader': uploader,
'timestamp': int_or_none(video_data.get('data-asset-created-date')), 'timestamp': int_or_none(video_data.get('data-start-time')) or int_or_none(video_data.get('data-asset-created-date')),
'duration': int_or_none(video_data.get('data-duration')), 'duration': int_or_none(video_data.get('data-duration')),
} }

View File

@ -128,6 +128,7 @@ from .zype import ZypeIE
from .odnoklassniki import OdnoklassnikiIE from .odnoklassniki import OdnoklassnikiIE
from .kinja import KinjaEmbedIE from .kinja import KinjaEmbedIE
from .arcpublishing import ArcPublishingIE from .arcpublishing import ArcPublishingIE
from .medialaan import MedialaanIE
class GenericIE(InfoExtractor): class GenericIE(InfoExtractor):
@ -2223,6 +2224,20 @@ class GenericIE(InfoExtractor):
'duration': 1581, 'duration': 1581,
}, },
}, },
{
# MyChannels SDK embed
# https://www.24kitchen.nl/populair/deskundige-dit-waarom-sommigen-gevoelig-zijn-voor-voedselallergieen
'url': 'https://www.demorgen.be/nieuws/burgemeester-rotterdam-richt-zich-in-videoboodschap-tot-relschoppers-voelt-het-goed~b0bcfd741/',
'md5': '90c0699c37006ef18e198c032d81739c',
'info_dict': {
'id': '194165',
'ext': 'mp4',
'title': 'Burgemeester Aboutaleb spreekt relschoppers toe',
'timestamp': 1611740340,
'upload_date': '20210127',
'duration': 159,
},
},
] ]
def report_following_redirect(self, new_url): def report_following_redirect(self, new_url):
@ -2462,6 +2477,9 @@ class GenericIE(InfoExtractor):
webpage = self._webpage_read_content( webpage = self._webpage_read_content(
full_response, url, video_id, prefix=first_bytes) full_response, url, video_id, prefix=first_bytes)
if '<title>DPG Media Privacy Gate</title>' in webpage:
webpage = self._download_webpage(url, video_id)
self.report_extraction(video_id) self.report_extraction(video_id)
# Is it an RSS feed, a SMIL file, an XSPF playlist or a MPD manifest? # Is it an RSS feed, a SMIL file, an XSPF playlist or a MPD manifest?
@ -2593,6 +2611,11 @@ class GenericIE(InfoExtractor):
if arc_urls: if arc_urls:
return self.playlist_from_matches(arc_urls, video_id, video_title, ie=ArcPublishingIE.ie_key()) return self.playlist_from_matches(arc_urls, video_id, video_title, ie=ArcPublishingIE.ie_key())
mychannels_urls = MedialaanIE._extract_urls(webpage)
if mychannels_urls:
return self.playlist_from_matches(
mychannels_urls, video_id, video_title, ie=MedialaanIE.ie_key())
# Look for embedded rtl.nl player # Look for embedded rtl.nl player
matches = re.findall( matches = re.findall(
r'<iframe[^>]+?src="((?:https?:)?//(?:(?:www|static)\.)?rtl\.nl/(?:system/videoplayer/[^"]+(?:video_)?)?embed[^"]+)"', r'<iframe[^>]+?src="((?:https?:)?//(?:(?:www|static)\.)?rtl\.nl/(?:system/videoplayer/[^"]+(?:video_)?)?embed[^"]+)"',

View File

@ -7,6 +7,7 @@ from ..compat import compat_parse_qs
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
ExtractorError, ExtractorError,
get_element_by_class,
int_or_none, int_or_none,
lowercase_escape, lowercase_escape,
try_get, try_get,
@ -237,7 +238,7 @@ class GoogleDriveIE(InfoExtractor):
if confirmation_webpage: if confirmation_webpage:
confirm = self._search_regex( confirm = self._search_regex(
r'confirm=([^&"\']+)', confirmation_webpage, r'confirm=([^&"\']+)', confirmation_webpage,
'confirmation code', fatal=False) 'confirmation code', default=None)
if confirm: if confirm:
confirmed_source_url = update_url_query(source_url, { confirmed_source_url = update_url_query(source_url, {
'confirm': confirm, 'confirm': confirm,
@ -245,6 +246,11 @@ class GoogleDriveIE(InfoExtractor):
urlh = request_source_file(confirmed_source_url, 'confirmed source') urlh = request_source_file(confirmed_source_url, 'confirmed source')
if urlh and urlh.headers.get('Content-Disposition'): if urlh and urlh.headers.get('Content-Disposition'):
add_source_format(urlh) add_source_format(urlh)
else:
self.report_warning(
get_element_by_class('uc-error-subcaption', confirmation_webpage)
or get_element_by_class('uc-error-caption', confirmation_webpage)
or 'unable to extract confirmation code')
if not formats and reason: if not formats and reason:
raise ExtractorError(reason, expected=True) raise ExtractorError(reason, expected=True)

View File

@ -5,7 +5,10 @@ import functools
import json import json
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import (
compat_str,
compat_urllib_parse_unquote,
)
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
ExtractorError, ExtractorError,
@ -131,6 +134,9 @@ class LBRYIE(LBRYBaseIE):
}, { }, {
'url': 'https://lbry.tv/$/download/Episode-1/e7d93d772bd87e2b62d5ab993c1c3ced86ebb396', 'url': 'https://lbry.tv/$/download/Episode-1/e7d93d772bd87e2b62d5ab993c1c3ced86ebb396',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://lbry.tv/@lacajadepandora:a/TRUMP-EST%C3%81-BIEN-PUESTO-con-Pilar-Baselga,-Carlos-Senra,-Luis-Palacios-(720p_30fps_H264-192kbit_AAC):1',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -139,6 +145,7 @@ class LBRYIE(LBRYBaseIE):
display_id = display_id.split('/', 2)[-1].replace('/', ':') display_id = display_id.split('/', 2)[-1].replace('/', ':')
else: else:
display_id = display_id.replace(':', '#') display_id = display_id.replace(':', '#')
display_id = compat_urllib_parse_unquote(display_id)
uri = 'lbry://' + display_id uri = 'lbry://' + display_id
result = self._resolve_url(uri, display_id, 'stream') result = self._resolve_url(uri, display_id, 'stream')
result_value = result['value'] result_value = result['value']

View File

@ -2,268 +2,113 @@ from __future__ import unicode_literals
import re import re
from .gigya import GigyaBaseIE from .common import InfoExtractor
from ..compat import compat_str
from ..utils import ( from ..utils import (
extract_attributes,
int_or_none, int_or_none,
parse_duration, mimetype2ext,
try_get, parse_iso8601,
unified_timestamp,
) )
class MedialaanIE(GigyaBaseIE): class MedialaanIE(InfoExtractor):
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?:// https?://
(?:www\.|nieuws\.)?
(?: (?:
(?P<site_id>vtm|q2|vtmkzoom)\.be/ (?:embed\.)?mychannels.video/embed/|
(?: embed\.mychannels\.video/(?:s(?:dk|cript)/)?production/|
video(?:/[^/]+/id/|/?\?.*?\baid=)| (?:www\.)?(?:
(?:[^/]+/)* (?:
) 7sur7|
demorgen|
hln|
joe|
qmusic
)\.be|
(?:
[abe]d|
bndestem|
destentor|
gelderlander|
pzc|
tubantia|
volkskrant
)\.nl
)/video/(?:[^/]+/)*[^/?&#]+~p
) )
(?P<id>[^/?#&]+) (?P<id>\d+)
''' '''
_NETRC_MACHINE = 'medialaan'
_APIKEY = '3_HZ0FtkMW_gOyKlqQzW5_0FHRC7Nd5XpXJZcDdXY4pk5eES2ZWmejRW5egwVm4ug-'
_SITE_TO_APP_ID = {
'vtm': 'vtm_watch',
'q2': 'q2',
'vtmkzoom': 'vtmkzoom',
}
_TESTS = [{ _TESTS = [{
# vod 'url': 'https://www.bndestem.nl/video/de-terugkeer-van-ally-de-aap-en-wie-vertrekt-er-nog-bij-nac~p193993',
'url': 'http://vtm.be/video/volledige-afleveringen/id/vtm_20170219_VM0678361_vtmwatch',
'info_dict': { 'info_dict': {
'id': 'vtm_20170219_VM0678361_vtmwatch', 'id': '193993',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Allemaal Chris afl. 6', 'title': 'De terugkeer van Ally de Aap en wie vertrekt er nog bij NAC?',
'description': 'md5:4be86427521e7b07e0adb0c9c554ddb2', 'timestamp': 1611663540,
'timestamp': 1487533280, 'upload_date': '20210126',
'upload_date': '20170219', 'duration': 238,
'duration': 2562,
'series': 'Allemaal Chris',
'season': 'Allemaal Chris',
'season_number': 1,
'season_id': '256936078124527',
'episode': 'Allemaal Chris afl. 6',
'episode_number': 6,
'episode_id': '256936078591527',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
'skip': 'Requires account credentials',
}, { }, {
# clip 'url': 'https://www.gelderlander.nl/video/kanalen/degelderlander~c320/series/snel-nieuws~s984/noodbevel-in-doetinchem-politie-stuurt-mensen-centrum-uit~p194093',
'url': 'http://vtm.be/video?aid=168332',
'info_dict': {
'id': '168332',
'ext': 'mp4',
'title': '"Veronique liegt!"',
'description': 'md5:1385e2b743923afe54ba4adc38476155',
'timestamp': 1489002029,
'upload_date': '20170308',
'duration': 96,
},
}, {
# vod
'url': 'http://vtm.be/video/volledige-afleveringen/id/257107153551000',
'only_matching': True, 'only_matching': True,
}, { }, {
# vod 'url': 'https://embed.mychannels.video/sdk/production/193993?options=TFTFF_default',
'url': 'http://vtm.be/video?aid=163157',
'only_matching': True, 'only_matching': True,
}, { }, {
# vod 'url': 'https://embed.mychannels.video/script/production/193993',
'url': 'http://www.q2.be/video/volledige-afleveringen/id/2be_20170301_VM0684442_q2',
'only_matching': True, 'only_matching': True,
}, { }, {
# clip 'url': 'https://embed.mychannels.video/production/193993',
'url': 'http://vtmkzoom.be/k3-dansstudio/een-nieuw-seizoen-van-k3-dansstudio',
'only_matching': True, 'only_matching': True,
}, { }, {
# http/s redirect 'url': 'https://mychannels.video/embed/193993',
'url': 'https://vtmkzoom.be/video?aid=45724', 'only_matching': True,
'info_dict': {
'id': '257136373657000',
'ext': 'mp4',
'title': 'K3 Dansstudio Ushuaia afl.6',
},
'params': {
'skip_download': True,
},
'skip': 'Requires account credentials',
}, { }, {
# nieuws.vtm.be 'url': 'https://embed.mychannels.video/embed/193993',
'url': 'https://nieuws.vtm.be/stadion/stadion/genk-nog-moeilijk-programma',
'only_matching': True, 'only_matching': True,
}] }]
def _real_initialize(self): @staticmethod
self._logged_in = False def _extract_urls(webpage):
entries = []
def _login(self): for element in re.findall(r'(<div[^>]+data-mychannels-type="video"[^>]*>)', webpage):
username, password = self._get_login_info() mychannels_id = extract_attributes(element).get('data-mychannels-id')
if username is None: if mychannels_id:
self.raise_login_required() entries.append('https://mychannels.video/embed/' + mychannels_id)
return entries
auth_data = {
'APIKey': self._APIKEY,
'sdk': 'js_6.1',
'format': 'json',
'loginID': username,
'password': password,
}
auth_info = self._gigya_login(auth_data)
self._uid = auth_info['UID']
self._uid_signature = auth_info['UIDSignature']
self._signature_timestamp = auth_info['signatureTimestamp']
self._logged_in = True
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) production_id = self._match_id(url)
video_id, site_id = mobj.group('id', 'site_id') production = self._download_json(
'https://embed.mychannels.video/sdk/production/' + production_id,
production_id, query={'options': 'UUUU_default'})['productions'][0]
title = production['title']
webpage = self._download_webpage(url, video_id) formats = []
for source in (production.get('sources') or []):
config = self._parse_json( src = source.get('src')
self._search_regex( if not src:
r'videoJSConfig\s*=\s*JSON\.parse\(\'({.+?})\'\);', continue
webpage, 'config', default='{}'), video_id, ext = mimetype2ext(source.get('type'))
transform_source=lambda s: s.replace( if ext == 'm3u8':
'\\\\', '\\').replace(r'\"', '"').replace(r"\'", "'")) formats.extend(self._extract_m3u8_formats(
src, production_id, 'mp4', 'm3u8_native',
vod_id = config.get('vodId') or self._search_regex( m3u8_id='hls', fatal=False))
(r'\\"vodId\\"\s*:\s*\\"(.+?)\\"',
r'"vodId"\s*:\s*"(.+?)"',
r'<[^>]+id=["\']vod-(\d+)'),
webpage, 'video_id', default=None)
# clip, no authentication required
if not vod_id:
player = self._parse_json(
self._search_regex(
r'vmmaplayer\(({.+?})\);', webpage, 'vmma player',
default=''),
video_id, transform_source=lambda s: '[%s]' % s, fatal=False)
if player:
video = player[-1]
if video['videoUrl'] in ('http', 'https'):
return self.url_result(video['url'], MedialaanIE.ie_key())
info = {
'id': video_id,
'url': video['videoUrl'],
'title': video['title'],
'thumbnail': video.get('imageUrl'),
'timestamp': int_or_none(video.get('createdDate')),
'duration': int_or_none(video.get('duration')),
}
else: else:
info = self._parse_html5_media_entries( formats.append({
url, webpage, video_id, m3u8_id='hls')[0] 'ext': ext,
info.update({ 'url': src,
'id': video_id,
'title': self._html_search_meta('description', webpage),
'duration': parse_duration(self._html_search_meta('duration', webpage)),
}) })
# vod, authentication required self._sort_formats(formats)
else:
if not self._logged_in:
self._login()
settings = self._parse_json( return {
self._search_regex( 'id': production_id,
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);', 'title': title,
webpage, 'drupal settings', default='{}'), 'formats': formats,
video_id) 'thumbnail': production.get('posterUrl'),
'timestamp': parse_iso8601(production.get('publicationDate'), ' '),
def get(container, item): 'duration': int_or_none(production.get('duration')) or None,
return try_get( }
settings, lambda x: x[container][item],
compat_str) or self._search_regex(
r'"%s"\s*:\s*"([^"]+)' % item, webpage, item,
default=None)
app_id = get('vod', 'app_id') or self._SITE_TO_APP_ID.get(site_id, 'vtm_watch')
sso = get('vod', 'gigyaDatabase') or 'vtm-sso'
data = self._download_json(
'http://vod.medialaan.io/api/1.0/item/%s/video' % vod_id,
video_id, query={
'app_id': app_id,
'user_network': sso,
'UID': self._uid,
'UIDSignature': self._uid_signature,
'signatureTimestamp': self._signature_timestamp,
})
formats = self._extract_m3u8_formats(
data['response']['uri'], video_id, entry_protocol='m3u8_native',
ext='mp4', m3u8_id='hls')
self._sort_formats(formats)
info = {
'id': vod_id,
'formats': formats,
}
api_key = get('vod', 'apiKey')
channel = get('medialaanGigya', 'channel')
if api_key:
videos = self._download_json(
'http://vod.medialaan.io/vod/v2/videos', video_id, fatal=False,
query={
'channels': channel,
'ids': vod_id,
'limit': 1,
'apikey': api_key,
})
if videos:
video = try_get(
videos, lambda x: x['response']['videos'][0], dict)
if video:
def get(container, item, expected_type=None):
return try_get(
video, lambda x: x[container][item], expected_type)
def get_string(container, item):
return get(container, item, compat_str)
info.update({
'series': get_string('program', 'title'),
'season': get_string('season', 'title'),
'season_number': int_or_none(get('season', 'number')),
'season_id': get_string('season', 'id'),
'episode': get_string('episode', 'title'),
'episode_number': int_or_none(get('episode', 'number')),
'episode_id': get_string('episode', 'id'),
'duration': int_or_none(
video.get('duration')) or int_or_none(
video.get('durationMillis'), scale=1000),
'title': get_string('episode', 'title'),
'description': get_string('episode', 'text'),
'timestamp': unified_timestamp(get_string(
'publication', 'begin')),
})
if not info.get('title'):
info['title'] = try_get(
config, lambda x: x['videoConfig']['title'],
compat_str) or self._html_search_regex(
r'\\"title\\"\s*:\s*\\"(.+?)\\"', webpage, 'title',
default=None) or self._og_search_title(webpage)
if not info.get('description'):
info['description'] = self._html_search_regex(
r'<div[^>]+class="field-item\s+even">\s*<p>(.+?)</p>',
webpage, 'description', default=None)
return info

View File

@ -0,0 +1,196 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
clean_html,
int_or_none,
str_or_none,
strip_or_none,
)
class MindsBaseIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?minds\.com/'
def _call_api(self, path, video_id, resource, query=None):
api_url = 'https://www.minds.com/api/' + path
token = self._get_cookies(api_url).get('XSRF-TOKEN')
return self._download_json(
api_url, video_id, 'Downloading %s JSON metadata' % resource, headers={
'Referer': 'https://www.minds.com/',
'X-XSRF-TOKEN': token.value if token else '',
}, query=query)
class MindsIE(MindsBaseIE):
IE_NAME = 'minds'
_VALID_URL = MindsBaseIE._VALID_URL_BASE + r'(?:media|newsfeed|archive/view)/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://www.minds.com/media/100000000000086822',
'md5': '215a658184a419764852239d4970b045',
'info_dict': {
'id': '100000000000086822',
'ext': 'mp4',
'title': 'Minds intro sequence',
'thumbnail': r're:https?://.+\.png',
'uploader_id': 'ottman',
'upload_date': '20130524',
'timestamp': 1369404826,
'uploader': 'Bill Ottman',
'view_count': int,
'like_count': int,
'dislike_count': int,
'tags': ['animation'],
'comment_count': int,
'license': 'attribution-cc',
},
}, {
# entity.type == 'activity' and empty title
'url': 'https://www.minds.com/newsfeed/798025111988506624',
'md5': 'b2733a74af78d7fd3f541c4cbbaa5950',
'info_dict': {
'id': '798022190320226304',
'ext': 'mp4',
'title': '798022190320226304',
'uploader': 'ColinFlaherty',
'upload_date': '20180111',
'timestamp': 1515639316,
'uploader_id': 'ColinFlaherty',
},
}, {
'url': 'https://www.minds.com/archive/view/715172106794442752',
'only_matching': True,
}, {
# youtube perma_url
'url': 'https://www.minds.com/newsfeed/1197131838022602752',
'only_matching': True,
}]
def _real_extract(self, url):
entity_id = self._match_id(url)
entity = self._call_api(
'v1/entities/entity/' + entity_id, entity_id, 'entity')['entity']
if entity.get('type') == 'activity':
if entity.get('custom_type') == 'video':
video_id = entity['entity_guid']
else:
return self.url_result(entity['perma_url'])
else:
assert(entity['subtype'] == 'video')
video_id = entity_id
# 1080p and webm formats available only on the sources array
video = self._call_api(
'v2/media/video/' + video_id, video_id, 'video')
formats = []
for source in (video.get('sources') or []):
src = source.get('src')
if not src:
continue
formats.append({
'format_id': source.get('label'),
'height': int_or_none(source.get('size')),
'url': src,
})
self._sort_formats(formats)
entity = video.get('entity') or entity
owner = entity.get('ownerObj') or {}
uploader_id = owner.get('username')
tags = entity.get('tags')
if tags and isinstance(tags, compat_str):
tags = [tags]
thumbnail = None
poster = video.get('poster') or entity.get('thumbnail_src')
if poster:
urlh = self._request_webpage(poster, video_id, fatal=False)
if urlh:
thumbnail = urlh.geturl()
return {
'id': video_id,
'title': entity.get('title') or video_id,
'formats': formats,
'description': clean_html(entity.get('description')) or None,
'license': str_or_none(entity.get('license')),
'timestamp': int_or_none(entity.get('time_created')),
'uploader': strip_or_none(owner.get('name')),
'uploader_id': uploader_id,
'uploader_url': 'https://www.minds.com/' + uploader_id if uploader_id else None,
'view_count': int_or_none(entity.get('play:count')),
'like_count': int_or_none(entity.get('thumbs:up:count')),
'dislike_count': int_or_none(entity.get('thumbs:down:count')),
'tags': tags,
'comment_count': int_or_none(entity.get('comments:count')),
'thumbnail': thumbnail,
}
class MindsFeedBaseIE(MindsBaseIE):
_PAGE_SIZE = 150
def _entries(self, feed_id):
query = {'limit': self._PAGE_SIZE, 'sync': 1}
i = 1
while True:
data = self._call_api(
'v2/feeds/container/%s/videos' % feed_id,
feed_id, 'page %s' % i, query)
entities = data.get('entities') or []
for entity in entities:
guid = entity.get('guid')
if not guid:
continue
yield self.url_result(
'https://www.minds.com/newsfeed/' + guid,
MindsIE.ie_key(), guid)
query['from_timestamp'] = data['load-next']
if not (query['from_timestamp'] and len(entities) == self._PAGE_SIZE):
break
i += 1
def _real_extract(self, url):
feed_id = self._match_id(url)
feed = self._call_api(
'v1/%s/%s' % (self._FEED_PATH, feed_id),
feed_id, self._FEED_TYPE)[self._FEED_TYPE]
return self.playlist_result(
self._entries(feed['guid']), feed_id,
strip_or_none(feed.get('name')),
feed.get('briefdescription'))
class MindsChannelIE(MindsFeedBaseIE):
_FEED_TYPE = 'channel'
IE_NAME = 'minds:' + _FEED_TYPE
_VALID_URL = MindsBaseIE._VALID_URL_BASE + r'(?!(?:newsfeed|media|api|archive|groups)/)(?P<id>[^/?&#]+)'
_FEED_PATH = 'channel'
_TEST = {
'url': 'https://www.minds.com/ottman',
'info_dict': {
'id': 'ottman',
'title': 'Bill Ottman',
'description': 'Co-creator & CEO @minds',
},
'playlist_mincount': 54,
}
class MindsGroupIE(MindsFeedBaseIE):
_FEED_TYPE = 'group'
IE_NAME = 'minds:' + _FEED_TYPE
_VALID_URL = MindsBaseIE._VALID_URL_BASE + r'groups/profile/(?P<id>[0-9]+)'
_FEED_PATH = 'groups/group'
_TEST = {
'url': 'https://www.minds.com/groups/profile/785582576369672204/feed/videos',
'info_dict': {
'id': '785582576369672204',
'title': 'Cooking Videos',
},
'playlist_mincount': 1,
}

View File

@ -251,11 +251,9 @@ class MixcloudPlaylistBaseIE(MixcloudBaseIE):
cloudcast_url = cloudcast.get('url') cloudcast_url = cloudcast.get('url')
if not cloudcast_url: if not cloudcast_url:
continue continue
video_id = cloudcast.get('slug') slug = try_get(cloudcast, lambda x: x['slug'], compat_str)
if video_id: owner_username = try_get(cloudcast, lambda x: x['owner']['username'], compat_str)
owner_username = try_get(cloudcast, lambda x: x['owner']['username'], compat_str) video_id = '%s_%s' % (owner_username, slug) if slug and owner_username else None
if owner_username:
video_id = '%s_%s' % (owner_username, video_id)
entries.append(self.url_result( entries.append(self.url_result(
cloudcast_url, MixcloudIE.ie_key(), video_id)) cloudcast_url, MixcloudIE.ie_key(), video_id))

View File

@ -253,6 +253,10 @@ class MTVServicesInfoExtractor(InfoExtractor):
return try_get(feed, lambda x: x['result']['data']['id'], compat_str) return try_get(feed, lambda x: x['result']['data']['id'], compat_str)
@staticmethod
def _extract_child_with_type(parent, t):
return next(c for c in parent['children'] if c.get('type') == t)
def _extract_mgid(self, webpage): def _extract_mgid(self, webpage):
try: try:
# the url can be http://media.mtvnservices.com/fb/{mgid}.swf # the url can be http://media.mtvnservices.com/fb/{mgid}.swf
@ -278,6 +282,13 @@ class MTVServicesInfoExtractor(InfoExtractor):
if not mgid: if not mgid:
mgid = self._extract_triforce_mgid(webpage) mgid = self._extract_triforce_mgid(webpage)
if not mgid:
data = self._parse_json(self._search_regex(
r'__DATA__\s*=\s*({.+?});', webpage, 'data'), None)
main_container = self._extract_child_with_type(data, 'MainContainer')
video_player = self._extract_child_with_type(main_container, 'VideoPlayer')
mgid = video_player['props']['media']['video']['config']['uri']
return mgid return mgid
def _real_extract(self, url): def _real_extract(self, url):
@ -349,18 +360,6 @@ class MTVIE(MTVServicesInfoExtractor):
'only_matching': True, 'only_matching': True,
}] }]
@staticmethod
def extract_child_with_type(parent, t):
children = parent['children']
return next(c for c in children if c.get('type') == t)
def _extract_mgid(self, webpage):
data = self._parse_json(self._search_regex(
r'__DATA__\s*=\s*({.+?});', webpage, 'data'), None)
main_container = self.extract_child_with_type(data, 'MainContainer')
video_player = self.extract_child_with_type(main_container, 'VideoPlayer')
return video_player['props']['media']['video']['config']['uri']
class MTVJapanIE(MTVServicesInfoExtractor): class MTVJapanIE(MTVServicesInfoExtractor):
IE_NAME = 'mtvjapan' IE_NAME = 'mtvjapan'

View File

@ -1,104 +1,125 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import str_to_int from ..utils import (
determine_ext,
ExtractorError,
int_or_none,
try_get,
url_or_none,
)
class NineGagIE(InfoExtractor): class NineGagIE(InfoExtractor):
IE_NAME = '9gag' IE_NAME = '9gag'
_VALID_URL = r'https?://(?:www\.)?9gag(?:\.com/tv|\.tv)/(?:p|embed)/(?P<id>[a-zA-Z0-9]+)(?:/(?P<display_id>[^?#/]+))?' _VALID_URL = r'https?://(?:www\.)?9gag\.com/gag/(?P<id>[^/?&#]+)'
_TESTS = [{ _TEST = {
'url': 'http://9gag.com/tv/p/Kk2X5/people-are-awesome-2013-is-absolutely-awesome', 'url': 'https://9gag.com/gag/ae5Ag7B',
'info_dict': { 'info_dict': {
'id': 'kXzwOKyGlSA', 'id': 'ae5Ag7B',
'ext': 'mp4', 'ext': 'mp4',
'description': 'This 3-minute video will make you smile and then make you feel untalented and insignificant. Anyway, you should share this awesomeness. (Thanks, Dino!)', 'title': 'Capybara Agility Training',
'title': '\"People Are Awesome 2013\" Is Absolutely Awesome', 'upload_date': '20191108',
'uploader_id': 'UCdEH6EjDKwtTe-sO2f0_1XA', 'timestamp': 1573237208,
'uploader': 'CompilationChannel', 'categories': ['Awesome'],
'upload_date': '20131110', 'tags': ['Weimaraner', 'American Pit Bull Terrier'],
'view_count': int, 'duration': 44,
}, 'like_count': int,
'add_ie': ['Youtube'], 'dislike_count': int,
}, { 'comment_count': int,
'url': 'http://9gag.com/tv/p/aKolP3', }
'info_dict': {
'id': 'aKolP3',
'ext': 'mp4',
'title': 'This Guy Travelled 11 countries In 44 days Just To Make This Amazing Video',
'description': "I just saw more in 1 minute than I've seen in 1 year. This guy's video is epic!!",
'uploader_id': 'rickmereki',
'uploader': 'Rick Mereki',
'upload_date': '20110803',
'view_count': int,
},
'add_ie': ['Vimeo'],
}, {
'url': 'http://9gag.com/tv/p/KklwM',
'only_matching': True,
}, {
'url': 'http://9gag.tv/p/Kk2X5',
'only_matching': True,
}, {
'url': 'http://9gag.com/tv/embed/a5Dmvl',
'only_matching': True,
}]
_EXTERNAL_VIDEO_PROVIDER = {
'1': {
'url': '%s',
'ie_key': 'Youtube',
},
'2': {
'url': 'http://player.vimeo.com/video/%s',
'ie_key': 'Vimeo',
},
'3': {
'url': 'http://instagram.com/p/%s',
'ie_key': 'Instagram',
},
'4': {
'url': 'http://vine.co/v/%s',
'ie_key': 'Vine',
},
} }
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) post_id = self._match_id(url)
video_id = mobj.group('id') post = self._download_json(
display_id = mobj.group('display_id') or video_id 'https://9gag.com/v1/post', post_id, query={
'id': post_id
})['data']['post']
webpage = self._download_webpage(url, display_id) if post.get('type') != 'Animated':
raise ExtractorError(
'The given url does not contain a video',
expected=True)
post_view = self._parse_json( title = post['title']
self._search_regex(
r'var\s+postView\s*=\s*new\s+app\.PostView\({\s*post:\s*({.+?})\s*,\s*posts:\s*prefetchedCurrentPost',
webpage, 'post view'),
display_id)
ie_key = None duration = None
source_url = post_view.get('sourceUrl') formats = []
if not source_url: thumbnails = []
external_video_id = post_view['videoExternalId'] for key, image in (post.get('images') or {}).items():
external_video_provider = post_view['videoExternalProvider'] image_url = url_or_none(image.get('url'))
source_url = self._EXTERNAL_VIDEO_PROVIDER[external_video_provider]['url'] % external_video_id if not image_url:
ie_key = self._EXTERNAL_VIDEO_PROVIDER[external_video_provider]['ie_key'] continue
title = post_view['title'] ext = determine_ext(image_url)
description = post_view.get('description') image_id = key.strip('image')
view_count = str_to_int(post_view.get('externalView')) common = {
thumbnail = post_view.get('thumbnail_700w') or post_view.get('ogImageUrl') or post_view.get('thumbnail_300w') 'url': image_url,
'width': int_or_none(image.get('width')),
'height': int_or_none(image.get('height')),
}
if ext in ('jpg', 'png'):
webp_url = image.get('webpUrl')
if webp_url:
t = common.copy()
t.update({
'id': image_id + '-webp',
'url': webp_url,
})
thumbnails.append(t)
common.update({
'id': image_id,
'ext': ext,
})
thumbnails.append(common)
elif ext in ('webm', 'mp4'):
if not duration:
duration = int_or_none(image.get('duration'))
common['acodec'] = 'none' if image.get('hasAudio') == 0 else None
for vcodec in ('vp8', 'vp9', 'h265'):
c_url = image.get(vcodec + 'Url')
if not c_url:
continue
c_f = common.copy()
c_f.update({
'format_id': image_id + '-' + vcodec,
'url': c_url,
'vcodec': vcodec,
})
formats.append(c_f)
common.update({
'ext': ext,
'format_id': image_id,
})
formats.append(common)
self._sort_formats(formats)
section = try_get(post, lambda x: x['postSection']['name'])
tags = None
post_tags = post.get('tags')
if post_tags:
tags = []
for tag in post_tags:
tag_key = tag.get('key')
if not tag_key:
continue
tags.append(tag_key)
get_count = lambda x: int_or_none(post.get(x + 'Count'))
return { return {
'_type': 'url_transparent', 'id': post_id,
'url': source_url,
'ie_key': ie_key,
'id': video_id,
'display_id': display_id,
'title': title, 'title': title,
'description': description, 'timestamp': int_or_none(post.get('creationTs')),
'view_count': view_count, 'duration': duration,
'thumbnail': thumbnail, 'formats': formats,
'thumbnails': thumbnails,
'like_count': get_count('upVote'),
'dislike_count': get_count('downVote'),
'comment_count': get_count('comments'),
'age_limit': 18 if post.get('nsfw') == 1 else None,
'categories': [section] if section else None,
'tags': tags,
} }

View File

@ -6,30 +6,40 @@ import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_urlparse from ..compat import compat_urlparse
from ..utils import ( from ..utils import (
extract_attributes,
get_element_by_class, get_element_by_class,
urlencode_postdata, urlencode_postdata,
) )
class NJPWWorldIE(InfoExtractor): class NJPWWorldIE(InfoExtractor):
_VALID_URL = r'https?://njpwworld\.com/p/(?P<id>[a-z0-9_]+)' _VALID_URL = r'https?://(front\.)?njpwworld\.com/p/(?P<id>[a-z0-9_]+)'
IE_DESC = '新日本プロレスワールド' IE_DESC = '新日本プロレスワールド'
_NETRC_MACHINE = 'njpwworld' _NETRC_MACHINE = 'njpwworld'
_TEST = { _TESTS = [{
'url': 'http://njpwworld.com/p/s_series_00155_1_9/', 'url': 'http://njpwworld.com/p/s_series_00155_1_9/',
'info_dict': { 'info_dict': {
'id': 's_series_00155_1_9', 'id': 's_series_00155_1_9',
'ext': 'mp4', 'ext': 'mp4',
'title': '第9試合 ランディ・サベージ vs リック・スタイナー', 'title': '闘強導夢2000 2000年1月4日 東京ドーム 第9試合 ランディ・サベージ VS リック・スタイナー',
'tags': list, 'tags': list,
}, },
'params': { 'params': {
'skip_download': True, # AES-encrypted m3u8 'skip_download': True, # AES-encrypted m3u8
}, },
'skip': 'Requires login', 'skip': 'Requires login',
} }, {
'url': 'https://front.njpwworld.com/p/s_series_00563_16_bs',
'info_dict': {
'id': 's_series_00563_16_bs',
'ext': 'mp4',
'title': 'WORLD TAG LEAGUE 2020 & BEST OF THE SUPER Jr.27 2020年12月6日 福岡・福岡国際センター バックステージコメント(字幕あり)',
'tags': ["福岡・福岡国際センター", "バックステージコメント", "2020", "20年代"],
},
'params': {
'skip_download': True,
},
}]
_LOGIN_URL = 'https://front.njpwworld.com/auth/login' _LOGIN_URL = 'https://front.njpwworld.com/auth/login'
@ -64,35 +74,27 @@ class NJPWWorldIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
formats = [] formats = []
for mobj in re.finditer(r'<a[^>]+\bhref=(["\'])/player.+?[^>]*>', webpage): for kind, vid in re.findall(r'if\s+\(\s*imageQualityType\s*==\s*\'([^\']+)\'\s*\)\s*{\s*video_id\s*=\s*"(\d+)"', webpage):
player = extract_attributes(mobj.group(0)) player_path = '/intent?id=%s&type=url' % vid
player_path = player.get('href')
if not player_path:
continue
kind = self._search_regex(
r'(low|high)$', player.get('class') or '', 'kind',
default='low')
player_url = compat_urlparse.urljoin(url, player_path) player_url = compat_urlparse.urljoin(url, player_path)
player_page = self._download_webpage( formats.append({
player_url, video_id, note='Downloading player page') 'url': player_url,
entries = self._parse_html5_media_entries( 'format_id': kind,
player_url, player_page, video_id, m3u8_id='hls-%s' % kind, 'ext': 'mp4',
m3u8_entry_protocol='m3u8_native') 'protocol': 'm3u8',
kind_formats = entries[0]['formats'] 'quality': 2 if kind == 'high' else 1,
for f in kind_formats: })
f['quality'] = 2 if kind == 'high' else 1
formats.extend(kind_formats)
self._sort_formats(formats) self._sort_formats(formats)
post_content = get_element_by_class('post-content', webpage) tag_block = get_element_by_class('tag-block', webpage)
tags = re.findall( tags = re.findall(
r'<li[^>]+class="tag-[^"]+"><a[^>]*>([^<]+)</a></li>', post_content r'<a[^>]+class="tag-[^"]+"[^>]*>([^<]+)</a>', tag_block
) if post_content else None ) if tag_block else None
return { return {
'id': video_id, 'id': video_id,
'title': self._og_search_title(webpage), 'title': get_element_by_class('article-title', webpage) or self._og_search_title(webpage),
'formats': formats, 'formats': formats,
'tags': tags, 'tags': tags,
} }

View File

@ -22,11 +22,15 @@ from ..utils import (
orderedSet, orderedSet,
remove_quotes, remove_quotes,
str_to_int, str_to_int,
update_url_query,
urlencode_postdata,
url_or_none, url_or_none,
) )
class PornHubBaseIE(InfoExtractor): class PornHubBaseIE(InfoExtractor):
_NETRC_MACHINE = 'pornhub'
def _download_webpage_handle(self, *args, **kwargs): def _download_webpage_handle(self, *args, **kwargs):
def dl(*args, **kwargs): def dl(*args, **kwargs):
return super(PornHubBaseIE, self)._download_webpage_handle(*args, **kwargs) return super(PornHubBaseIE, self)._download_webpage_handle(*args, **kwargs)
@ -52,6 +56,66 @@ class PornHubBaseIE(InfoExtractor):
return webpage, urlh return webpage, urlh
def _real_initialize(self):
self._logged_in = False
def _login(self, host):
if self._logged_in:
return
site = host.split('.')[0]
# Both sites pornhub and pornhubpremium have separate accounts
# so there should be an option to provide credentials for both.
# At the same time some videos are available under the same video id
# on both sites so that we have to identify them as the same video.
# For that purpose we have to keep both in the same extractor
# but under different netrc machines.
username, password = self._get_login_info(netrc_machine=site)
if username is None:
return
login_url = 'https://www.%s/%slogin' % (host, 'premium/' if 'premium' in host else '')
login_page = self._download_webpage(
login_url, None, 'Downloading %s login page' % site)
def is_logged(webpage):
return any(re.search(p, webpage) for p in (
r'class=["\']signOut',
r'>Sign\s+[Oo]ut\s*<'))
if is_logged(login_page):
self._logged_in = True
return
login_form = self._hidden_inputs(login_page)
login_form.update({
'username': username,
'password': password,
})
response = self._download_json(
'https://www.%s/front/authenticate' % host, None,
'Logging in to %s' % site,
data=urlencode_postdata(login_form),
headers={
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'Referer': login_url,
'X-Requested-With': 'XMLHttpRequest',
})
if response.get('success') == '1':
self._logged_in = True
return
message = response.get('message')
if message is not None:
raise ExtractorError(
'Unable to login: %s' % message, expected=True)
raise ExtractorError('Unable to log in')
class PornHubIE(PornHubBaseIE): class PornHubIE(PornHubBaseIE):
IE_DESC = 'PornHub and Thumbzilla' IE_DESC = 'PornHub and Thumbzilla'
@ -163,12 +227,20 @@ class PornHubIE(PornHubBaseIE):
}, { }, {
'url': 'https://www.pornhubpremium.com/view_video.php?viewkey=ph5e4acdae54a82', 'url': 'https://www.pornhubpremium.com/view_video.php?viewkey=ph5e4acdae54a82',
'only_matching': True, 'only_matching': True,
}, {
# Some videos are available with the same id on both premium
# and non-premium sites (e.g. this and the following test)
'url': 'https://www.pornhub.com/view_video.php?viewkey=ph5f75b0f4b18e3',
'only_matching': True,
}, {
'url': 'https://www.pornhubpremium.com/view_video.php?viewkey=ph5f75b0f4b18e3',
'only_matching': True,
}] }]
@staticmethod @staticmethod
def _extract_urls(webpage): def _extract_urls(webpage):
return re.findall( return re.findall(
r'<iframe[^>]+?src=["\'](?P<url>(?:https?:)?//(?:www\.)?pornhub\.(?:com|net|org)/embed/[\da-z]+)', r'<iframe[^>]+?src=["\'](?P<url>(?:https?:)?//(?:www\.)?pornhub(?:premium)?\.(?:com|net|org)/embed/[\da-z]+)',
webpage) webpage)
def _extract_count(self, pattern, webpage, name): def _extract_count(self, pattern, webpage, name):
@ -180,12 +252,7 @@ class PornHubIE(PornHubBaseIE):
host = mobj.group('host') or 'pornhub.com' host = mobj.group('host') or 'pornhub.com'
video_id = mobj.group('id') video_id = mobj.group('id')
if 'premium' in host: self._login(host)
if not self._downloader.params.get('cookiefile'):
raise ExtractorError(
'PornHub Premium requires authentication.'
' You may want to use --cookies.',
expected=True)
self._set_cookie(host, 'age_verified', '1') self._set_cookie(host, 'age_verified', '1')
@ -405,6 +472,10 @@ class PornHubIE(PornHubBaseIE):
class PornHubPlaylistBaseIE(PornHubBaseIE): class PornHubPlaylistBaseIE(PornHubBaseIE):
def _extract_page(self, url):
return int_or_none(self._search_regex(
r'\bpage=(\d+)', url, 'page', default=None))
def _extract_entries(self, webpage, host): def _extract_entries(self, webpage, host):
# Only process container div with main playlist content skipping # Only process container div with main playlist content skipping
# drop-down menu that uses similar pattern for videos (see # drop-down menu that uses similar pattern for videos (see
@ -422,26 +493,6 @@ class PornHubPlaylistBaseIE(PornHubBaseIE):
container)) container))
] ]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
host = mobj.group('host')
playlist_id = mobj.group('id')
webpage = self._download_webpage(url, playlist_id)
entries = self._extract_entries(webpage, host)
playlist = self._parse_json(
self._search_regex(
r'(?:playlistObject|PLAYLIST_VIEW)\s*=\s*({.+?});', webpage,
'playlist', default='{}'),
playlist_id, fatal=False)
title = playlist.get('title') or self._search_regex(
r'>Videos\s+in\s+(.+?)\s+[Pp]laylist<', webpage, 'title', fatal=False)
return self.playlist_result(
entries, playlist_id, title, playlist.get('description'))
class PornHubUserIE(PornHubPlaylistBaseIE): class PornHubUserIE(PornHubPlaylistBaseIE):
_VALID_URL = r'(?P<url>https?://(?:[^/]+\.)?(?P<host>pornhub(?:premium)?\.(?:com|net|org))/(?:(?:user|channel)s|model|pornstar)/(?P<id>[^/?#&]+))(?:[?#&]|/(?!videos)|$)' _VALID_URL = r'(?P<url>https?://(?:[^/]+\.)?(?P<host>pornhub(?:premium)?\.(?:com|net|org))/(?:(?:user|channel)s|model|pornstar)/(?P<id>[^/?#&]+))(?:[?#&]|/(?!videos)|$)'
@ -463,14 +514,27 @@ class PornHubUserIE(PornHubPlaylistBaseIE):
}, { }, {
'url': 'https://www.pornhub.com/model/zoe_ph?abc=1', 'url': 'https://www.pornhub.com/model/zoe_ph?abc=1',
'only_matching': True, 'only_matching': True,
}, {
# Unavailable via /videos page, but available with direct pagination
# on pornstar page (see [1]), requires premium
# 1. https://github.com/ytdl-org/youtube-dl/issues/27853
'url': 'https://www.pornhubpremium.com/pornstar/sienna-west',
'only_matching': True,
}, {
# Same as before, multi page
'url': 'https://www.pornhubpremium.com/pornstar/lily-labeau',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
user_id = mobj.group('id') user_id = mobj.group('id')
videos_url = '%s/videos' % mobj.group('url')
page = self._extract_page(url)
if page:
videos_url = update_url_query(videos_url, {'page': page})
return self.url_result( return self.url_result(
'%s/videos' % mobj.group('url'), ie=PornHubPagedVideoListIE.ie_key(), videos_url, ie=PornHubPagedVideoListIE.ie_key(), video_id=user_id)
video_id=user_id)
class PornHubPagedPlaylistBaseIE(PornHubPlaylistBaseIE): class PornHubPagedPlaylistBaseIE(PornHubPlaylistBaseIE):
@ -483,32 +547,55 @@ class PornHubPagedPlaylistBaseIE(PornHubPlaylistBaseIE):
<button[^>]+\bid=["\']moreDataBtn <button[^>]+\bid=["\']moreDataBtn
''', webpage) is not None ''', webpage) is not None
def _real_extract(self, url): def _entries(self, url, host, item_id):
mobj = re.match(self._VALID_URL, url) page = self._extract_page(url)
host = mobj.group('host')
item_id = mobj.group('id')
page = int_or_none(self._search_regex( VIDEOS = '/videos'
r'\bpage=(\d+)', url, 'page', default=None))
entries = [] def download_page(base_url, num, fallback=False):
for page_num in (page, ) if page is not None else itertools.count(1): note = 'Downloading page %d%s' % (num, ' (switch to fallback)' if fallback else '')
return self._download_webpage(
base_url, item_id, note, query={'page': num})
def is_404(e):
return isinstance(e.cause, compat_HTTPError) and e.cause.code == 404
base_url = url
has_page = page is not None
first_page = page if has_page else 1
for page_num in (first_page, ) if has_page else itertools.count(first_page):
try: try:
webpage = self._download_webpage( try:
url, item_id, 'Downloading page %d' % page_num, webpage = download_page(base_url, page_num)
query={'page': page_num}) except ExtractorError as e:
# Some sources may not be available via /videos page,
# trying to fallback to main page pagination (see [1])
# 1. https://github.com/ytdl-org/youtube-dl/issues/27853
if is_404(e) and page_num == first_page and VIDEOS in base_url:
base_url = base_url.replace(VIDEOS, '')
webpage = download_page(base_url, page_num, fallback=True)
else:
raise
except ExtractorError as e: except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404: if is_404(e) and page_num != first_page:
break break
raise raise
page_entries = self._extract_entries(webpage, host) page_entries = self._extract_entries(webpage, host)
if not page_entries: if not page_entries:
break break
entries.extend(page_entries) for e in page_entries:
yield e
if not self._has_more(webpage): if not self._has_more(webpage):
break break
return self.playlist_result(orderedSet(entries), item_id) def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
host = mobj.group('host')
item_id = mobj.group('id')
self._login(host)
return self.playlist_result(self._entries(url, host, item_id), item_id)
class PornHubPagedVideoListIE(PornHubPagedPlaylistBaseIE): class PornHubPagedVideoListIE(PornHubPagedPlaylistBaseIE):

View File

@ -20,9 +20,6 @@ class BellatorIE(MTVServicesInfoExtractor):
_FEED_URL = 'http://www.bellator.com/feeds/mrss/' _FEED_URL = 'http://www.bellator.com/feeds/mrss/'
_GEO_COUNTRIES = ['US'] _GEO_COUNTRIES = ['US']
def _extract_mgid(self, webpage):
return self._extract_triforce_mgid(webpage)
class ParamountNetworkIE(MTVServicesInfoExtractor): class ParamountNetworkIE(MTVServicesInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?paramountnetwork\.com/[^/]+/[\da-z]{6}(?:[/?#&]|$)' _VALID_URL = r'https?://(?:www\.)?paramountnetwork\.com/[^/]+/[\da-z]{6}(?:[/?#&]|$)'
@ -46,16 +43,6 @@ class ParamountNetworkIE(MTVServicesInfoExtractor):
def _get_feed_query(self, uri): def _get_feed_query(self, uri):
return { return {
'arcEp': 'paramountnetwork.com', 'arcEp': 'paramountnetwork.com',
'imageEp': 'paramountnetwork.com',
'mgid': uri, 'mgid': uri,
} }
def _extract_mgid(self, webpage):
root_data = self._parse_json(self._search_regex(
r'window\.__DATA__\s*=\s*({.+})',
webpage, 'data'), None)
def find_sub_data(data, data_type):
return next(c for c in data['children'] if c.get('type') == data_type)
c = find_sub_data(find_sub_data(root_data, 'MainContainer'), 'VideoPlayer')
return c['props']['media']['video']['config']['uri']

View File

@ -0,0 +1,156 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
from ..utils import (
clean_podcast_url,
float_or_none,
int_or_none,
strip_or_none,
try_get,
unified_strdate,
)
class SpotifyBaseIE(InfoExtractor):
_ACCESS_TOKEN = None
_OPERATION_HASHES = {
'Episode': '8276d4423d709ae9b68ec1b74cc047ba0f7479059a37820be730f125189ac2bf',
'MinimalShow': '13ee079672fad3f858ea45a55eb109553b4fb0969ed793185b2e34cbb6ee7cc0',
'ShowEpisodes': 'e0e5ce27bd7748d2c59b4d44ba245a8992a05be75d6fabc3b20753fc8857444d',
}
_VALID_URL_TEMPL = r'https?://open\.spotify\.com/%s/(?P<id>[^/?&#]+)'
def _real_initialize(self):
self._ACCESS_TOKEN = self._download_json(
'https://open.spotify.com/get_access_token', None)['accessToken']
def _call_api(self, operation, video_id, variables):
return self._download_json(
'https://api-partner.spotify.com/pathfinder/v1/query', video_id, query={
'operationName': 'query' + operation,
'variables': json.dumps(variables),
'extensions': json.dumps({
'persistedQuery': {
'sha256Hash': self._OPERATION_HASHES[operation],
},
})
}, headers={'authorization': 'Bearer ' + self._ACCESS_TOKEN})['data']
def _extract_episode(self, episode, series):
episode_id = episode['id']
title = episode['name'].strip()
formats = []
audio_preview = episode.get('audioPreview') or {}
audio_preview_url = audio_preview.get('url')
if audio_preview_url:
f = {
'url': audio_preview_url.replace('://p.scdn.co/mp3-preview/', '://anon-podcast.scdn.co/'),
'vcodec': 'none',
}
audio_preview_format = audio_preview.get('format')
if audio_preview_format:
f['format_id'] = audio_preview_format
mobj = re.match(r'([0-9A-Z]{3})_(?:[A-Z]+_)?(\d+)', audio_preview_format)
if mobj:
f.update({
'abr': int(mobj.group(2)),
'ext': mobj.group(1).lower(),
})
formats.append(f)
for item in (try_get(episode, lambda x: x['audio']['items']) or []):
item_url = item.get('url')
if not (item_url and item.get('externallyHosted')):
continue
formats.append({
'url': clean_podcast_url(item_url),
'vcodec': 'none',
})
thumbnails = []
for source in (try_get(episode, lambda x: x['coverArt']['sources']) or []):
source_url = source.get('url')
if not source_url:
continue
thumbnails.append({
'url': source_url,
'width': int_or_none(source.get('width')),
'height': int_or_none(source.get('height')),
})
return {
'id': episode_id,
'title': title,
'formats': formats,
'thumbnails': thumbnails,
'description': strip_or_none(episode.get('description')),
'duration': float_or_none(try_get(
episode, lambda x: x['duration']['totalMilliseconds']), 1000),
'release_date': unified_strdate(try_get(
episode, lambda x: x['releaseDate']['isoString'])),
'series': series,
}
class SpotifyIE(SpotifyBaseIE):
IE_NAME = 'spotify'
_VALID_URL = SpotifyBaseIE._VALID_URL_TEMPL % 'episode'
_TEST = {
'url': 'https://open.spotify.com/episode/4Z7GAJ50bgctf6uclHlWKo',
'md5': '74010a1e3fa4d9e1ab3aa7ad14e42d3b',
'info_dict': {
'id': '4Z7GAJ50bgctf6uclHlWKo',
'ext': 'mp3',
'title': 'From the archive: Why time management is ruining our lives',
'description': 'md5:b120d9c4ff4135b42aa9b6d9cde86935',
'duration': 2083.605,
'release_date': '20201217',
'series': "The Guardian's Audio Long Reads",
}
}
def _real_extract(self, url):
episode_id = self._match_id(url)
episode = self._call_api('Episode', episode_id, {
'uri': 'spotify:episode:' + episode_id
})['episode']
return self._extract_episode(
episode, try_get(episode, lambda x: x['podcast']['name']))
class SpotifyShowIE(SpotifyBaseIE):
IE_NAME = 'spotify:show'
_VALID_URL = SpotifyBaseIE._VALID_URL_TEMPL % 'show'
_TEST = {
'url': 'https://open.spotify.com/show/4PM9Ke6l66IRNpottHKV9M',
'info_dict': {
'id': '4PM9Ke6l66IRNpottHKV9M',
'title': 'The Story from the Guardian',
'description': 'The Story podcast is dedicated to our finest audio documentaries, investigations and long form stories',
},
'playlist_mincount': 36,
}
def _real_extract(self, url):
show_id = self._match_id(url)
podcast = self._call_api('ShowEpisodes', show_id, {
'limit': 1000000000,
'offset': 0,
'uri': 'spotify:show:' + show_id,
})['podcast']
podcast_name = podcast.get('name')
entries = []
for item in (try_get(podcast, lambda x: x['episodes']['items']) or []):
episode = item.get('episode')
if not episode:
continue
entries.append(self._extract_episode(episode, podcast_name))
return self.playlist_result(
entries, show_id, podcast_name, podcast.get('description'))

View File

@ -255,8 +255,10 @@ class SVTPlayIE(SVTPlayBaseIE):
svt_id = self._search_regex( svt_id = self._search_regex(
(r'<video[^>]+data-video-id=["\']([\da-zA-Z-]+)', (r'<video[^>]+data-video-id=["\']([\da-zA-Z-]+)',
r'["\']videoSvtId["\']\s*:\s*["\']([\da-zA-Z-]+)', r'["\']videoSvtId["\']\s*:\s*["\']([\da-zA-Z-]+)',
r'["\']videoSvtId\\?["\']\s*:\s*\\?["\']([\da-zA-Z-]+)',
r'"content"\s*:\s*{.*?"id"\s*:\s*"([\da-zA-Z-]+)"', r'"content"\s*:\s*{.*?"id"\s*:\s*"([\da-zA-Z-]+)"',
r'["\']svtId["\']\s*:\s*["\']([\da-zA-Z-]+)'), r'["\']svtId["\']\s*:\s*["\']([\da-zA-Z-]+)',
r'["\']svtId\\?["\']\s*:\s*\\?["\']([\da-zA-Z-]+)'),
webpage, 'video id') webpage, 'video id')
info_dict = self._extract_by_video_id(svt_id, webpage) info_dict = self._extract_by_video_id(svt_id, webpage)

View File

@ -0,0 +1,193 @@
# coding: utf-8
from __future__ import unicode_literals
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
str_or_none,
try_get,
)
class TrovoBaseIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?trovo\.live/'
def _extract_streamer_info(self, data):
streamer_info = data.get('streamerInfo') or {}
username = streamer_info.get('userName')
return {
'uploader': streamer_info.get('nickName'),
'uploader_id': str_or_none(streamer_info.get('uid')),
'uploader_url': 'https://trovo.live/' + username if username else None,
}
class TrovoIE(TrovoBaseIE):
_VALID_URL = TrovoBaseIE._VALID_URL_BASE + r'(?!(?:clip|video)/)(?P<id>[^/?&#]+)'
def _real_extract(self, url):
username = self._match_id(url)
live_info = self._download_json(
'https://gql.trovo.live/', username, query={
'query': '''{
getLiveInfo(params: {userName: "%s"}) {
isLive
programInfo {
coverUrl
id
streamInfo {
desc
playUrl
}
title
}
streamerInfo {
nickName
uid
userName
}
}
}''' % username,
})['data']['getLiveInfo']
if live_info.get('isLive') == 0:
raise ExtractorError('%s is offline' % username, expected=True)
program_info = live_info['programInfo']
program_id = program_info['id']
title = self._live_title(program_info['title'])
formats = []
for stream_info in (program_info.get('streamInfo') or []):
play_url = stream_info.get('playUrl')
if not play_url:
continue
format_id = stream_info.get('desc')
formats.append({
'format_id': format_id,
'height': int_or_none(format_id[:-1]) if format_id else None,
'url': play_url,
})
self._sort_formats(formats)
info = {
'id': program_id,
'title': title,
'formats': formats,
'thumbnail': program_info.get('coverUrl'),
'is_live': True,
}
info.update(self._extract_streamer_info(live_info))
return info
class TrovoVodIE(TrovoBaseIE):
_VALID_URL = TrovoBaseIE._VALID_URL_BASE + r'(?:clip|video)/(?P<id>[^/?&#]+)'
_TESTS = [{
'url': 'https://trovo.live/video/ltv-100095501_100095501_1609596043',
'info_dict': {
'id': 'ltv-100095501_100095501_1609596043',
'ext': 'mp4',
'title': 'Spontaner 12 Stunden Stream! - Ok Boomer!',
'uploader': 'Exsl',
'timestamp': 1609640305,
'upload_date': '20210103',
'uploader_id': '100095501',
'duration': 43977,
'view_count': int,
'like_count': int,
'comment_count': int,
'comments': 'mincount:8',
'categories': ['Grand Theft Auto V'],
},
}, {
'url': 'https://trovo.live/clip/lc-5285890810184026005',
'only_matching': True,
}]
def _real_extract(self, url):
vid = self._match_id(url)
resp = self._download_json(
'https://gql.trovo.live/', vid, data=json.dumps([{
'query': '''{
batchGetVodDetailInfo(params: {vids: ["%s"]}) {
VodDetailInfos
}
}''' % vid,
}, {
'query': '''{
getCommentList(params: {appInfo: {postID: "%s"}, pageSize: 1000000000, preview: {}}) {
commentList {
author {
nickName
uid
}
commentID
content
createdAt
parentID
}
}
}''' % vid,
}]).encode(), headers={
'Content-Type': 'application/json',
})
vod_detail_info = resp[0]['data']['batchGetVodDetailInfo']['VodDetailInfos'][vid]
vod_info = vod_detail_info['vodInfo']
title = vod_info['title']
language = vod_info.get('languageName')
formats = []
for play_info in (vod_info.get('playInfos') or []):
play_url = play_info.get('playUrl')
if not play_url:
continue
format_id = play_info.get('desc')
formats.append({
'ext': 'mp4',
'filesize': int_or_none(play_info.get('fileSize')),
'format_id': format_id,
'height': int_or_none(format_id[:-1]) if format_id else None,
'language': language,
'protocol': 'm3u8_native',
'tbr': int_or_none(play_info.get('bitrate')),
'url': play_url,
})
self._sort_formats(formats)
category = vod_info.get('categoryName')
get_count = lambda x: int_or_none(vod_info.get(x + 'Num'))
comment_list = try_get(resp, lambda x: x[1]['data']['getCommentList']['commentList'], list) or []
comments = []
for comment in comment_list:
content = comment.get('content')
if not content:
continue
author = comment.get('author') or {}
parent = comment.get('parentID')
comments.append({
'author': author.get('nickName'),
'author_id': str_or_none(author.get('uid')),
'id': str_or_none(comment.get('commentID')),
'text': content,
'timestamp': int_or_none(comment.get('createdAt')),
'parent': 'root' if parent == 0 else str_or_none(parent),
})
info = {
'id': vid,
'title': title,
'formats': formats,
'thumbnail': vod_info.get('coverUrl'),
'timestamp': int_or_none(vod_info.get('publishTs')),
'duration': int_or_none(vod_info.get('duration')),
'view_count': get_count('watch'),
'like_count': get_count('like'),
'comment_count': get_count('comment'),
'comments': comments,
'categories': [category] if category else None,
}
info.update(self._extract_streamer_info(vod_detail_info))
return info

View File

@ -20,7 +20,7 @@ from ..utils import (
class TV2IE(InfoExtractor): class TV2IE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tv2\.no/v/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?tv2\.no/v/(?P<id>\d+)'
_TEST = { _TESTS = [{
'url': 'http://www.tv2.no/v/916509/', 'url': 'http://www.tv2.no/v/916509/',
'info_dict': { 'info_dict': {
'id': '916509', 'id': '916509',
@ -33,7 +33,7 @@ class TV2IE(InfoExtractor):
'view_count': int, 'view_count': int,
'categories': list, 'categories': list,
}, },
} }]
_API_DOMAIN = 'sumo.tv2.no' _API_DOMAIN = 'sumo.tv2.no'
_PROTOCOLS = ('HDS', 'HLS', 'DASH') _PROTOCOLS = ('HDS', 'HLS', 'DASH')
_GEO_COUNTRIES = ['NO'] _GEO_COUNTRIES = ['NO']
@ -42,6 +42,12 @@ class TV2IE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
api_base = 'http://%s/api/web/asset/%s' % (self._API_DOMAIN, video_id) api_base = 'http://%s/api/web/asset/%s' % (self._API_DOMAIN, video_id)
asset = self._download_json(
api_base + '.json', video_id,
'Downloading metadata JSON')['asset']
title = asset.get('subtitle') or asset['title']
is_live = asset.get('live') is True
formats = [] formats = []
format_urls = [] format_urls = []
for protocol in self._PROTOCOLS: for protocol in self._PROTOCOLS:
@ -81,7 +87,8 @@ class TV2IE(InfoExtractor):
elif ext == 'm3u8': elif ext == 'm3u8':
if not data.get('drmProtected'): if not data.get('drmProtected'):
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', entry_protocol='m3u8_native', video_url, video_id, 'mp4',
'm3u8' if is_live else 'm3u8_native',
m3u8_id=format_id, fatal=False)) m3u8_id=format_id, fatal=False))
elif ext == 'mpd': elif ext == 'mpd':
formats.extend(self._extract_mpd_formats( formats.extend(self._extract_mpd_formats(
@ -99,11 +106,6 @@ class TV2IE(InfoExtractor):
raise ExtractorError('This video is DRM protected.', expected=True) raise ExtractorError('This video is DRM protected.', expected=True)
self._sort_formats(formats) self._sort_formats(formats)
asset = self._download_json(
api_base + '.json', video_id,
'Downloading metadata JSON')['asset']
title = asset['title']
thumbnails = [{ thumbnails = [{
'id': thumbnail.get('@type'), 'id': thumbnail.get('@type'),
'url': thumbnail.get('url'), 'url': thumbnail.get('url'),
@ -112,7 +114,7 @@ class TV2IE(InfoExtractor):
return { return {
'id': video_id, 'id': video_id,
'url': video_url, 'url': video_url,
'title': title, 'title': self._live_title(title) if is_live else title,
'description': strip_or_none(asset.get('description')), 'description': strip_or_none(asset.get('description')),
'thumbnails': thumbnails, 'thumbnails': thumbnails,
'timestamp': parse_iso8601(asset.get('createTime')), 'timestamp': parse_iso8601(asset.get('createTime')),
@ -120,6 +122,7 @@ class TV2IE(InfoExtractor):
'view_count': int_or_none(asset.get('views')), 'view_count': int_or_none(asset.get('views')),
'categories': asset.get('keywords', '').split(','), 'categories': asset.get('keywords', '').split(','),
'formats': formats, 'formats': formats,
'is_live': is_live,
} }
@ -168,13 +171,13 @@ class TV2ArticleIE(InfoExtractor):
class KatsomoIE(TV2IE): class KatsomoIE(TV2IE):
_VALID_URL = r'https?://(?:www\.)?(?:katsomo|mtv)\.fi/(?:#!/)?(?:[^/]+/[0-9a-z-]+-\d+/[0-9a-z-]+-|[^/]+/\d+/[^/]+/)(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?(?:katsomo|mtv(uutiset)?)\.fi/(?:sarja/[0-9a-z-]+-\d+/[0-9a-z-]+-|(?:#!/)?jakso/(?:\d+/[^/]+/)?|video/prog)(?P<id>\d+)'
_TEST = { _TESTS = [{
'url': 'https://www.mtv.fi/sarja/mtv-uutiset-live-33001002003/lahden-pelicans-teki-kovan-ratkaisun-ville-nieminen-pihalle-1181321', 'url': 'https://www.mtv.fi/sarja/mtv-uutiset-live-33001002003/lahden-pelicans-teki-kovan-ratkaisun-ville-nieminen-pihalle-1181321',
'info_dict': { 'info_dict': {
'id': '1181321', 'id': '1181321',
'ext': 'mp4', 'ext': 'mp4',
'title': 'MTV Uutiset Live', 'title': 'Lahden Pelicans teki kovan ratkaisun Ville Nieminen pihalle',
'description': 'Päätöksen teki Pelicansin hallitus.', 'description': 'Päätöksen teki Pelicansin hallitus.',
'timestamp': 1575116484, 'timestamp': 1575116484,
'upload_date': '20191130', 'upload_date': '20191130',
@ -186,7 +189,60 @@ class KatsomoIE(TV2IE):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
} }, {
'url': 'http://www.katsomo.fi/#!/jakso/33001005/studio55-fi/658521/jukka-kuoppamaki-tekee-yha-lauluja-vaikka-lentokoneessa',
'only_matching': True,
}, {
'url': 'https://www.mtvuutiset.fi/video/prog1311159',
'only_matching': True,
}, {
'url': 'https://www.katsomo.fi/#!/jakso/1311159',
'only_matching': True,
}]
_API_DOMAIN = 'api.katsomo.fi' _API_DOMAIN = 'api.katsomo.fi'
_PROTOCOLS = ('HLS', 'MPD') _PROTOCOLS = ('HLS', 'MPD')
_GEO_COUNTRIES = ['FI'] _GEO_COUNTRIES = ['FI']
class MTVUutisetArticleIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)mtvuutiset\.fi/artikkeli/[^/]+/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.mtvuutiset.fi/artikkeli/tallaisia-vaurioita-viking-amorellassa-on-useamman-osaston-alla-vetta/7931384',
'info_dict': {
'id': '1311159',
'ext': 'mp4',
'title': 'Viking Amorellan matkustajien evakuointi on alkanut tältä operaatio näyttää laivalla',
'description': 'Viking Amorellan matkustajien evakuointi on alkanut tältä operaatio näyttää laivalla',
'timestamp': 1600608966,
'upload_date': '20200920',
'duration': 153.7886666,
'view_count': int,
'categories': list,
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
# multiple Youtube embeds
'url': 'https://www.mtvuutiset.fi/artikkeli/50-vuotta-subarun-vastaiskua/6070962',
'only_matching': True,
}]
def _real_extract(self, url):
article_id = self._match_id(url)
article = self._download_json(
'http://api.mtvuutiset.fi/mtvuutiset/api/json/' + article_id,
article_id)
def entries():
for video in (article.get('videos') or []):
video_type = video.get('videotype')
video_url = video.get('url')
if not (video_url and video_type in ('katsomo', 'youtube')):
continue
yield self.url_result(
video_url, video_type.capitalize(), video.get('video_id'))
return self.playlist_result(
entries(), article_id, article.get('title'), article.get('description'))

View File

@ -17,7 +17,7 @@ class TV4IE(InfoExtractor):
tv4\.se/(?:[^/]+)/klipp/(?:.*)-| tv4\.se/(?:[^/]+)/klipp/(?:.*)-|
tv4play\.se/ tv4play\.se/
(?: (?:
(?:program|barn)/(?:[^/]+/|(?:[^\?]+)\?video_id=)| (?:program|barn)/(?:(?:[^/]+/){1,2}|(?:[^\?]+)\?video_id=)|
iframe/video/| iframe/video/|
film/| film/|
sport/| sport/|
@ -65,6 +65,10 @@ class TV4IE(InfoExtractor):
{ {
'url': 'http://www.tv4play.se/program/farang/3922081', 'url': 'http://www.tv4play.se/program/farang/3922081',
'only_matching': True, 'only_matching': True,
},
{
'url': 'https://www.tv4play.se/program/nyheterna/avsnitt/13315940',
'only_matching': True,
} }
] ]

View File

@ -373,6 +373,24 @@ class TwitterIE(TwitterBaseIE):
'uploader_id': '1eVjYOLGkGrQL', 'uploader_id': '1eVjYOLGkGrQL',
}, },
'add_ie': ['TwitterBroadcast'], 'add_ie': ['TwitterBroadcast'],
}, {
# unified card
'url': 'https://twitter.com/BrooklynNets/status/1349794411333394432?s=20',
'info_dict': {
'id': '1349794411333394432',
'ext': 'mp4',
'title': 'md5:d1c4941658e4caaa6cb579260d85dcba',
'thumbnail': r're:^https?://.*\.jpg',
'description': 'md5:71ead15ec44cee55071547d6447c6a3e',
'uploader': 'Brooklyn Nets',
'uploader_id': 'BrooklynNets',
'duration': 324.484,
'timestamp': 1610651040,
'upload_date': '20210114',
},
'params': {
'skip_download': True,
},
}, { }, {
# Twitch Clip Embed # Twitch Clip Embed
'url': 'https://twitter.com/GunB1g/status/1163218564784017422', 'url': 'https://twitter.com/GunB1g/status/1163218564784017422',
@ -389,6 +407,22 @@ class TwitterIE(TwitterBaseIE):
# appplayer card # appplayer card
'url': 'https://twitter.com/poco_dandy/status/1150646424461176832', 'url': 'https://twitter.com/poco_dandy/status/1150646424461176832',
'only_matching': True, 'only_matching': True,
}, {
# video_direct_message card
'url': 'https://twitter.com/qarev001/status/1348948114569269251',
'only_matching': True,
}, {
# poll2choice_video card
'url': 'https://twitter.com/CAF_Online/status/1349365911120195585',
'only_matching': True,
}, {
# poll3choice_video card
'url': 'https://twitter.com/SamsungMobileSA/status/1348609186725289984',
'only_matching': True,
}, {
# poll4choice_video card
'url': 'https://twitter.com/SouthamptonFC/status/1347577658079641604',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -433,8 +467,7 @@ class TwitterIE(TwitterBaseIE):
'tags': tags, 'tags': tags,
} }
media = try_get(status, lambda x: x['extended_entities']['media'][0]) def extract_from_video_info(media):
if media and media.get('type') != 'photo':
video_info = media.get('video_info') or {} video_info = media.get('video_info') or {}
formats = [] formats = []
@ -461,6 +494,10 @@ class TwitterIE(TwitterBaseIE):
'thumbnails': thumbnails, 'thumbnails': thumbnails,
'duration': float_or_none(video_info.get('duration_millis'), 1000), 'duration': float_or_none(video_info.get('duration_millis'), 1000),
}) })
media = try_get(status, lambda x: x['extended_entities']['media'][0])
if media and media.get('type') != 'photo':
extract_from_video_info(media)
else: else:
card = status.get('card') card = status.get('card')
if card: if card:
@ -493,7 +530,12 @@ class TwitterIE(TwitterBaseIE):
'_type': 'url', '_type': 'url',
'url': get_binding_value('card_url'), 'url': get_binding_value('card_url'),
}) })
# amplify, promo_video_website, promo_video_convo, appplayer, ... elif card_name == 'unified_card':
media_entities = self._parse_json(get_binding_value('unified_card'), twid)['media_entities']
extract_from_video_info(next(iter(media_entities.values())))
# amplify, promo_video_website, promo_video_convo, appplayer,
# video_direct_message, poll2choice_video, poll3choice_video,
# poll4choice_video, ...
else: else:
is_amplify = card_name == 'amplify' is_amplify = card_name == 'amplify'
vmap_url = get_binding_value('amplify_url_vmap') if is_amplify else get_binding_value('player_stream_url') vmap_url = get_binding_value('amplify_url_vmap') if is_amplify else get_binding_value('player_stream_url')

View File

@ -4,7 +4,13 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import int_or_none from ..utils import (
int_or_none,
parse_iso8601,
str_or_none,
strip_or_none,
try_get,
)
class VidioIE(InfoExtractor): class VidioIE(InfoExtractor):
@ -21,57 +27,63 @@ class VidioIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 149, 'duration': 149,
'like_count': int, 'like_count': int,
'uploader': 'TWELVE Pic',
'timestamp': 1444902800,
'upload_date': '20151015',
'uploader_id': 'twelvepictures',
'channel': 'Cover Music Video',
'channel_id': '280236',
'view_count': int,
'dislike_count': int,
'comment_count': int,
'tags': 'count:4',
}, },
}, { }, {
'url': 'https://www.vidio.com/watch/77949-south-korea-test-fires-missile-that-can-strike-all-of-the-north', 'url': 'https://www.vidio.com/watch/77949-south-korea-test-fires-missile-that-can-strike-all-of-the-north',
'only_matching': True, 'only_matching': True,
}] }]
def _real_initialize(self):
self._api_key = self._download_json(
'https://www.vidio.com/auth', None, data=b'')['api_key']
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) video_id, display_id = re.match(self._VALID_URL, url).groups()
video_id, display_id = mobj.group('id', 'display_id') data = self._download_json(
'https://api.vidio.com/videos/' + video_id, display_id, headers={
'Content-Type': 'application/vnd.api+json',
'X-API-KEY': self._api_key,
})
video = data['videos'][0]
title = video['title'].strip()
webpage = self._download_webpage(url, display_id)
title = self._og_search_title(webpage)
m3u8_url, duration, thumbnail = [None] * 3
clips = self._parse_json(
self._html_search_regex(
r'data-json-clips\s*=\s*(["\'])(?P<data>\[.+?\])\1',
webpage, 'video data', default='[]', group='data'),
display_id, fatal=False)
if clips:
clip = clips[0]
m3u8_url = clip.get('sources', [{}])[0].get('file')
duration = clip.get('clip_duration')
thumbnail = clip.get('image')
m3u8_url = m3u8_url or self._search_regex(
r'data(?:-vjs)?-clip-hls-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
webpage, 'hls url', group='url')
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(
m3u8_url, display_id, 'mp4', entry_protocol='m3u8_native') data['clips'][0]['hls_url'], display_id, 'mp4', 'm3u8_native')
self._sort_formats(formats) self._sort_formats(formats)
duration = int_or_none(duration or self._search_regex( get_first = lambda x: try_get(data, lambda y: y[x + 's'][0], dict) or {}
r'data-video-duration=(["\'])(?P<duration>\d+)\1', webpage, channel = get_first('channel')
'duration', fatal=False, group='duration')) user = get_first('user')
thumbnail = thumbnail or self._og_search_thumbnail(webpage) username = user.get('username')
get_count = lambda x: int_or_none(video.get('total_' + x))
like_count = int_or_none(self._search_regex(
(r'<span[^>]+data-comment-vote-count=["\'](\d+)',
r'<span[^>]+class=["\'].*?\blike(?:__|-)count\b.*?["\'][^>]*>\s*(\d+)'),
webpage, 'like count', fatal=False))
return { return {
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
'title': title, 'title': title,
'description': self._og_search_description(webpage), 'description': strip_or_none(video.get('description')),
'thumbnail': thumbnail, 'thumbnail': video.get('image_url_medium'),
'duration': duration, 'duration': int_or_none(video.get('duration')),
'like_count': like_count, 'like_count': get_count('likes'),
'formats': formats, 'formats': formats,
'uploader': user.get('name'),
'timestamp': parse_iso8601(video.get('created_at')),
'uploader_id': username,
'uploader_url': 'https://www.vidio.com/@' + username if username else None,
'channel': channel.get('name'),
'channel_id': str_or_none(channel.get('id')),
'view_count': get_count('view_count'),
'dislike_count': get_count('dislikes'),
'comment_count': get_count('comments'),
'tags': video.get('tag_list'),
} }

View File

@ -1,68 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
decode_packed_codes,
js_to_json,
NO_DEFAULT,
PACKED_CODES_RE,
)
class VidziIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?vidzi\.(?:tv|cc|si|nu)/(?:embed-)?(?P<id>[0-9a-zA-Z]+)'
_TESTS = [{
'url': 'http://vidzi.tv/cghql9yq6emu.html',
'md5': '4f16c71ca0c8c8635ab6932b5f3f1660',
'info_dict': {
'id': 'cghql9yq6emu',
'ext': 'mp4',
'title': 'youtube-dl test video 1\\\\2\'3/4<5\\\\6ä7↭',
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://vidzi.tv/embed-4z2yb0rzphe9-600x338.html',
'only_matching': True,
}, {
'url': 'http://vidzi.cc/cghql9yq6emu.html',
'only_matching': True,
}, {
'url': 'https://vidzi.si/rph9gztxj1et.html',
'only_matching': True,
}, {
'url': 'http://vidzi.nu/cghql9yq6emu.html',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
'http://vidzi.tv/%s' % video_id, video_id)
title = self._html_search_regex(
r'(?s)<h2 class="video-title">(.*?)</h2>', webpage, 'title')
codes = [webpage]
codes.extend([
decode_packed_codes(mobj.group(0)).replace('\\\'', '\'')
for mobj in re.finditer(PACKED_CODES_RE, webpage)])
for num, code in enumerate(codes, 1):
jwplayer_data = self._parse_json(
self._search_regex(
r'setup\(([^)]+)\)', code, 'jwplayer data',
default=NO_DEFAULT if num == len(codes) else '{}'),
video_id, transform_source=lambda s: js_to_json(
re.sub(r'\s*\+\s*window\[.+?\]', '', s)))
if jwplayer_data:
break
info_dict = self._parse_jwplayer_data(jwplayer_data, video_id, require_title=False)
info_dict['title'] = title
return info_dict

View File

@ -116,7 +116,7 @@ class VLiveIE(VLiveBaseIE):
headers={'Referer': 'https://www.vlive.tv/'}, query=query) headers={'Referer': 'https://www.vlive.tv/'}, query=query)
except ExtractorError as e: except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403: if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
self.raise_login_required(json.loads(e.cause.read().decode())['message']) self.raise_login_required(json.loads(e.cause.read().decode('utf-8'))['message'])
raise raise
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -0,0 +1,62 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
try_get,
)
class VTMIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?vtm\.be/([^/?&#]+)~v(?P<id>[0-9a-f]{8}(?:-[0-9a-f]{4}){3}-[0-9a-f]{12})'
_TEST = {
'url': 'https://vtm.be/gast-vernielt-genkse-hotelkamer~ve7534523-279f-4b4d-a5c9-a33ffdbe23e1',
'md5': '37dca85fbc3a33f2de28ceb834b071f8',
'info_dict': {
'id': '192445',
'ext': 'mp4',
'title': 'Gast vernielt Genkse hotelkamer',
'timestamp': 1611060180,
'upload_date': '20210119',
'duration': 74,
# TODO: fix url _type result processing
# 'series': 'Op Interventie',
}
}
def _real_extract(self, url):
uuid = self._match_id(url)
video = self._download_json(
'https://omc4vm23offuhaxx6hekxtzspi.appsync-api.eu-west-1.amazonaws.com/graphql',
uuid, query={
'query': '''{
getComponent(type: Video, uuid: "%s") {
... on Video {
description
duration
myChannelsVideo
program {
title
}
publishedAt
title
}
}
}''' % uuid,
}, headers={
'x-api-key': 'da2-lz2cab4tfnah3mve6wiye4n77e',
})['data']['getComponent']
return {
'_type': 'url',
'id': uuid,
'title': video.get('title'),
'url': 'http://mychannels.video/embed/%d' % video['myChannelsVideo'],
'description': video.get('description'),
'timestamp': parse_iso8601(video.get('publishedAt')),
'duration': int_or_none(video.get('duration')),
'series': try_get(video, lambda x: x['program']['title']),
'ie_key': 'Medialaan',
}

View File

@ -4,6 +4,7 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from .youtube import YoutubeIE
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
@ -47,6 +48,22 @@ class VVVVIDIE(InfoExtractor):
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
}, {
# video_type == 'video/youtube'
'url': 'https://www.vvvvid.it/show/404/one-punch-man/406/486683/trailer',
'md5': '33e0edfba720ad73a8782157fdebc648',
'info_dict': {
'id': 'RzmFKUDOUgw',
'ext': 'mp4',
'title': 'Trailer',
'upload_date': '20150906',
'description': 'md5:a5e802558d35247fee285875328c0b80',
'uploader_id': 'BandaiVisual',
'uploader': 'BANDAI NAMCO Arts Channel',
},
'params': {
'skip_download': True,
},
}, { }, {
'url': 'https://www.vvvvid.it/show/434/perche-dovrei-guardarlo-di-dario-moccia/437/489048', 'url': 'https://www.vvvvid.it/show/434/perche-dovrei-guardarlo-di-dario-moccia/437/489048',
'only_matching': True 'only_matching': True
@ -154,12 +171,13 @@ class VVVVIDIE(InfoExtractor):
if season_number: if season_number:
info['season_number'] = int(season_number) info['season_number'] = int(season_number)
for quality in ('_sd', ''): video_type = video_data.get('video_type')
is_youtube = False
for quality in ('', '_sd'):
embed_code = video_data.get('embed_info' + quality) embed_code = video_data.get('embed_info' + quality)
if not embed_code: if not embed_code:
continue continue
embed_code = ds(embed_code) embed_code = ds(embed_code)
video_type = video_data.get('video_type')
if video_type in ('video/rcs', 'video/kenc'): if video_type in ('video/rcs', 'video/kenc'):
if video_type == 'video/kenc': if video_type == 'video/kenc':
kenc = self._download_json( kenc = self._download_json(
@ -172,19 +190,28 @@ class VVVVIDIE(InfoExtractor):
if kenc_message: if kenc_message:
embed_code += '?' + ds(kenc_message) embed_code += '?' + ds(kenc_message)
formats.extend(self._extract_akamai_formats(embed_code, video_id)) formats.extend(self._extract_akamai_formats(embed_code, video_id))
elif video_type == 'video/youtube':
info.update({
'_type': 'url_transparent',
'ie_key': YoutubeIE.ie_key(),
'url': embed_code,
})
is_youtube = True
break
else: else:
formats.extend(self._extract_wowza_formats( formats.extend(self._extract_wowza_formats(
'http://sb.top-ix.org/videomg/_definst_/mp4:%s/playlist.m3u8' % embed_code, video_id)) 'http://sb.top-ix.org/videomg/_definst_/mp4:%s/playlist.m3u8' % embed_code, video_id))
metadata_from_url(embed_code) metadata_from_url(embed_code)
self._sort_formats(formats) if not is_youtube:
self._sort_formats(formats)
info['formats'] = formats
metadata_from_url(video_data.get('thumbnail')) metadata_from_url(video_data.get('thumbnail'))
info.update(self._extract_common_video_info(video_data)) info.update(self._extract_common_video_info(video_data))
info.update({ info.update({
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'formats': formats,
'duration': int_or_none(video_data.get('length')), 'duration': int_or_none(video_data.get('length')),
'series': video_data.get('show_title'), 'series': video_data.get('show_title'),
'season_id': season_id, 'season_id': season_id,

View File

@ -1,12 +1,9 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import compat_str
from ..utils import ( from ..utils import (
ExtractorError,
unified_strdate, unified_strdate,
HEADRequest, HEADRequest,
int_or_none, int_or_none,
@ -46,15 +43,6 @@ class WatIE(InfoExtractor):
}, },
] ]
_FORMATS = (
(200, 416, 234),
(400, 480, 270),
(600, 640, 360),
(1200, 640, 360),
(1800, 960, 540),
(2500, 1280, 720),
)
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
video_id = video_id if video_id.isdigit() and len(video_id) > 6 else compat_str(int(video_id, 36)) video_id = video_id if video_id.isdigit() and len(video_id) > 6 else compat_str(int(video_id, 36))
@ -97,46 +85,20 @@ class WatIE(InfoExtractor):
return red_url return red_url
return None return None
def remove_bitrate_limit(manifest_url):
return re.sub(r'(?:max|min)_bitrate=\d+&?', '', manifest_url)
formats = [] formats = []
try: manifest_urls = self._download_json(
alt_urls = lambda manifest_url: [re.sub(r'(?:wdv|ssm)?\.ism/', repl + '.ism/', manifest_url) for repl in ('', 'ssm')] 'http://www.wat.tv/get/webhtml/' + video_id, video_id)
manifest_urls = self._download_json( m3u8_url = manifest_urls.get('hls')
'http://www.wat.tv/get/webhtml/' + video_id, video_id) if m3u8_url:
m3u8_url = manifest_urls.get('hls') formats.extend(self._extract_m3u8_formats(
if m3u8_url: m3u8_url, video_id, 'mp4',
m3u8_url = remove_bitrate_limit(m3u8_url) 'm3u8_native', m3u8_id='hls', fatal=False))
for m3u8_alt_url in alt_urls(m3u8_url): mpd_url = manifest_urls.get('mpd')
formats.extend(self._extract_m3u8_formats( if mpd_url:
m3u8_alt_url, video_id, 'mp4', formats.extend(self._extract_mpd_formats(
'm3u8_native', m3u8_id='hls', fatal=False)) mpd_url.replace('://das-q1.tf1.fr/', '://das-q1-ssl.tf1.fr/'),
formats.extend(self._extract_f4m_formats( video_id, mpd_id='dash', fatal=False))
m3u8_alt_url.replace('ios', 'web').replace('.m3u8', '.f4m'), self._sort_formats(formats)
video_id, f4m_id='hds', fatal=False))
mpd_url = manifest_urls.get('mpd')
if mpd_url:
mpd_url = remove_bitrate_limit(mpd_url)
for mpd_alt_url in alt_urls(mpd_url):
formats.extend(self._extract_mpd_formats(
mpd_alt_url, video_id, mpd_id='dash', fatal=False))
self._sort_formats(formats)
except ExtractorError:
abr = 64
for vbr, width, height in self._FORMATS:
tbr = vbr + abr
format_id = 'http-%s' % tbr
fmt_url = 'http://dnl.adv.tf1.fr/2/USP-0x0/%s/%s/%s/ssm/%s-%s-64k.mp4' % (video_id[-4:-2], video_id[-2:], video_id, video_id, vbr)
if self._is_valid_url(fmt_url, video_id, format_id):
formats.append({
'format_id': format_id,
'url': fmt_url,
'vbr': vbr,
'abr': abr,
'width': width,
'height': height,
})
date_diffusion = first_chapter.get('date_diffusion') or video_data.get('configv4', {}).get('estatS4') date_diffusion = first_chapter.get('date_diffusion') or video_data.get('configv4', {}).get('estatS4')
upload_date = unified_strdate(date_diffusion) if date_diffusion else None upload_date = unified_strdate(date_diffusion) if date_diffusion else None

View File

@ -177,46 +177,9 @@ class YahooIE(InfoExtractor):
'only_matching': True, 'only_matching': True,
}] }]
def _real_extract(self, url): def _extract_yahoo_video(self, video_id, country):
url, country, display_id = re.match(self._VALID_URL, url).groups()
if not country:
country = 'us'
else:
country = country.split('-')[0]
api_base = 'https://%s.yahoo.com/_td/api/resource/' % country
for i, uuid in enumerate(['url=' + url, 'ymedia-alias=' + display_id]):
content = self._download_json(
api_base + 'content;getDetailView=true;uuids=["%s"]' % uuid,
display_id, 'Downloading content JSON metadata', fatal=i == 1)
if content:
item = content['items'][0]
break
if item.get('type') != 'video':
entries = []
cover = item.get('cover') or {}
if cover.get('type') == 'yvideo':
cover_url = cover.get('url')
if cover_url:
entries.append(self.url_result(
cover_url, 'Yahoo', cover.get('uuid')))
for e in item.get('body', []):
if e.get('type') == 'videoIframe':
iframe_url = e.get('url')
if not iframe_url:
continue
entries.append(self.url_result(iframe_url))
return self.playlist_result(
entries, item.get('uuid'),
item.get('title'), item.get('summary'))
video_id = item['uuid']
video = self._download_json( video = self._download_json(
api_base + 'VideoService.videos;view=full;video_ids=["%s"]' % video_id, 'https://%s.yahoo.com/_td/api/resource/VideoService.videos;view=full;video_ids=["%s"]' % (country, video_id),
video_id, 'Downloading video JSON metadata')[0] video_id, 'Downloading video JSON metadata')[0]
title = video['title'] title = video['title']
@ -298,7 +261,6 @@ class YahooIE(InfoExtractor):
'id': video_id, 'id': video_id,
'title': self._live_title(title) if is_live else title, 'title': self._live_title(title) if is_live else title,
'formats': formats, 'formats': formats,
'display_id': display_id,
'thumbnails': thumbnails, 'thumbnails': thumbnails,
'description': clean_html(video.get('description')), 'description': clean_html(video.get('description')),
'timestamp': parse_iso8601(video.get('publish_time')), 'timestamp': parse_iso8601(video.get('publish_time')),
@ -311,6 +273,44 @@ class YahooIE(InfoExtractor):
'episode_number': int_or_none(series_info.get('episode_number')), 'episode_number': int_or_none(series_info.get('episode_number')),
} }
def _real_extract(self, url):
url, country, display_id = re.match(self._VALID_URL, url).groups()
if not country:
country = 'us'
else:
country = country.split('-')[0]
item = self._download_json(
'https://%s.yahoo.com/caas/content/article' % country, display_id,
'Downloading content JSON metadata', query={
'url': url
})['items'][0]['data']['partnerData']
if item.get('type') != 'video':
entries = []
cover = item.get('cover') or {}
if cover.get('type') == 'yvideo':
cover_url = cover.get('url')
if cover_url:
entries.append(self.url_result(
cover_url, 'Yahoo', cover.get('uuid')))
for e in (item.get('body') or []):
if e.get('type') == 'videoIframe':
iframe_url = e.get('url')
if not iframe_url:
continue
entries.append(self.url_result(iframe_url))
return self.playlist_result(
entries, item.get('uuid'),
item.get('title'), item.get('summary'))
info = self._extract_yahoo_video(item['uuid'], country)
info['display_id'] = display_id
return info
class YahooSearchIE(SearchInfoExtractor): class YahooSearchIE(SearchInfoExtractor):
IE_DESC = 'Yahoo screen search' IE_DESC = 'Yahoo screen search'

View File

@ -60,6 +60,9 @@ class YouPornIE(InfoExtractor):
}, { }, {
'url': 'http://www.youporn.com/watch/505835', 'url': 'http://www.youporn.com/watch/505835',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.youporn.com/watch/13922959/femdom-principal/',
'only_matching': True,
}] }]
@staticmethod @staticmethod
@ -88,7 +91,7 @@ class YouPornIE(InfoExtractor):
# Main source # Main source
definitions = self._parse_json( definitions = self._parse_json(
self._search_regex( self._search_regex(
r'mediaDefinition\s*=\s*(\[.+?\]);', webpage, r'mediaDefinition\s*[=:]\s*(\[.+?\])\s*[;,]', webpage,
'media definitions', default='[]'), 'media definitions', default='[]'),
video_id, fatal=False) video_id, fatal=False)
if definitions: if definitions:
@ -100,7 +103,7 @@ class YouPornIE(InfoExtractor):
links.append(video_url) links.append(video_url)
# Fallback #1, this also contains extra low quality 180p format # Fallback #1, this also contains extra low quality 180p format
for _, link in re.findall(r'<a[^>]+href=(["\'])(http.+?)\1[^>]+title=["\']Download [Vv]ideo', webpage): for _, link in re.findall(r'<a[^>]+href=(["\'])(http(?:(?!\1).)+\.mp4(?:(?!\1).)*)\1[^>]+title=["\']Download [Vv]ideo', webpage):
links.append(link) links.append(link)
# Fallback #2 (unavailable as at 22.06.2017) # Fallback #2 (unavailable as at 22.06.2017)
@ -128,8 +131,9 @@ class YouPornIE(InfoExtractor):
# Video URL's path looks like this: # Video URL's path looks like this:
# /201012/17/505835/720p_1500k_505835/YouPorn%20-%20Sex%20Ed%20Is%20It%20Safe%20To%20Masturbate%20Daily.mp4 # /201012/17/505835/720p_1500k_505835/YouPorn%20-%20Sex%20Ed%20Is%20It%20Safe%20To%20Masturbate%20Daily.mp4
# /201012/17/505835/vl_240p_240k_505835/YouPorn%20-%20Sex%20Ed%20Is%20It%20Safe%20To%20Masturbate%20Daily.mp4 # /201012/17/505835/vl_240p_240k_505835/YouPorn%20-%20Sex%20Ed%20Is%20It%20Safe%20To%20Masturbate%20Daily.mp4
# /videos/201703/11/109285532/1080P_4000K_109285532.mp4
# We will benefit from it by extracting some metadata # We will benefit from it by extracting some metadata
mobj = re.search(r'(?P<height>\d{3,4})[pP]_(?P<bitrate>\d+)[kK]_\d+/', video_url) mobj = re.search(r'(?P<height>\d{3,4})[pP]_(?P<bitrate>\d+)[kK]_\d+', video_url)
if mobj: if mobj:
height = int(mobj.group('height')) height = int(mobj.group('height'))
bitrate = int(mobj.group('bitrate')) bitrate = int(mobj.group('bitrate'))

File diff suppressed because it is too large Load Diff

View File

@ -87,11 +87,16 @@ class ZypeIE(InfoExtractor):
r'(["\'])(?P<url>(?:(?!\1).)+\.m3u8(?:(?!\1).)*)\1', r'(["\'])(?P<url>(?:(?!\1).)+\.m3u8(?:(?!\1).)*)\1',
body, 'm3u8 url', group='url', default=None) body, 'm3u8 url', group='url', default=None)
if not m3u8_url: if not m3u8_url:
source = self._parse_json(self._search_regex( source = self._search_regex(
r'(?s)sources\s*:\s*\[\s*({.+?})\s*\]', body, r'(?s)sources\s*:\s*\[\s*({.+?})\s*\]', body, 'source')
'source'), video_id, js_to_json)
if source.get('integration') == 'verizon-media': def get_attr(key):
m3u8_url = 'https://content.uplynk.com/%s.m3u8' % source['id'] return self._search_regex(
r'\b%s\s*:\s*([\'"])(?P<val>(?:(?!\1).)+)\1' % key,
source, key, group='val')
if get_attr('integration') == 'verizon-media':
m3u8_url = 'https://content.uplynk.com/%s.m3u8' % get_attr('id')
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls') m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls')
text_tracks = self._search_regex( text_tracks = self._search_regex(

View File

@ -689,6 +689,10 @@ def parseOpts(overrideArguments=None):
'-o', '--output', '-o', '--output',
dest='outtmpl', metavar='TEMPLATE', dest='outtmpl', metavar='TEMPLATE',
help=('Output filename template, see the "OUTPUT TEMPLATE" for all the info')) help=('Output filename template, see the "OUTPUT TEMPLATE" for all the info'))
filesystem.add_option(
'--output-na-placeholder',
dest='outtmpl_na_placeholder', metavar='PLACEHOLDER', default='NA',
help=('Placeholder value for unavailable meta fields in output filename template (default is "%default")'))
filesystem.add_option( filesystem.add_option(
'--autonumber-size', '--autonumber-size',
dest='autonumber_size', metavar='NUMBER', type=int, dest='autonumber_size', metavar='NUMBER', type=int,
@ -782,7 +786,7 @@ def parseOpts(overrideArguments=None):
postproc.add_option( postproc.add_option(
'-x', '--extract-audio', '-x', '--extract-audio',
action='store_true', dest='extractaudio', default=False, action='store_true', dest='extractaudio', default=False,
help='Convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)') help='Convert video files to audio-only files (requires ffmpeg/avconv and ffprobe/avprobe)')
postproc.add_option( postproc.add_option(
'--audio-format', metavar='FORMAT', dest='audioformat', default='best', '--audio-format', metavar='FORMAT', dest='audioformat', default='best',
help='Specify audio format: "best", "aac", "flac", "mp3", "m4a", "opus", "vorbis", or "wav"; "%default" by default; No effect without -x') help='Specify audio format: "best", "aac", "flac", "mp3", "m4a", "opus", "vorbis", or "wav"; "%default" by default; No effect without -x')

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals from __future__ import unicode_literals
__version__ = '2021.01.08' __version__ = '2021.02.04.1'