Compare commits

...

114 Commits

Author SHA1 Message Date
Sergey M․
091200c368 release 2019.04.30 2019-04-30 06:11:50 +07:00
Sergey M․
67bfbe4942 [ChangeLog] Actualize
[ci skip]
2019-04-30 06:08:51 +07:00
Sergey M․
54f3b61216 [openload] Use real Chrome versions (closes #20902) 2019-04-30 05:59:12 +07:00
Sergey M․
a61ce71468 [youtube] Remove info el for get_video_info request
Since it does not work for quite a long time
2019-04-30 04:49:12 +07:00
Sergey M․
026fbedc85 [youtube] Improve extraction robustness
Fail on missing token only when no formats found
2019-04-30 04:32:55 +07:00
Remita Amine
6e07b5a6d5 [dramafever] Remove extractor(closes #20868) 2019-04-28 18:02:41 +01:00
Remita Amine
c464e1df2c [adn] fix subtitle extraction(#12724) 2019-04-28 17:50:47 +01:00
Remita Amine
92bc97d398 [youtube] extract album from Music in this video section(#20301) 2019-04-28 17:38:20 +01:00
Sergey M․
f916abc0ac [ccc] Improve extraction (closes #14601, closes #20355) 2019-04-28 23:08:09 +07:00
Tobias Gruetzmacher
24510bdcfa [ccc] Extract creator 2019-04-28 23:07:41 +07:00
Tobias Kunze
ae8c13565e [ccc:playlist] Add extractor 2019-04-28 23:07:01 +07:00
Remita Amine
280913800d [sverigesradio] improve extraction(closes #18635) 2019-04-28 12:03:39 +01:00
Mattias Wadman
7ff8ad80f1 [sverigesradio] Add extractor 2019-04-28 10:07:06 +01:00
Remita Amine
4e4db743e7 [cinemax] Add new extractor 2019-04-28 00:42:55 +01:00
Remita Amine
3545d38bfb [sixplay] add missing parenthesis 2019-04-27 10:32:53 +01:00
Remita Amine
2309d6bf92 [sixplay] try to extract non drm protected manifests(closes #20849) 2019-04-27 10:17:34 +01:00
Remita Amine
822b9d9cb0 [youtube] improve Youtube Music Auto-generated description parsing(closes #20742) 2019-04-27 09:16:17 +01:00
quinlander
5caabd3c70 [youtube] Extract additional meta data from video description on youtube music videos
YouTube music videos often have auto-generated video descriptions that can be
utilized to extract additional information about the video. This is desirable
in order to provide the user with as much meta data as possible. This commit
adds extraction methods for the following fields for youtube music videos:
- artist (fallback extraction methods added)
- track (fallback extraction methods added)
- album (new in this commit)
- release_date (new in this commit)
- release_year (new in this commit)

4 tests have been added to test this new functionality:
- YoutubeIE tests 27, 28, 29, and 30

Resolves: #20599
2019-04-27 09:09:54 +01:00
Jakub Wilk
aa05a093bb [wrzuta] Remove extractor (closes #20684) (#20801)
Wrzuta.pl was shut down in 2017.
2019-04-27 05:12:15 +07:00
Sergey M․
60e67c5b2c [twitch] Prefer source format (closes #20850) 2019-04-27 05:08:27 +07:00
Sergey M․
eefa0f2157 Move issue template templates into separate folder 2019-04-27 04:55:30 +07:00
Sergey M․
6f366ef30c Issue template overhaul 2019-04-27 04:50:47 +07:00
Mao Zedong
88b547492f [twitcasting] Add support for private videos (#20843) 2019-04-26 16:17:40 +00:00
Mao Zedong
00a9a25cf9 [twitcasting] Fix test: video title (#20840) 2019-04-26 09:34:23 +00:00
Remita Amine
97abf05ad3 [reddit] check thumbnail URL(closes #20030) 2019-04-26 10:26:51 +01:00
Sergey M․
da668a23bd [ISSUE_TEMPLATE.md] Add entry on argument escaping in make-sure checklist
[ci skip]
2019-04-26 00:46:41 +07:00
Remita Amine
58ef5e7881 [yandexmusic] fix track url extraction(closes #20820) 2019-04-25 11:36:44 +01:00
Sergey M․
3e7ec5330a release 2019.04.24 2019-04-24 10:05:54 +07:00
Sergey M․
98933c14e1 [ChangeLog] Actualize
[ci skip]
2019-04-24 10:05:08 +07:00
Sergey M․
56667d622c [youtube] Fix extraction (closes #20758, closes #20759, closes #20761, closes #20762, closes #20764, closes #20766, closes #20767, closes #20769, closes #20771, closes #20768, closes #20770) 2019-04-24 09:58:00 +07:00
Remita Amine
50d660479d [toutv] fix extraction and extract series info(closes #20757) 2019-04-24 00:28:00 +01:00
Remita Amine
1fa8893734 [vrv] add support for movie listings(closes #19229) 2019-04-22 23:50:37 +01:00
Remita Amine
15be3eb5e5 [youtube] raise ExtractorError when no data available(#20737) 2019-04-22 20:52:43 +01:00
Sergey M․
e09965d550 [soundcloud] Add support for new rendition and improve extraction (closes #20699) 2019-04-23 00:39:36 +07:00
Remita Amine
3fd86cfe13 [ooyala] add support for geo verification proxy 2019-04-22 10:04:56 +01:00
Remita Amine
fdc2183650 [nrl] Add new extractor(closes #15991) 2019-04-22 10:04:00 +01:00
Remita Amine
85b6335d55 [vimeo] extract live archive source format(#19144) 2019-04-21 21:05:58 +01:00
Remita Amine
c25720ef6a [vimeo] add support live streams and improve info extraction(closes #19144) 2019-04-21 17:20:52 +01:00
Remita Amine
c9b19d7a55 [ntvcojp] Add new extractor 2019-04-21 14:51:26 +01:00
Remita Amine
47cfa00516 [nhk] extract rtmpt format 2019-04-21 13:25:04 +01:00
Remita Amine
061d1cd948 [nhk] add support for audio URLs 2019-04-21 13:17:22 +01:00
Remita Amine
5de538787d [udemy] add another course id extraction pattern(closes #20491) 2019-04-19 20:44:59 +01:00
Sergey M․
9abeefd527 [openload] Add support for oload.services (closes #20691) 2019-04-18 23:56:20 +07:00
ealgase
f3914b06a0 [openload] Add support for openloed.co (closes #20691)
While the .co could be captured directly, I anticipate that there will be more TLD's for openloed in the future.
2019-04-18 01:51:32 +07:00
Remita Amine
81d989c21e [bravotv] fix extraction(closes #19213) 2019-04-18 01:50:30 +07:00
Sergey M․
cd6c75b05f release 2019.04.17 2019-04-18 01:50:25 +07:00
Sergey M․
9846935256 [ChangeLog] Actualize
[ci skip]
2019-04-17 00:15:48 +07:00
Sergey M․
7fc3b68ad3 [openload] Randomize User-Agent (closes #20688) 2019-04-17 00:08:50 +07:00
Sergey M
c4341ea47e [openload] Add support for oladblock domains (#20471) 2019-04-16 23:50:04 +07:00
Remita Amine
e6c9ae31df [adn] fix subtitle extraction(#12724) 2019-04-16 13:04:13 +01:00
ealgase
6104cc1591 [openload] add test for oladblock.me 2019-04-15 23:09:05 -04:00
ealgase
f114e43d38 [openload] add oladblock.me domain 2019-04-15 23:08:28 -04:00
Remita Amine
cb6cd76f7b [aol] add support for localized websites 2019-04-14 23:18:36 +01:00
Remita Amine
0b758fea1c [yahoo] add support GYAO episode URLs 2019-04-14 15:01:01 +01:00
Remita Amine
3534b6329a [yahoo] add support for streaming.yahoo.co.jp(closes #5811)(closes #7098) 2019-04-14 14:39:20 +01:00
Remita Amine
174f62992d [yahoo] add support for gyao.yahoo.co.jp 2019-04-14 14:29:04 +01:00
Remita Amine
1038532213 [aenetworks] add encoding declaration 2019-04-14 13:18:16 +01:00
Remita Amine
4f1e02ad60 [aenetworks] fix history topic extraction and extract more formats 2019-04-14 11:46:33 +01:00
Remita Amine
180a9dff1f [cbs] extract smpte and vtt subtitles 2019-04-13 17:02:22 +01:00
Sergey M
972d2dd0bc [streamango] add support for streamcherry.com (#20592) 2019-04-13 15:05:24 +07:00
DaMightyZombie
11edb76610 [README.md] Rephrase usage example comment (#20614) 2019-04-13 15:03:42 +07:00
JChris246
8721b09751 [yourporn] Add support for sxyprn.com (#20646) 2019-04-13 15:02:09 +07:00
Remita Amine
dc27fd8bb8 [mgtv] fix extraction(closes #20650) 2019-04-12 09:19:09 +01:00
Remita Amine
c912029480 [linkedin:learning] use urljoin for form action url(closes #20431) 2019-04-11 08:44:58 +01:00
Remita Amine
118f7add3b [gdc] add support for kaltura embeds and update tests(closes #20575) 2019-04-09 11:23:47 +01:00
Remita Amine
4bc12b8f81 [dispeak] improve mp4 bitrate extraction 2019-04-09 11:21:46 +01:00
Remita Amine
5ca3459828 [kaltura] sanitize embed URLs 2019-04-09 11:20:26 +01:00
Remita Amine
9c017253e8 [jwplatfom] do not match manifest URLs(#20596) 2019-04-08 16:34:03 +01:00
Remita Amine
9045d28b5e [aol] restrict url regex and improve format extraction 2019-04-07 21:31:26 +01:00
Sergey M․
7c2ecbc1cc [tiktok] Add support for new URL schema (closes #20573) 2019-04-07 21:06:09 +07:00
Remita Amine
d562cac9dc [stv:player] Add new extractor(closes #20586) 2019-04-07 12:40:14 +01:00
ealgase
9ed06812ec [streamango] add support for streamcherry.com 2019-04-06 23:59:41 -04:00
ealgase
bf6fb8b9dc [openload] add tests 2019-04-06 23:38:40 -04:00
Sergey M․
a46d9e5b41 release 2019.04.07 2019-04-07 04:19:46 +07:00
Sergey M․
aa5338118e [ChangeLog] Actualize
[ci skip]
2019-04-07 04:16:45 +07:00
Sergey M․
8410653f24 [ruutu] Add support for audio podcasts (closes #20473, closes #20545) 2019-04-07 03:18:10 +07:00
Sergey M․
f4da808036 [xvideos] Extract all thumbnails (closes #20432) 2019-04-07 02:59:09 +07:00
Martin Michlmayr
f412970164 [README.md] Fix lists formatting (closes #20558)
Lists have to be separated from the previous paragraph by a blank line
in certain variants of Markdown, otherwise they are not interpreted as
lists.

This change ensures that that the youtube-dl.1 man page, which is
generated from README.md with the help of pandoc, is formatted
correctly.
2019-04-07 02:28:31 +07:00
Sergey M․
059cd768b9 [vk] Remove unused import 2019-04-07 02:17:54 +07:00
Sergey M․
c701472fc9 [platzi] Add extractor (closes #20562) 2019-04-07 02:15:52 +07:00
Remita Amine
19591facea [dvtv] remove unnecessary comments and spaces 2019-04-06 16:45:11 +01:00
Jan Friesse
b9aad6c427 [dvtv] Fix extraction (closes #18514) 2019-04-06 16:37:07 +01:00
Remita Amine
9f182c23ba [vrv] add basic support for individual movie links(#19229) 2019-04-06 09:22:25 +01:00
Remita Amine
4810655cd6 [bfi:player] Add new extractor(#19235) 2019-04-05 19:35:35 +01:00
Remita Amine
a7978f8e2a [hbo] fix extraction and extract subtitles(closes #14629)(closes #13709) 2019-04-05 18:08:43 +01:00
Remita Amine
19041a3877 [youtube] extract srv[1-3] subtitle formats(#20566) 2019-04-05 16:18:57 +01:00
Remita Amine
afb7496416 [adultswim] fix extraction(closes #18025) 2019-04-05 11:45:49 +01:00
Remita Amine
69e6efac16 [teamcoco] fix extraction and add suport for subdomains(closes #17099)(closes #20339) 2019-04-05 08:26:04 +01:00
Remita Amine
2bbde1d09a [adn] fix subtitle compatibility with ffmpeg 2019-04-04 17:59:20 +01:00
Remita Amine
b966740cf7 [adn] fix extraction and add support for positioning styles(closes #20549) 2019-04-04 14:50:16 +01:00
Remita Amine
220828f2d6 [vk] use a more unique video id(closes #17848) 2019-04-03 11:08:42 +01:00
Remita Amine
977a782110 [rtl2] update player_url 2019-04-03 10:20:01 +01:00
Remita Amine
a2b6f946f1 [newstube] fix extraction 2019-04-03 10:19:36 +01:00
Remita Amine
4f7db46887 [rtl2] improve _VALID_URL regex 2019-04-03 01:00:02 +01:00
Remita Amine
d7d86fdd49 [download/external] pass rtmp_conn to ffmpeg 2019-04-02 22:41:23 +01:00
Remita Amine
f8987163fb [adobeconnect] Add new extractor(closes #20283) 2019-04-02 22:40:39 +01:00
Remita Amine
313e8b2b18 [gaia] add support for authentication(closes #14605) 2019-04-02 15:50:06 +01:00
Sergey M․
c0b7d11713 [YoutubeDL] Add ffmpeg_location to post processor options (closes #20532) 2019-04-02 01:29:44 +07:00
Sergey M․
efee62ac7f [mediasite] Add support for dashed ids and named catalogs (closes #20531) 2019-04-02 01:13:52 +07:00
Sergey M․
38287d251d release 2019.04.01 2019-04-01 23:55:17 +07:00
Sergey M․
25d9243141 [ChangeLog] Actualize
[ci skip]
2019-04-01 23:53:28 +07:00
RexYuan
93bb6b1bae [weibo] Extend _VALID_URL (#20496) 2019-03-31 01:31:33 +07:00
Sergey M․
b43c5f474a [xhamster] Add support for xhamster.one (closes #20508) 2019-03-31 01:27:45 +07:00
Sergey M․
4014a48622 [mediasite:catalog] Add extractor (closes #20507) 2019-03-31 01:21:53 +07:00
Remita Amine
99fe330070 [teamtreehouse] Add new extractor(closes #9836) 2019-03-28 16:55:57 +01:00
Remita Amine
c4c888697e [ina] add support for audio URLs 2019-03-27 18:49:29 +01:00
Remita Amine
b27a71e66c [ina] improve extraction 2019-03-27 18:29:24 +01:00
Remita Amine
de74ef83b7 [cwtv] fix episode number extraction(closes #20461) 2019-03-27 18:01:51 +01:00
ealgase
cf3d399727 [openload] add support for oladblock.services and oladblock.xyz domains 2019-03-25 12:04:31 -04:00
Sergey M․
8cb10807ed [npo] Improve DRM detection 2019-03-23 21:43:50 +07:00
Sergey M․
b8526c78f9 [pornhub] Add support for DASH formats (closes #20403) 2019-03-23 01:09:33 +07:00
Sergey M․
5e1271c56d [utils] Improve int_or_none and float_or_none (#20403) 2019-03-23 01:08:54 +07:00
Jesse de Zwart
050afa60c6 Check for valid --min-sleep-interval when --max-sleep-interval is specified 2019-03-21 22:55:03 +07:00
Sergey M․
c4580580f5 [svtplay] Update API endpoint (closes #20430) 2019-03-21 22:39:35 +07:00
86 changed files with 4673 additions and 1300 deletions

View File

@@ -1,61 +0,0 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use the *Preview* tab to see what your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2019.03.18*. If it's not, read [this FAQ entry](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2019.03.18**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/ytdl-org/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/ytdl-org/youtube-dl#faq) and [BUGS](https://github.com/ytdl-org/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2019.03.18
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

63
.github/ISSUE_TEMPLATE/1_broken_site.md vendored Normal file
View File

@@ -0,0 +1,63 @@
---
name: Broken site support
about: Report broken or misfunctioning site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.04.30. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2019.04.30**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2019.04.30
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -0,0 +1,54 @@
---
name: Site support request
about: Request support for a new site
title: ''
labels: 'site-support-request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.04.30. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2019.04.30**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones
## Example URLs
<!--
Provide all kinds of example URLs support for which should be included. Replace following example URLs by yours.
-->
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
## Description
<!--
Provide any additional information.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -0,0 +1,37 @@
---
name: Site feature request
about: Request a new functionality for a site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.04.30. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2019.04.30**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
## Description
<!--
Provide an explanation of your site feature request in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

65
.github/ISSUE_TEMPLATE/4_bug_report.md vendored Normal file
View File

@@ -0,0 +1,65 @@
---
name: Bug report
about: Report a bug unrelated to any particular site or extractor
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.04.30. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Read bugs section in FAQ: http://yt-dl.org/reporting
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2019.04.30**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
- [ ] I've read bugs section in FAQ
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2019.04.30
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -0,0 +1,38 @@
---
name: Feature request
about: Request a new functionality unrelated to any particular site or extractor
title: ''
labels: 'request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.04.30. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2019.04.30**
- [ ] I've searched the bugtracker for similar feature requests including closed ones
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

38
.github/ISSUE_TEMPLATE/6_question.md vendored Normal file
View File

@@ -0,0 +1,38 @@
---
name: Ask question
about: Ask youtube-dl related question
title: ''
labels: 'question'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- Look through the README (http://yt-dl.org/readme) and FAQ (http://yt-dl.org/faq) for similar questions
- Search the bugtracker for similar questions: http://yt-dl.org/search-issues
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm asking a question
- [ ] I've looked through the README and FAQ for similar questions
- [ ] I've searched the bugtracker for similar questions including closed ones
## Question
<!--
Ask your question in an arbitrary form. Please make sure it's worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient.
-->
WRITE QUESTION HERE

View File

@@ -1,61 +0,0 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use the *Preview* tab to see what your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not, read [this FAQ entry](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/ytdl-org/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/ytdl-org/youtube-dl#faq) and [BUGS](https://github.com/ytdl-org/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

View File

@@ -0,0 +1,63 @@
---
name: Broken site support
about: Report broken or misfunctioning site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -0,0 +1,54 @@
---
name: Site support request
about: Request support for a new site
title: ''
labels: 'site-support-request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones
## Example URLs
<!--
Provide all kinds of example URLs support for which should be included. Replace following example URLs by yours.
-->
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
## Description
<!--
Provide any additional information.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -0,0 +1,37 @@
---
name: Site feature request
about: Request a new functionality for a site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
## Description
<!--
Provide an explanation of your site feature request in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View File

@@ -0,0 +1,65 @@
---
name: Bug report
about: Report a bug unrelated to any particular site or extractor
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Read bugs section in FAQ: http://yt-dl.org/reporting
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
- [ ] I've read bugs section in FAQ
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -0,0 +1,38 @@
---
name: Feature request
about: Request a new functionality unrelated to any particular site or extractor
title: ''
labels: 'request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've searched the bugtracker for similar feature requests including closed ones
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

114
ChangeLog
View File

@@ -1,3 +1,117 @@
version 2019.04.30
Extractors
* [openload] Use real Chrome versions (#20902)
- [youtube] Remove info el for get_video_info request
* [youtube] Improve extraction robustness
- [dramafever] Remove extractor (#20868)
* [adn] Fix subtitle extraction (#12724)
+ [ccc] Extract creator (#20355)
+ [ccc:playlist] Add support for media.ccc.de playlists (#14601, #20355)
+ [sverigesradio] Add support for sverigesradio.se (#18635)
+ [cinemax] Add support for cinemax.com
* [sixplay] Try extracting non-DRM protected manifests (#20849)
+ [youtube] Extract Youtube Music Auto-generated metadata (#20599, #20742)
- [wrzuta] Remove extractor (#20684, #20801)
* [twitch] Prefer source format (#20850)
+ [twitcasting] Add support for private videos (#20843)
* [reddit] Validate thumbnail URL (#20030)
* [yandexmusic] Fix track URL extraction (#20820)
version 2019.04.24
Extractors
* [youtube] Fix extraction (#20758, #20759, #20761, #20762, #20764, #20766,
#20767, #20769, #20771, #20768, #20770)
* [toutv] Fix extraction and extract series info (#20757)
+ [vrv] Add support for movie listings (#19229)
+ [youtube] Print error when no data is available (#20737)
+ [soundcloud] Add support for new rendition and improve extraction (#20699)
+ [ooyala] Add support for geo verification proxy
+ [nrl] Add support for nrl.com (#15991)
+ [vimeo] Extract live archive source format (#19144)
+ [vimeo] Add support for live streams and improve info extraction (#19144)
+ [ntvcojp] Add support for cu.ntv.co.jp
+ [nhk] Extract RTMPT format
+ [nhk] Add support for audio URLs
+ [udemy] Add another course id extraction pattern (#20491)
+ [openload] Add support for oload.services (#20691)
+ [openload] Add support for openloed.co (#20691, #20693)
* [bravotv] Fix extraction (#19213)
version 2019.04.17
Extractors
* [openload] Randomize User-Agent (closes #20688)
+ [openload] Add support for oladblock domains (#20471)
* [adn] Fix subtitle extraction (#12724)
+ [aol] Add support for localized websites
+ [yahoo] Add support GYAO episode URLs
+ [yahoo] Add support for streaming.yahoo.co.jp (#5811, #7098)
+ [yahoo] Add support for gyao.yahoo.co.jp
* [aenetworks] Fix history topic extraction and extract more formats
+ [cbs] Extract smpte and vtt subtitles
+ [streamango] Add support for streamcherry.com (#20592)
+ [yourporn] Add support for sxyprn.com (#20646)
* [mgtv] Fix extraction (#20650)
* [linkedin:learning] Use urljoin for form action URL (#20431)
+ [gdc] Add support for kaltura embeds (#20575)
* [dispeak] Improve mp4 bitrate extraction
* [kaltura] Sanitize embed URLs
* [jwplatfom] Do not match manifest URLs (#20596)
* [aol] Restrict URL regular expression and improve format extraction
+ [tiktok] Add support for new URL schema (#20573)
+ [stv:player] Add support for player.stv.tv (#20586)
version 2019.04.07
Core
+ [downloader/external] Pass rtmp_conn to ffmpeg
Extractors
+ [ruutu] Add support for audio podcasts (#20473, #20545)
+ [xvideos] Extract all thumbnails (#20432)
+ [platzi] Add support for platzi.com (#20562)
* [dvtv] Fix extraction (#18514, #19174)
+ [vrv] Add basic support for individual movie links (#19229)
+ [bfi:player] Add support for player.bfi.org.uk (#19235)
* [hbo] Fix extraction and extract subtitles (#14629, #13709)
* [youtube] Extract srv[1-3] subtitle formats (#20566)
* [adultswim] Fix extraction (#18025)
* [teamcoco] Fix extraction and add suport for subdomains (#17099, #20339)
* [adn] Fix subtitle compatibility with ffmpeg
* [adn] Fix extraction and add support for positioning styles (#20549)
* [vk] Use unique video id (#17848)
* [newstube] Fix extraction
* [rtl2] Actualize extraction
+ [adobeconnect] Add support for adobeconnect.com (#20283)
+ [gaia] Add support for authentication (#14605)
+ [mediasite] Add support for dashed ids and named catalogs (#20531)
version 2019.04.01
Core
* [utils] Improve int_or_none and float_or_none (#20403)
* Check for valid --min-sleep-interval when --max-sleep-interval is specified
(#20435)
Extractors
+ [weibo] Extend URL regular expression (#20496)
+ [xhamster] Add support for xhamster.one (#20508)
+ [mediasite] Add support for catalogs (#20507)
+ [teamtreehouse] Add support for teamtreehouse.com (#9836)
+ [ina] Add support for audio URLs
* [ina] Improve extraction
* [cwtv] Fix episode number extraction (#20461)
* [npo] Improve DRM detection
+ [pornhub] Add support for DASH formats (#20403)
* [svtplay] Update API endpoint (#20430)
version 2019.03.18
Core

View File

@@ -1,7 +1,7 @@
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete
find . -name "*.class" -delete
@@ -78,8 +78,12 @@ README.md: youtube_dl/*.py youtube_dl/*/*.py
CONTRIBUTING.md: README.md
$(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md
.github/ISSUE_TEMPLATE.md: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md youtube_dl/version.py
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md .github/ISSUE_TEMPLATE.md
issuetemplates: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md youtube_dl/version.py
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE/1_broken_site.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE/2_site_support_request.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE/4_bug_report.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md .github/ISSUE_TEMPLATE/5_feature_request.md
supportedsites:
$(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md

View File

@@ -642,6 +642,7 @@ The simplest case is requesting a specific format, for example with `-f 22` you
You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download the best quality format of a particular file extension served as a single file, e.g. `-f webm` will download the best quality format with the `webm` extension served as a single file.
You can also use special names to select particular edge case formats:
- `best`: Select the best quality format represented by a single file with video and audio.
- `worst`: Select the worst quality format represented by a single file with video and audio.
- `bestvideo`: Select the best quality video-only format (e.g. DASH video). May not be available.
@@ -658,6 +659,7 @@ If you want to download several formats of the same video use a comma as a separ
You can also filter the video formats by putting a condition in brackets, as in `-f "best[height=720]"` (or `-f "[filesize>10M]"`).
The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals):
- `filesize`: The number of bytes, if known in advance
- `width`: Width of the video, if known
- `height`: Height of the video, if known
@@ -668,6 +670,7 @@ The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `
- `fps`: Frame rate
Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains) and following string meta fields:
- `ext`: File extension
- `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use
@@ -697,7 +700,7 @@ Note that on Windows you may need to use double quotes instead of single.
# Download best mp4 format available or any other best if no mp4 available
$ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
# Download best format available but not better that 480p
# Download best format available but no better than 480p
$ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
# Download best video only format but no bigger than 50 MB

View File

@@ -78,8 +78,8 @@ sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
sed -i "s/<unreleased>/$version/" ChangeLog
/bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
make README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md supportedsites
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py ChangeLog
make README.md CONTRIBUTING.md issuetemplates supportedsites
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE/1_broken_site.md .github/ISSUE_TEMPLATE/2_site_support_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md .github/ISSUE_TEMPLATE/4_bug_report.md .github/ISSUE_TEMPLATE/5_feature_request.md .github/ISSUE_TEMPLATE/6_question.md docs/supportedsites.md youtube_dl/version.py ChangeLog
git commit $gpg_sign_commits -m "release $version"
/bin/echo -e "\n### Now tagging, signing and pushing..."

View File

@@ -28,6 +28,7 @@
- **acast:channel**
- **AddAnime**
- **ADN**: Anime Digital Network
- **AdobeConnect**
- **AdobeTV**
- **AdobeTVChannel**
- **AdobeTVShow**
@@ -45,6 +46,7 @@
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **AnimeOnDemand**
- **Anvato**
- **aol.com**
- **APA**
- **Aparat**
- **AppleConnect**
@@ -101,6 +103,7 @@
- **Bellator**
- **BellMedia**
- **Bet**
- **bfi:player**
- **Bigflix**
- **Bild**: Bild.de
- **BiliBili**
@@ -161,6 +164,7 @@
- **chirbit**
- **chirbit:profile**
- **Cinchcast**
- **Cinemax**
- **CiscoLiveSearch**
- **CiscoLiveSession**
- **CJSW**
@@ -198,6 +202,7 @@
- **CSpan**: C-SPAN
- **CtsNews**: 華視新聞
- **CTVNews**
- **cu.ntv.co.jp**: Nippon Television Network
- **Culturebox**
- **CultureUnplugged**
- **curiositystream**
@@ -233,8 +238,6 @@
- **DouyuTV**: 斗鱼
- **DPlay**
- **DPlayIt**
- **dramafever**
- **dramafever:series**
- **DRBonanza**
- **Dropbox**
- **DrTuber**
@@ -345,7 +348,6 @@
- **Groupon**
- **Hark**
- **hbo**
- **hbo:episode**
- **HearThisAt**
- **Heise**
- **HellPorno**
@@ -485,9 +487,12 @@
- **MatchTV**
- **MDR**: MDR.DE and KiKA
- **media.ccc.de**
- **media.ccc.de:lists**
- **Medialaan**
- **Mediaset**
- **Mediasite**
- **MediasiteCatalog**
- **MediasiteNamedCatalog**
- **Medici**
- **megaphone.fm**: megaphone.fm embedded players
- **Meipai**: 美拍
@@ -620,6 +625,7 @@
- **NRKTVEpisodes**
- **NRKTVSeason**
- **NRKTVSeries**
- **NRLTV**
- **ntv.ru**
- **Nuvid**
- **NYTimes**
@@ -629,7 +635,6 @@
- **OdaTV**
- **Odnoklassniki**
- **OktoberfestTV**
- **on.aol.com**
- **OnDemandKorea**
- **onet.pl**
- **onet.tv**
@@ -670,6 +675,8 @@
- **Piksel**
- **Pinkbike**
- **Pladform**
- **Platzi**
- **PlatziCourse**
- **play.fm**
- **PlayPlusTV**
- **PlaysTV**
@@ -848,7 +855,10 @@
- **StreamCZ**
- **StreetVoice**
- **StretchInternet**
- **stv:player**
- **SunPorno**
- **sverigesradio:episode**
- **sverigesradio:publication**
- **SVT**
- **SVTPage**
- **SVTPlay**: SVT Play and Öppet arkiv
@@ -869,6 +879,7 @@
- **teachertube:user:collection**: teachertube.com user and collection videos
- **TeachingChannel**
- **Teamcoco**
- **TeamTreeHouse**
- **TechTalks**
- **techtv.mit.edu**
- **ted**
@@ -1093,8 +1104,6 @@
- **Wistia**
- **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **WorldStarHipHop**
- **wrzuta.pl**
- **wrzuta.pl:playlist**
- **WSJ**: Wall Street Journal
- **WSJArticle**
- **WWE**
@@ -1118,6 +1127,8 @@
- **XVideos**
- **XXXYMovies**
- **Yahoo**: Yahoo screen and movies
- **yahoo:gyao**
- **yahoo:gyao:player**
- **YandexDisk**
- **yandexmusic:album**: Яндекс.Музыка - Альбом
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист

View File

@@ -33,11 +33,13 @@ from youtube_dl.utils import (
ExtractorError,
find_xpath_attr,
fix_xml_ampersands,
float_or_none,
get_element_by_class,
get_element_by_attribute,
get_elements_by_class,
get_elements_by_attribute,
InAdvancePagedList,
int_or_none,
intlist_to_bytes,
is_html,
js_to_json,
@@ -468,6 +470,21 @@ class TestUtil(unittest.TestCase):
shell_quote(args),
"""ffmpeg -i 'ñ€ß'"'"'.mp4'""" if compat_os_name != 'nt' else '''ffmpeg -i "ñ€ß'.mp4"''')
def test_float_or_none(self):
self.assertEqual(float_or_none('42.42'), 42.42)
self.assertEqual(float_or_none('42'), 42.0)
self.assertEqual(float_or_none(''), None)
self.assertEqual(float_or_none(None), None)
self.assertEqual(float_or_none([]), None)
self.assertEqual(float_or_none(set()), None)
def test_int_or_none(self):
self.assertEqual(int_or_none('42'), 42)
self.assertEqual(int_or_none(''), None)
self.assertEqual(int_or_none(None), None)
self.assertEqual(int_or_none([]), None)
self.assertEqual(int_or_none(set()), None)
def test_str_to_int(self):
self.assertEqual(str_to_int('123,456'), 123456)
self.assertEqual(str_to_int('123.456'), 123456)

View File

@@ -309,6 +309,8 @@ class YoutubeDL(object):
The following options are used by the post processors:
prefer_ffmpeg: If False, use avconv instead of ffmpeg if both are available,
otherwise prefer ffmpeg.
ffmpeg_location: Location of the ffmpeg/avconv binary; either the path
to the binary or its containing directory.
postprocessor_args: A list of additional command-line arguments for the
postprocessor.

View File

@@ -166,6 +166,8 @@ def _real_main(argv=None):
if opts.max_sleep_interval is not None:
if opts.max_sleep_interval < 0:
parser.error('max sleep interval must be positive or 0')
if opts.sleep_interval is None:
parser.error('min sleep interval must be specified, use --min-sleep-interval')
if opts.max_sleep_interval < opts.sleep_interval:
parser.error('max sleep interval must be greater than or equal to min sleep interval')
else:

View File

@@ -289,6 +289,7 @@ class FFmpegFD(ExternalFD):
tc_url = info_dict.get('tc_url')
flash_version = info_dict.get('flash_version')
live = info_dict.get('rtmp_live', False)
conn = info_dict.get('rtmp_conn')
if player_url is not None:
args += ['-rtmp_swfverify', player_url]
if page_url is not None:
@@ -303,6 +304,11 @@ class FFmpegFD(ExternalFD):
args += ['-rtmp_flashver', flash_version]
if live:
args += ['-rtmp_live', 'live']
if isinstance(conn, list):
for entry in conn:
args += ['-rtmp_conn', entry]
elif isinstance(conn, compat_str):
args += ['-rtmp_conn', conn]
args += ['-i', url, '-c', 'copy']

View File

@@ -21,7 +21,6 @@ from ..utils import (
intlist_to_bytes,
long_to_bytes,
pkcs1pad,
srt_subtitles_timecode,
strip_or_none,
urljoin,
)
@@ -42,6 +41,18 @@ class ADNIE(InfoExtractor):
}
_BASE_URL = 'http://animedigitalnetwork.fr'
_RSA_KEY = (0xc35ae1e4356b65a73b551493da94b8cb443491c0aa092a357a5aee57ffc14dda85326f42d716e539a34542a0d3f363adf16c5ec222d713d5997194030ee2e4f0d1fb328c01a81cf6868c090d50de8e169c6b13d1675b9eeed1cbc51e1fffca9b38af07f37abd790924cd3bee59d0257cfda4fe5f3f0534877e21ce5821447d1b, 65537)
_POS_ALIGN_MAP = {
'start': 1,
'end': 3,
}
_LINE_ALIGN_MAP = {
'middle': 8,
'end': 4,
}
@staticmethod
def _ass_subtitles_timecode(seconds):
return '%01d:%02d:%02d.%02d' % (seconds / 3600, (seconds % 3600) / 60, seconds % 60, (seconds % 1) * 100)
def _get_subtitles(self, sub_path, video_id):
if not sub_path:
@@ -49,14 +60,20 @@ class ADNIE(InfoExtractor):
enc_subtitles = self._download_webpage(
urljoin(self._BASE_URL, sub_path),
video_id, fatal=False)
video_id, 'Downloading subtitles location', fatal=False) or '{}'
subtitle_location = (self._parse_json(enc_subtitles, video_id, fatal=False) or {}).get('location')
if subtitle_location:
enc_subtitles = self._download_webpage(
urljoin(self._BASE_URL, subtitle_location),
video_id, 'Downloading subtitles data', fatal=False,
headers={'Origin': 'https://animedigitalnetwork.fr'})
if not enc_subtitles:
return None
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
bytes_to_intlist(compat_b64decode(enc_subtitles[24:])),
bytes_to_intlist(binascii.unhexlify(self._K + '9032ad7083106400')),
bytes_to_intlist(binascii.unhexlify(self._K + '4b8ef13ec1872730')),
bytes_to_intlist(compat_b64decode(enc_subtitles[:24]))
))
subtitles_json = self._parse_json(
@@ -67,23 +84,27 @@ class ADNIE(InfoExtractor):
subtitles = {}
for sub_lang, sub in subtitles_json.items():
srt = ''
for num, current in enumerate(sub):
start, end, text = (
ssa = '''[Script Info]
ScriptType:V4.00
[V4 Styles]
Format: Name,Fontname,Fontsize,PrimaryColour,SecondaryColour,TertiaryColour,BackColour,Bold,Italic,BorderStyle,Outline,Shadow,Alignment,MarginL,MarginR,MarginV,AlphaLevel,Encoding
Style: Default,Arial,18,16777215,16777215,16777215,0,-1,0,1,1,0,2,20,20,20,0,0
[Events]
Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
for current in sub:
start, end, text, line_align, position_align = (
float_or_none(current.get('startTime')),
float_or_none(current.get('endTime')),
current.get('text'))
current.get('text'), current.get('lineAlign'),
current.get('positionAlign'))
if start is None or end is None or text is None:
continue
srt += os.linesep.join(
(
'%d' % num,
'%s --> %s' % (
srt_subtitles_timecode(start),
srt_subtitles_timecode(end)),
text,
os.linesep,
))
alignment = self._POS_ALIGN_MAP.get(position_align, 2) + self._LINE_ALIGN_MAP.get(line_align, 0)
ssa += os.linesep + 'Dialogue: Marked=0,%s,%s,Default,,0,0,0,,%s%s' % (
self._ass_subtitles_timecode(start),
self._ass_subtitles_timecode(end),
'{\\a%d}' % alignment if alignment != 2 else '',
text.replace('\n', '\\N').replace('<i>', '{\\i1}').replace('</i>', '{\\i0}'))
if sub_lang == 'vostf':
sub_lang = 'fr'
@@ -91,8 +112,8 @@ class ADNIE(InfoExtractor):
'ext': 'json',
'data': json.dumps(sub),
}, {
'ext': 'srt',
'data': srt,
'ext': 'ssa',
'data': ssa,
}])
return subtitles
@@ -100,7 +121,15 @@ class ADNIE(InfoExtractor):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player_config = self._parse_json(self._search_regex(
r'playerConfig\s*=\s*({.+});', webpage, 'player config'), video_id)
r'playerConfig\s*=\s*({.+});', webpage,
'player config', default='{}'), video_id, fatal=False)
if not player_config:
config_url = urljoin(self._BASE_URL, self._search_regex(
r'(?:id="player"|class="[^"]*adn-player-container[^"]*")[^>]+data-url="([^"]+)"',
webpage, 'config url'))
player_config = self._download_json(
config_url, video_id,
'Downloading player config JSON metadata')['player']
video_info = {}
video_info_str = self._search_regex(
@@ -129,12 +158,15 @@ class ADNIE(InfoExtractor):
encrypted_message = long_to_bytes(pow(bytes_to_long(padded_message), e, n))
authorization = base64.b64encode(encrypted_message).decode()
links_data = self._download_json(
urljoin(self._BASE_URL, links_url), video_id, headers={
urljoin(self._BASE_URL, links_url), video_id,
'Downloading links JSON metadata', headers={
'Authorization': 'Bearer ' + authorization,
})
links = links_data.get('links') or {}
metas = metas or links_data.get('meta') or {}
sub_path = (sub_path or links_data.get('subtitles')) + '&token=' + token
sub_path = sub_path or links_data.get('subtitles') or \
'index.php?option=com_vodapi&task=subtitles.getJSON&format=json&id=' + video_id
sub_path += '&token=' + token
error = links_data.get('error')
title = metas.get('title') or video_info['title']
@@ -142,9 +174,11 @@ class ADNIE(InfoExtractor):
for format_id, qualities in links.items():
if not isinstance(qualities, dict):
continue
for load_balancer_url in qualities.values():
for quality, load_balancer_url in qualities.items():
load_balancer_data = self._download_json(
load_balancer_url, video_id, fatal=False) or {}
load_balancer_url, video_id,
'Downloading %s %s JSON metadata' % (format_id, quality),
fatal=False) or {}
m3u8_url = load_balancer_data.get('location')
if not m3u8_url:
continue

View File

@@ -0,0 +1,37 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_parse_qs,
compat_urlparse,
)
class AdobeConnectIE(InfoExtractor):
_VALID_URL = r'https?://\w+\.adobeconnect\.com/(?P<id>[\w-]+)'
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(r'<title>(.+?)</title>', webpage, 'title')
qs = compat_parse_qs(self._search_regex(r"swfUrl\s*=\s*'([^']+)'", webpage, 'swf url').split('?')[1])
is_live = qs.get('isLive', ['false'])[0] == 'true'
formats = []
for con_string in qs['conStrings'][0].split(','):
formats.append({
'format_id': con_string.split('://')[0],
'app': compat_urlparse.quote('?' + con_string.split('?')[1] + 'flvplayerapp/' + qs['appInstance'][0]),
'ext': 'flv',
'play_path': 'mp4:' + qs['streamName'][0],
'rtmp_conn': 'S:' + qs['ticket'][0],
'rtmp_live': is_live,
'url': con_string,
})
return {
'id': video_id,
'title': self._live_title(title) if is_live else title,
'formats': formats,
'is_live': is_live,
}

View File

@@ -1,13 +1,19 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import re
from .turner import TurnerBaseIE
from ..utils import (
determine_ext,
float_or_none,
int_or_none,
mimetype2ext,
parse_age_limit,
parse_iso8601,
strip_or_none,
url_or_none,
try_get,
)
@@ -21,8 +27,8 @@ class AdultSwimIE(TurnerBaseIE):
'ext': 'mp4',
'title': 'Rick and Morty - Pilot',
'description': 'Rick moves in with his daughter\'s family and establishes himself as a bad influence on his grandson, Morty.',
'timestamp': 1493267400,
'upload_date': '20170427',
'timestamp': 1543294800,
'upload_date': '20181127',
},
'params': {
# m3u8 download
@@ -43,6 +49,7 @@ class AdultSwimIE(TurnerBaseIE):
# m3u8 download
'skip_download': True,
},
'skip': '404 Not Found',
}, {
'url': 'http://www.adultswim.com/videos/decker/inside-decker-a-new-hero/',
'info_dict': {
@@ -61,9 +68,9 @@ class AdultSwimIE(TurnerBaseIE):
}, {
'url': 'http://www.adultswim.com/videos/attack-on-titan',
'info_dict': {
'id': 'b7A69dzfRzuaXIECdxW8XQ',
'id': 'attack-on-titan',
'title': 'Attack on Titan',
'description': 'md5:6c8e003ea0777b47013e894767f5e114',
'description': 'md5:41caa9416906d90711e31dc00cb7db7e',
},
'playlist_mincount': 12,
}, {
@@ -78,83 +85,118 @@ class AdultSwimIE(TurnerBaseIE):
# m3u8 download
'skip_download': True,
},
'skip': '404 Not Found',
}]
def _real_extract(self, url):
show_path, episode_path = re.match(self._VALID_URL, url).groups()
display_id = episode_path or show_path
webpage = self._download_webpage(url, display_id)
initial_data = self._parse_json(self._search_regex(
r'AS_INITIAL_DATA(?:__)?\s*=\s*({.+?});',
webpage, 'initial data'), display_id)
is_stream = show_path == 'streams'
if is_stream:
if not episode_path:
episode_path = 'live-stream'
video_data = next(stream for stream_path, stream in initial_data['streams'].items() if stream_path == episode_path)
video_id = video_data.get('stream')
if not video_id:
entries = []
for episode in video_data.get('archiveEpisodes', []):
episode_url = url_or_none(episode.get('url'))
if not episode_url:
continue
entries.append(self.url_result(
episode_url, 'AdultSwim', episode.get('id')))
return self.playlist_result(
entries, video_data.get('id'), video_data.get('title'),
strip_or_none(video_data.get('description')))
query = '''query {
getShowBySlug(slug:"%s") {
%%s
}
}''' % show_path
if episode_path:
query = query % '''title
getVideoBySlug(slug:"%s") {
_id
auth
description
duration
episodeNumber
launchDate
mediaID
seasonNumber
poster
title
tvRating
}''' % episode_path
['getVideoBySlug']
else:
show_data = initial_data['show']
query = query % '''metaDescription
title
videos(first:1000,sort:["episode_number"]) {
edges {
node {
_id
slug
}
}
}'''
show_data = self._download_json(
'https://www.adultswim.com/api/search', display_id,
data=json.dumps({'query': query}).encode(),
headers={'Content-Type': 'application/json'})['data']['getShowBySlug']
if episode_path:
video_data = show_data['getVideoBySlug']
video_id = video_data['_id']
episode_title = title = video_data['title']
series = show_data.get('title')
if series:
title = '%s - %s' % (series, title)
info = {
'id': video_id,
'title': title,
'description': strip_or_none(video_data.get('description')),
'duration': float_or_none(video_data.get('duration')),
'formats': [],
'subtitles': {},
'age_limit': parse_age_limit(video_data.get('tvRating')),
'thumbnail': video_data.get('poster'),
'timestamp': parse_iso8601(video_data.get('launchDate')),
'series': series,
'season_number': int_or_none(video_data.get('seasonNumber')),
'episode': episode_title,
'episode_number': int_or_none(video_data.get('episodeNumber')),
}
if not episode_path:
entries = []
for video in show_data.get('videos', []):
slug = video.get('slug')
if not slug:
auth = video_data.get('auth')
media_id = video_data.get('mediaID')
if media_id:
info.update(self._extract_ngtv_info(media_id, {
# CDN_TOKEN_APP_ID from:
# https://d2gg02c3xr550i.cloudfront.net/assets/asvp.e9c8bef24322d060ef87.bundle.js
'appId': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhcHBJZCI6ImFzLXR2ZS1kZXNrdG9wLXB0enQ2bSIsInByb2R1Y3QiOiJ0dmUiLCJuZXR3b3JrIjoiYXMiLCJwbGF0Zm9ybSI6ImRlc2t0b3AiLCJpYXQiOjE1MzI3MDIyNzl9.BzSCk-WYOZ2GMCIaeVb8zWnzhlgnXuJTCu0jGp_VaZE',
}, {
'url': url,
'site_name': 'AdultSwim',
'auth_required': auth,
}))
if not auth:
extract_data = self._download_json(
'https://www.adultswim.com/api/shows/v1/videos/' + video_id,
video_id, query={'fields': 'stream'}, fatal=False) or {}
assets = try_get(extract_data, lambda x: x['data']['video']['stream']['assets'], list) or []
for asset in assets:
asset_url = asset.get('url')
if not asset_url:
continue
entries.append(self.url_result(
'http://adultswim.com/videos/%s/%s' % (show_path, slug),
'AdultSwim', video.get('id')))
return self.playlist_result(
entries, show_data.get('id'), show_data.get('title'),
strip_or_none(show_data.get('metadata', {}).get('description')))
ext = determine_ext(asset_url, mimetype2ext(asset.get('mime_type')))
if ext == 'm3u8':
info['formats'].extend(self._extract_m3u8_formats(
asset_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
elif ext == 'f4m':
continue
# info['formats'].extend(self._extract_f4m_formats(
# asset_url, video_id, f4m_id='hds', fatal=False))
elif ext in ('scc', 'ttml', 'vtt'):
info['subtitles'].setdefault('en', []).append({
'url': asset_url,
})
self._sort_formats(info['formats'])
video_data = show_data['sluggedVideo']
video_id = video_data['id']
info = self._extract_cvp_info(
'http://www.adultswim.com/videos/api/v0/assets?platform=desktop&id=' + video_id,
video_id, {
'secure': {
'media_src': 'http://androidhls-secure.cdn.turner.com/adultswim/big',
'tokenizer_src': 'http://www.adultswim.com/astv/mvpd/processors/services/token_ipadAdobe.do',
},
}, {
'url': url,
'site_name': 'AdultSwim',
'auth_required': video_data.get('auth'),
})
info.update({
'id': video_id,
'display_id': display_id,
'description': info.get('description') or strip_or_none(video_data.get('description')),
})
if not is_stream:
info.update({
'duration': info.get('duration') or int_or_none(video_data.get('duration')),
'timestamp': info.get('timestamp') or int_or_none(video_data.get('launch_date')),
'season_number': info.get('season_number') or int_or_none(video_data.get('season_number')),
'episode': info['title'],
'episode_number': info.get('episode_number') or int_or_none(video_data.get('episode_number')),
})
info['series'] = video_data.get('collection_title') or info.get('series')
if info['series'] and info['series'] != info['title']:
info['title'] = '%s - %s' % (info['series'], info['title'])
return info
return info
else:
entries = []
for edge in show_data.get('videos', {}).get('edges', []):
video = edge.get('node') or {}
slug = video.get('slug')
if not slug:
continue
entries.append(self.url_result(
'http://adultswim.com/videos/%s/%s' % (show_path, slug),
'AdultSwim', video.get('_id')))
return self.playlist_result(
entries, show_path, show_data.get('title'),
strip_or_none(show_data.get('metaDescription')))

View File

@@ -1,14 +1,15 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .theplatform import ThePlatformIE
from ..utils import (
extract_attributes,
ExtractorError,
int_or_none,
smuggle_url,
update_url_query,
unescapeHTML,
extract_attributes,
get_element_by_attribute,
)
from ..compat import (
compat_urlparse,
@@ -19,6 +20,43 @@ class AENetworksBaseIE(ThePlatformIE):
_THEPLATFORM_KEY = 'crazyjava'
_THEPLATFORM_SECRET = 's3cr3t'
def _extract_aen_smil(self, smil_url, video_id, auth=None):
query = {'mbr': 'true'}
if auth:
query['auth'] = auth
TP_SMIL_QUERY = [{
'assetTypes': 'high_video_ak',
'switch': 'hls_high_ak'
}, {
'assetTypes': 'high_video_s3'
}, {
'assetTypes': 'high_video_s3',
'switch': 'hls_ingest_fastly'
}]
formats = []
subtitles = {}
last_e = None
for q in TP_SMIL_QUERY:
q.update(query)
m_url = update_url_query(smil_url, q)
m_url = self._sign_url(m_url, self._THEPLATFORM_KEY, self._THEPLATFORM_SECRET)
try:
tp_formats, tp_subtitles = self._extract_theplatform_smil(
m_url, video_id, 'Downloading %s SMIL data' % (q.get('switch') or q['assetTypes']))
except ExtractorError as e:
last_e = e
continue
formats.extend(tp_formats)
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
if last_e and not formats:
raise last_e
self._sort_formats(formats)
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
}
class AENetworksIE(AENetworksBaseIE):
IE_NAME = 'aenetworks'
@@ -33,22 +71,25 @@ class AENetworksIE(AENetworksBaseIE):
(?:
shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|
movies/(?P<movie_display_id>[^/]+)(?:/full-movie)?|
specials/(?P<special_display_id>[^/]+)/full-special|
specials/(?P<special_display_id>[^/]+)/(?:full-special|preview-)|
collections/[^/]+/(?P<collection_display_id>[^/]+)
)
'''
_TESTS = [{
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
'md5': 'a97a65f7e823ae10e9244bc5433d5fe6',
'info_dict': {
'id': '22253814',
'ext': 'mp4',
'title': 'Winter Is Coming',
'title': 'Winter is Coming',
'description': 'md5:641f424b7a19d8e24f26dea22cf59d74',
'timestamp': 1338306241,
'upload_date': '20120529',
'uploader': 'AENE-NEW',
},
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['ThePlatform'],
}, {
'url': 'http://www.history.com/shows/ancient-aliens/season-1',
@@ -84,6 +125,9 @@ class AENetworksIE(AENetworksBaseIE):
}, {
'url': 'https://www.historyvault.com/collections/america-the-story-of-us/westward',
'only_matching': True
}, {
'url': 'https://www.aetv.com/specials/hunting-jonbenets-killer-the-untold-story/preview-hunting-jonbenets-killer-the-untold-story',
'only_matching': True
}]
_DOMAIN_TO_REQUESTOR_ID = {
'history.com': 'HISTORY',
@@ -124,11 +168,6 @@ class AENetworksIE(AENetworksBaseIE):
return self.playlist_result(
entries, self._html_search_meta('aetn:SeasonId', webpage))
query = {
'mbr': 'true',
'assetTypes': 'high_video_ak',
'switch': 'hls_high_ak',
}
video_id = self._html_search_meta('aetn:VideoID', webpage)
media_url = self._search_regex(
[r"media_url\s*=\s*'(?P<url>[^']+)'",
@@ -138,64 +177,39 @@ class AENetworksIE(AENetworksBaseIE):
theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
r'https?://link\.theplatform\.com/s/([^?]+)', media_url, 'theplatform_path'), video_id)
info = self._parse_theplatform_metadata(theplatform_metadata)
auth = None
if theplatform_metadata.get('AETN$isBehindWall'):
requestor_id = self._DOMAIN_TO_REQUESTOR_ID[domain]
resource = self._get_mvpd_resource(
requestor_id, theplatform_metadata['title'],
theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'),
theplatform_metadata['ratings'][0]['rating'])
query['auth'] = self._extract_mvpd_auth(
auth = self._extract_mvpd_auth(
url, video_id, requestor_id, resource)
info.update(self._search_json_ld(webpage, video_id, fatal=False))
media_url = update_url_query(media_url, query)
media_url = self._sign_url(media_url, self._THEPLATFORM_KEY, self._THEPLATFORM_SECRET)
formats, subtitles = self._extract_theplatform_smil(media_url, video_id)
self._sort_formats(formats)
info.update({
'id': video_id,
'formats': formats,
'subtitles': subtitles,
})
info.update(self._extract_aen_smil(media_url, video_id, auth))
return info
class HistoryTopicIE(AENetworksBaseIE):
IE_NAME = 'history:topic'
IE_DESC = 'History.com Topic'
_VALID_URL = r'https?://(?:www\.)?history\.com/topics/(?:[^/]+/)?(?P<topic_id>[^/]+)(?:/[^/]+(?:/(?P<video_display_id>[^/?#]+))?)?'
_VALID_URL = r'https?://(?:www\.)?history\.com/topics/[^/]+/(?P<id>[\w+-]+?)-video'
_TESTS = [{
'url': 'http://www.history.com/topics/valentines-day/history-of-valentines-day/videos/bet-you-didnt-know-valentines-day?m=528e394da93ae&s=undefined&f=1&free=false',
'url': 'https://www.history.com/topics/valentines-day/history-of-valentines-day-video',
'info_dict': {
'id': '40700995724',
'ext': 'mp4',
'title': "Bet You Didn't Know: Valentine's Day",
'title': "History of Valentines Day",
'description': 'md5:7b57ea4829b391995b405fa60bd7b5f7',
'timestamp': 1375819729,
'upload_date': '20130806',
'uploader': 'AENE-NEW',
},
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['ThePlatform'],
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history/videos',
'info_dict':
{
'id': 'world-war-i-history',
'title': 'World War I History',
},
'playlist_mincount': 23,
}, {
'url': 'http://www.history.com/topics/world-war-i-history/videos',
'only_matching': True,
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history',
'only_matching': True,
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history/speeches',
'only_matching': True,
}]
def theplatform_url_result(self, theplatform_url, video_id, query):
@@ -215,27 +229,19 @@ class HistoryTopicIE(AENetworksBaseIE):
}
def _real_extract(self, url):
topic_id, video_display_id = re.match(self._VALID_URL, url).groups()
if video_display_id:
webpage = self._download_webpage(url, video_display_id)
release_url, video_id = re.search(r"_videoPlayer.play\('([^']+)'\s*,\s*'[^']+'\s*,\s*'(\d+)'\)", webpage).groups()
release_url = unescapeHTML(release_url)
return self.theplatform_url_result(
release_url, video_id, {
'mbr': 'true',
'switch': 'hls',
'assetTypes': 'high_video_ak',
})
else:
webpage = self._download_webpage(url, topic_id)
entries = []
for episode_item in re.findall(r'<a.+?data-release-url="[^"]+"[^>]*>', webpage):
video_attributes = extract_attributes(episode_item)
entries.append(self.theplatform_url_result(
video_attributes['data-release-url'], video_attributes['data-id'], {
'mbr': 'true',
'switch': 'hls',
'assetTypes': 'high_video_ak',
}))
return self.playlist_result(entries, topic_id, get_element_by_attribute('class', 'show-title', webpage))
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r'<phoenix-iframe[^>]+src="[^"]+\btpid=(\d+)', webpage, 'tpid')
result = self._download_json(
'https://feeds.video.aetnd.com/api/v2/history/videos',
video_id, query={'filter[id]': video_id})['results'][0]
title = result['title']
info = self._extract_aen_smil(result['publicUrl'], video_id)
info.update({
'title': title,
'description': result.get('description'),
'duration': int_or_none(result.get('duration')),
'timestamp': int_or_none(result.get('added'), 1000),
})
return info

View File

@@ -4,6 +4,10 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_parse_qs,
compat_urllib_parse_urlparse,
)
from ..utils import (
ExtractorError,
int_or_none,
@@ -12,12 +16,12 @@ from ..utils import (
class AolIE(InfoExtractor):
IE_NAME = 'on.aol.com'
_VALID_URL = r'(?:aol-video:|https?://(?:(?:www|on)\.)?aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
IE_NAME = 'aol.com'
_VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>[0-9a-f]+)'
_TESTS = [{
# video with 5min ID
'url': 'http://on.aol.com/video/u-s--official-warns-of-largest-ever-irs-phone-scam-518167793?icid=OnHomepageC2Wide_MustSee_Img',
'url': 'https://www.aol.com/video/view/u-s--official-warns-of-largest-ever-irs-phone-scam/518167793/',
'md5': '18ef68f48740e86ae94b98da815eec42',
'info_dict': {
'id': '518167793',
@@ -34,7 +38,7 @@ class AolIE(InfoExtractor):
}
}, {
# video with vidible ID
'url': 'http://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/',
'url': 'https://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/',
'info_dict': {
'id': '5707d6b8e4b090497b04f706',
'ext': 'mp4',
@@ -49,17 +53,29 @@ class AolIE(InfoExtractor):
'skip_download': True,
}
}, {
'url': 'http://on.aol.com/partners/abc-551438d309eab105804dbfe8/sneak-peek-was-haley-really-framed-570eaebee4b0448640a5c944',
'url': 'https://www.aol.com/video/view/park-bench-season-2-trailer/559a1b9be4b0c3bfad3357a7/',
'only_matching': True,
}, {
'url': 'http://on.aol.com/shows/park-bench-shw518173474-559a1b9be4b0c3bfad3357a7?context=SH:SHW518173474:PL4327:1460619712763',
'only_matching': True,
}, {
'url': 'http://on.aol.com/video/519442220',
'url': 'https://www.aol.com/video/view/donald-trump-spokeswoman-tones-down-megyn-kelly-attacks/519442220/',
'only_matching': True,
}, {
'url': 'aol-video:5707d6b8e4b090497b04f706',
'only_matching': True,
}, {
'url': 'https://www.aol.com/video/playlist/PL8245/5ca79d19d21f1a04035db606/',
'only_matching': True,
}, {
'url': 'https://www.aol.ca/video/view/u-s-woman-s-family-arrested-for-murder-first-pinned-on-panhandler-police/5c7ccf45bc03931fa04b2fe1/',
'only_matching': True,
}, {
'url': 'https://www.aol.co.uk/video/view/-one-dead-and-22-hurt-in-bus-crash-/5cb3a6f3d21f1a072b457347/',
'only_matching': True,
}, {
'url': 'https://www.aol.de/video/view/eva-braun-privataufnahmen-von-hitlers-geliebter-werden-digitalisiert/5cb2d49de98ab54c113d3d5d/',
'only_matching': True,
}, {
'url': 'https://www.aol.jp/video/playlist/5a28e936a1334d000137da0c/5a28f3151e642219fde19831/',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -73,7 +89,7 @@ class AolIE(InfoExtractor):
video_data = response['data']
formats = []
m3u8_url = video_data.get('videoMasterPlaylist')
m3u8_url = url_or_none(video_data.get('videoMasterPlaylist'))
if m3u8_url:
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
@@ -96,6 +112,12 @@ class AolIE(InfoExtractor):
'width': int(mobj.group(1)),
'height': int(mobj.group(2)),
})
else:
qs = compat_parse_qs(compat_urllib_parse_urlparse(video_url).query)
f.update({
'width': int_or_none(qs.get('w', [None])[0]),
'height': int_or_none(qs.get('h', [None])[0]),
})
formats.append(f)
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))

View File

@@ -0,0 +1,37 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import extract_attributes
class BFIPlayerIE(InfoExtractor):
IE_NAME = 'bfi:player'
_VALID_URL = r'https?://player\.bfi\.org\.uk/[^/]+/film/watch-(?P<id>[\w-]+)-online'
_TEST = {
'url': 'https://player.bfi.org.uk/free/film/watch-computer-doctor-1974-online',
'md5': 'e8783ebd8e061ec4bc6e9501ed547de8',
'info_dict': {
'id': 'htNnhlZjE60C9VySkQEIBtU-cNV1Xx63',
'ext': 'mp4',
'title': 'Computer Doctor',
'description': 'md5:fb6c240d40c4dbe40428bdd62f78203b',
},
'skip': 'BFI Player films cannot be played outside of the UK',
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
entries = []
for player_el in re.findall(r'(?s)<[^>]+class="player"[^>]*>', webpage):
player_attr = extract_attributes(player_el)
ooyala_id = player_attr.get('data-video-id')
if not ooyala_id:
continue
entries.append(self.url_result(
'ooyala:' + ooyala_id, 'Ooyala',
ooyala_id, player_attr.get('data-label')))
return self.playlist_result(entries)

View File

@@ -28,7 +28,7 @@ class BIQLEIE(InfoExtractor):
'url': 'http://biqle.org/watch/-44781847_168547604',
'md5': '7f24e72af1db0edf7c1aaba513174f97',
'info_dict': {
'id': '168547604',
'id': '-44781847_168547604',
'ext': 'mp4',
'title': 'Ребенок в шоке от автоматической мойки',
'timestamp': 1396633454,

View File

@@ -1,6 +1,8 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .adobepass import AdobePassIE
from ..utils import (
smuggle_url,
@@ -12,16 +14,16 @@ from ..utils import (
class BravoTVIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://www.bravotv.com/last-chance-kitchen/season-5/videos/lck-ep-12-fishy-finale',
'md5': '9086d0b7ef0ea2aabc4781d75f4e5863',
'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is',
'md5': 'e34684cfea2a96cd2ee1ef3a60909de9',
'info_dict': {
'id': 'zHyk1_HU_mPy',
'id': 'epL0pmK1kQlT',
'ext': 'mp4',
'title': 'LCK Ep 12: Fishy Finale',
'description': 'S13/E12: Two eliminated chefs have just 12 minutes to cook up a delicious fish dish.',
'title': 'The Top Chef Season 16 Winner Is...',
'description': 'Find out who takes the title of Top Chef!',
'uploader': 'NBCU-BRAV',
'upload_date': '20160302',
'timestamp': 1456945320,
'upload_date': '20190314',
'timestamp': 1552591860,
}
}, {
'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
@@ -32,30 +34,38 @@ class BravoTVIE(AdobePassIE):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
settings = self._parse_json(self._search_regex(
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);', webpage, 'drupal settings'),
r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>({.+?})</script>', webpage, 'drupal settings'),
display_id)
info = {}
query = {
'mbr': 'true',
}
account_pid, release_pid = [None] * 2
tve = settings.get('sharedTVE')
tve = settings.get('ls_tve')
if tve:
query['manifest'] = 'm3u'
account_pid = 'HNK2IC'
release_pid = tve['release_pid']
mobj = re.search(r'<[^>]+id="pdk-player"[^>]+data-url=["\']?(?:https?:)?//player\.theplatform\.com/p/([^/]+)/(?:[^/]+/)*select/([^?#&"\']+)', webpage)
if mobj:
account_pid, tp_path = mobj.groups()
release_pid = tp_path.strip('/').split('/')[-1]
else:
account_pid = 'HNK2IC'
tp_path = release_pid = tve['release_pid']
if tve.get('entitlement') == 'auth':
adobe_pass = settings.get('adobePass', {})
adobe_pass = settings.get('tve_adobe_auth', {})
resource = self._get_mvpd_resource(
adobe_pass.get('adobePassResourceId', 'bravo'),
tve['title'], release_pid, tve.get('rating'))
query['auth'] = self._extract_mvpd_auth(
url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource)
else:
shared_playlist = settings['shared_playlist']
shared_playlist = settings['ls_playlist']
account_pid = shared_playlist['account_pid']
metadata = shared_playlist['video_metadata'][shared_playlist['default_clip']]
release_pid = metadata['release_pid']
tp_path = release_pid = metadata.get('release_pid')
if not release_pid:
release_pid = metadata['guid']
tp_path = 'media/guid/2140479951/' + release_pid
info.update({
'title': metadata['title'],
'description': metadata.get('description'),
@@ -67,7 +77,7 @@ class BravoTVIE(AdobePassIE):
'_type': 'url_transparent',
'id': release_pid,
'url': smuggle_url(update_url_query(
'http://link.theplatform.com/s/%s/%s' % (account_pid, release_pid),
'http://link.theplatform.com/s/%s/%s' % (account_pid, tp_path),
query), {'force_smil_url': True}),
'ie_key': 'ThePlatform',
})

View File

@@ -13,13 +13,17 @@ from ..utils import (
class CBSBaseIE(ThePlatformFeedIE):
def _parse_smil_subtitles(self, smil, namespace=None, subtitles_lang='en'):
closed_caption_e = find_xpath_attr(smil, self._xpath_ns('.//param', namespace), 'name', 'ClosedCaptionURL')
return {
'en': [{
'ext': 'ttml',
'url': closed_caption_e.attrib['value'],
}]
} if closed_caption_e is not None and closed_caption_e.attrib.get('value') else []
subtitles = {}
for k, ext in [('sMPTE-TTCCURL', 'tt'), ('ClosedCaptionURL', 'ttml'), ('webVTTCaptionURL', 'vtt')]:
cc_e = find_xpath_attr(smil, self._xpath_ns('.//param', namespace), 'name', k)
if cc_e is not None:
cc_url = cc_e.get('value')
if cc_url:
subtitles.setdefault(subtitles_lang, []).append({
'ext': ext,
'url': cc_url,
})
return subtitles
class CBSIE(CBSBaseIE):

View File

@@ -1,9 +1,12 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
try_get,
url_or_none,
)
@@ -18,11 +21,13 @@ class CCCIE(InfoExtractor):
'id': '1839',
'ext': 'mp4',
'title': 'Introduction to Processor Design',
'creator': 'byterazor',
'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20131228',
'timestamp': 1388188800,
'duration': 3710,
'tags': list,
}
}, {
'url': 'https://media.ccc.de/v/32c3-7368-shopshifting#download',
@@ -68,6 +73,7 @@ class CCCIE(InfoExtractor):
'id': event_id,
'display_id': display_id,
'title': event_data['title'],
'creator': try_get(event_data, lambda x: ', '.join(x['persons'])),
'description': event_data.get('description'),
'thumbnail': event_data.get('thumb_url'),
'timestamp': parse_iso8601(event_data.get('date')),
@@ -75,3 +81,31 @@ class CCCIE(InfoExtractor):
'tags': event_data.get('tags'),
'formats': formats,
}
class CCCPlaylistIE(InfoExtractor):
IE_NAME = 'media.ccc.de:lists'
_VALID_URL = r'https?://(?:www\.)?media\.ccc\.de/c/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://media.ccc.de/c/30c3',
'info_dict': {
'title': '30C3',
'id': '30c3',
},
'playlist_count': 135,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url).lower()
conf = self._download_json(
'https://media.ccc.de/public/conferences/' + playlist_id,
playlist_id)
entries = []
for e in conf['events']:
event_url = url_or_none(e.get('frontend_link'))
if event_url:
entries.append(self.url_result(event_url, ie=CCCIE.ie_key()))
return self.playlist_result(entries, playlist_id, conf.get('title'))

View File

@@ -0,0 +1,29 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .hbo import HBOBaseIE
class CinemaxIE(HBOBaseIE):
_VALID_URL = r'https?://(?:www\.)?cinemax\.com/(?P<path>[^/]+/video/[0-9a-z-]+-(?P<id>\d+))'
_TESTS = [{
'url': 'https://www.cinemax.com/warrior/video/s1-ep-1-recap-20126903',
'md5': '82e0734bba8aa7ef526c9dd00cf35a05',
'info_dict': {
'id': '20126903',
'ext': 'mp4',
'title': 'S1 Ep 1: Recap',
},
'expected_warnings': ['Unknown MIME type application/mp4 in DASH manifest'],
}, {
'url': 'https://www.cinemax.com/warrior/video/s1-ep-1-recap-20126903.embed',
'only_matching': True,
}]
def _real_extract(self, url):
path, video_id = re.match(self._VALID_URL, url).groups()
info = self._extract_info('https://www.cinemax.com/%s.xml' % path, video_id)
info['id'] = video_id
return info

View File

@@ -2019,6 +2019,8 @@ class InfoExtractor(object):
if res is False:
return []
mpd_doc, urlh = res
if mpd_doc is None:
return []
mpd_base_url = base_url(urlh.geturl())
return self._parse_mpd_formats(

View File

@@ -79,7 +79,7 @@ class CWTVIE(InfoExtractor):
season = str_or_none(video_data.get('season'))
episode = str_or_none(video_data.get('episode'))
if episode and season:
episode = episode.lstrip(season)
episode = episode[len(season):]
return {
'_type': 'url_transparent',

View File

@@ -58,10 +58,17 @@ class DigitallySpeakingIE(InfoExtractor):
stream_name = xpath_text(a_format, 'streamName', fatal=True)
video_path = re.match(r'mp4\:(?P<path>.*)', stream_name).group('path')
url = video_root + video_path
vbr = xpath_text(a_format, 'bitrate')
bitrate = xpath_text(a_format, 'bitrate')
tbr = int_or_none(bitrate)
vbr = int_or_none(self._search_regex(
r'-(\d+)\.mp4', video_path, 'vbr', default=None))
abr = tbr - vbr if tbr and vbr else None
video_formats.append({
'format_id': bitrate,
'url': url,
'vbr': int_or_none(vbr),
'tbr': tbr,
'vbr': vbr,
'abr': abr,
})
return video_formats

View File

@@ -1,266 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import itertools
import json
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_urlparse,
)
from ..utils import (
clean_html,
ExtractorError,
int_or_none,
parse_age_limit,
parse_duration,
unified_timestamp,
url_or_none,
)
class DramaFeverBaseIE(InfoExtractor):
_NETRC_MACHINE = 'dramafever'
_CONSUMER_SECRET = 'DA59dtVXYLxajktV'
_consumer_secret = None
def _get_consumer_secret(self):
mainjs = self._download_webpage(
'http://www.dramafever.com/static/51afe95/df2014/scripts/main.js',
None, 'Downloading main.js', fatal=False)
if not mainjs:
return self._CONSUMER_SECRET
return self._search_regex(
r"var\s+cs\s*=\s*'([^']+)'", mainjs,
'consumer secret', default=self._CONSUMER_SECRET)
def _real_initialize(self):
self._consumer_secret = self._get_consumer_secret()
self._login()
def _login(self):
username, password = self._get_login_info()
if username is None:
return
login_form = {
'username': username,
'password': password,
}
try:
response = self._download_json(
'https://www.dramafever.com/api/users/login', None, 'Logging in',
data=json.dumps(login_form).encode('utf-8'), headers={
'x-consumer-key': self._consumer_secret,
})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (403, 404):
response = self._parse_json(
e.cause.read().decode('utf-8'), None)
else:
raise
# Successful login
if response.get('result') or response.get('guid') or response.get('user_guid'):
return
errors = response.get('errors')
if errors and isinstance(errors, list):
error = errors[0]
message = error.get('message') or error['reason']
raise ExtractorError('Unable to login: %s' % message, expected=True)
raise ExtractorError('Unable to log in')
class DramaFeverIE(DramaFeverBaseIE):
IE_NAME = 'dramafever'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
_TESTS = [{
'url': 'https://www.dramafever.com/drama/4274/1/Heirs/',
'info_dict': {
'id': '4274.1',
'ext': 'wvm',
'title': 'Heirs - Episode 1',
'description': 'md5:362a24ba18209f6276e032a651c50bc2',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3783,
'timestamp': 1381354993,
'upload_date': '20131009',
'series': 'Heirs',
'season_number': 1,
'episode': 'Episode 1',
'episode_number': 1,
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://www.dramafever.com/drama/4826/4/Mnet_Asian_Music_Awards_2015/?ap=1',
'info_dict': {
'id': '4826.4',
'ext': 'flv',
'title': 'Mnet Asian Music Awards 2015',
'description': 'md5:3ff2ee8fedaef86e076791c909cf2e91',
'episode': 'Mnet Asian Music Awards 2015 - Part 3',
'episode_number': 4,
'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1450213200,
'upload_date': '20151215',
'duration': 5359,
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'https://www.dramafever.com/zh-cn/drama/4972/15/Doctor_Romantic/',
'only_matching': True,
}]
def _call_api(self, path, video_id, note, fatal=False):
return self._download_json(
'https://www.dramafever.com/api/5/' + path,
video_id, note=note, headers={
'x-consumer-key': self._consumer_secret,
}, fatal=fatal)
def _get_subtitles(self, video_id):
subtitles = {}
subs = self._call_api(
'video/%s/subtitles/webvtt/' % video_id, video_id,
'Downloading subtitles JSON', fatal=False)
if not subs or not isinstance(subs, list):
return subtitles
for sub in subs:
if not isinstance(sub, dict):
continue
sub_url = url_or_none(sub.get('url'))
if not sub_url:
continue
subtitles.setdefault(
sub.get('code') or sub.get('language') or 'en', []).append({
'url': sub_url
})
return subtitles
def _real_extract(self, url):
video_id = self._match_id(url).replace('/', '.')
series_id, episode_number = video_id.split('.')
video = self._call_api(
'series/%s/episodes/%s/' % (series_id, episode_number), video_id,
'Downloading video JSON')
formats = []
download_assets = video.get('download_assets')
if download_assets and isinstance(download_assets, dict):
for format_id, format_dict in download_assets.items():
if not isinstance(format_dict, dict):
continue
format_url = url_or_none(format_dict.get('url'))
if not format_url:
continue
formats.append({
'url': format_url,
'format_id': format_id,
'filesize': int_or_none(video.get('filesize')),
})
stream = self._call_api(
'video/%s/stream/' % video_id, video_id, 'Downloading stream JSON',
fatal=False)
if stream:
stream_url = stream.get('stream_url')
if stream_url:
formats.extend(self._extract_m3u8_formats(
stream_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
self._sort_formats(formats)
title = video.get('title') or 'Episode %s' % episode_number
description = video.get('description')
thumbnail = video.get('thumbnail')
timestamp = unified_timestamp(video.get('release_date'))
duration = parse_duration(video.get('duration'))
age_limit = parse_age_limit(video.get('tv_rating'))
series = video.get('series_title')
season_number = int_or_none(video.get('season'))
if series:
title = '%s - %s' % (series, title)
subtitles = self.extract_subtitles(video_id)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'timestamp': timestamp,
'age_limit': age_limit,
'series': series,
'season_number': season_number,
'episode_number': int_or_none(episode_number),
'formats': formats,
'subtitles': subtitles,
}
class DramaFeverSeriesIE(DramaFeverBaseIE):
IE_NAME = 'dramafever:series'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+)(?:/(?:(?!\d+(?:/|$)).+)?)?$'
_TESTS = [{
'url': 'http://www.dramafever.com/drama/4512/Cooking_with_Shin/',
'info_dict': {
'id': '4512',
'title': 'Cooking with Shin',
'description': 'md5:84a3f26e3cdc3fb7f500211b3593b5c1',
},
'playlist_count': 4,
}, {
'url': 'http://www.dramafever.com/drama/124/IRIS/',
'info_dict': {
'id': '124',
'title': 'IRIS',
'description': 'md5:b3a30e587cf20c59bd1c01ec0ee1b862',
},
'playlist_count': 20,
}]
_PAGE_SIZE = 60 # max is 60 (see http://api.drama9.com/#get--api-4-episode-series-)
def _real_extract(self, url):
series_id = self._match_id(url)
series = self._download_json(
'http://www.dramafever.com/api/4/series/query/?cs=%s&series_id=%s'
% (self._consumer_secret, series_id),
series_id, 'Downloading series JSON')['series'][series_id]
title = clean_html(series['name'])
description = clean_html(series.get('description') or series.get('description_short'))
entries = []
for page_num in itertools.count(1):
episodes = self._download_json(
'http://www.dramafever.com/api/4/episode/series/?cs=%s&series_id=%s&page_size=%d&page_number=%d'
% (self._consumer_secret, series_id, self._PAGE_SIZE, page_num),
series_id, 'Downloading episodes JSON page #%d' % page_num)
for episode in episodes.get('value', []):
episode_url = episode.get('episode_url')
if not episode_url:
continue
entries.append(self.url_result(
compat_urlparse.urljoin(url, episode_url),
'DramaFever', episode.get('guid')))
if page_num == episodes['num_pages']:
break
return self.playlist_result(entries, series_id, title, description)

View File

@@ -10,16 +10,16 @@ from ..utils import (
int_or_none,
js_to_json,
mimetype2ext,
try_get,
unescapeHTML,
parse_iso8601,
)
class DVTVIE(InfoExtractor):
IE_NAME = 'dvtv'
IE_DESC = 'http://video.aktualne.cz/'
_VALID_URL = r'https?://video\.aktualne\.cz/(?:[^/]+/)+r~(?P<id>[0-9a-f]{32})'
_TESTS = [{
'url': 'http://video.aktualne.cz/dvtv/vondra-o-ceskem-stoleti-pri-pohledu-na-havla-mi-bylo-trapne/r~e5efe9ca855511e4833a0025900fea04/',
'md5': '67cb83e4a955d36e1b5d31993134a0c2',
@@ -28,11 +28,13 @@ class DVTVIE(InfoExtractor):
'ext': 'mp4',
'title': 'Vondra o Českém století: Při pohledu na Havla mi bylo trapně',
'duration': 1484,
'upload_date': '20141217',
'timestamp': 1418792400,
}
}, {
'url': 'http://video.aktualne.cz/dvtv/dvtv-16-12-2014-utok-talibanu-boj-o-kliniku-uprchlici/r~973eb3bc854e11e498be002590604f2e/',
'info_dict': {
'title': r're:^DVTV 16\. 12\. 2014: útok Talibanu, boj o kliniku, uprchlíci',
'title': r'DVTV 16. 12. 2014: útok Talibanu, boj o kliniku, uprchlíci',
'id': '973eb3bc854e11e498be002590604f2e',
},
'playlist': [{
@@ -84,6 +86,8 @@ class DVTVIE(InfoExtractor):
'ext': 'mp4',
'title': 'Zeman si jen léčí mindráky, Sobotku nenávidí a Babiš se mu teď hodí, tvrdí Kmenta',
'duration': 1103,
'upload_date': '20170511',
'timestamp': 1494514200,
},
'params': {
'skip_download': True,
@@ -91,43 +95,59 @@ class DVTVIE(InfoExtractor):
}, {
'url': 'http://video.aktualne.cz/v-cechach-poprve-zazni-zelenkova-zrestaurovana-mse/r~45b4b00483ec11e4883b002590604f2e/',
'only_matching': True,
}, {
# Test live stream video (liveStarter) parsing
'url': 'https://video.aktualne.cz/dvtv/zive-mistryne-sveta-eva-samkova-po-navratu-ze-sampionatu/r~182654c2288811e990fd0cc47ab5f122/',
'md5': '2e552e483f2414851ca50467054f9d5d',
'info_dict': {
'id': '8d116360288011e98c840cc47ab5f122',
'ext': 'mp4',
'title': 'Živě: Mistryně světa Eva Samková po návratu ze šampionátu',
'upload_date': '20190204',
'timestamp': 1549289591,
},
'params': {
# Video content is no longer available
'skip_download': True,
},
}]
def _parse_video_metadata(self, js, video_id, live_js=None):
def _parse_video_metadata(self, js, video_id, timestamp):
data = self._parse_json(js, video_id, transform_source=js_to_json)
if live_js:
data.update(self._parse_json(
live_js, video_id, transform_source=js_to_json))
title = unescapeHTML(data['title'])
live_starter = try_get(data, lambda x: x['plugins']['liveStarter'], dict)
if live_starter:
data.update(live_starter)
formats = []
for video in data['sources']:
video_url = video.get('file')
if not video_url:
continue
video_type = video.get('type')
ext = determine_ext(video_url, mimetype2ext(video_type))
if video_type == 'application/vnd.apple.mpegurl' or ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif video_type == 'application/dash+xml' or ext == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
else:
label = video.get('label')
height = self._search_regex(
r'^(\d+)[pP]', label or '', 'height', default=None)
format_id = ['http']
for f in (ext, label):
if f:
format_id.append(f)
formats.append({
'url': video_url,
'format_id': '-'.join(format_id),
'height': int_or_none(height),
})
for tracks in data.get('tracks', {}).values():
for video in tracks:
video_url = video.get('src')
if not video_url:
continue
video_type = video.get('type')
ext = determine_ext(video_url, mimetype2ext(video_type))
if video_type == 'application/vnd.apple.mpegurl' or ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif video_type == 'application/dash+xml' or ext == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
else:
label = video.get('label')
height = self._search_regex(
r'^(\d+)[pP]', label or '', 'height', default=None)
format_id = ['http']
for f in (ext, label):
if f:
format_id.append(f)
formats.append({
'url': video_url,
'format_id': '-'.join(format_id),
'height': int_or_none(height),
})
self._sort_formats(formats)
return {
@@ -136,41 +156,29 @@ class DVTVIE(InfoExtractor):
'description': data.get('description'),
'thumbnail': data.get('image'),
'duration': int_or_none(data.get('duration')),
'timestamp': int_or_none(data.get('pubtime')),
'timestamp': int_or_none(timestamp),
'formats': formats
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
timestamp = parse_iso8601(self._html_search_meta(
'article:published_time', webpage, 'published time', default=None))
# live content
live_item = self._search_regex(
r'(?s)embedData[0-9a-f]{32}\.asset\.liveStarter\s*=\s*(\{.+?\});',
webpage, 'video', default=None)
# single video
item = self._search_regex(
r'(?s)embedData[0-9a-f]{32}\[["\']asset["\']\]\s*=\s*(\{.+?\});',
webpage, 'video', default=None)
if item:
return self._parse_video_metadata(item, video_id, live_item)
# playlist
items = re.findall(
r"(?s)BBX\.context\.assets\['[0-9a-f]{32}'\]\.push\(({.+?})\);",
webpage)
if not items:
items = re.findall(r'(?s)var\s+asset\s*=\s*({.+?});\n', webpage)
items = re.findall(r'(?s)playlist\.push\(({.+?})\);', webpage)
if items:
return {
'_type': 'playlist',
'id': video_id,
'title': self._og_search_title(webpage),
'entries': [self._parse_video_metadata(i, video_id) for i in items]
}
return self.playlist_result(
[self._parse_video_metadata(i, video_id, timestamp) for i in items],
video_id, self._html_search_meta('twitter:title', webpage))
item = self._search_regex(
r'(?s)BBXPlayer\.setup\((.+?)\);',
webpage, 'video', default=None)
if item:
# remove function calls (ex. htmldeentitize)
# TODO this should be fixed in a general way in the js_to_json
item = re.sub(r'\w+?\((.+)\)', r'\1', item)
return self._parse_video_metadata(item, video_id, timestamp)
raise ExtractorError('Could not find neither video nor playlist')

View File

@@ -20,6 +20,7 @@ from .acast import (
)
from .addanime import AddAnimeIE
from .adn import ADNIE
from .adobeconnect import AdobeConnectIE
from .adobetv import (
AdobeTVIE,
AdobeTVShowIE,
@@ -106,6 +107,7 @@ from .behindkink import BehindKinkIE
from .bellmedia import BellMediaIE
from .beatport import BeatportIE
from .bet import BetIE
from .bfi import BFIPlayerIE
from .bigflix import BigflixIE
from .bild import BildIE
from .bilibili import (
@@ -175,7 +177,10 @@ from .cbsnews import (
CBSNewsLiveVideoIE,
)
from .cbssports import CBSSportsIE
from .ccc import CCCIE
from .ccc import (
CCCIE,
CCCPlaylistIE,
)
from .ccma import CCMAIE
from .cctv import CCTVIE
from .cda import CDAIE
@@ -192,6 +197,7 @@ from .chirbit import (
ChirbitProfileIE,
)
from .cinchcast import CinchcastIE
from .cinemax import CinemaxIE
from .ciscolive import (
CiscoLiveSessionIE,
CiscoLiveSearchIE,
@@ -281,10 +287,6 @@ from .dplay import (
DPlayIE,
DPlayItIE,
)
from .dramafever import (
DramaFeverIE,
DramaFeverSeriesIE,
)
from .dreisat import DreiSatIE
from .drbonanza import DRBonanzaIE
from .drtuber import DrTuberIE
@@ -440,10 +442,7 @@ from .goshgay import GoshgayIE
from .gputechconf import GPUTechConfIE
from .groupon import GrouponIE
from .hark import HarkIE
from .hbo import (
HBOIE,
HBOEpisodeIE,
)
from .hbo import HBOIE
from .hearthisat import HearThisAtIE
from .heise import HeiseIE
from .hellporno import HellPornoIE
@@ -632,7 +631,11 @@ from .massengeschmacktv import MassengeschmackTVIE
from .matchtv import MatchTVIE
from .mdr import MDRIE
from .mediaset import MediasetIE
from .mediasite import MediasiteIE
from .mediasite import (
MediasiteIE,
MediasiteCatalogIE,
MediasiteNamedCatalogIE,
)
from .medici import MediciIE
from .megaphone import MegaphoneIE
from .meipai import MeipaiIE
@@ -805,6 +808,8 @@ from .nrk import (
NRKTVSeasonIE,
NRKTVSeriesIE,
)
from .nrl import NRLTVIE
from .ntvcojp import NTVCoJpCUIE
from .ntvde import NTVDeIE
from .ntvru import NTVRuIE
from .nytimes import (
@@ -865,6 +870,10 @@ from .picarto import (
from .piksel import PikselIE
from .pinkbike import PinkbikeIE
from .pladform import PladformIE
from .platzi import (
PlatziIE,
PlatziCourseIE,
)
from .playfm import PlayFMIE
from .playplustv import PlayPlusTVIE
from .plays import PlaysTVIE
@@ -1086,7 +1095,12 @@ from .streamcloud import StreamcloudIE
from .streamcz import StreamCZIE
from .streetvoice import StreetVoiceIE
from .stretchinternet import StretchInternetIE
from .stv import STVPlayerIE
from .sunporno import SunPornoIE
from .sverigesradio import (
SverigesRadioEpisodeIE,
SverigesRadioPublicationIE,
)
from .svt import (
SVTIE,
SVTPageIE,
@@ -1114,6 +1128,7 @@ from .teachertube import (
)
from .teachingchannel import TeachingChannelIE
from .teamcoco import TeamcocoIE
from .teamtreehouse import TeamTreeHouseIE
from .techtalks import TechTalksIE
from .ted import TEDIE
from .tele5 import Tele5IE
@@ -1407,10 +1422,6 @@ from .weiqitv import WeiqiTVIE
from .wimp import WimpIE
from .wistia import WistiaIE
from .worldstarhiphop import WorldStarHipHopIE
from .wrzuta import (
WrzutaIE,
WrzutaPlaylistIE,
)
from .wsj import (
WSJIE,
WSJArticleIE,
@@ -1443,6 +1454,8 @@ from .xxxymovies import XXXYMoviesIE
from .yahoo import (
YahooIE,
YahooSearchIE,
YahooGyaOPlayerIE,
YahooGyaOIE,
)
from .yandexdisk import YandexDiskIE
from .yandexmusic import (

View File

@@ -4,12 +4,17 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..compat import (
compat_str,
compat_urllib_parse_unquote,
)
from ..utils import (
ExtractorError,
int_or_none,
str_or_none,
strip_or_none,
try_get,
urlencode_postdata,
)
@@ -46,6 +51,29 @@ class GaiaIE(InfoExtractor):
'skip_download': True,
},
}]
_NETRC_MACHINE = 'gaia'
_jwt = None
def _real_initialize(self):
auth = self._get_cookies('https://www.gaia.com/').get('auth')
if auth:
auth = self._parse_json(
compat_urllib_parse_unquote(auth.value),
None, fatal=False)
if not auth:
username, password = self._get_login_info()
if username is None:
return
auth = self._download_json(
'https://auth.gaia.com/v1/login',
None, data=urlencode_postdata({
'username': username,
'password': password
}))
if auth.get('success') is False:
raise ExtractorError(', '.join(auth['messages']), expected=True)
if auth:
self._jwt = auth.get('jwt')
def _real_extract(self, url):
display_id, vtype = re.search(self._VALID_URL, url).groups()
@@ -59,8 +87,12 @@ class GaiaIE(InfoExtractor):
media_id = compat_str(vdata['nid'])
title = node['title']
headers = None
if self._jwt:
headers = {'Authorization': 'Bearer ' + self._jwt}
media = self._download_json(
'https://brooklyn.gaia.com/media/' + media_id, media_id)
'https://brooklyn.gaia.com/media/' + media_id,
media_id, headers=headers)
formats = self._extract_m3u8_formats(
media['mediaUrls']['bcHLS'], media_id, 'mp4')
self._sort_formats(formats)

View File

@@ -3,22 +3,24 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .kaltura import KalturaIE
from ..utils import (
HEADRequest,
sanitized_Request,
smuggle_url,
urlencode_postdata,
)
class GDCVaultIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?gdcvault\.com/play/(?P<id>\d+)/(?P<name>(\w|-)+)?'
_VALID_URL = r'https?://(?:www\.)?gdcvault\.com/play/(?P<id>\d+)(?:/(?P<name>[\w-]+))?'
_NETRC_MACHINE = 'gdcvault'
_TESTS = [
{
'url': 'http://www.gdcvault.com/play/1019721/Doki-Doki-Universe-Sweet-Simple',
'md5': '7ce8388f544c88b7ac11c7ab1b593704',
'info_dict': {
'id': '1019721',
'id': '201311826596_AWNY',
'display_id': 'Doki-Doki-Universe-Sweet-Simple',
'ext': 'mp4',
'title': 'Doki-Doki Universe: Sweet, Simple and Genuine (GDC Next 10)'
@@ -27,7 +29,7 @@ class GDCVaultIE(InfoExtractor):
{
'url': 'http://www.gdcvault.com/play/1015683/Embracing-the-Dark-Art-of',
'info_dict': {
'id': '1015683',
'id': '201203272_1330951438328RSXR',
'display_id': 'Embracing-the-Dark-Art-of',
'ext': 'flv',
'title': 'Embracing the Dark Art of Mathematical Modeling in AI'
@@ -56,7 +58,7 @@ class GDCVaultIE(InfoExtractor):
'url': 'http://gdcvault.com/play/1023460/Tenacious-Design-and-The-Interface',
'md5': 'a8efb6c31ed06ca8739294960b2dbabd',
'info_dict': {
'id': '1023460',
'id': '840376_BQRC',
'ext': 'mp4',
'display_id': 'Tenacious-Design-and-The-Interface',
'title': 'Tenacious Design and The Interface of \'Destiny\'',
@@ -66,26 +68,38 @@ class GDCVaultIE(InfoExtractor):
# Multiple audios
'url': 'http://www.gdcvault.com/play/1014631/Classic-Game-Postmortem-PAC',
'info_dict': {
'id': '1014631',
'ext': 'flv',
'id': '12396_1299111843500GMPX',
'ext': 'mp4',
'title': 'How to Create a Good Game - From My Experience of Designing Pac-Man',
},
'params': {
'skip_download': True, # Requires rtmpdump
'format': 'jp', # The japanese audio
}
# 'params': {
# 'skip_download': True, # Requires rtmpdump
# 'format': 'jp', # The japanese audio
# }
},
{
# gdc-player.html
'url': 'http://www.gdcvault.com/play/1435/An-American-engine-in-Tokyo',
'info_dict': {
'id': '1435',
'id': '9350_1238021887562UHXB',
'display_id': 'An-American-engine-in-Tokyo',
'ext': 'flv',
'ext': 'mp4',
'title': 'An American Engine in Tokyo:/nThe collaboration of Epic Games and Square Enix/nFor THE LAST REMINANT',
},
},
{
# Kaltura Embed
'url': 'https://www.gdcvault.com/play/1026180/Mastering-the-Apex-of-Scaling',
'info_dict': {
'id': '0_h1fg8j3p',
'ext': 'mp4',
'title': 'Mastering the Apex of Scaling Game Servers (Presented by Multiplay)',
'timestamp': 1554401811,
'upload_date': '20190404',
'uploader_id': 'joe@blazestreaming.com',
},
'params': {
'skip_download': True, # Requires rtmpdump
'format': 'mp4-408',
},
},
]
@@ -114,10 +128,8 @@ class GDCVaultIE(InfoExtractor):
return start_page
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
display_id = mobj.group('name') or video_id
video_id, name = re.match(self._VALID_URL, url).groups()
display_id = name or video_id
webpage_url = 'http://www.gdcvault.com/play/' + video_id
start_page = self._download_webpage(webpage_url, display_id)
@@ -127,12 +139,12 @@ class GDCVaultIE(InfoExtractor):
start_page, 'url', default=None)
if direct_url:
title = self._html_search_regex(
r'<td><strong>Session Name</strong></td>\s*<td>(.*?)</td>',
r'<td><strong>Session Name:?</strong></td>\s*<td>(.*?)</td>',
start_page, 'title')
video_url = 'http://www.gdcvault.com' + direct_url
# resolve the url so that we can detect the correct extension
head = self._request_webpage(HEADRequest(video_url), video_id)
video_url = head.geturl()
video_url = self._request_webpage(
HEADRequest(video_url), video_id).geturl()
return {
'id': video_id,
@@ -141,34 +153,36 @@ class GDCVaultIE(InfoExtractor):
'title': title,
}
PLAYER_REGEX = r'<iframe src="(?P<xml_root>.+?)/(?:gdc-)?player.*?\.html.*?".*?</iframe>'
embed_url = KalturaIE._extract_url(start_page)
if embed_url:
embed_url = smuggle_url(embed_url, {'source_url': url})
ie_key = 'Kaltura'
else:
PLAYER_REGEX = r'<iframe src="(?P<xml_root>.+?)/(?:gdc-)?player.*?\.html.*?".*?</iframe>'
xml_root = self._html_search_regex(
PLAYER_REGEX, start_page, 'xml root', default=None)
if xml_root is None:
# Probably need to authenticate
login_res = self._login(webpage_url, display_id)
if login_res is None:
self.report_warning('Could not login.')
else:
start_page = login_res
# Grab the url from the authenticated page
xml_root = self._html_search_regex(
PLAYER_REGEX, start_page, 'xml root')
xml_root = self._html_search_regex(
PLAYER_REGEX, start_page, 'xml root', default=None)
if xml_root is None:
# Probably need to authenticate
login_res = self._login(webpage_url, display_id)
if login_res is None:
self.report_warning('Could not login.')
else:
start_page = login_res
# Grab the url from the authenticated page
xml_root = self._html_search_regex(
PLAYER_REGEX, start_page, 'xml root')
xml_name = self._html_search_regex(
r'<iframe src=".*?\?xml=(.+?\.xml).*?".*?</iframe>',
start_page, 'xml filename', default=None)
if xml_name is None:
# Fallback to the older format
xml_name = self._html_search_regex(
r'<iframe src=".*?\?xmlURL=xml/(?P<xml_file>.+?\.xml).*?".*?</iframe>',
r'<iframe src=".*?\?xml(?:=|URL=xml/)(.+?\.xml).*?".*?</iframe>',
start_page, 'xml filename')
embed_url = '%s/xml/%s' % (xml_root, xml_name)
ie_key = 'DigitallySpeaking'
return {
'_type': 'url_transparent',
'id': video_id,
'display_id': display_id,
'url': '%s/xml/%s' % (xml_root, xml_name),
'ie_key': 'DigitallySpeaking',
'url': embed_url,
'ie_key': ie_key,
}

View File

@@ -4,12 +4,12 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
xpath_text,
xpath_element,
int_or_none,
parse_duration,
urljoin,
)
@@ -53,10 +53,13 @@ class HBOBaseIE(InfoExtractor):
},
}
def _extract_from_id(self, video_id):
video_data = self._download_xml(
'http://render.lv3.hbo.com/data/content/global/videos/data/%s.xml' % video_id, video_id)
title = xpath_text(video_data, 'title', 'title', True)
def _extract_info(self, url, display_id):
video_data = self._download_xml(url, display_id)
video_id = xpath_text(video_data, 'id', fatal=True)
episode_title = title = xpath_text(video_data, 'title', fatal=True)
series = xpath_text(video_data, 'program')
if series:
title = '%s - %s' % (series, title)
formats = []
for source in xpath_element(video_data, 'videos', 'sources', True):
@@ -128,68 +131,45 @@ class HBOBaseIE(InfoExtractor):
'width': width,
})
subtitles = None
caption_url = xpath_text(video_data, 'captionUrl')
if caption_url:
subtitles = {
'en': [{
'url': caption_url,
'ext': 'ttml'
}],
}
return {
'id': video_id,
'title': title,
'duration': parse_duration(xpath_text(video_data, 'duration/tv14')),
'series': series,
'episode': episode_title,
'formats': formats,
'thumbnails': thumbnails,
'subtitles': subtitles,
}
class HBOIE(HBOBaseIE):
IE_NAME = 'hbo'
_VALID_URL = r'https?://(?:www\.)?hbo\.com/video/video\.html\?.*vid=(?P<id>[0-9]+)'
_VALID_URL = r'https?://(?:www\.)?hbo\.com/(?:video|embed)(?:/[^/]+)*/(?P<id>[^/?#]+)'
_TEST = {
'url': 'http://www.hbo.com/video/video.html?autoplay=true&g=u&vid=1437839',
'md5': '2c6a6bc1222c7e91cb3334dad1746e5a',
'url': 'https://www.hbo.com/video/game-of-thrones/seasons/season-8/videos/trailer',
'md5': '8126210656f433c452a21367f9ad85b3',
'info_dict': {
'id': '1437839',
'id': '22113301',
'ext': 'mp4',
'title': 'Ep. 64 Clip: Encryption',
'thumbnail': r're:https?://.*\.jpg$',
'duration': 1072,
}
'title': 'Game of Thrones - Trailer',
},
'expected_warnings': ['Unknown MIME type application/mp4 in DASH manifest'],
}
def _real_extract(self, url):
video_id = self._match_id(url)
return self._extract_from_id(video_id)
class HBOEpisodeIE(HBOBaseIE):
IE_NAME = 'hbo:episode'
_VALID_URL = r'https?://(?:www\.)?hbo\.com/(?P<path>(?!video)(?:(?:[^/]+/)+video|watch-free-episodes)/(?P<id>[0-9a-z-]+))(?:\.html)?'
_TESTS = [{
'url': 'http://www.hbo.com/girls/episodes/5/52-i-love-you-baby/video/ep-52-inside-the-episode.html?autoplay=true',
'md5': '61ead79b9c0dfa8d3d4b07ef4ac556fb',
'info_dict': {
'id': '1439518',
'display_id': 'ep-52-inside-the-episode',
'ext': 'mp4',
'title': 'Ep. 52: Inside the Episode',
'thumbnail': r're:https?://.*\.jpg$',
'duration': 240,
},
}, {
'url': 'http://www.hbo.com/game-of-thrones/about/video/season-5-invitation-to-the-set.html?autoplay=true',
'only_matching': True,
}, {
'url': 'http://www.hbo.com/watch-free-episodes/last-week-tonight-with-john-oliver',
'only_matching': True,
}]
def _real_extract(self, url):
path, display_id = re.match(self._VALID_URL, url).groups()
content = self._download_json(
'http://www.hbo.com/api/content/' + path, display_id)['content']
video_id = compat_str((content.get('parsed', {}).get(
'common:FullBleedVideo', {}) or content['selectedEpisode'])['videoId'])
info_dict = self._extract_from_id(video_id)
info_dict['display_id'] = display_id
return info_dict
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
location_path = self._parse_json(self._html_search_regex(
r'data-state="({.+?})"', webpage, 'state'), display_id)['video']['locationUrl']
return self._extract_info(urljoin(url, location_path), display_id)

View File

@@ -1,36 +1,83 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
int_or_none,
strip_or_none,
xpath_attr,
xpath_text,
)
class InaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?ina\.fr/video/(?P<id>I?[A-Z0-9]+)'
_TEST = {
_VALID_URL = r'https?://(?:www\.)?ina\.fr/(?:video|audio)/(?P<id>[A-Z0-9_]+)'
_TESTS = [{
'url': 'http://www.ina.fr/video/I12055569/francois-hollande-je-crois-que-c-est-clair-video.html',
'md5': 'a667021bf2b41f8dc6049479d9bb38a3',
'info_dict': {
'id': 'I12055569',
'ext': 'mp4',
'title': 'François Hollande "Je crois que c\'est clair"',
'description': 'md5:3f09eb072a06cb286b8f7e4f77109663',
}
}
}, {
'url': 'https://www.ina.fr/video/S806544_001/don-d-organes-des-avancees-mais-d-importants-besoins-video.html',
'only_matching': True,
}, {
'url': 'https://www.ina.fr/audio/P16173408',
'only_matching': True,
}, {
'url': 'https://www.ina.fr/video/P16173408-video.html',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = self._match_id(url)
info_doc = self._download_xml(
'http://player.ina.fr/notices/%s.mrss' % video_id, video_id)
item = info_doc.find('channel/item')
title = xpath_text(item, 'title', fatal=True)
media_ns_xpath = lambda x: self._xpath_ns(x, 'http://search.yahoo.com/mrss/')
content = item.find(media_ns_xpath('content'))
video_id = mobj.group('id')
mrss_url = 'http://player.ina.fr/notices/%s.mrss' % video_id
info_doc = self._download_xml(mrss_url, video_id)
get_furl = lambda x: xpath_attr(content, media_ns_xpath(x), 'url')
formats = []
for q, w, h in (('bq', 400, 300), ('mq', 512, 384), ('hq', 768, 576)):
q_url = get_furl(q)
if not q_url:
continue
formats.append({
'format_id': q,
'url': q_url,
'width': w,
'height': h,
})
if not formats:
furl = get_furl('player') or content.attrib['url']
ext = determine_ext(furl)
formats = [{
'url': furl,
'vcodec': 'none' if ext == 'mp3' else None,
'ext': ext,
}]
self.report_extraction(video_id)
video_url = info_doc.find('.//{http://search.yahoo.com/mrss/}player').attrib['url']
thumbnails = []
for thumbnail in content.findall(media_ns_xpath('thumbnail')):
thumbnail_url = thumbnail.get('url')
if not thumbnail_url:
continue
thumbnails.append({
'url': thumbnail_url,
'height': int_or_none(thumbnail.get('height')),
'width': int_or_none(thumbnail.get('width')),
})
return {
'id': video_id,
'url': video_url,
'title': info_doc.find('.//title').text,
'formats': formats,
'title': title,
'description': strip_or_none(xpath_text(item, 'description')),
'thumbnails': thumbnails,
}

View File

@@ -7,7 +7,7 @@ from .common import InfoExtractor
class JWPlatformIE(InfoExtractor):
_VALID_URL = r'(?:https?://(?:content\.jwplatform|cdn\.jwplayer)\.com/(?:(?:feed|player|thumb|preview|video|manifest)s|jw6|v2/media)/|jwplatform:)(?P<id>[a-zA-Z0-9]{8})'
_VALID_URL = r'(?:https?://(?:content\.jwplatform|cdn\.jwplayer)\.com/(?:(?:feed|player|thumb|preview|video)s|jw6|v2/media)/|jwplatform:)(?P<id>[a-zA-Z0-9]{8})'
_TESTS = [{
'url': 'http://content.jwplatform.com/players/nPripu9l-ALJ3XQCI.js',
'md5': 'fa8899fa601eb7c83a64e9d568bdf325',

View File

@@ -145,6 +145,8 @@ class KalturaIE(InfoExtractor):
)
if mobj:
embed_info = mobj.groupdict()
for k, v in embed_info.items():
embed_info[k] = v.strip()
url = 'kaltura:%(partner_id)s:%(id)s' % embed_info
escaped_pid = re.escape(embed_info['partner_id'])
service_url = re.search(

View File

@@ -9,11 +9,13 @@ from ..utils import (
float_or_none,
int_or_none,
urlencode_postdata,
urljoin,
)
class LinkedInLearningBaseIE(InfoExtractor):
_NETRC_MACHINE = 'linkedin'
_LOGIN_URL = 'https://www.linkedin.com/uas/login?trk=learning'
def _call_api(self, course_slug, fields, video_slug=None, resolution=None):
query = {
@@ -50,11 +52,10 @@ class LinkedInLearningBaseIE(InfoExtractor):
return
login_page = self._download_webpage(
'https://www.linkedin.com/uas/login?trk=learning',
None, 'Downloading login page')
action_url = self._search_regex(
self._LOGIN_URL, None, 'Downloading login page')
action_url = urljoin(self._LOGIN_URL, self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', login_page, 'post url',
default='https://www.linkedin.com/uas/login-submit', group='url')
default='https://www.linkedin.com/uas/login-submit', group='url'))
data = self._hidden_inputs(login_page)
data.update({
'session_key': email,

View File

@@ -13,6 +13,8 @@ from ..utils import (
ExtractorError,
float_or_none,
mimetype2ext,
str_or_none,
try_get,
unescapeHTML,
unsmuggle_url,
url_or_none,
@@ -20,8 +22,11 @@ from ..utils import (
)
_ID_RE = r'(?:[0-9a-f]{32,34}|[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12,14})'
class MediasiteIE(InfoExtractor):
_VALID_URL = r'(?xi)https?://[^/]+/Mediasite/(?:Play|Showcase/(?:default|livebroadcast)/Presentation)/(?P<id>[0-9a-f]{32,34})(?P<query>\?[^#]+|)'
_VALID_URL = r'(?xi)https?://[^/]+/Mediasite/(?:Play|Showcase/(?:default|livebroadcast)/Presentation)/(?P<id>%s)(?P<query>\?[^#]+|)' % _ID_RE
_TESTS = [
{
'url': 'https://hitsmediaweb.h-its.org/mediasite/Play/2db6c271681e4f199af3c60d1f82869b1d',
@@ -93,6 +98,11 @@ class MediasiteIE(InfoExtractor):
'url': 'https://mediasite.ntnu.no/Mediasite/Showcase/default/Presentation/7d8b913259334b688986e970fae6fcb31d',
'only_matching': True,
},
{
# dashed id
'url': 'https://hitsmediaweb.h-its.org/mediasite/Play/2db6c271-681e-4f19-9af3-c60d1f82869b1d',
'only_matching': True,
}
]
# look in Mediasite.Core.js (Mediasite.ContentStreamType[*])
@@ -109,7 +119,7 @@ class MediasiteIE(InfoExtractor):
return [
unescapeHTML(mobj.group('url'))
for mobj in re.finditer(
r'(?xi)<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:(?:https?:)?//[^/]+)?/Mediasite/Play/[0-9a-f]{32,34}(?:\?.*?)?)\1',
r'(?xi)<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:(?:https?:)?//[^/]+)?/Mediasite/Play/%s(?:\?.*?)?)\1' % _ID_RE,
webpage)]
def _real_extract(self, url):
@@ -221,3 +231,136 @@ class MediasiteIE(InfoExtractor):
'formats': formats,
'thumbnails': thumbnails,
}
class MediasiteCatalogIE(InfoExtractor):
_VALID_URL = r'''(?xi)
(?P<url>https?://[^/]+/Mediasite)
/Catalog/Full/
(?P<catalog_id>{0})
(?:
/(?P<current_folder_id>{0})
/(?P<root_dynamic_folder_id>{0})
)?
'''.format(_ID_RE)
_TESTS = [{
'url': 'http://events7.mediasite.com/Mediasite/Catalog/Full/631f9e48530d454381549f955d08c75e21',
'info_dict': {
'id': '631f9e48530d454381549f955d08c75e21',
'title': 'WCET Summit: Adaptive Learning in Higher Ed: Improving Outcomes Dynamically',
},
'playlist_count': 6,
'expected_warnings': ['is not a supported codec'],
}, {
# with CurrentFolderId and RootDynamicFolderId
'url': 'https://medaudio.medicine.iu.edu/Mediasite/Catalog/Full/9518c4a6c5cf4993b21cbd53e828a92521/97a9db45f7ab47428c77cd2ed74bb98f14/9518c4a6c5cf4993b21cbd53e828a92521',
'info_dict': {
'id': '9518c4a6c5cf4993b21cbd53e828a92521',
'title': 'IUSM Family and Friends Sessions',
},
'playlist_count': 2,
}, {
'url': 'http://uipsyc.mediasite.com/mediasite/Catalog/Full/d5d79287c75243c58c50fef50174ec1b21',
'only_matching': True,
}, {
# no AntiForgeryToken
'url': 'https://live.libraries.psu.edu/Mediasite/Catalog/Full/8376d4b24dd1457ea3bfe4cf9163feda21',
'only_matching': True,
}, {
'url': 'https://medaudio.medicine.iu.edu/Mediasite/Catalog/Full/9518c4a6c5cf4993b21cbd53e828a92521/97a9db45f7ab47428c77cd2ed74bb98f14/9518c4a6c5cf4993b21cbd53e828a92521',
'only_matching': True,
}, {
# dashed id
'url': 'http://events7.mediasite.com/Mediasite/Catalog/Full/631f9e48-530d-4543-8154-9f955d08c75e',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
mediasite_url = mobj.group('url')
catalog_id = mobj.group('catalog_id')
current_folder_id = mobj.group('current_folder_id') or catalog_id
root_dynamic_folder_id = mobj.group('root_dynamic_folder_id')
webpage = self._download_webpage(url, catalog_id)
# AntiForgeryToken is optional (e.g. [1])
# 1. https://live.libraries.psu.edu/Mediasite/Catalog/Full/8376d4b24dd1457ea3bfe4cf9163feda21
anti_forgery_token = self._search_regex(
r'AntiForgeryToken\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1',
webpage, 'anti forgery token', default=None, group='value')
if anti_forgery_token:
anti_forgery_header = self._search_regex(
r'AntiForgeryHeaderName\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1',
webpage, 'anti forgery header name',
default='X-SOFO-AntiForgeryHeader', group='value')
data = {
'IsViewPage': True,
'IsNewFolder': True,
'AuthTicket': None,
'CatalogId': catalog_id,
'CurrentFolderId': current_folder_id,
'RootDynamicFolderId': root_dynamic_folder_id,
'ItemsPerPage': 1000,
'PageIndex': 0,
'PermissionMask': 'Execute',
'CatalogSearchType': 'SearchInFolder',
'SortBy': 'Date',
'SortDirection': 'Descending',
'StartDate': None,
'EndDate': None,
'StatusFilterList': None,
'PreviewKey': None,
'Tags': [],
}
headers = {
'Content-Type': 'application/json; charset=UTF-8',
'Referer': url,
'X-Requested-With': 'XMLHttpRequest',
}
if anti_forgery_token:
headers[anti_forgery_header] = anti_forgery_token
catalog = self._download_json(
'%s/Catalog/Data/GetPresentationsForFolder' % mediasite_url,
catalog_id, data=json.dumps(data).encode(), headers=headers)
entries = []
for video in catalog['PresentationDetailsList']:
if not isinstance(video, dict):
continue
video_id = str_or_none(video.get('Id'))
if not video_id:
continue
entries.append(self.url_result(
'%s/Play/%s' % (mediasite_url, video_id),
ie=MediasiteIE.ie_key(), video_id=video_id))
title = try_get(
catalog, lambda x: x['CurrentFolder']['Name'], compat_str)
return self.playlist_result(entries, catalog_id, title,)
class MediasiteNamedCatalogIE(InfoExtractor):
_VALID_URL = r'(?xi)(?P<url>https?://[^/]+/Mediasite)/Catalog/catalogs/(?P<catalog_name>[^/?#&]+)'
_TESTS = [{
'url': 'https://msite.misis.ru/Mediasite/Catalog/catalogs/2016-industrial-management-skriabin-o-o',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
mediasite_url = mobj.group('url')
catalog_name = mobj.group('catalog_name')
webpage = self._download_webpage(url, catalog_name)
catalog_id = self._search_regex(
r'CatalogId\s*:\s*["\'](%s)' % _ID_RE, webpage, 'catalog id')
return self.url_result(
'%s/Catalog/Full/%s' % (mediasite_url, catalog_id),
ie=MediasiteCatalogIE.ie_key(), video_id=catalog_id)

View File

@@ -1,22 +1,32 @@
# coding: utf-8
from __future__ import unicode_literals
import base64
import time
import uuid
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import int_or_none
from ..compat import (
compat_HTTPError,
compat_str,
)
from ..utils import (
ExtractorError,
int_or_none,
)
class MGTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?mgtv\.com/(v|b)/(?:[^/]+/)*(?P<id>\d+)\.html'
IE_DESC = '芒果TV'
_GEO_COUNTRIES = ['CN']
_TESTS = [{
'url': 'http://www.mgtv.com/v/1/290525/f/3116640.html',
'md5': 'b1ffc0fc163152acf6beaa81832c9ee7',
'info_dict': {
'id': '3116640',
'ext': 'mp4',
'title': '我是歌手第四季双年巅峰会:韩红李玟“双王”领军对抗',
'title': '我是歌手 第四季',
'description': '我是歌手第四季双年巅峰会',
'duration': 7461,
'thumbnail': r're:^https?://.*\.jpg$',
@@ -28,16 +38,30 @@ class MGTVIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
api_data = self._download_json(
'http://pcweb.api.mgtv.com/player/video', video_id,
query={'video_id': video_id},
headers=self.geo_verification_headers())['data']
try:
api_data = self._download_json(
'https://pcweb.api.mgtv.com/player/video', video_id, query={
'tk2': base64.urlsafe_b64encode(b'did=%s|pno=1030|ver=0.3.0301|clit=%d' % (compat_str(uuid.uuid4()).encode(), time.time()))[::-1],
'video_id': video_id,
}, headers=self.geo_verification_headers())['data']
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
error = self._parse_json(e.cause.read().decode(), None)
if error.get('code') == 40005:
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
raise ExtractorError(error['msg'], expected=True)
raise
info = api_data['info']
title = info['title'].strip()
stream_domain = api_data['stream_domain'][0]
stream_data = self._download_json(
'https://pcweb.api.mgtv.com/player/getSource', video_id, query={
'pm2': api_data['atc']['pm2'],
'video_id': video_id,
}, headers=self.geo_verification_headers())['data']
stream_domain = stream_data['stream_domain'][0]
formats = []
for idx, stream in enumerate(api_data['stream']):
for idx, stream in enumerate(stream_data['stream']):
stream_path = stream.get('url')
if not stream_path:
continue
@@ -47,7 +71,7 @@ class MGTVIE(InfoExtractor):
format_url = format_data.get('info')
if not format_url:
continue
tbr = int_or_none(self._search_regex(
tbr = int_or_none(stream.get('filebitrate') or self._search_regex(
r'_(\d+)_mp4/', format_url, 'tbr', default=None))
formats.append({
'format_id': compat_str(tbr or idx),

View File

@@ -1,12 +1,17 @@
# coding: utf-8
from __future__ import unicode_literals
import re
import base64
import hashlib
from .common import InfoExtractor
from ..aes import aes_cbc_decrypt
from ..utils import (
ExtractorError,
bytes_to_intlist,
int_or_none,
intlist_to_bytes,
parse_codecs,
parse_duration,
)
@@ -14,7 +19,7 @@ class NewstubeIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?newstube\.ru/media/(?P<id>.+)'
_TEST = {
'url': 'http://www.newstube.ru/media/telekanal-cnn-peremestil-gorod-slavyansk-v-krym',
'md5': '801eef0c2a9f4089fa04e4fe3533abdc',
'md5': '9d10320ad473444352f72f746ccb8b8c',
'info_dict': {
'id': '728e0ef2-e187-4012-bac0-5a081fdcb1f6',
'ext': 'mp4',
@@ -25,84 +30,45 @@ class NewstubeIE(InfoExtractor):
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
page = self._download_webpage(url, video_id, 'Downloading page')
page = self._download_webpage(url, video_id)
title = self._html_search_meta(['og:title', 'twitter:title'], page, fatal=True)
video_guid = self._html_search_regex(
r'<meta property="og:video:url" content="https?://(?:www\.)?newstube\.ru/freshplayer\.swf\?guid=(?P<guid>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})',
r'<meta\s+property="og:video(?::(?:(?:secure_)?url|iframe))?"\s+content="https?://(?:www\.)?newstube\.ru/embed/(?P<guid>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})',
page, 'video GUID')
player = self._download_xml(
'http://p.newstube.ru/v2/player.asmx/GetAutoPlayInfo6?state=&url=%s&sessionId=&id=%s&placement=profile&location=n2' % (url, video_guid),
video_guid, 'Downloading player XML')
def ns(s):
return s.replace('/', '/%(ns)s') % {'ns': '{http://app1.newstube.ru/N2SiteWS/player.asmx}'}
error_message = player.find(ns('./ErrorMessage'))
if error_message is not None:
raise ExtractorError('%s returned error: %s' % (self.IE_NAME, error_message.text), expected=True)
session_id = player.find(ns('./SessionId')).text
media_info = player.find(ns('./Medias/MediaInfo'))
title = media_info.find(ns('./Name')).text
description = self._og_search_description(page)
thumbnail = media_info.find(ns('./KeyFrame')).text
duration = int(media_info.find(ns('./Duration')).text) / 1000.0
enc_data = base64.b64decode(self._download_webpage(
'https://www.newstube.ru/embed/api/player/getsources2',
video_guid, query={
'guid': video_guid,
'ff': 3,
}))
key = hashlib.pbkdf2_hmac(
'sha1', video_guid.replace('-', '').encode(), enc_data[:16], 1)[:16]
dec_data = aes_cbc_decrypt(
bytes_to_intlist(enc_data[32:]), bytes_to_intlist(key),
bytes_to_intlist(enc_data[16:32]))
sources = self._parse_json(intlist_to_bytes(dec_data[:-dec_data[-1]]), video_guid)
formats = []
for stream_info in media_info.findall(ns('./Streams/StreamInfo')):
media_location = stream_info.find(ns('./MediaLocation'))
if media_location is None:
for source in sources:
source_url = source.get('Src')
if not source_url:
continue
server = media_location.find(ns('./Server')).text
app = media_location.find(ns('./App')).text
media_id = stream_info.find(ns('./Id')).text
name = stream_info.find(ns('./Name')).text
width = int(stream_info.find(ns('./Width')).text)
height = int(stream_info.find(ns('./Height')).text)
formats.append({
'url': 'rtmp://%s/%s' % (server, app),
'app': app,
'play_path': '01/%s' % video_guid.upper(),
'rtmp_conn': ['S:%s' % session_id, 'S:%s' % media_id, 'S:n2'],
'page_url': url,
'ext': 'flv',
'format_id': 'rtmp' + ('-%s' % name if name else ''),
'width': width,
height = int_or_none(source.get('Height'))
f = {
'format_id': 'http' + ('-%dp' % height if height else ''),
'url': source_url,
'width': int_or_none(source.get('Width')),
'height': height,
})
sources_data = self._download_json(
'http://www.newstube.ru/player2/getsources?guid=%s' % video_guid,
video_guid, fatal=False)
if sources_data:
for source in sources_data.get('Sources', []):
source_url = source.get('Src')
if not source_url:
continue
height = int_or_none(source.get('Height'))
f = {
'format_id': 'http' + ('-%dp' % height if height else ''),
'url': source_url,
'width': int_or_none(source.get('Width')),
'height': height,
}
source_type = source.get('Type')
if source_type:
mobj = re.search(r'codecs="([^,]+),\s*([^"]+)"', source_type)
if mobj:
vcodec, acodec = mobj.groups()
f.update({
'vcodec': vcodec,
'acodec': acodec,
})
formats.append(f)
}
source_type = source.get('Type')
if source_type:
f.update(parse_codecs(self._search_regex(
r'codecs="([^"]+)"', source_type, 'codecs', fatal=False)))
formats.append(f)
self._check_formats(formats, video_guid)
self._sort_formats(formats)
@@ -110,8 +76,8 @@ class NewstubeIE(InfoExtractor):
return {
'id': video_guid,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'description': self._html_search_meta(['description', 'og:description'], page),
'thumbnail': self._html_search_meta(['og:image:secure_url', 'og:image', 'twitter:image'], page),
'duration': parse_duration(self._html_search_meta('duration', page)),
'formats': formats,
}

View File

@@ -1,54 +1,81 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import ExtractorError
class NhkVodIE(InfoExtractor):
_VALID_URL = r'https?://www3\.nhk\.or\.jp/nhkworld/en/(?:vod|ondemand)/(?P<id>[^/]+/[^/?#&]+)'
_VALID_URL = r'https?://www3\.nhk\.or\.jp/nhkworld/(?P<lang>[a-z]{2})/ondemand/(?P<type>video|audio)/(?P<id>\d{7}|[a-z]+-\d{8}-\d+)'
# Content available only for a limited period of time. Visit
# https://www3.nhk.or.jp/nhkworld/en/ondemand/ for working samples.
_TESTS = [{
# Videos available only for a limited period of time. Visit
# http://www3.nhk.or.jp/nhkworld/en/vod/ for working samples.
'url': 'http://www3.nhk.or.jp/nhkworld/en/vod/tokyofashion/20160815',
'info_dict': {
'id': 'A1bnNiNTE6nY3jLllS-BIISfcC_PpvF5',
'ext': 'flv',
'title': 'TOKYO FASHION EXPRESS - The Kimono as Global Fashion',
'description': 'md5:db338ee6ce8204f415b754782f819824',
'series': 'TOKYO FASHION EXPRESS',
'episode': 'The Kimono as Global Fashion',
},
'skip': 'Videos available only for a limited period of time',
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2015173/',
'only_matching': True,
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/audio/plugin-20190404-1/',
'only_matching': True,
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/fr/ondemand/audio/plugin-20190404-1/',
'only_matching': True,
}]
_API_URL = 'http://api.nhk.or.jp/nhkworld/vodesdlist/v1/all/all/all.json?apikey=EJfK8jdS57GqlupFgAfAAwr573q01y6k'
_API_URL_TEMPLATE = 'https://api.nhk.or.jp/nhkworld/%sodesdlist/v7/episode/%s/%s/all%s.json'
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_json(self._API_URL, video_id)
try:
episode = next(
e for e in data['data']['episodes']
if e.get('url') and video_id in e['url'])
except StopIteration:
raise ExtractorError('Unable to find episode')
embed_code = episode['vod_id']
lang, m_type, episode_id = re.match(self._VALID_URL, url).groups()
if episode_id.isdigit():
episode_id = episode_id[:4] + '-' + episode_id[4:]
is_video = m_type == 'video'
episode = self._download_json(
self._API_URL_TEMPLATE % ('v' if is_video else 'r', episode_id, lang, '/all' if is_video else ''),
episode_id, query={'apikey': 'EJfK8jdS57GqlupFgAfAAwr573q01y6k'})['data']['episodes'][0]
title = episode.get('sub_title_clean') or episode['sub_title']
description = episode.get('description_clean') or episode.get('description')
series = episode.get('title_clean') or episode.get('title')
return {
'_type': 'url_transparent',
'ie_key': 'Ooyala',
'url': 'ooyala:%s' % embed_code,
def get_clean_field(key):
return episode.get(key + '_clean') or episode.get(key)
series = get_clean_field('title')
thumbnails = []
for s, w, h in [('', 640, 360), ('_l', 1280, 720)]:
img_path = episode.get('image' + s)
if not img_path:
continue
thumbnails.append({
'id': '%dp' % h,
'height': h,
'width': w,
'url': 'https://www3.nhk.or.jp' + img_path,
})
info = {
'id': episode_id + '-' + lang,
'title': '%s - %s' % (series, title) if series and title else title,
'description': description,
'description': get_clean_field('description'),
'thumbnails': thumbnails,
'series': series,
'episode': title,
}
if is_video:
info.update({
'_type': 'url_transparent',
'ie_key': 'Ooyala',
'url': 'ooyala:' + episode['vod_id'],
})
else:
audio = episode['audio']
audio_path = audio['audio']
info['formats'] = self._extract_m3u8_formats(
'https://nhks-vh.akamaihd.net/i%s/master.m3u8' % audio_path,
episode_id, 'm4a', m3u8_id='hls', fatal=False)
for proto in ('rtmpt', 'rtmp'):
info['formats'].append({
'ext': 'flv',
'format_id': proto,
'url': '%s://flv.nhk.or.jp/ondemand/mp4:flv%s' % (proto, audio_path),
'vcodec': 'none',
})
for f in info['formats']:
f['language'] = lang
return info

View File

@@ -181,10 +181,7 @@ class NPOIE(NPOBaseIE):
def _real_extract(self, url):
video_id = self._match_id(url)
try:
return self._get_info(url, video_id)
except ExtractorError:
return self._get_old_info(video_id)
return self._get_info(url, video_id) or self._get_old_info(video_id)
def _get_info(self, url, video_id):
token = self._download_json(
@@ -206,6 +203,7 @@ class NPOIE(NPOBaseIE):
player_token = player['token']
drm = False
format_urls = set()
formats = []
for profile in ('hls', 'dash-widevine', 'dash-playready', 'smooth'):
@@ -227,7 +225,8 @@ class NPOIE(NPOBaseIE):
if not stream_url or stream_url in format_urls:
continue
format_urls.add(stream_url)
if stream.get('protection') is not None:
if stream.get('protection') is not None or stream.get('keySystemOptions') is not None:
drm = True
continue
stream_type = stream.get('type')
stream_ext = determine_ext(stream_url)
@@ -246,6 +245,11 @@ class NPOIE(NPOBaseIE):
'url': stream_url,
})
if not formats:
if drm:
raise ExtractorError('This video is DRM protected.', expected=True)
return
self._sort_formats(formats)
info = {

View File

@@ -0,0 +1,30 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class NRLTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?nrl\.com/tv(/[^/]+)*/(?P<id>[^/?&#]+)'
_TEST = {
'url': 'https://www.nrl.com/tv/news/match-highlights-titans-v-knights-862805/',
'info_dict': {
'id': 'YyNnFuaDE6kPJqlDhG4CGQ_w89mKTau4',
'ext': 'mp4',
'title': 'Match Highlights: Titans v Knights',
},
'params': {
# m3u8 download
'skip_download': True,
'format': 'bestvideo',
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
q_data = self._parse_json(self._search_regex(
r"(?s)q-data='({.+?})'", webpage, 'player data'), display_id)
ooyala_id = q_data['videoId']
return self.url_result(
'ooyala:' + ooyala_id, 'Ooyala', ooyala_id, q_data.get('title'))

View File

@@ -0,0 +1,49 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
js_to_json,
smuggle_url,
)
class NTVCoJpCUIE(InfoExtractor):
IE_NAME = 'cu.ntv.co.jp'
IE_DESC = 'Nippon Television Network'
_VALID_URL = r'https?://cu\.ntv\.co\.jp/(?!program)(?P<id>[^/?&#]+)'
_TEST = {
'url': 'https://cu.ntv.co.jp/televiva-chill-gohan_181031/',
'info_dict': {
'id': '5978891207001',
'ext': 'mp4',
'title': '桜エビと炒り卵がポイント! 「中華風 エビチリおにぎり」──『美虎』五十嵐美幸',
'upload_date': '20181213',
'description': 'md5:211b52f4fd60f3e0e72b68b0c6ba52a9',
'uploader_id': '3855502814001',
'timestamp': 1544669941,
},
'params': {
# m3u8 download
'skip_download': True,
},
}
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s'
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
player_config = self._parse_json(self._search_regex(
r'(?s)PLAYER_CONFIG\s*=\s*({.+?})',
webpage, 'player config'), display_id, js_to_json)
video_id = player_config['videoId']
account_id = player_config.get('account') or '3855502814001'
return {
'_type': 'url_transparent',
'id': video_id,
'display_id': display_id,
'title': self._search_regex(r'<h1[^>]+class="title"[^>]*>([^<]+)', webpage, 'title').strip(),
'description': self._html_search_meta(['description', 'og:description'], webpage),
'url': smuggle_url(self.BRIGHTCOVE_URL_TEMPLATE % (account_id, video_id), {'geo_countries': ['JP']}),
'ie_key': 'BrightcoveNew',
}

View File

@@ -36,7 +36,7 @@ class OoyalaBaseIE(InfoExtractor):
'domain': domain,
'supportedFormats': supportedformats or 'mp4,rtmp,m3u8,hds,dash,smooth',
'embedToken': embed_token,
}), video_id)
}), video_id, headers=self.geo_verification_headers())
cur_auth_data = auth_data['authorization_data'][embed_code]

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,217 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_b64decode,
compat_str,
)
from ..utils import (
clean_html,
ExtractorError,
int_or_none,
str_or_none,
try_get,
url_or_none,
urlencode_postdata,
urljoin,
)
class PlatziIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
platzi\.com/clases| # es version
courses\.platzi\.com/classes # en version
)/[^/]+/(?P<id>\d+)-[^/?\#&]+
'''
_LOGIN_URL = 'https://platzi.com/login/'
_NETRC_MACHINE = 'platzi'
_TESTS = [{
'url': 'https://platzi.com/clases/1311-next-js/12074-creando-nuestra-primera-pagina/',
'md5': '8f56448241005b561c10f11a595b37e3',
'info_dict': {
'id': '12074',
'ext': 'mp4',
'title': 'Creando nuestra primera página',
'description': 'md5:4c866e45034fc76412fbf6e60ae008bc',
'duration': 420,
},
'skip': 'Requires platzi account credentials',
}, {
'url': 'https://courses.platzi.com/classes/1367-communication-codestream/13430-background/',
'info_dict': {
'id': '13430',
'ext': 'mp4',
'title': 'Background',
'description': 'md5:49c83c09404b15e6e71defaf87f6b305',
'duration': 360,
},
'skip': 'Requires platzi account credentials',
'params': {
'skip_download': True,
},
}]
def _real_initialize(self):
self._login()
def _login(self):
username, password = self._get_login_info()
if username is None:
return
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login page')
login_form = self._hidden_inputs(login_page)
login_form.update({
'email': username,
'password': password,
})
urlh = self._request_webpage(
self._LOGIN_URL, None, 'Logging in',
data=urlencode_postdata(login_form),
headers={'Referer': self._LOGIN_URL})
# login succeeded
if 'platzi.com/login' not in compat_str(urlh.geturl()):
return
login_error = self._webpage_read_content(
urlh, self._LOGIN_URL, None, 'Downloading login error page')
login = self._parse_json(
self._search_regex(
r'login\s*=\s*({.+?})(?:\s*;|\s*</script)', login_error, 'login'),
None)
for kind in ('error', 'password', 'nonFields'):
error = str_or_none(login.get('%sError' % kind))
if error:
raise ExtractorError(
'Unable to login: %s' % error, expected=True)
raise ExtractorError('Unable to log in')
def _real_extract(self, url):
lecture_id = self._match_id(url)
webpage = self._download_webpage(url, lecture_id)
data = self._parse_json(
self._search_regex(
r'client_data\s*=\s*({.+?})\s*;', webpage, 'client data'),
lecture_id)
material = data['initialState']['material']
desc = material['description']
title = desc['title']
formats = []
for server_id, server in material['videos'].items():
if not isinstance(server, dict):
continue
for format_id in ('hls', 'dash'):
format_url = url_or_none(server.get(format_id))
if not format_url:
continue
if format_id == 'hls':
formats.extend(self._extract_m3u8_formats(
format_url, lecture_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id=format_id,
note='Downloading %s m3u8 information' % server_id,
fatal=False))
elif format_id == 'dash':
formats.extend(self._extract_mpd_formats(
format_url, lecture_id, mpd_id=format_id,
note='Downloading %s MPD manifest' % server_id,
fatal=False))
self._sort_formats(formats)
content = str_or_none(desc.get('content'))
description = (clean_html(compat_b64decode(content).decode('utf-8'))
if content else None)
duration = int_or_none(material.get('duration'), invscale=60)
return {
'id': lecture_id,
'title': title,
'description': description,
'duration': duration,
'formats': formats,
}
class PlatziCourseIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
platzi\.com/clases| # es version
courses\.platzi\.com/classes # en version
)/(?P<id>[^/?\#&]+)
'''
_TESTS = [{
'url': 'https://platzi.com/clases/next-js/',
'info_dict': {
'id': '1311',
'title': 'Curso de Next.js',
},
'playlist_count': 22,
}, {
'url': 'https://courses.platzi.com/classes/communication-codestream/',
'info_dict': {
'id': '1367',
'title': 'Codestream Course',
},
'playlist_count': 14,
}]
@classmethod
def suitable(cls, url):
return False if PlatziIE.suitable(url) else super(PlatziCourseIE, cls).suitable(url)
def _real_extract(self, url):
course_name = self._match_id(url)
webpage = self._download_webpage(url, course_name)
props = self._parse_json(
self._search_regex(r'data\s*=\s*({.+?})\s*;', webpage, 'data'),
course_name)['initialProps']
entries = []
for chapter_num, chapter in enumerate(props['concepts'], 1):
if not isinstance(chapter, dict):
continue
materials = chapter.get('materials')
if not materials or not isinstance(materials, list):
continue
chapter_title = chapter.get('title')
chapter_id = str_or_none(chapter.get('id'))
for material in materials:
if not isinstance(material, dict):
continue
if material.get('material_type') != 'video':
continue
video_url = urljoin(url, material.get('url'))
if not video_url:
continue
entries.append({
'_type': 'url_transparent',
'url': video_url,
'title': str_or_none(material.get('name')),
'id': str_or_none(material.get('id')),
'ie_key': PlatziIE.ie_key(),
'chapter': chapter_title,
'chapter_number': chapter_num,
'chapter_id': chapter_id,
})
course_id = compat_str(try_get(props, lambda x: x['course']['id']))
course_title = try_get(props, lambda x: x['course']['name'], compat_str)
return self.playlist_result(entries, course_id, course_title)

View File

@@ -14,6 +14,7 @@ from ..compat import (
)
from .openload import PhantomJSwrapper
from ..utils import (
determine_ext,
ExtractorError,
int_or_none,
orderedSet,
@@ -275,6 +276,10 @@ class PornHubIE(PornHubBaseIE):
r'/(\d{6}/\d{2})/', video_url, 'upload data', default=None)
if upload_date:
upload_date = upload_date.replace('/', '')
if determine_ext(video_url) == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
continue
tbr = None
mobj = re.search(r'(?P<height>\d+)[pP]?_(?P<tbr>\d+)[kK]', video_url)
if mobj:

View File

@@ -7,6 +7,7 @@ from ..utils import (
ExtractorError,
int_or_none,
float_or_none,
url_or_none,
)
@@ -119,7 +120,7 @@ class RedditRIE(InfoExtractor):
'_type': 'url_transparent',
'url': video_url,
'title': data.get('title'),
'thumbnail': data.get('thumbnail'),
'thumbnail': url_or_none(data.get('thumbnail')),
'timestamp': float_or_none(data.get('created_utc')),
'uploader': data.get('author'),
'like_count': int_or_none(data.get('ups')),

View File

@@ -21,7 +21,7 @@ from ..utils import (
class RTL2IE(InfoExtractor):
IE_NAME = 'rtl2'
_VALID_URL = r'http?://(?:www\.)?rtl2\.de/[^?#]*?/(?P<id>[^?#/]*?)(?:$|/(?:$|[?#]))'
_VALID_URL = r'https?://(?:www\.)?rtl2\.de/sendung/[^/]+/(?:video/(?P<vico_id>\d+)[^/]+/(?P<vivi_id>\d+)-|folge/)(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://www.rtl2.de/sendung/grip-das-motormagazin/folge/folge-203-0',
'info_dict': {
@@ -34,10 +34,11 @@ class RTL2IE(InfoExtractor):
# rtmp download
'skip_download': True,
},
'expected_warnings': ['Unable to download f4m manifest', 'Failed to download m3u8 information'],
}, {
'url': 'http://www.rtl2.de/sendung/koeln-50667/video/5512-anna/21040-anna-erwischt-alex/',
'info_dict': {
'id': '21040-anna-erwischt-alex',
'id': 'anna-erwischt-alex',
'ext': 'mp4',
'title': 'Anna erwischt Alex!',
'description': 'Anna nimmt ihrem Vater nicht ab, dass er nicht spielt. Und tatsächlich erwischt sie ihn auf frischer Tat.'
@@ -46,31 +47,29 @@ class RTL2IE(InfoExtractor):
# rtmp download
'skip_download': True,
},
'expected_warnings': ['Unable to download f4m manifest', 'Failed to download m3u8 information'],
}]
def _real_extract(self, url):
# Some rtl2 urls have no slash at the end, so append it.
if not url.endswith('/'):
url += '/'
vico_id, vivi_id, display_id = re.match(self._VALID_URL, url).groups()
if not vico_id:
webpage = self._download_webpage(url, display_id)
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
mobj = re.search(
r'<div[^>]+data-collection="(?P<vico_id>\d+)"[^>]+data-video="(?P<vivi_id>\d+)"',
webpage)
if mobj:
vico_id = mobj.group('vico_id')
vivi_id = mobj.group('vivi_id')
else:
vico_id = self._html_search_regex(
r'vico_id\s*:\s*([0-9]+)', webpage, 'vico_id')
vivi_id = self._html_search_regex(
r'vivi_id\s*:\s*([0-9]+)', webpage, 'vivi_id')
mobj = re.search(
r'data-collection="(?P<vico_id>\d+)"[^>]+data-video="(?P<vivi_id>\d+)"',
webpage)
if mobj:
vico_id = mobj.group('vico_id')
vivi_id = mobj.group('vivi_id')
else:
vico_id = self._html_search_regex(
r'vico_id\s*:\s*([0-9]+)', webpage, 'vico_id')
vivi_id = self._html_search_regex(
r'vivi_id\s*:\s*([0-9]+)', webpage, 'vivi_id')
info = self._download_json(
'http://www.rtl2.de/sites/default/modules/rtl2/mediathek/php/get_video_jw.php',
video_id, query={
'https://service.rtl2.de/api-player-vipo/video.php',
display_id, query={
'vico_id': vico_id,
'vivi_id': vivi_id,
})
@@ -89,7 +88,7 @@ class RTL2IE(InfoExtractor):
'format_id': 'rtmp',
'url': rtmp_url,
'play_path': stream_url,
'player_url': 'http://www.rtl2.de/flashplayer/vipo_player.swf',
'player_url': 'https://www.rtl2.de/sites/default/modules/rtl2/jwplayer/jwplayer-7.6.0/jwplayer.flash.swf',
'page_url': url,
'flash_version': 'LNX 11,2,202,429',
'rtmp_conn': rtmp_conn,
@@ -99,12 +98,12 @@ class RTL2IE(InfoExtractor):
m3u8_url = video_info.get('streamurl_hls')
if m3u8_url:
formats.extend(self._extract_akamai_formats(m3u8_url, video_id))
formats.extend(self._extract_akamai_formats(m3u8_url, display_id))
self._sort_formats(formats)
return {
'id': video_id,
'id': display_id,
'title': title,
'thumbnail': video_info.get('image'),
'description': video_info.get('beschreibung'),

View File

@@ -59,6 +59,20 @@ class RuutuIE(InfoExtractor):
'url': 'http://www.ruutu.fi/video/3193728',
'only_matching': True,
},
{
# audio podcast
'url': 'https://www.supla.fi/supla/3382410',
'md5': 'b9d7155fed37b2ebf6021d74c4b8e908',
'info_dict': {
'id': '3382410',
'ext': 'mp3',
'title': 'Mikä ihmeen poltergeist?',
'description': 'md5:bbb6963df17dfd0ecd9eb9a61bf14b52',
'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 0,
},
'expected_warnings': ['HTTP Error 502: Bad Gateway'],
}
]
def _real_extract(self, url):
@@ -94,6 +108,12 @@ class RuutuIE(InfoExtractor):
continue
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
elif ext == 'mp3' or child.tag == 'AudioMediaFile':
formats.append({
'format_id': 'audio',
'url': video_url,
'vcodec': 'none',
})
else:
proto = compat_urllib_parse_urlparse(video_url).scheme
if not child.tag.startswith('HTTP') and proto != 'rtmp':

View File

@@ -65,7 +65,7 @@ class SixPlayIE(InfoExtractor):
for asset in assets:
asset_url = asset.get('full_physical_path')
protocol = asset.get('protocol')
if not asset_url or protocol == 'primetime' or asset.get('type') == 'usp_hlsfp_h264' or asset_url in urls:
if not asset_url or ((protocol == 'primetime' or asset.get('type') == 'usp_hlsfp_h264') and not ('_drmnp.ism/' in asset_url or '_unpnp.ism/' in asset_url)) or asset_url in urls:
continue
urls.append(asset_url)
container = asset.get('video_container')
@@ -82,6 +82,7 @@ class SixPlayIE(InfoExtractor):
if not urlh:
continue
asset_url = urlh.geturl()
asset_url = asset_url.replace('_drmnp.ism/', '_unpnp.ism/')
for i in range(3, 0, -1):
asset_url = asset_url = asset_url.replace('_sd1/', '_sd%d/' % i)
m3u8_formats = self._extract_m3u8_formats(

View File

@@ -15,7 +15,12 @@ from ..compat import (
)
from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
KNOWN_EXTENSIONS,
merge_dicts,
mimetype2ext,
str_or_none,
try_get,
unified_timestamp,
update_url_query,
@@ -57,7 +62,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'E.T. ExTerrestrial Music',
'timestamp': 1349920598,
'upload_date': '20121011',
'duration': 143,
'duration': 143.216,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
@@ -100,7 +105,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'jaimeMF',
'timestamp': 1386604920,
'upload_date': '20131209',
'duration': 9,
'duration': 9.927,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
@@ -120,7 +125,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'jaimeMF',
'timestamp': 1386604920,
'upload_date': '20131209',
'duration': 9,
'duration': 9.927,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
@@ -140,7 +145,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'oddsamples',
'timestamp': 1389232924,
'upload_date': '20140109',
'duration': 17,
'duration': 17.346,
'license': 'cc-by-sa',
'view_count': int,
'like_count': int,
@@ -160,7 +165,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'Ori Uplift Music',
'timestamp': 1504206263,
'upload_date': '20170831',
'duration': 7449,
'duration': 7449.096,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
@@ -180,7 +185,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'garyvee',
'timestamp': 1488152409,
'upload_date': '20170226',
'duration': 207,
'duration': 207.012,
'thumbnail': r're:https?://.*\.jpg',
'license': 'all-rights-reserved',
'view_count': int,
@@ -192,9 +197,31 @@ class SoundcloudIE(InfoExtractor):
'skip_download': True,
},
},
# not avaialble via api.soundcloud.com/i1/tracks/id/streams
{
'url': 'https://soundcloud.com/giovannisarani/mezzo-valzer',
'md5': 'e22aecd2bc88e0e4e432d7dcc0a1abf7',
'info_dict': {
'id': '583011102',
'ext': 'mp3',
'title': 'Mezzo Valzer',
'description': 'md5:4138d582f81866a530317bae316e8b61',
'uploader': 'Giovanni Sarani',
'timestamp': 1551394171,
'upload_date': '20190228',
'duration': 180.157,
'thumbnail': r're:https?://.*\.jpg',
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
},
'expected_warnings': ['Unable to download JSON metadata'],
}
]
_CLIENT_ID = 'NmW1FlPaiL94ueEu7oziOWjYEzZzQDcK'
_CLIENT_ID = 'FweeGBOOEOYJWLJN3oEyToGLKhmSz0I7'
@staticmethod
def _extract_urls(webpage):
@@ -202,10 +229,6 @@ class SoundcloudIE(InfoExtractor):
r'<iframe[^>]+src=(["\'])(?P<url>(?:https?://)?(?:w\.)?soundcloud\.com/player.+?)\1',
webpage)]
def report_resolve(self, video_id):
"""Report information extraction."""
self.to_screen('%s: Resolving id' % video_id)
@classmethod
def _resolv_url(cls, url):
return 'https://api.soundcloud.com/resolve.json?url=' + url + '&client_id=' + cls._CLIENT_ID
@@ -224,6 +247,10 @@ class SoundcloudIE(InfoExtractor):
def extract_count(key):
return int_or_none(info.get('%s_count' % key))
like_count = extract_count('favoritings')
if like_count is None:
like_count = extract_count('likes')
result = {
'id': track_id,
'uploader': username,
@@ -231,15 +258,17 @@ class SoundcloudIE(InfoExtractor):
'title': title,
'description': info.get('description'),
'thumbnail': thumbnail,
'duration': int_or_none(info.get('duration'), 1000),
'duration': float_or_none(info.get('duration'), 1000),
'webpage_url': info.get('permalink_url'),
'license': info.get('license'),
'view_count': extract_count('playback'),
'like_count': extract_count('favoritings'),
'like_count': like_count,
'comment_count': extract_count('comment'),
'repost_count': extract_count('reposts'),
'genre': info.get('genre'),
}
format_urls = set()
formats = []
query = {'client_id': self._CLIENT_ID}
if secret_token is not None:
@@ -248,6 +277,7 @@ class SoundcloudIE(InfoExtractor):
# We can build a direct link to the song
format_url = update_url_query(
'https://api.soundcloud.com/tracks/%s/download' % track_id, query)
format_urls.add(format_url)
formats.append({
'format_id': 'download',
'ext': info.get('original_format', 'mp3'),
@@ -256,44 +286,91 @@ class SoundcloudIE(InfoExtractor):
'preference': 10,
})
# We have to retrieve the url
# Old API, does not work for some tracks (e.g.
# https://soundcloud.com/giovannisarani/mezzo-valzer)
format_dict = self._download_json(
'https://api.soundcloud.com/i1/tracks/%s/streams' % track_id,
track_id, 'Downloading track url', query=query)
track_id, 'Downloading track url', query=query, fatal=False)
for key, stream_url in format_dict.items():
ext, abr = 'mp3', None
mobj = re.search(r'_([^_]+)_(\d+)_url', key)
if mobj:
ext, abr = mobj.groups()
abr = int(abr)
if key.startswith('http'):
stream_formats = [{
'format_id': key,
'ext': ext,
'url': stream_url,
}]
elif key.startswith('rtmp'):
# The url doesn't have an rtmp app, we have to extract the playpath
url, path = stream_url.split('mp3:', 1)
stream_formats = [{
'format_id': key,
'url': url,
'play_path': 'mp3:' + path,
'ext': 'flv',
}]
elif key.startswith('hls'):
stream_formats = self._extract_m3u8_formats(
stream_url, track_id, ext, entry_protocol='m3u8_native',
m3u8_id=key, fatal=False)
else:
if format_dict:
for key, stream_url in format_dict.items():
if stream_url in format_urls:
continue
format_urls.add(stream_url)
ext, abr = 'mp3', None
mobj = re.search(r'_([^_]+)_(\d+)_url', key)
if mobj:
ext, abr = mobj.groups()
abr = int(abr)
if key.startswith('http'):
stream_formats = [{
'format_id': key,
'ext': ext,
'url': stream_url,
}]
elif key.startswith('rtmp'):
# The url doesn't have an rtmp app, we have to extract the playpath
url, path = stream_url.split('mp3:', 1)
stream_formats = [{
'format_id': key,
'url': url,
'play_path': 'mp3:' + path,
'ext': 'flv',
}]
elif key.startswith('hls'):
stream_formats = self._extract_m3u8_formats(
stream_url, track_id, ext, entry_protocol='m3u8_native',
m3u8_id=key, fatal=False)
else:
continue
if abr:
for f in stream_formats:
f['abr'] = abr
formats.extend(stream_formats)
# New API
transcodings = try_get(
info, lambda x: x['media']['transcodings'], list) or []
for t in transcodings:
if not isinstance(t, dict):
continue
if abr:
for f in stream_formats:
f['abr'] = abr
formats.extend(stream_formats)
format_url = url_or_none(t.get('url'))
if not format_url:
continue
stream = self._download_json(
update_url_query(format_url, query), track_id, fatal=False)
if not isinstance(stream, dict):
continue
stream_url = url_or_none(stream.get('url'))
if not stream_url:
continue
if stream_url in format_urls:
continue
format_urls.add(stream_url)
protocol = try_get(t, lambda x: x['format']['protocol'], compat_str)
if protocol != 'hls' and '/hls' in format_url:
protocol = 'hls'
ext = None
preset = str_or_none(t.get('preset'))
if preset:
ext = preset.split('_')[0]
if ext not in KNOWN_EXTENSIONS:
mimetype = try_get(
t, lambda x: x['format']['mime_type'], compat_str)
ext = mimetype2ext(mimetype) or 'mp3'
format_id_list = []
if protocol:
format_id_list.append(protocol)
format_id_list.append(ext)
format_id = '_'.join(format_id_list)
formats.append({
'url': stream_url,
'format_id': format_id,
'ext': ext,
'protocol': 'm3u8_native' if protocol == 'hls' else 'http',
})
if not formats:
# We fallback to the stream_url in the original info, this
@@ -303,11 +380,11 @@ class SoundcloudIE(InfoExtractor):
'url': update_url_query(info['stream_url'], query),
'ext': 'mp3',
})
self._check_formats(formats, track_id)
for f in formats:
f['vcodec'] = 'none'
self._check_formats(formats, track_id)
self._sort_formats(formats)
result['formats'] = formats
@@ -319,6 +396,7 @@ class SoundcloudIE(InfoExtractor):
raise ExtractorError('Invalid URL: %s' % url)
track_id = mobj.group('track_id')
new_info = {}
if track_id is not None:
info_json_url = 'https://api.soundcloud.com/tracks/' + track_id + '.json?client_id=' + self._CLIENT_ID
@@ -344,13 +422,31 @@ class SoundcloudIE(InfoExtractor):
if token:
resolve_title += '/%s' % token
self.report_resolve(full_title)
webpage = self._download_webpage(url, full_title, fatal=False)
if webpage:
entries = self._parse_json(
self._search_regex(
r'var\s+c\s*=\s*(\[.+?\])\s*,\s*o\s*=Date\b', webpage,
'data', default='[]'), full_title, fatal=False)
if entries:
for e in entries:
if not isinstance(e, dict):
continue
if e.get('id') != 67:
continue
data = try_get(e, lambda x: x['data'][0], dict)
if data:
new_info = data
break
info_json_url = self._resolv_url(
'https://soundcloud.com/%s' % resolve_title)
url = 'https://soundcloud.com/%s' % resolve_title
info_json_url = self._resolv_url(url)
info = self._download_json(info_json_url, full_title, 'Downloading info JSON')
# Contains some additional info missing from new_info
info = self._download_json(
info_json_url, full_title, 'Downloading info JSON')
return self._extract_info_dict(info, full_title, secret_token=token)
return self._extract_info_dict(
merge_dicts(info, new_info), full_title, secret_token=token)
class SoundcloudPlaylistBaseIE(SoundcloudIE):
@@ -396,8 +492,6 @@ class SoundcloudSetIE(SoundcloudPlaylistBaseIE):
full_title += '/' + token
url += '/' + token
self.report_resolve(full_title)
resolv_url = self._resolv_url(url)
info = self._download_json(resolv_url, full_title)

View File

@@ -14,7 +14,7 @@ from ..utils import (
class StreamangoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?:streamango\.com|fruithosts\.net)/(?:f|embed)/(?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:www\.)?(?:streamango\.com|fruithosts\.net|streamcherry\.com)/(?:f|embed)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://streamango.com/f/clapasobsptpkdfe/20170315_150006_mp4',
'md5': 'e992787515a182f55e38fc97588d802a',
@@ -41,6 +41,9 @@ class StreamangoIE(InfoExtractor):
}, {
'url': 'https://fruithosts.net/f/mreodparcdcmspsm/w1f1_r4lph_2018_brrs_720p_latino_mp4',
'only_matching': True,
}, {
'url': 'https://streamcherry.com/f/clapasobsptpkdfe/',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -0,0 +1,94 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_parse_qs,
compat_urllib_parse_urlparse
)
from ..utils import (
extract_attributes,
float_or_none,
int_or_none,
str_or_none,
)
class STVPlayerIE(InfoExtractor):
IE_NAME = 'stv:player'
_VALID_URL = r'https?://player\.stv\.tv/(?P<type>episode|video)/(?P<id>[a-z0-9]{4})'
_TEST = {
'url': 'https://player.stv.tv/video/7srz/victoria/interview-with-the-cast-ahead-of-new-victoria/',
'md5': '2ad867d4afd641fa14187596e0fbc91b',
'info_dict': {
'id': '6016487034001',
'ext': 'mp4',
'upload_date': '20190321',
'title': 'Interview with the cast ahead of new Victoria',
'description': 'Nell Hudson and Lily Travers tell us what to expect in the new season of Victoria.',
'timestamp': 1553179628,
'uploader_id': '1486976045',
},
'skip': 'this resource is unavailable outside of the UK',
}
_PUBLISHER_ID = '1486976045'
_PTYPE_MAP = {
'episode': 'episodes',
'video': 'shortform',
}
def _real_extract(self, url):
ptype, video_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, video_id)
qs = compat_parse_qs(compat_urllib_parse_urlparse(self._search_regex(
r'itemprop="embedURL"[^>]+href="([^"]+)',
webpage, 'embed URL', default=None)).query)
publisher_id = qs.get('publisherID', [None])[0] or self._PUBLISHER_ID
player_attr = extract_attributes(self._search_regex(
r'(<[^>]+class="bcplayer"[^>]+>)', webpage, 'player', default=None)) or {}
info = {}
duration = ref_id = series = video_id = None
api_ref_id = player_attr.get('data-player-api-refid')
if api_ref_id:
resp = self._download_json(
'https://player.api.stv.tv/v1/%s/%s' % (self._PTYPE_MAP[ptype], api_ref_id),
api_ref_id, fatal=False)
if resp:
result = resp.get('results') or {}
video = result.get('video') or {}
video_id = str_or_none(video.get('id'))
ref_id = video.get('guid')
duration = video.get('length')
programme = result.get('programme') or {}
series = programme.get('name') or programme.get('shortName')
subtitles = {}
_subtitles = result.get('_subtitles') or {}
for ext, sub_url in _subtitles.items():
subtitles.setdefault('en', []).append({
'ext': 'vtt' if ext == 'webvtt' else ext,
'url': sub_url,
})
info.update({
'description': result.get('summary'),
'subtitles': subtitles,
'view_count': int_or_none(result.get('views')),
})
if not video_id:
video_id = qs.get('videoId', [None])[0] or self._search_regex(
r'<link\s+itemprop="url"\s+href="(\d+)"',
webpage, 'video id', default=None) or 'ref:' + (ref_id or player_attr['data-refid'])
info.update({
'_type': 'url_transparent',
'duration': float_or_none(duration or player_attr.get('data-duration'), 1000),
'id': video_id,
'ie_key': 'BrightcoveNew',
'series': series or player_attr.get('data-programme-name'),
'url': 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s' % (publisher_id, video_id),
})
return info

View File

@@ -0,0 +1,115 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
determine_ext,
int_or_none,
str_or_none,
)
class SverigesRadioBaseIE(InfoExtractor):
_BASE_URL = 'https://sverigesradio.se/sida/playerajax/'
_QUALITIES = ['low', 'medium', 'high']
_EXT_TO_CODEC_MAP = {
'mp3': 'mp3',
'm4a': 'aac',
}
_CODING_FORMAT_TO_ABR_MAP = {
5: 128,
11: 192,
12: 32,
13: 96,
}
def _real_extract(self, url):
audio_id = self._match_id(url)
query = {
'id': audio_id,
'type': self._AUDIO_TYPE,
}
item = self._download_json(
self._BASE_URL + 'audiometadata', audio_id,
'Downloading audio JSON metadata', query=query)['items'][0]
title = item['subtitle']
query['format'] = 'iis'
urls = []
formats = []
for quality in self._QUALITIES:
query['quality'] = quality
audio_url_data = self._download_json(
self._BASE_URL + 'getaudiourl', audio_id,
'Downloading %s format JSON metadata' % quality,
fatal=False, query=query) or {}
audio_url = audio_url_data.get('audioUrl')
if not audio_url or audio_url in urls:
continue
urls.append(audio_url)
ext = determine_ext(audio_url)
coding_format = audio_url_data.get('codingFormat')
abr = int_or_none(self._search_regex(
r'_a(\d+)\.m4a', audio_url, 'audio bitrate',
default=None)) or self._CODING_FORMAT_TO_ABR_MAP.get(coding_format)
formats.append({
'abr': abr,
'acodec': self._EXT_TO_CODEC_MAP.get(ext),
'ext': ext,
'format_id': str_or_none(coding_format),
'vcodec': 'none',
'url': audio_url,
})
self._sort_formats(formats)
return {
'id': audio_id,
'title': title,
'formats': formats,
'series': item.get('title'),
'duration': int_or_none(item.get('duration')),
'thumbnail': item.get('displayimageurl'),
'description': item.get('description'),
}
class SverigesRadioPublicationIE(SverigesRadioBaseIE):
IE_NAME = 'sverigesradio:publication'
_VALID_URL = r'https?://(?:www\.)?sverigesradio\.se/sida/(?:artikel|gruppsida)\.aspx\?.*?\bartikel=(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://sverigesradio.se/sida/artikel.aspx?programid=83&artikel=7038546',
'md5': '6a4917e1923fccb080e5a206a5afa542',
'info_dict': {
'id': '7038546',
'ext': 'm4a',
'duration': 132,
'series': 'Nyheter (Ekot)',
'title': 'Esa Teittinen: Sanningen har inte kommit fram',
'description': 'md5:daf7ce66a8f0a53d5465a5984d3839df',
'thumbnail': r're:^https?://.*\.jpg',
},
}, {
'url': 'https://sverigesradio.se/sida/gruppsida.aspx?programid=3304&grupp=6247&artikel=7146887',
'only_matching': True,
}]
_AUDIO_TYPE = 'publication'
class SverigesRadioEpisodeIE(SverigesRadioBaseIE):
IE_NAME = 'sverigesradio:episode'
_VALID_URL = r'https?://(?:www\.)?sverigesradio\.se/(?:sida/)?avsnitt/(?P<id>[0-9]+)'
_TEST = {
'url': 'https://sverigesradio.se/avsnitt/1140922?programid=1300',
'md5': '20dc4d8db24228f846be390b0c59a07c',
'info_dict': {
'id': '1140922',
'ext': 'mp3',
'duration': 3307,
'series': 'Konflikt',
'title': 'Metoo och valen',
'description': 'md5:fcb5c1f667f00badcc702b196f10a27e',
'thumbnail': r're:^https?://.*\.jpg',
}
}
_AUDIO_TYPE = 'episode'

View File

@@ -185,7 +185,7 @@ class SVTPlayIE(SVTPlayBaseIE):
def _extract_by_video_id(self, video_id, webpage=None):
data = self._download_json(
'https://api.svt.se/videoplayer-api/video/%s' % video_id,
'https://api.svt.se/video/%s' % video_id,
video_id, headers=self.geo_verification_headers())
info_dict = self._extract_video(data, video_id)
if not info_dict.get('title'):

View File

@@ -16,7 +16,7 @@ from ..utils import (
class TeamcocoIE(TurnerBaseIE):
_VALID_URL = r'https?://teamcoco\.com/(?P<id>([^/]+/)*[^/?#]+)'
_VALID_URL = r'https?://(?:\w+\.)?teamcoco\.com/(?P<id>([^/]+/)*[^/?#]+)'
_TESTS = [
{
'url': 'http://teamcoco.com/video/mary-kay-remote',
@@ -79,15 +79,20 @@ class TeamcocoIE(TurnerBaseIE):
}, {
'url': 'http://teamcoco.com/israel/conan-hits-the-streets-beaches-of-tel-aviv',
'only_matching': True,
}, {
'url': 'https://conan25.teamcoco.com/video/ice-cube-kevin-hart-conan-share-lyft',
'only_matching': True,
}
]
def _graphql_call(self, query_template, object_type, object_id):
find_object = 'find' + object_type
return self._download_json(
'http://teamcoco.com/graphql/', object_id, data=json.dumps({
'https://teamcoco.com/graphql', object_id, data=json.dumps({
'query': query_template % (find_object, object_id)
}))['data'][find_object]
}).encode(), headers={
'Content-Type': 'application/json',
})['data'][find_object]
def _real_extract(self, url):
display_id = self._match_id(url)
@@ -145,7 +150,12 @@ class TeamcocoIE(TurnerBaseIE):
'accessTokenType': 'jws',
}))
else:
video_sources = self._graphql_call('''{
d = self._download_json(
'https://teamcoco.com/_truman/d/' + video_id,
video_id, fatal=False) or {}
video_sources = d.get('meta') or {}
if not video_sources:
video_sources = self._graphql_call('''{
%s(id: "%s") {
src
}

View File

@@ -0,0 +1,140 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
clean_html,
determine_ext,
ExtractorError,
float_or_none,
get_element_by_class,
get_element_by_id,
parse_duration,
remove_end,
urlencode_postdata,
urljoin,
)
class TeamTreeHouseIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?teamtreehouse\.com/library/(?P<id>[^/]+)'
_TESTS = [{
# Course
'url': 'https://teamtreehouse.com/library/introduction-to-user-authentication-in-php',
'info_dict': {
'id': 'introduction-to-user-authentication-in-php',
'title': 'Introduction to User Authentication in PHP',
'description': 'md5:405d7b4287a159b27ddf30ca72b5b053',
},
'playlist_mincount': 24,
}, {
# WorkShop
'url': 'https://teamtreehouse.com/library/deploying-a-react-app',
'info_dict': {
'id': 'deploying-a-react-app',
'title': 'Deploying a React App',
'description': 'md5:10a82e3ddff18c14ac13581c9b8e5921',
},
'playlist_mincount': 4,
}, {
# Video
'url': 'https://teamtreehouse.com/library/application-overview-2',
'info_dict': {
'id': 'application-overview-2',
'ext': 'mp4',
'title': 'Application Overview',
'description': 'md5:4b0a234385c27140a4378de5f1e15127',
},
'expected_warnings': ['This is just a preview'],
}]
_NETRC_MACHINE = 'teamtreehouse'
def _real_initialize(self):
email, password = self._get_login_info()
if email is None:
return
signin_page = self._download_webpage(
'https://teamtreehouse.com/signin',
None, 'Downloading signin page')
data = self._form_hidden_inputs('new_user_session', signin_page)
data.update({
'user_session[email]': email,
'user_session[password]': password,
})
error_message = get_element_by_class('error-message', self._download_webpage(
'https://teamtreehouse.com/person_session',
None, 'Logging in', data=urlencode_postdata(data)))
if error_message:
raise ExtractorError(clean_html(error_message), expected=True)
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
title = self._html_search_meta(['og:title', 'twitter:title'], webpage)
description = self._html_search_meta(
['description', 'og:description', 'twitter:description'], webpage)
entries = self._parse_html5_media_entries(url, webpage, display_id)
if entries:
info = entries[0]
for subtitles in info.get('subtitles', {}).values():
for subtitle in subtitles:
subtitle['ext'] = determine_ext(subtitle['url'], 'srt')
is_preview = 'data-preview="true"' in webpage
if is_preview:
self.report_warning(
'This is just a preview. You need to be signed in with a Basic account to download the entire video.', display_id)
duration = 30
else:
duration = float_or_none(self._search_regex(
r'data-duration="(\d+)"', webpage, 'duration'), 1000)
if not duration:
duration = parse_duration(get_element_by_id(
'video-duration', webpage))
info.update({
'id': display_id,
'title': title,
'description': description,
'duration': duration,
})
return info
else:
def extract_urls(html, extract_info=None):
for path in re.findall(r'<a[^>]+href="([^"]+)"', html):
page_url = urljoin(url, path)
entry = {
'_type': 'url_transparent',
'id': self._match_id(page_url),
'url': page_url,
'id_key': self.ie_key(),
}
if extract_info:
entry.update(extract_info)
entries.append(entry)
workshop_videos = self._search_regex(
r'(?s)<ul[^>]+id="workshop-videos"[^>]*>(.+?)</ul>',
webpage, 'workshop videos', default=None)
if workshop_videos:
extract_urls(workshop_videos)
else:
stages_path = self._search_regex(
r'(?s)<div[^>]+id="syllabus-stages"[^>]+data-url="([^"]+)"',
webpage, 'stages path')
if stages_path:
stages_page = self._download_webpage(
urljoin(url, stages_path), display_id, 'Downloading stages page')
for chapter_number, (chapter, steps_list) in enumerate(re.findall(r'(?s)<h2[^>]*>\s*(.+?)\s*</h2>.+?<ul[^>]*>(.+?)</ul>', stages_page), 1):
extract_urls(steps_list, {
'chapter': chapter,
'chapter_number': chapter_number,
})
title = remove_end(title, ' Course')
return self.playlist_result(
entries, display_id, title, description)

View File

@@ -65,8 +65,15 @@ class TikTokBaseIE(InfoExtractor):
class TikTokIE(TikTokBaseIE):
_VALID_URL = r'https?://(?:m\.)?tiktok\.com/v/(?P<id>\d+)'
_TEST = {
_VALID_URL = r'''(?x)
https?://
(?:
(?:m\.)?tiktok\.com/v|
(?:www\.)?tiktok\.com/share/video
)
/(?P<id>\d+)
'''
_TESTS = [{
'url': 'https://m.tiktok.com/v/6606727368545406213.html',
'md5': 'd584b572e92fcd48888051f238022420',
'info_dict': {
@@ -81,25 +88,39 @@ class TikTokIE(TikTokBaseIE):
'comment_count': int,
'repost_count': int,
}
}
}, {
'url': 'https://www.tiktok.com/share/video/6606727368545406213',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
webpage = self._download_webpage(
'https://m.tiktok.com/v/%s.html' % video_id, video_id)
data = self._parse_json(self._search_regex(
r'\bdata\s*=\s*({.+?})\s*;', webpage, 'data'), video_id)
return self._extract_aweme(data)
class TikTokUserIE(TikTokBaseIE):
_VALID_URL = r'https?://(?:m\.)?tiktok\.com/h5/share/usr/(?P<id>\d+)'
_TEST = {
_VALID_URL = r'''(?x)
https?://
(?:
(?:m\.)?tiktok\.com/h5/share/usr|
(?:www\.)?tiktok\.com/share/user
)
/(?P<id>\d+)
'''
_TESTS = [{
'url': 'https://m.tiktok.com/h5/share/usr/188294915489964032.html',
'info_dict': {
'id': '188294915489964032',
},
'playlist_mincount': 24,
}
}, {
'url': 'https://www.tiktok.com/share/user/188294915489964032',
'only_matching': True,
}]
def _real_extract(self, url):
user_id = self._match_id(url)

View File

@@ -66,7 +66,12 @@ class TouTvIE(RadioCanadaIE):
def _real_extract(self, url):
path = self._match_id(url)
metadata = self._download_json('http://ici.tou.tv/presentation/%s' % path, path)
metadata = self._download_json(
'https://services.radio-canada.ca/toutv/presentation/%s' % path, path, query={
'client_key': self._CLIENT_KEY,
'device': 'web',
'version': 4,
})
# IsDrm does not necessarily mean the video is DRM protected (see
# https://github.com/ytdl-org/youtube-dl/issues/13994).
if metadata.get('IsDrm'):
@@ -77,6 +82,12 @@ class TouTvIE(RadioCanadaIE):
return merge_dicts({
'id': video_id,
'title': details.get('OriginalTitle'),
'description': details.get('Description'),
'thumbnail': details.get('ImageUrl'),
'duration': int_or_none(details.get('LengthInSeconds')),
'series': metadata.get('ProgramTitle'),
'season_number': int_or_none(metadata.get('SeasonNumber')),
'season': metadata.get('SeasonTitle'),
'episode_number': int_or_none(metadata.get('EpisodeNumber')),
'episode': metadata.get('EpisodeTitle'),
}, self._extract_info(metadata.get('AppCode', 'toutv'), video_id))

View File

@@ -2,19 +2,20 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import urlencode_postdata
import re
class TwitCastingIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?twitcasting\.tv/(?P<uploader_id>[^/]+)/movie/(?P<id>\d+)'
_TEST = {
_TESTS = [{
'url': 'https://twitcasting.tv/ivetesangalo/movie/2357609',
'md5': '745243cad58c4681dc752490f7540d7f',
'info_dict': {
'id': '2357609',
'ext': 'mp4',
'title': 'Recorded Live #2357609',
'title': 'Live #2357609',
'uploader_id': 'ivetesangalo',
'description': "Moi! I'm live on TwitCasting from my iPhone.",
'thumbnail': r're:^https?://.*\.jpg$',
@@ -22,14 +23,34 @@ class TwitCastingIE(InfoExtractor):
'params': {
'skip_download': True,
},
}
}, {
'url': 'https://twitcasting.tv/mttbernardini/movie/3689740',
'info_dict': {
'id': '3689740',
'ext': 'mp4',
'title': 'Live playing something #3689740',
'uploader_id': 'mttbernardini',
'description': "I'm live on TwitCasting from my iPad. password: abc (Santa Marinella/Lazio, Italia)",
'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
'skip_download': True,
'videopassword': 'abc',
},
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
uploader_id = mobj.group('uploader_id')
webpage = self._download_webpage(url, video_id)
video_password = self._downloader.params.get('videopassword')
request_data = None
if video_password:
request_data = urlencode_postdata({
'password': video_password,
})
webpage = self._download_webpage(url, video_id, data=request_data)
title = self._html_search_regex(
r'(?s)<[^>]+id=["\']movietitle[^>]+>(.+?)</',

View File

@@ -134,12 +134,12 @@ class TwitchBaseIE(InfoExtractor):
def _prefer_source(self, formats):
try:
source = next(f for f in formats if f['format_id'] == 'Source')
source['preference'] = 10
source['quality'] = 10
except StopIteration:
for f in formats:
if '/chunked/' in f['url']:
f.update({
'source_preference': 10,
'quality': 10,
'format_note': 'Source',
})
self._sort_formats(formats)

View File

@@ -76,7 +76,10 @@ class UdemyIE(InfoExtractor):
webpage, 'course', default='{}')),
video_id, fatal=False) or {}
course_id = course.get('id') or self._search_regex(
r'data-course-id=["\'](\d+)', webpage, 'course id')
[
r'data-course-id=["\'](\d+)',
r'&quot;courseId&quot;\s*:\s*(\d+)'
], webpage, 'course id')
return course_id, course.get('title')
def _enroll_course(self, base_url, webpage, course_id):

View File

@@ -109,23 +109,9 @@ class VimeoBaseInfoExtractor(InfoExtractor):
def _parse_config(self, config, video_id):
video_data = config['video']
# Extract title
video_title = video_data['title']
# Extract uploader, uploader_url and uploader_id
video_uploader = video_data.get('owner', {}).get('name')
video_uploader_url = video_data.get('owner', {}).get('url')
video_uploader_id = video_uploader_url.split('/')[-1] if video_uploader_url else None
# Extract video thumbnail
video_thumbnail = video_data.get('thumbnail')
if video_thumbnail is None:
video_thumbs = video_data.get('thumbs')
if video_thumbs and isinstance(video_thumbs, dict):
_, video_thumbnail = sorted((int(width if width.isdigit() else 0), t_url) for (width, t_url) in video_thumbs.items())[-1]
# Extract video duration
video_duration = int_or_none(video_data.get('duration'))
live_event = video_data.get('live_event') or {}
is_live = live_event.get('status') == 'started'
formats = []
config_files = video_data.get('files') or config['request'].get('files', {})
@@ -142,6 +128,7 @@ class VimeoBaseInfoExtractor(InfoExtractor):
'tbr': int_or_none(f.get('bitrate')),
})
# TODO: fix handling of 308 status code returned for live archive manifest requests
for files_type in ('hls', 'dash'):
for cdn_name, cdn_data in config_files.get(files_type, {}).get('cdns', {}).items():
manifest_url = cdn_data.get('url')
@@ -151,7 +138,7 @@ class VimeoBaseInfoExtractor(InfoExtractor):
if files_type == 'hls':
formats.extend(self._extract_m3u8_formats(
manifest_url, video_id, 'mp4',
'm3u8_native', m3u8_id=format_id,
'm3u8' if is_live else 'm3u8_native', m3u8_id=format_id,
note='Downloading %s m3u8 information' % cdn_name,
fatal=False))
elif files_type == 'dash':
@@ -164,6 +151,10 @@ class VimeoBaseInfoExtractor(InfoExtractor):
else:
mpd_manifest_urls = [(format_id, manifest_url)]
for f_id, m_url in mpd_manifest_urls:
if 'json=1' in m_url:
real_m_url = (self._download_json(m_url, video_id, fatal=False) or {}).get('url')
if real_m_url:
m_url = real_m_url
mpd_formats = self._extract_mpd_formats(
m_url.replace('/master.json', '/master.mpd'), video_id, f_id,
'Downloading %s MPD information' % cdn_name,
@@ -175,6 +166,15 @@ class VimeoBaseInfoExtractor(InfoExtractor):
f['preference'] = -40
formats.extend(mpd_formats)
live_archive = live_event.get('archive') or {}
live_archive_source_url = live_archive.get('source_url')
if live_archive_source_url and live_archive.get('status') == 'done':
formats.append({
'format_id': 'live-archive-source',
'url': live_archive_source_url,
'preference': 1,
})
subtitles = {}
text_tracks = config['request'].get('text_tracks')
if text_tracks:
@@ -184,15 +184,33 @@ class VimeoBaseInfoExtractor(InfoExtractor):
'url': 'https://vimeo.com' + tt['url'],
}]
thumbnails = []
if not is_live:
for key, thumb in video_data.get('thumbs', {}).items():
thumbnails.append({
'id': key,
'width': int_or_none(key),
'url': thumb,
})
thumbnail = video_data.get('thumbnail')
if thumbnail:
thumbnails.append({
'url': thumbnail,
})
owner = video_data.get('owner') or {}
video_uploader_url = owner.get('url')
return {
'title': video_title,
'uploader': video_uploader,
'uploader_id': video_uploader_id,
'title': self._live_title(video_title) if is_live else video_title,
'uploader': owner.get('name'),
'uploader_id': video_uploader_url.split('/')[-1] if video_uploader_url else None,
'uploader_url': video_uploader_url,
'thumbnail': video_thumbnail,
'duration': video_duration,
'thumbnails': thumbnails,
'duration': int_or_none(video_data.get('duration')),
'formats': formats,
'subtitles': subtitles,
'is_live': is_live,
}
def _extract_original_format(self, url, video_id):

View File

@@ -6,10 +6,7 @@ import re
import sys
from .common import InfoExtractor
from ..compat import (
compat_str,
compat_urlparse,
)
from ..compat import compat_urlparse
from ..utils import (
clean_html,
ExtractorError,
@@ -103,7 +100,7 @@ class VKIE(VKBaseIE):
'url': 'http://vk.com/videos-77521?z=video-77521_162222515%2Fclub77521',
'md5': '7babad3b85ea2e91948005b1b8b0cb84',
'info_dict': {
'id': '162222515',
'id': '-77521_162222515',
'ext': 'mp4',
'title': 'ProtivoGunz - Хуёвая песня',
'uploader': 're:(?:Noize MC|Alexander Ilyashenko).*',
@@ -117,7 +114,7 @@ class VKIE(VKBaseIE):
'url': 'http://vk.com/video205387401_165548505',
'md5': '6c0aeb2e90396ba97035b9cbde548700',
'info_dict': {
'id': '165548505',
'id': '205387401_165548505',
'ext': 'mp4',
'title': 'No name',
'uploader': 'Tom Cruise',
@@ -132,7 +129,7 @@ class VKIE(VKBaseIE):
'url': 'http://vk.com/video_ext.php?oid=32194266&id=162925554&hash=7d8c2e0d5e05aeaa&hd=1',
'md5': 'c7ce8f1f87bec05b3de07fdeafe21a0a',
'info_dict': {
'id': '162925554',
'id': '32194266_162925554',
'ext': 'mp4',
'uploader': 'Vladimir Gavrin',
'title': 'Lin Dan',
@@ -149,7 +146,7 @@ class VKIE(VKBaseIE):
'md5': 'a590bcaf3d543576c9bd162812387666',
'note': 'Only available for registered users',
'info_dict': {
'id': '164049491',
'id': '-8871596_164049491',
'ext': 'mp4',
'uploader': 'Триллеры',
'title': '► Бойцовский клуб / Fight Club 1999 [HD 720]',
@@ -163,7 +160,7 @@ class VKIE(VKBaseIE):
'url': 'http://vk.com/hd_kino_mania?z=video-43215063_168067957%2F15c66b9b533119788d',
'md5': '4d7a5ef8cf114dfa09577e57b2993202',
'info_dict': {
'id': '168067957',
'id': '-43215063_168067957',
'ext': 'mp4',
'uploader': 'Киномания - лучшее из мира кино',
'title': ' ',
@@ -177,7 +174,7 @@ class VKIE(VKBaseIE):
'md5': '0c45586baa71b7cb1d0784ee3f4e00a6',
'note': 'ivi.ru embed',
'info_dict': {
'id': '60690',
'id': '-43215063_169084319',
'ext': 'mp4',
'title': 'Книга Илая',
'duration': 6771,
@@ -191,7 +188,7 @@ class VKIE(VKBaseIE):
'url': 'https://vk.com/video30481095_171201961?list=8764ae2d21f14088d4',
'md5': '091287af5402239a1051c37ec7b92913',
'info_dict': {
'id': '171201961',
'id': '30481095_171201961',
'ext': 'mp4',
'title': 'ТюменцевВВ_09.07.2015',
'uploader': 'Anton Ivanov',
@@ -206,10 +203,10 @@ class VKIE(VKBaseIE):
'url': 'https://vk.com/video276849682_170681728',
'info_dict': {
'id': 'V3K4mi0SYkc',
'ext': 'webm',
'ext': 'mp4',
'title': "DSWD Awards 'Children's Joy Foundation, Inc.' Certificate of Registration and License to Operate",
'description': 'md5:bf9c26cfa4acdfb146362682edd3827a',
'duration': 179,
'duration': 178,
'upload_date': '20130116',
'uploader': "Children's Joy Foundation Inc.",
'uploader_id': 'thecjf',
@@ -239,7 +236,7 @@ class VKIE(VKBaseIE):
'url': 'http://vk.com/video-110305615_171782105',
'md5': 'e13fcda136f99764872e739d13fac1d1',
'info_dict': {
'id': '171782105',
'id': '-110305615_171782105',
'ext': 'mp4',
'title': 'S-Dance, репетиции к The way show',
'uploader': 'THE WAY SHOW | 17 апреля',
@@ -254,14 +251,17 @@ class VKIE(VKBaseIE):
{
# finished live stream, postlive_mp4
'url': 'https://vk.com/videos-387766?z=video-387766_456242764%2Fpl_-387766_-2',
'md5': '90d22d051fccbbe9becfccc615be6791',
'info_dict': {
'id': '456242764',
'id': '-387766_456242764',
'ext': 'mp4',
'title': 'ИгроМир 2016 — день 1',
'title': 'ИгроМир 2016 День 1 — Игромания Утром',
'uploader': 'Игромания',
'duration': 5239,
'view_count': int,
# TODO: use act=show to extract view_count
# 'view_count': int,
'upload_date': '20160929',
'uploader_id': '-387766',
'timestamp': 1475137527,
},
},
{
@@ -465,7 +465,7 @@ class VKIE(VKBaseIE):
self._sort_formats(formats)
return {
'id': compat_str(data.get('vid') or video_id),
'id': video_id,
'formats': formats,
'title': title,
'thumbnail': data.get('jpg'),

View File

@@ -102,6 +102,15 @@ class VRVIE(VRVBaseIE):
# m3u8 download
'skip_download': True,
},
}, {
# movie listing
'url': 'https://vrv.co/watch/G6NQXZ1J6/Lily-CAT',
'info_dict': {
'id': 'G6NQXZ1J6',
'title': 'Lily C.A.T',
'description': 'md5:988b031e7809a6aeb60968be4af7db07',
},
'playlist_count': 2,
}]
_NETRC_MACHINE = 'vrv'
@@ -123,23 +132,23 @@ class VRVIE(VRVBaseIE):
def _extract_vrv_formats(self, url, video_id, stream_format, audio_lang, hardsub_lang):
if not url or stream_format not in ('hls', 'dash'):
return []
assert audio_lang or hardsub_lang
stream_id_list = []
if audio_lang:
stream_id_list.append('audio-%s' % audio_lang)
if hardsub_lang:
stream_id_list.append('hardsub-%s' % hardsub_lang)
stream_id = '-'.join(stream_id_list)
format_id = '%s-%s' % (stream_format, stream_id)
format_id = stream_format
if stream_id_list:
format_id += '-' + '-'.join(stream_id_list)
if stream_format == 'hls':
adaptive_formats = self._extract_m3u8_formats(
url, video_id, 'mp4', m3u8_id=format_id,
note='Downloading %s m3u8 information' % stream_id,
note='Downloading %s information' % format_id,
fatal=False)
elif stream_format == 'dash':
adaptive_formats = self._extract_mpd_formats(
url, video_id, mpd_id=format_id,
note='Downloading %s MPD information' % stream_id,
note='Downloading %s information' % format_id,
fatal=False)
if audio_lang:
for f in adaptive_formats:
@@ -150,10 +159,28 @@ class VRVIE(VRVBaseIE):
def _real_extract(self, url):
video_id = self._match_id(url)
episode_path = self._get_cms_resource(
'cms:/episodes/' + video_id, video_id)
video_data = self._call_cms(episode_path, video_id, 'video')
object_data = self._call_cms(self._get_cms_resource(
'cms:/objects/' + video_id, video_id), video_id, 'object')['items'][0]
resource_path = object_data['__links__']['resource']['href']
video_data = self._call_cms(resource_path, video_id, 'video')
title = video_data['title']
description = video_data.get('description')
if video_data.get('__class__') == 'movie_listing':
items = self._call_cms(
video_data['__links__']['movie_listing/movies']['href'],
video_id, 'movie listing').get('items') or []
if len(items) != 1:
entries = []
for item in items:
item_id = item.get('id')
if not item_id:
continue
entries.append(self.url_result(
'https://vrv.co/watch/' + item_id,
self.ie_key(), item_id, item.get('title')))
return self.playlist_result(entries, video_id, title, description)
video_data = items[0]
streams_path = video_data['__links__'].get('streams', {}).get('href')
if not streams_path:
@@ -197,7 +224,7 @@ class VRVIE(VRVBaseIE):
'formats': formats,
'subtitles': subtitles,
'thumbnails': thumbnails,
'description': video_data.get('description'),
'description': description,
'duration': float_or_none(video_data.get('duration_ms'), 1000),
'uploader_id': video_data.get('channel_id'),
'series': video_data.get('series_title'),

View File

@@ -19,7 +19,7 @@ from ..utils import (
class WeiboIE(InfoExtractor):
_VALID_URL = r'https?://weibo\.com/[0-9]+/(?P<id>[a-zA-Z0-9]+)'
_VALID_URL = r'https?://(?:www\.)?weibo\.com/[0-9]+/(?P<id>[a-zA-Z0-9]+)'
_TEST = {
'url': 'https://weibo.com/6275294458/Fp6RGfbff?type=comment',
'info_dict': {

View File

@@ -1,158 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
qualities,
remove_start,
)
class WrzutaIE(InfoExtractor):
IE_NAME = 'wrzuta.pl'
_VALID_URL = r'https?://(?P<uploader>[0-9a-zA-Z]+)\.wrzuta\.pl/(?P<typ>film|audio)/(?P<id>[0-9a-zA-Z]+)'
_TESTS = [{
'url': 'http://laboratoriumdextera.wrzuta.pl/film/aq4hIZWrkBu/nike_football_the_last_game',
'md5': '9e67e05bed7c03b82488d87233a9efe7',
'info_dict': {
'id': 'aq4hIZWrkBu',
'ext': 'mp4',
'title': 'Nike Football: The Last Game',
'duration': 307,
'uploader_id': 'laboratoriumdextera',
'description': 'md5:7fb5ef3c21c5893375fda51d9b15d9cd',
},
'skip': 'Redirected to wrzuta.pl',
}, {
'url': 'http://vexling.wrzuta.pl/audio/01xBFabGXu6/james_horner_-_into_the_na_39_vi_world_bonus',
'md5': 'f80564fb5a2ec6ec59705ae2bf2ba56d',
'info_dict': {
'id': '01xBFabGXu6',
'ext': 'mp3',
'title': 'James Horner - Into The Na\'vi World [Bonus]',
'description': 'md5:30a70718b2cd9df3120fce4445b0263b',
'duration': 95,
'uploader_id': 'vexling',
},
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
typ = mobj.group('typ')
uploader = mobj.group('uploader')
webpage, urlh = self._download_webpage_handle(url, video_id)
if urlh.geturl() == 'http://www.wrzuta.pl/':
raise ExtractorError('Video removed', expected=True)
quality = qualities(['SD', 'MQ', 'HQ', 'HD'])
audio_table = {'flv': 'mp3', 'webm': 'ogg', '???': 'mp3'}
embedpage = self._download_json('http://www.wrzuta.pl/npp/embed/%s/%s' % (uploader, video_id), video_id)
formats = []
for media in embedpage['url']:
fmt = media['type'].split('@')[0]
if typ == 'audio':
ext = audio_table.get(fmt, fmt)
else:
ext = fmt
formats.append({
'format_id': '%s_%s' % (ext, media['quality'].lower()),
'url': media['url'],
'ext': ext,
'quality': quality(media['quality']),
})
self._sort_formats(formats)
return {
'id': video_id,
'title': self._og_search_title(webpage),
'thumbnail': self._og_search_thumbnail(webpage),
'formats': formats,
'duration': int_or_none(embedpage['duration']),
'uploader_id': uploader,
'description': self._og_search_description(webpage),
'age_limit': embedpage.get('minimalAge', 0),
}
class WrzutaPlaylistIE(InfoExtractor):
"""
this class covers extraction of wrzuta playlist entries
the extraction process bases on following steps:
* collect information of playlist size
* download all entries provided on
the playlist webpage (the playlist is split
on two pages: first directly reached from webpage
second: downloaded on demand by ajax call and rendered
using the ajax call response)
* in case size of extracted entries not reached total number of entries
use the ajax call to collect the remaining entries
"""
IE_NAME = 'wrzuta.pl:playlist'
_VALID_URL = r'https?://(?P<uploader>[0-9a-zA-Z]+)\.wrzuta\.pl/playlista/(?P<id>[0-9a-zA-Z]+)'
_TESTS = [{
'url': 'http://miromak71.wrzuta.pl/playlista/7XfO4vE84iR/moja_muza',
'playlist_mincount': 14,
'info_dict': {
'id': '7XfO4vE84iR',
'title': 'Moja muza',
},
}, {
'url': 'http://heroesf70.wrzuta.pl/playlista/6Nj3wQHx756/lipiec_-_lato_2015_muzyka_swiata',
'playlist_mincount': 144,
'info_dict': {
'id': '6Nj3wQHx756',
'title': 'Lipiec - Lato 2015 Muzyka Świata',
},
}, {
'url': 'http://miromak71.wrzuta.pl/playlista/7XfO4vE84iR',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
playlist_id = mobj.group('id')
uploader = mobj.group('uploader')
webpage = self._download_webpage(url, playlist_id)
playlist_size = int_or_none(self._html_search_regex(
(r'<div[^>]+class=["\']playlist-counter["\'][^>]*>\d+/(\d+)',
r'<div[^>]+class=["\']all-counter["\'][^>]*>(.+?)</div>'),
webpage, 'playlist size', default=None))
playlist_title = remove_start(
self._og_search_title(webpage), 'Playlista: ')
entries = []
if playlist_size:
entries = [
self.url_result(entry_url)
for _, entry_url in re.findall(
r'<a[^>]+href=(["\'])(http.+?)\1[^>]+class=["\']playlist-file-page',
webpage)]
if playlist_size > len(entries):
playlist_content = self._download_json(
'http://%s.wrzuta.pl/xhr/get_playlist_offset/%s' % (uploader, playlist_id),
playlist_id,
'Downloading playlist JSON',
'Unable to download playlist JSON')
entries.extend([
self.url_result(entry['filelink'])
for entry in playlist_content.get('files', []) if entry.get('filelink')])
return self.playlist_result(entries, playlist_id, playlist_title)

View File

@@ -20,7 +20,7 @@ from ..utils import (
class XHamsterIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:.+?\.)?xhamster\.com/
(?:.+?\.)?xhamster\.(?:com|one)/
(?:
movies/(?P<id>\d+)/(?P<display_id>[^/]*)\.html|
videos/(?P<display_id_2>[^/]*)-(?P<id_2>\d+)
@@ -91,6 +91,9 @@ class XHamsterIE(InfoExtractor):
# new URL schema
'url': 'https://pt.xhamster.com/videos/euro-pedal-pumping-7937821',
'only_matching': True,
}, {
'url': 'https://xhamster.one/videos/femaleagent-shy-beauty-takes-the-bait-1509445',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -57,10 +57,17 @@ class XVideosIE(InfoExtractor):
webpage, 'title', default=None,
group='title') or self._og_search_title(webpage)
thumbnail = self._search_regex(
(r'setThumbUrl\(\s*(["\'])(?P<thumbnail>(?:(?!\1).)+)\1',
r'url_bigthumb=(?P<thumbnail>.+?)&amp'),
webpage, 'thumbnail', fatal=False, group='thumbnail')
thumbnails = []
for preference, thumbnail in enumerate(('', '169')):
thumbnail_url = self._search_regex(
r'setThumbUrl%s\(\s*(["\'])(?P<thumbnail>(?:(?!\1).)+)\1' % thumbnail,
webpage, 'thumbnail', default=None, group='thumbnail')
if thumbnail_url:
thumbnails.append({
'url': thumbnail_url,
'preference': preference,
})
duration = int_or_none(self._og_search_property(
'duration', webpage, default=None)) or parse_duration(
self._search_regex(
@@ -98,6 +105,6 @@ class XVideosIE(InfoExtractor):
'formats': formats,
'title': title,
'duration': duration,
'thumbnail': thumbnail,
'thumbnails': thumbnails,
'age_limit': 18,
}

View File

@@ -477,3 +477,77 @@ class YahooSearchIE(SearchInfoExtractor):
'id': query,
'entries': entries,
}
class YahooGyaOPlayerIE(InfoExtractor):
IE_NAME = 'yahoo:gyao:player'
_VALID_URL = r'https?://(?:gyao\.yahoo\.co\.jp/(?:player|episode/[^/]+)|streaming\.yahoo\.co\.jp/c/y)/(?P<id>\d+/v\d+/v\d+|[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
_TESTS = [{
'url': 'https://gyao.yahoo.co.jp/player/00998/v00818/v0000000000000008564/',
'info_dict': {
'id': '5993125228001',
'ext': 'mp4',
'title': 'フューリー 【字幕版】',
'description': 'md5:21e691c798a15330eda4db17a8fe45a5',
'uploader_id': '4235717419001',
'upload_date': '20190124',
'timestamp': 1548294365,
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'https://streaming.yahoo.co.jp/c/y/01034/v00133/v0000000000000000706/',
'only_matching': True,
}, {
'url': 'https://gyao.yahoo.co.jp/episode/%E3%81%8D%E3%81%AE%E3%81%86%E4%BD%95%E9%A3%9F%E3%81%B9%E3%81%9F%EF%BC%9F%20%E7%AC%AC2%E8%A9%B1%202019%2F4%2F12%E6%94%BE%E9%80%81%E5%88%86/5cb02352-b725-409e-9f8d-88f947a9f682',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url).replace('/', ':')
video = self._download_json(
'https://gyao.yahoo.co.jp/dam/v1/videos/' + video_id,
video_id, query={
'fields': 'longDescription,title,videoId',
})
return {
'_type': 'url_transparent',
'id': video_id,
'title': video['title'],
'url': smuggle_url(
'http://players.brightcove.net/4235717419001/default_default/index.html?videoId=' + video['videoId'],
{'geo_countries': ['JP']}),
'description': video.get('longDescription'),
'ie_key': BrightcoveNewIE.ie_key(),
}
class YahooGyaOIE(InfoExtractor):
IE_NAME = 'yahoo:gyao'
_VALID_URL = r'https?://(?:gyao\.yahoo\.co\.jp/p|streaming\.yahoo\.co\.jp/p/y)/(?P<id>\d+/v\d+)'
_TESTS = [{
'url': 'https://gyao.yahoo.co.jp/p/00449/v03102/',
'info_dict': {
'id': '00449:v03102',
},
'playlist_count': 2,
}, {
'url': 'https://streaming.yahoo.co.jp/p/y/01034/v00133/',
'only_matching': True,
}]
def _real_extract(self, url):
program_id = self._match_id(url).replace('/', ':')
videos = self._download_json(
'https://gyao.yahoo.co.jp/api/programs/%s/videos' % program_id, program_id)['videos']
entries = []
for video in videos:
video_id = video.get('id')
if not video_id:
continue
entries.append(self.url_result(
'https://gyao.yahoo.co.jp/player/%s/' % video_id.replace(':', '/'),
YahooGyaOPlayerIE.ie_key(), video_id))
return self.playlist_result(entries, program_id)

View File

@@ -69,25 +69,28 @@ class YandexMusicTrackIE(YandexMusicBaseIE):
'skip': 'Travis CI servers blocked by YandexMusic',
}
def _get_track_url(self, storage_dir, track_id):
data = self._download_json(
'http://music.yandex.ru/api/v1.5/handlers/api-jsonp.jsx?action=getTrackSrc&p=download-info/%s'
% storage_dir,
track_id, 'Downloading track location JSON')
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
album_id, track_id = mobj.group('album_id'), mobj.group('id')
# Each string is now wrapped in a list, this is probably only temporarily thus
# supporting both scenarios (see https://github.com/ytdl-org/youtube-dl/issues/10193)
for k, v in data.items():
if v and isinstance(v, list):
data[k] = v[0]
track = self._download_json(
'http://music.yandex.ru/handlers/track.jsx?track=%s:%s' % (track_id, album_id),
track_id, 'Downloading track JSON')['track']
track_title = track['title']
key = hashlib.md5(('XGRlBW9FXlekgbPrRHuSiA' + data['path'][1:] + data['s']).encode('utf-8')).hexdigest()
storage = storage_dir.split('.')
download_data = self._download_json(
'https://music.yandex.ru/api/v2.1/handlers/track/%s:%s/web-album_track-track-track-main/download/m' % (track_id, album_id),
track_id, 'Downloading track location url JSON',
headers={'X-Retpath-Y': url})
return ('http://%s/get-mp3/%s/%s?track-id=%s&from=service-10-track&similarities-experiment=default'
% (data['host'], key, data['ts'] + data['path'], storage[1]))
fd_data = self._download_json(
download_data['src'], track_id,
'Downloading track location JSON',
query={'format': 'json'})
key = hashlib.md5(('XGRlBW9FXlekgbPrRHuSiA' + fd_data['path'][1:] + fd_data['s']).encode('utf-8')).hexdigest()
storage = track['storageDir'].split('.')
f_url = 'http://%s/get-mp3/%s/%s?track-id=%s ' % (fd_data['host'], key, fd_data['ts'] + fd_data['path'], storage[1])
def _get_track_info(self, track):
thumbnail = None
cover_uri = track.get('albums', [{}])[0].get('coverUri')
if cover_uri:
@@ -95,15 +98,16 @@ class YandexMusicTrackIE(YandexMusicBaseIE):
if not thumbnail.startswith('http'):
thumbnail = 'http://' + thumbnail
track_title = track['title']
track_info = {
'id': track['id'],
'id': track_id,
'ext': 'mp3',
'url': self._get_track_url(track['storageDir'], track['id']),
'url': f_url,
'filesize': int_or_none(track.get('fileSize')),
'duration': float_or_none(track.get('durationMs'), 1000),
'thumbnail': thumbnail,
'track': track_title,
'acodec': download_data.get('codec'),
'abr': int_or_none(download_data.get('bitrate')),
}
def extract_artist(artist_list):
@@ -131,18 +135,9 @@ class YandexMusicTrackIE(YandexMusicBaseIE):
})
else:
track_info['title'] = track_title
return track_info
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
album_id, track_id = mobj.group('album_id'), mobj.group('id')
track = self._download_json(
'http://music.yandex.ru/handlers/track.jsx?track=%s:%s' % (track_id, album_id),
track_id, 'Downloading track JSON')['track']
return self._get_track_info(track)
class YandexMusicPlaylistBaseIE(YandexMusicBaseIE):
def _build_playlist(self, tracks):

View File

@@ -8,8 +8,8 @@ from ..utils import (
class YourPornIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?yourporn\.sexy/post/(?P<id>[^/?#&.]+)'
_TEST = {
_VALID_URL = r'https?://(?:www\.)?(?:yourporn\.sexy|sxyprn\.com)/post/(?P<id>[^/?#&.]+)'
_TESTS = [{
'url': 'https://yourporn.sexy/post/57ffcb2e1179b.html',
'md5': '6f8682b6464033d87acaa7a8ff0c092e',
'info_dict': {
@@ -23,7 +23,10 @@ class YourPornIE(InfoExtractor):
'params': {
'skip_download': True,
},
}
}, {
'url': 'https://sxyprn.com/post/57ffcb2e1179b.html',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)

View File

@@ -27,6 +27,7 @@ from ..compat import (
)
from ..utils import (
clean_html,
dict_get,
error_to_compat_str,
ExtractorError,
float_or_none,
@@ -484,7 +485,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# RTMP (unnamed)
'_rtmp': {'protocol': 'rtmp'},
}
_SUBTITLE_FORMATS = ('ttml', 'vtt')
_SUBTITLE_FORMATS = ('srv1', 'srv2', 'srv3', 'ttml', 'vtt')
_GEO_BYPASS = False
@@ -908,6 +909,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'creator': 'Todd Haberman, Daniel Law Heath and Aaron Kaplan',
'track': 'Dark Walk - Position Music',
'artist': 'Todd Haberman, Daniel Law Heath and Aaron Kaplan',
'album': 'Position Music - Production Music Vol. 143 - Dark Walk',
},
'params': {
'skip_download': True,
@@ -1086,7 +1088,95 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'skip_download': True,
'youtube_include_dash_manifest': False,
},
}
},
{
# Youtube Music Auto-generated description
'url': 'https://music.youtube.com/watch?v=MgNrAu2pzNs',
'info_dict': {
'id': 'MgNrAu2pzNs',
'ext': 'mp4',
'title': 'Voyeur Girl',
'description': 'md5:7ae382a65843d6df2685993e90a8628f',
'upload_date': '20190312',
'uploader': 'Various Artists - Topic',
'uploader_id': 'UCVWKBi1ELZn0QX2CBLSkiyw',
'artist': 'Stephen',
'track': 'Voyeur Girl',
'album': 'it\'s too much love to know my dear',
'release_date': '20190313',
'release_year': 2019,
},
'params': {
'skip_download': True,
},
},
{
# Youtube Music Auto-generated description
# Retrieve 'artist' field from 'Artist:' in video description
# when it is present on youtube music video
'url': 'https://www.youtube.com/watch?v=k0jLE7tTwjY',
'info_dict': {
'id': 'k0jLE7tTwjY',
'ext': 'mp4',
'title': 'Latch Feat. Sam Smith',
'description': 'md5:3cb1e8101a7c85fcba9b4fb41b951335',
'upload_date': '20150110',
'uploader': 'Various Artists - Topic',
'uploader_id': 'UCNkEcmYdjrH4RqtNgh7BZ9w',
'artist': 'Disclosure',
'track': 'Latch Feat. Sam Smith',
'album': 'Latch Featuring Sam Smith',
'release_date': '20121008',
'release_year': 2012,
},
'params': {
'skip_download': True,
},
},
{
# Youtube Music Auto-generated description
# handle multiple artists on youtube music video
'url': 'https://www.youtube.com/watch?v=74qn0eJSjpA',
'info_dict': {
'id': '74qn0eJSjpA',
'ext': 'mp4',
'title': 'Eastside',
'description': 'md5:290516bb73dcbfab0dcc4efe6c3de5f2',
'upload_date': '20180710',
'uploader': 'Benny Blanco - Topic',
'uploader_id': 'UCzqz_ksRu_WkIzmivMdIS7A',
'artist': 'benny blanco, Halsey, Khalid',
'track': 'Eastside',
'album': 'Eastside',
'release_date': '20180713',
'release_year': 2018,
},
'params': {
'skip_download': True,
},
},
{
# Youtube Music Auto-generated description
# handle youtube music video with release_year and no release_date
'url': 'https://www.youtube.com/watch?v=-hcAI0g-f5M',
'info_dict': {
'id': '-hcAI0g-f5M',
'ext': 'mp4',
'title': 'Put It On Me',
'description': 'md5:93c55acc682ae7b0c668f2e34e1c069e',
'upload_date': '20180426',
'uploader': 'Matt Maeson - Topic',
'uploader_id': 'UCnEkIGqtGcQMLk73Kp-Q5LQ',
'artist': 'Matt Maeson',
'track': 'Put It On Me',
'album': 'The Hearse',
'release_date': None,
'release_year': 2018,
},
'params': {
'skip_download': True,
},
},
]
def __init__(self, *args, **kwargs):
@@ -1563,6 +1653,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def extract_view_count(v_info):
return int_or_none(try_get(v_info, lambda x: x['view_count'][0]))
def extract_token(v_info):
return dict_get(v_info, ('account_playback_token', 'accountPlaybackToken', 'token'))
player_response = {}
# Get video info
@@ -1622,7 +1715,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# The general idea is to take a union of itags of both DASH manifests (for example
# video with such 'manifest behavior' see https://github.com/ytdl-org/youtube-dl/issues/6093)
self.report_video_info_webpage_download(video_id)
for el in ('info', 'embedded', 'detailpage', 'vevo', ''):
for el in ('embedded', 'detailpage', 'vevo', ''):
query = {
'video_id': video_id,
'ps': 'default',
@@ -1652,7 +1745,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
view_count = extract_view_count(get_video_info)
if not video_info:
video_info = get_video_info
if 'token' in get_video_info:
get_token = extract_token(get_video_info)
if get_token:
# Different get_video_info requests may report different results, e.g.
# some may report video unavailability, but some may serve it without
# any complaint (see https://github.com/ytdl-org/youtube-dl/issues/7362,
@@ -1662,7 +1756,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# due to YouTube measures against IP ranges of hosting providers.
# Working around by preferring the first succeeded video_info containing
# the token if no such video_info yet was found.
if 'token' not in video_info:
token = extract_token(video_info)
if not token:
video_info = get_video_info
break
@@ -1671,26 +1766,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
r'(?s)<h1[^>]+id="unavailable-message"[^>]*>(.+?)</h1>',
video_webpage, 'unavailable message', default=None)
if 'token' not in video_info:
if 'reason' in video_info:
if 'The uploader has not made this video available in your country.' in video_info['reason']:
regions_allowed = self._html_search_meta(
'regionsAllowed', video_webpage, default=None)
countries = regions_allowed.split(',') if regions_allowed else None
self.raise_geo_restricted(
msg=video_info['reason'][0], countries=countries)
reason = video_info['reason'][0]
if 'Invalid parameters' in reason:
unavailable_message = extract_unavailable_message()
if unavailable_message:
reason = unavailable_message
raise ExtractorError(
'YouTube said: %s' % reason,
expected=True, video_id=video_id)
else:
raise ExtractorError(
'"token" parameter not in video info for unknown reason',
video_id=video_id)
if not video_info:
unavailable_message = extract_unavailable_message()
if not unavailable_message:
unavailable_message = 'Unable to extract video data'
raise ExtractorError(
'YouTube said: %s' % unavailable_message, expected=True, video_id=video_id)
if video_info.get('license_info'):
raise ExtractorError('This video is DRM protected.', expected=True)
@@ -2063,6 +2144,27 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
track = extract_meta('Song')
artist = extract_meta('Artist')
album = extract_meta('Album')
# Youtube Music Auto-generated description
release_date = release_year = None
if video_description:
mobj = re.search(r'(?s)Provided to YouTube by [^\n]+\n+(?P<track>[^·]+)·(?P<artist>[^\n]+)\n+(?P<album>[^\n]+)(?:.+?℗\s*(?P<release_year>\d{4})(?!\d))?(?:.+?Released on\s*:\s*(?P<release_date>\d{4}-\d{2}-\d{2}))?(.+?\nArtist\s*:\s*(?P<clean_artist>[^\n]+))?', video_description)
if mobj:
if not track:
track = mobj.group('track').strip()
if not artist:
artist = mobj.group('clean_artist') or ', '.join(a.strip() for a in mobj.group('artist').split('·'))
if not album:
album = mobj.group('album'.strip())
release_year = mobj.group('release_year')
release_date = mobj.group('release_date')
if release_date:
release_date = release_date.replace('-', '')
if not release_year:
release_year = int(release_date[:4])
if release_year:
release_year = int(release_year)
m_episode = re.search(
r'<div[^>]+id="watch7-headline"[^>]*>\s*<span[^>]*>.*?>(?P<series>[^<]+)</a></b>\s*S(?P<season>\d+)\s*•\s*E(?P<episode>\d+)</span>',
@@ -2176,6 +2278,29 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if f.get('vcodec') != 'none':
f['stretched_ratio'] = ratio
if not formats:
token = extract_token(video_info)
if not token:
if 'reason' in video_info:
if 'The uploader has not made this video available in your country.' in video_info['reason']:
regions_allowed = self._html_search_meta(
'regionsAllowed', video_webpage, default=None)
countries = regions_allowed.split(',') if regions_allowed else None
self.raise_geo_restricted(
msg=video_info['reason'][0], countries=countries)
reason = video_info['reason'][0]
if 'Invalid parameters' in reason:
unavailable_message = extract_unavailable_message()
if unavailable_message:
reason = unavailable_message
raise ExtractorError(
'YouTube said: %s' % reason,
expected=True, video_id=video_id)
else:
raise ExtractorError(
'"token" parameter not in video info for unknown reason',
video_id=video_id)
self._sort_formats(formats)
self.mark_watched(video_id, video_info, player_response)
@@ -2216,6 +2341,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'episode_number': episode_number,
'track': track,
'artist': artist,
'album': album,
'release_date': release_date,
'release_year': release_year,
}

View File

@@ -1922,7 +1922,7 @@ def int_or_none(v, scale=1, default=None, get_attr=None, invscale=1):
return default
try:
return int(v) * invscale // scale
except ValueError:
except (ValueError, TypeError):
return default
@@ -1943,7 +1943,7 @@ def float_or_none(v, scale=1, invscale=1, default=None):
return default
try:
return float(v) * invscale / scale
except ValueError:
except (ValueError, TypeError):
return default

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2019.03.18'
__version__ = '2019.04.30'