mirror of
https://github.com/ytdl-org/youtube-dl
synced 2024-11-02 16:37:59 +09:00
Merge branch 'master' into mkvthumbnail
This commit is contained in:
commit
2833a38215
6
.github/ISSUE_TEMPLATE/1_broken_site.md
vendored
6
.github/ISSUE_TEMPLATE/1_broken_site.md
vendored
@ -18,7 +18,7 @@ title: ''
|
|||||||
|
|
||||||
<!--
|
<!--
|
||||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
||||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.11.21.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.02.10. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||||
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
|
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
|
||||||
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
||||||
@ -26,7 +26,7 @@ Carefully read and work through this check list in order to prevent the most com
|
|||||||
-->
|
-->
|
||||||
|
|
||||||
- [ ] I'm reporting a broken site support
|
- [ ] I'm reporting a broken site support
|
||||||
- [ ] I've verified that I'm running youtube-dl version **2020.11.21.1**
|
- [ ] I've verified that I'm running youtube-dl version **2021.02.10**
|
||||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||||
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
|
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
|
||||||
- [ ] I've searched the bugtracker for similar issues including closed ones
|
- [ ] I've searched the bugtracker for similar issues including closed ones
|
||||||
@ -41,7 +41,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
|
|||||||
[debug] User config: []
|
[debug] User config: []
|
||||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||||
[debug] youtube-dl version 2020.11.21.1
|
[debug] youtube-dl version 2021.02.10
|
||||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||||
[debug] Proxy map: {}
|
[debug] Proxy map: {}
|
||||||
|
@ -19,7 +19,7 @@ labels: 'site-support-request'
|
|||||||
|
|
||||||
<!--
|
<!--
|
||||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
||||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.11.21.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.02.10. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||||
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
|
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
|
||||||
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
||||||
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
|
|||||||
-->
|
-->
|
||||||
|
|
||||||
- [ ] I'm reporting a new site support request
|
- [ ] I'm reporting a new site support request
|
||||||
- [ ] I've verified that I'm running youtube-dl version **2020.11.21.1**
|
- [ ] I've verified that I'm running youtube-dl version **2021.02.10**
|
||||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||||
- [ ] I've checked that none of provided URLs violate any copyrights
|
- [ ] I've checked that none of provided URLs violate any copyrights
|
||||||
- [ ] I've searched the bugtracker for similar site support requests including closed ones
|
- [ ] I've searched the bugtracker for similar site support requests including closed ones
|
||||||
|
@ -18,13 +18,13 @@ title: ''
|
|||||||
|
|
||||||
<!--
|
<!--
|
||||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
||||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.11.21.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.02.10. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||||
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
||||||
- Finally, put x into all relevant boxes (like this [x])
|
- Finally, put x into all relevant boxes (like this [x])
|
||||||
-->
|
-->
|
||||||
|
|
||||||
- [ ] I'm reporting a site feature request
|
- [ ] I'm reporting a site feature request
|
||||||
- [ ] I've verified that I'm running youtube-dl version **2020.11.21.1**
|
- [ ] I've verified that I'm running youtube-dl version **2021.02.10**
|
||||||
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
|
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
|
||||||
|
|
||||||
|
|
||||||
|
6
.github/ISSUE_TEMPLATE/4_bug_report.md
vendored
6
.github/ISSUE_TEMPLATE/4_bug_report.md
vendored
@ -18,7 +18,7 @@ title: ''
|
|||||||
|
|
||||||
<!--
|
<!--
|
||||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
||||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.11.21.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.02.10. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||||
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
|
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
|
||||||
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
||||||
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
|
|||||||
-->
|
-->
|
||||||
|
|
||||||
- [ ] I'm reporting a broken site support issue
|
- [ ] I'm reporting a broken site support issue
|
||||||
- [ ] I've verified that I'm running youtube-dl version **2020.11.21.1**
|
- [ ] I've verified that I'm running youtube-dl version **2021.02.10**
|
||||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||||
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
|
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
|
||||||
- [ ] I've searched the bugtracker for similar bug reports including closed ones
|
- [ ] I've searched the bugtracker for similar bug reports including closed ones
|
||||||
@ -43,7 +43,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
|
|||||||
[debug] User config: []
|
[debug] User config: []
|
||||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||||
[debug] youtube-dl version 2020.11.21.1
|
[debug] youtube-dl version 2021.02.10
|
||||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||||
[debug] Proxy map: {}
|
[debug] Proxy map: {}
|
||||||
|
4
.github/ISSUE_TEMPLATE/5_feature_request.md
vendored
4
.github/ISSUE_TEMPLATE/5_feature_request.md
vendored
@ -19,13 +19,13 @@ labels: 'request'
|
|||||||
|
|
||||||
<!--
|
<!--
|
||||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
||||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.11.21.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.02.10. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||||
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
||||||
- Finally, put x into all relevant boxes (like this [x])
|
- Finally, put x into all relevant boxes (like this [x])
|
||||||
-->
|
-->
|
||||||
|
|
||||||
- [ ] I'm reporting a feature request
|
- [ ] I'm reporting a feature request
|
||||||
- [ ] I've verified that I'm running youtube-dl version **2020.11.21.1**
|
- [ ] I've verified that I'm running youtube-dl version **2021.02.10**
|
||||||
- [ ] I've searched the bugtracker for similar feature requests including closed ones
|
- [ ] I've searched the bugtracker for similar feature requests including closed ones
|
||||||
|
|
||||||
|
|
||||||
|
4
.github/PULL_REQUEST_TEMPLATE.md
vendored
4
.github/PULL_REQUEST_TEMPLATE.md
vendored
@ -7,8 +7,10 @@
|
|||||||
---
|
---
|
||||||
|
|
||||||
### Before submitting a *pull request* make sure you have:
|
### Before submitting a *pull request* make sure you have:
|
||||||
- [ ] At least skimmed through [adding new extractor tutorial](https://github.com/ytdl-org/youtube-dl#adding-support-for-a-new-site) and [youtube-dl coding conventions](https://github.com/ytdl-org/youtube-dl#youtube-dl-coding-conventions) sections
|
|
||||||
- [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests
|
- [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests
|
||||||
|
- [ ] Read [adding new extractor tutorial](https://github.com/ytdl-org/youtube-dl#adding-support-for-a-new-site)
|
||||||
|
- [ ] Read [youtube-dl coding conventions](https://github.com/ytdl-org/youtube-dl#youtube-dl-coding-conventions) and adjusted the code to meet them
|
||||||
|
- [ ] Covered the code with tests (note that PRs without tests will be REJECTED)
|
||||||
- [ ] Checked the code with [flake8](https://pypi.python.org/pypi/flake8)
|
- [ ] Checked the code with [flake8](https://pypi.python.org/pypi/flake8)
|
||||||
|
|
||||||
### In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check one of the following options:
|
### In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check one of the following options:
|
||||||
|
74
.github/workflows/ci.yml
vendored
Normal file
74
.github/workflows/ci.yml
vendored
Normal file
@ -0,0 +1,74 @@
|
|||||||
|
name: CI
|
||||||
|
on: [push, pull_request]
|
||||||
|
jobs:
|
||||||
|
tests:
|
||||||
|
name: Tests
|
||||||
|
runs-on: ${{ matrix.os }}
|
||||||
|
strategy:
|
||||||
|
fail-fast: true
|
||||||
|
matrix:
|
||||||
|
os: [ubuntu-18.04]
|
||||||
|
# TODO: python 2.6
|
||||||
|
python-version: [2.7, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, pypy-2.7, pypy-3.6, pypy-3.7]
|
||||||
|
python-impl: [cpython]
|
||||||
|
ytdl-test-set: [core, download]
|
||||||
|
run-tests-ext: [sh]
|
||||||
|
include:
|
||||||
|
# python 3.2 is only available on windows via setup-python
|
||||||
|
- os: windows-latest
|
||||||
|
python-version: 3.2
|
||||||
|
python-impl: cpython
|
||||||
|
ytdl-test-set: core
|
||||||
|
run-tests-ext: bat
|
||||||
|
- os: windows-latest
|
||||||
|
python-version: 3.2
|
||||||
|
python-impl: cpython
|
||||||
|
ytdl-test-set: download
|
||||||
|
run-tests-ext: bat
|
||||||
|
# jython
|
||||||
|
- os: ubuntu-18.04
|
||||||
|
python-impl: jython
|
||||||
|
ytdl-test-set: core
|
||||||
|
run-tests-ext: sh
|
||||||
|
- os: ubuntu-18.04
|
||||||
|
python-impl: jython
|
||||||
|
ytdl-test-set: download
|
||||||
|
run-tests-ext: sh
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v2
|
||||||
|
- name: Set up Python ${{ matrix.python-version }}
|
||||||
|
uses: actions/setup-python@v2
|
||||||
|
if: ${{ matrix.python-impl == 'cpython' }}
|
||||||
|
with:
|
||||||
|
python-version: ${{ matrix.python-version }}
|
||||||
|
- name: Set up Java 8
|
||||||
|
if: ${{ matrix.python-impl == 'jython' }}
|
||||||
|
uses: actions/setup-java@v1
|
||||||
|
with:
|
||||||
|
java-version: 8
|
||||||
|
- name: Install Jython
|
||||||
|
if: ${{ matrix.python-impl == 'jython' }}
|
||||||
|
run: |
|
||||||
|
wget http://search.maven.org/remotecontent?filepath=org/python/jython-installer/2.7.1/jython-installer-2.7.1.jar -O jython-installer.jar
|
||||||
|
java -jar jython-installer.jar -s -d "$HOME/jython"
|
||||||
|
echo "$HOME/jython/bin" >> $GITHUB_PATH
|
||||||
|
- name: Install nose
|
||||||
|
run: pip install nose
|
||||||
|
- name: Run tests
|
||||||
|
continue-on-error: ${{ matrix.ytdl-test-set == 'download' || matrix.python-impl == 'jython' }}
|
||||||
|
env:
|
||||||
|
YTDL_TEST_SET: ${{ matrix.ytdl-test-set }}
|
||||||
|
run: ./devscripts/run_tests.${{ matrix.run-tests-ext }}
|
||||||
|
flake8:
|
||||||
|
name: Linter
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v2
|
||||||
|
- name: Set up Python
|
||||||
|
uses: actions/setup-python@v2
|
||||||
|
with:
|
||||||
|
python-version: 3.9
|
||||||
|
- name: Install flake8
|
||||||
|
run: pip install flake8
|
||||||
|
- name: Run flake8
|
||||||
|
run: flake8 .
|
50
.travis.yml
50
.travis.yml
@ -1,50 +0,0 @@
|
|||||||
language: python
|
|
||||||
python:
|
|
||||||
- "2.6"
|
|
||||||
- "2.7"
|
|
||||||
- "3.2"
|
|
||||||
- "3.3"
|
|
||||||
- "3.4"
|
|
||||||
- "3.5"
|
|
||||||
- "3.6"
|
|
||||||
- "pypy"
|
|
||||||
- "pypy3"
|
|
||||||
dist: trusty
|
|
||||||
env:
|
|
||||||
- YTDL_TEST_SET=core
|
|
||||||
- YTDL_TEST_SET=download
|
|
||||||
jobs:
|
|
||||||
include:
|
|
||||||
- python: 3.7
|
|
||||||
dist: xenial
|
|
||||||
env: YTDL_TEST_SET=core
|
|
||||||
- python: 3.7
|
|
||||||
dist: xenial
|
|
||||||
env: YTDL_TEST_SET=download
|
|
||||||
- python: 3.8
|
|
||||||
dist: xenial
|
|
||||||
env: YTDL_TEST_SET=core
|
|
||||||
- python: 3.8
|
|
||||||
dist: xenial
|
|
||||||
env: YTDL_TEST_SET=download
|
|
||||||
- python: 3.8-dev
|
|
||||||
dist: xenial
|
|
||||||
env: YTDL_TEST_SET=core
|
|
||||||
- python: 3.8-dev
|
|
||||||
dist: xenial
|
|
||||||
env: YTDL_TEST_SET=download
|
|
||||||
- env: JYTHON=true; YTDL_TEST_SET=core
|
|
||||||
- env: JYTHON=true; YTDL_TEST_SET=download
|
|
||||||
- name: flake8
|
|
||||||
python: 3.8
|
|
||||||
dist: xenial
|
|
||||||
install: pip install flake8
|
|
||||||
script: flake8 .
|
|
||||||
fast_finish: true
|
|
||||||
allow_failures:
|
|
||||||
- env: YTDL_TEST_SET=download
|
|
||||||
- env: JYTHON=true; YTDL_TEST_SET=core
|
|
||||||
- env: JYTHON=true; YTDL_TEST_SET=download
|
|
||||||
before_install:
|
|
||||||
- if [ "$JYTHON" == "true" ]; then ./devscripts/install_jython.sh; export PATH="$HOME/jython/bin:$PATH"; fi
|
|
||||||
script: ./devscripts/run_tests.sh
|
|
1
AUTHORS
1
AUTHORS
@ -246,3 +246,4 @@ Enes Solak
|
|||||||
Nathan Rossi
|
Nathan Rossi
|
||||||
Thomas van der Berg
|
Thomas van der Berg
|
||||||
Luca Cherubin
|
Luca Cherubin
|
||||||
|
Adrian Heine
|
485
ChangeLog
485
ChangeLog
@ -1,3 +1,488 @@
|
|||||||
|
version 2021.02.10
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [youtube:tab] Improve grid continuation extraction (#28130)
|
||||||
|
* [ign] Fix extraction (#24771)
|
||||||
|
+ [xhamster] Extract format filesize
|
||||||
|
+ [xhamster] Extract formats from xplayer settings (#28114)
|
||||||
|
+ [youtube] Add support phone/tablet JS player (#26424)
|
||||||
|
* [archiveorg] Fix and improve extraction (#21330, #23586, #25277, #26780,
|
||||||
|
#27109, #27236, #28063)
|
||||||
|
+ [cda] Detect geo restricted videos (#28106)
|
||||||
|
* [urplay] Fix extraction (#28073, #28074)
|
||||||
|
* [youtube] Fix release date extraction (#28094)
|
||||||
|
+ [youtube] Extract abr and vbr (#28100)
|
||||||
|
* [youtube] Skip OTF formats (#28070)
|
||||||
|
|
||||||
|
|
||||||
|
version 2021.02.04.1
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [youtube] Prefer DASH formats (#28070)
|
||||||
|
* [azmedien] Fix extraction (#28064)
|
||||||
|
|
||||||
|
|
||||||
|
version 2021.02.04
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [pornhub] Implement lazy playlist extraction
|
||||||
|
* [svtplay] Fix video id extraction (#28058)
|
||||||
|
+ [pornhub] Add support for authentication (#18797, #21416, #24294)
|
||||||
|
* [pornhub:user] Improve paging
|
||||||
|
+ [pornhub:user] Add support for URLs unavailable via /videos page (#27853)
|
||||||
|
+ [bravotv] Add support for oxygen.com (#13357, #22500)
|
||||||
|
+ [youtube] Pass embed URL to get_video_info request
|
||||||
|
* [ccma] Improve metadata extraction (#27994)
|
||||||
|
+ Extract age limit, alt title, categories, series and episode number
|
||||||
|
* Fix timestamp multiple subtitles extraction
|
||||||
|
* [egghead] Update API domain (#28038)
|
||||||
|
- [vidzi] Remove extractor (#12629)
|
||||||
|
* [vidio] Improve metadata extraction
|
||||||
|
* [youtube] Improve subtitles extraction
|
||||||
|
* [youtube] Fix chapter extraction fallback
|
||||||
|
* [youtube] Rewrite extractor
|
||||||
|
* Improve format sorting
|
||||||
|
* Remove unused code
|
||||||
|
* Fix series metadata extraction
|
||||||
|
* Fix trailer video extraction
|
||||||
|
* Improve error reporting
|
||||||
|
+ Extract video location
|
||||||
|
+ [vvvvid] Add support for youtube embeds (#27825)
|
||||||
|
* [googledrive] Report download page errors (#28005)
|
||||||
|
* [vlive] Fix error message decoding for python 2 (#28004)
|
||||||
|
* [youtube] Improve DASH formats file size extraction
|
||||||
|
* [cda] Improve birth validation detection (#14022, #27929)
|
||||||
|
+ [awaan] Extract uploader id (#27963)
|
||||||
|
+ [medialaan] Add support DPG Media MyChannels based websites (#14871, #15597,
|
||||||
|
#16106, #16489)
|
||||||
|
* [abcnews] Fix extraction (#12394, #27920)
|
||||||
|
* [AMP] Fix upload date and timestamp extraction (#27970)
|
||||||
|
* [tv4] Relax URL regular expression (#27964)
|
||||||
|
+ [tv2] Add support for mtvuutiset.fi (#27744)
|
||||||
|
* [adn] Improve login warning reporting
|
||||||
|
* [zype] Fix uplynk id extraction (#27956)
|
||||||
|
+ [adn] Add support for authentication (#17091, #27841, #27937)
|
||||||
|
|
||||||
|
|
||||||
|
version 2021.01.24.1
|
||||||
|
|
||||||
|
Core
|
||||||
|
* Introduce --output-na-placeholder (#27896)
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [franceculture] Make thumbnail optional (#18807)
|
||||||
|
* [franceculture] Fix extraction (#27891, #27903)
|
||||||
|
* [njpwworld] Fix extraction (#27890)
|
||||||
|
* [comedycentral] Fix extraction (#27905)
|
||||||
|
* [wat] Fix format extraction (#27901)
|
||||||
|
+ [americastestkitchen:season] Add support for seasons (#27861)
|
||||||
|
+ [trovo] Add support for trovo.live (#26125)
|
||||||
|
+ [aol] Add support for yahoo videos (#26650)
|
||||||
|
* [yahoo] Fix single video extraction
|
||||||
|
* [lbry] Unescape lbry URI (#27872)
|
||||||
|
* [9gag] Fix and improve extraction (#23022)
|
||||||
|
* [americastestkitchen] Improve metadata extraction for ATK episodes (#27860)
|
||||||
|
* [aljazeera] Fix extraction (#20911, #27779)
|
||||||
|
+ [minds] Add support for minds.com (#17934)
|
||||||
|
* [ard] Fix title and description extraction (#27761)
|
||||||
|
+ [spotify] Add support for Spotify Podcasts (#27443)
|
||||||
|
|
||||||
|
|
||||||
|
version 2021.01.16
|
||||||
|
|
||||||
|
Core
|
||||||
|
* [YoutubeDL] Protect from infinite recursion due to recursively nested
|
||||||
|
playlists (#27833)
|
||||||
|
* [YoutubeDL] Ignore failure to create existing directory (#27811)
|
||||||
|
* [YoutubeDL] Raise syntax error for format selection expressions with multiple
|
||||||
|
+ operators (#27803)
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
+ [animeondemand] Add support for lazy playlist extraction (#27829)
|
||||||
|
* [youporn] Restrict fallback download URL (#27822)
|
||||||
|
* [youporn] Improve height and tbr extraction (#20425, #23659)
|
||||||
|
* [youporn] Fix extraction (#27822)
|
||||||
|
+ [twitter] Add support for unified cards (#27826)
|
||||||
|
+ [twitch] Add Authorization header with OAuth token for GraphQL requests
|
||||||
|
(#27790)
|
||||||
|
* [mixcloud:playlist:base] Extract video id in flat playlist mode (#27787)
|
||||||
|
* [cspan] Improve info extraction (#27791)
|
||||||
|
* [adn] Improve info extraction
|
||||||
|
* [adn] Fix extraction (#26963, #27732)
|
||||||
|
* [youtube:search] Extract from all sections (#27604)
|
||||||
|
* [youtube:search] fix viewcount and try to extract all video sections (#27604)
|
||||||
|
* [twitch] Improve login error extraction
|
||||||
|
* [twitch] Fix authentication (#27743)
|
||||||
|
* [3qsdn] Improve extraction (#21058)
|
||||||
|
* [peertube] Extract formats from streamingPlaylists (#26002, #27586, #27728)
|
||||||
|
* [khanacademy] Fix extraction (#2887, #26803)
|
||||||
|
* [spike] Update Paramount Network feed URL (#27715)
|
||||||
|
|
||||||
|
|
||||||
|
version 2021.01.08
|
||||||
|
|
||||||
|
Core
|
||||||
|
* [downloader/hls] Disable decryption in tests (#27660)
|
||||||
|
+ [utils] Add a function to clean podcast URLs
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [rai] Improve subtitles extraction (#27698, #27705)
|
||||||
|
* [canvas] Match only supported VRT NU URLs (#27707)
|
||||||
|
+ [bibeltv] Add support for bibeltv.de (#14361)
|
||||||
|
+ [bfmtv] Add support for bfmtv.com (#16053, #26615)
|
||||||
|
+ [sbs] Add support for ondemand play and news embed URLs (#17650, #27629)
|
||||||
|
* [twitch] Drop legacy kraken API v5 code altogether and refactor
|
||||||
|
* [twitch:vod] Switch to GraphQL for video metadata
|
||||||
|
* [canvas] Fix VRT NU extraction (#26957, #27053)
|
||||||
|
* [twitch] Switch access token to GraphQL and refactor (#27646)
|
||||||
|
+ [rai] Detect ContentItem in iframe (#12652, #27673)
|
||||||
|
* [ketnet] Fix extraction (#27662)
|
||||||
|
+ [dplay] Add suport Discovery+ domains (#27680)
|
||||||
|
* [motherless] Improve extraction (#26495, #27450)
|
||||||
|
* [motherless] Fix recent videos upload date extraction (#27661)
|
||||||
|
* [nrk] Fix extraction for videos without a legalAge rating
|
||||||
|
- [googleplus] Remove extractor (#4955, #7400)
|
||||||
|
+ [applepodcasts] Add support for podcasts.apple.com (#25918)
|
||||||
|
+ [googlepodcasts] Add support for podcasts.google.com
|
||||||
|
+ [iheart] Add support for iheart.com (#27037)
|
||||||
|
* [acast] Clean podcast URLs
|
||||||
|
* [stitcher] Clean podcast URLs
|
||||||
|
+ [xfileshare] Add support for aparat.cam (#27651)
|
||||||
|
+ [twitter] Add support for summary card (#25121)
|
||||||
|
* [twitter] Try to use a Generic fallback for unknown twitter cards (#25982)
|
||||||
|
+ [stitcher] Add support for shows and show metadata extraction (#20510)
|
||||||
|
* [stv] Improve episode id extraction (#23083)
|
||||||
|
|
||||||
|
|
||||||
|
version 2021.01.03
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [nrk] Improve series metadata extraction (#27473)
|
||||||
|
+ [nrk] Extract subtitles
|
||||||
|
* [nrk] Fix age limit extraction
|
||||||
|
* [nrk] Improve video id extraction
|
||||||
|
+ [nrk] Add support for podcasts (#27634, #27635)
|
||||||
|
* [nrk] Generalize and delegate all item extractors to nrk
|
||||||
|
+ [nrk] Add support for mp3 formats
|
||||||
|
* [nrktv] Switch to playback endpoint
|
||||||
|
* [vvvvid] Fix season metadata extraction (#18130)
|
||||||
|
* [stitcher] Fix extraction (#20811, #27606)
|
||||||
|
* [acast] Fix extraction (#21444, #27612, #27613)
|
||||||
|
+ [arcpublishing] Add support for arcpublishing.com (#2298, #9340, #17200)
|
||||||
|
+ [sky] Add support for Sports News articles and Brighcove videos (#13054)
|
||||||
|
+ [vvvvid] Extract akamai formats
|
||||||
|
* [vvvvid] Skip unplayable episodes (#27599)
|
||||||
|
* [yandexvideo] Fix extraction for Python 3.4
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.12.31
|
||||||
|
|
||||||
|
Core
|
||||||
|
* [utils] Accept only supported protocols in url_or_none
|
||||||
|
* [YoutubeDL] Allow format filtering using audio language (#16209)
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
+ [redditr] Extract all thumbnails (#27503)
|
||||||
|
* [vvvvid] Improve info extraction
|
||||||
|
+ [vvvvid] Add support for playlists (#18130, #27574)
|
||||||
|
+ [yandexdisk] Extract info from webpage
|
||||||
|
* [yandexdisk] Fix extraction (#17861, #27131)
|
||||||
|
* [yandexvideo] Use old API call as fallback
|
||||||
|
* [yandexvideo] Fix extraction (#25000)
|
||||||
|
- [nbc] Remove CSNNE extractor
|
||||||
|
* [nbc] Fix NBCSport VPlayer URL extraction (#16640)
|
||||||
|
+ [aenetworks] Add support for biography.com (#3863)
|
||||||
|
* [uktvplay] Match new video URLs (#17909)
|
||||||
|
* [sevenplay] Detect API errors
|
||||||
|
* [tenplay] Fix format extraction (#26653)
|
||||||
|
* [brightcove] Raise error for DRM protected videos (#23467, #27568)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.12.29
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [youtube] Improve yt initial data extraction (#27524)
|
||||||
|
* [youtube:tab] Improve URL matching #27559)
|
||||||
|
* [youtube:tab] Restore retry on browse requests (#27313, #27564)
|
||||||
|
* [aparat] Fix extraction (#22285, #22611, #23348, #24354, #24591, #24904,
|
||||||
|
#25418, #26070, #26350, #26738, #27563)
|
||||||
|
- [brightcove] Remove sonyliv specific code
|
||||||
|
* [piksel] Improve format extraction
|
||||||
|
+ [zype] Add support for uplynk videos
|
||||||
|
+ [toggle] Add support for live.mewatch.sg (#27555)
|
||||||
|
+ [go] Add support for fxnow.fxnetworks.com (#13972, #22467, #23754, #26826)
|
||||||
|
* [teachable] Improve embed detection (#26923)
|
||||||
|
* [mitele] Fix free video extraction (#24624, #25827, #26757)
|
||||||
|
* [telecinco] Fix extraction
|
||||||
|
* [youtube] Update invidious.snopyta.org (#22667)
|
||||||
|
* [amcnetworks] Improve auth only video detection (#27548)
|
||||||
|
+ [generic] Add support for VHX Embeds (#27546)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.12.26
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [instagram] Fix comment count extraction
|
||||||
|
+ [instagram] Add support for reel URLs (#26234, #26250)
|
||||||
|
* [bbc] Switch to media selector v6 (#23232, #23933, #26303, #26432, #26821,
|
||||||
|
#27538)
|
||||||
|
* [instagram] Improve thumbnail extraction
|
||||||
|
* [instagram] Fix extraction when authenticated (#22880, #26377, #26981,
|
||||||
|
#27422)
|
||||||
|
* [spankbang:playlist] Fix extraction (#24087)
|
||||||
|
+ [spankbang] Add support for playlist videos
|
||||||
|
* [pornhub] Improve like and dislike count extraction (#27356)
|
||||||
|
* [pornhub] Fix lq formats extraction (#27386, #27393)
|
||||||
|
+ [bongacams] Add support for bongacams.com (#27440)
|
||||||
|
* [youtube:tab] Extend URL regular expression (#27501)
|
||||||
|
* [theweatherchannel] Fix extraction (#25930, #26051)
|
||||||
|
+ [sprout] Add support for Universal Kids (#22518)
|
||||||
|
* [theplatform] Allow passing geo bypass countries from other extractors
|
||||||
|
+ [wistia] Add support for playlists (#27533)
|
||||||
|
+ [ctv] Add support for ctv.ca (#27525)
|
||||||
|
* [9c9media] Improve info extraction
|
||||||
|
* [youtube] Fix automatic captions extraction (#27162, #27388)
|
||||||
|
* [sonyliv] Fix title for movies
|
||||||
|
* [sonyliv] Fix extraction (#25667)
|
||||||
|
* [streetvoice] Fix extraction (#27455, #27492)
|
||||||
|
+ [facebook] Add support for watchparty pages (#27507)
|
||||||
|
* [cbslocal] Fix video extraction
|
||||||
|
+ [brightcove] Add another method to extract policyKey
|
||||||
|
* [mewatch] Relax URL regular expression (#27506)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.12.22
|
||||||
|
|
||||||
|
Core
|
||||||
|
* [common] Remove unwanted query params from unsigned akamai manifest URLs
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
- [tastytrade] Remove extractor (#25716)
|
||||||
|
* [niconico] Fix playlist extraction (#27428)
|
||||||
|
- [everyonesmixtape] Remove extractor
|
||||||
|
- [kanalplay] Remove extractor
|
||||||
|
* [arkena] Fix extraction
|
||||||
|
* [nba] Rewrite extractor
|
||||||
|
* [turner] Improve info extraction
|
||||||
|
* [youtube] Improve xsrf token extraction (#27442)
|
||||||
|
* [generic] Improve RSS age limit extraction
|
||||||
|
* [generic] Fix RSS itunes thumbnail extraction (#27405)
|
||||||
|
+ [redditr] Extract duration (#27426)
|
||||||
|
- [zaq1] Remove extractor
|
||||||
|
+ [asiancrush] Add support for retrocrush.tv
|
||||||
|
* [asiancrush] Fix extraction
|
||||||
|
- [noco] Remove extractor (#10864)
|
||||||
|
* [nfl] Fix extraction (#22245)
|
||||||
|
* [skysports] Relax URL regular expression (#27435)
|
||||||
|
+ [tv5unis] Add support for tv5unis.ca (#22399, #24890)
|
||||||
|
+ [videomore] Add support for more.tv (#27088)
|
||||||
|
+ [yandexmusic] Add support for music.yandex.com (#27425)
|
||||||
|
+ [nhk:program] Add support for audio programs and program clips
|
||||||
|
+ [nhk] Add support for NHK video programs (#27230)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.12.14
|
||||||
|
|
||||||
|
Core
|
||||||
|
* [extractor/common] Improve JSON-LD interaction statistic extraction (#23306)
|
||||||
|
* [downloader/hls] Delegate manifests with media initialization to ffmpeg
|
||||||
|
+ [extractor/common] Document duration meta field for playlists
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [mdr] Bypass geo restriction
|
||||||
|
* [mdr] Improve extraction (#24346, #26873)
|
||||||
|
* [yandexmusic:album] Improve album title extraction (#27418)
|
||||||
|
* [eporner] Fix view count extraction and make optional (#23306)
|
||||||
|
+ [eporner] Extend URL regular expression
|
||||||
|
* [eporner] Fix hash extraction and extend _VALID_URL (#27396)
|
||||||
|
* [slideslive] Use m3u8 entry protocol for m3u8 formats (#27400)
|
||||||
|
* [twitcasting] Fix format extraction and improve info extraction (#24868)
|
||||||
|
* [linuxacademy] Fix authentication and extraction (#21129, #26223, #27402)
|
||||||
|
* [itv] Clean description from HTML tags (#27399)
|
||||||
|
* [vlive] Sort live formats (#27404)
|
||||||
|
* [hotstart] Fix and improve extraction
|
||||||
|
* Fix format extraction (#26690)
|
||||||
|
+ Extract thumbnail URL (#16079, #20412)
|
||||||
|
+ Add support for country specific playlist URLs (#23496)
|
||||||
|
* Select the last id in video URL (#26412)
|
||||||
|
+ [youtube] Add some invidious instances (#27373)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.12.12
|
||||||
|
|
||||||
|
Core
|
||||||
|
* [YoutubeDL] Improve thumbnail filename deducing (#26010, #27244)
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
+ [ruutu] Extract more metadata
|
||||||
|
+ [ruutu] Detect non-free videos (#21154)
|
||||||
|
* [ruutu] Authenticate format URLs (#21031, #26782)
|
||||||
|
+ [ruutu] Add support for static.nelonenmedia.fi (#25412)
|
||||||
|
+ [ruutu] Extend URL regular expression (#24839)
|
||||||
|
+ [facebook] Add support archived live video URLs (#15859)
|
||||||
|
* [wdr] Improve overall extraction
|
||||||
|
+ [wdr] Extend subtitles extraction (#22672, #22723)
|
||||||
|
+ [facebook] Add support for videos attached to Relay based story pages
|
||||||
|
(#10795)
|
||||||
|
+ [wdr:page] Add support for kinder.wdr.de (#27350)
|
||||||
|
+ [facebook] Add another regular expression for handleServerJS
|
||||||
|
* [facebook] Fix embed page extraction
|
||||||
|
+ [facebook] Add support for Relay post pages (#26935)
|
||||||
|
+ [facebook] Add support for watch videos (#22795, #27062)
|
||||||
|
+ [facebook] Add support for group posts with multiple videos (#19131)
|
||||||
|
* [itv] Fix series metadata extraction (#26897)
|
||||||
|
- [itv] Remove old extraction method (#23177)
|
||||||
|
* [facebook] Redirect mobile URLs to desktop URLs (#24831, #25624)
|
||||||
|
+ [facebook] Add support for Relay based pages (#26823)
|
||||||
|
* [facebook] Try to reduce unnecessary tahoe requests
|
||||||
|
- [facebook] Remove hardcoded Chrome User-Agent (#18974, #25411, #26958,
|
||||||
|
#27329)
|
||||||
|
- [smotri] Remove extractor (#27358)
|
||||||
|
- [beampro] Remove extractor (#17290, #22871, #23020, #23061, #26099)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.12.09
|
||||||
|
|
||||||
|
Core
|
||||||
|
* [extractor/common] Fix inline HTML5 media tags processing (#27345)
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [youtube:tab] Improve identity token extraction (#27197)
|
||||||
|
* [youtube:tab] Make click tracking params on continuation optional
|
||||||
|
* [youtube:tab] Delegate inline playlists to tab-based playlists (27298)
|
||||||
|
+ [tubitv] Extract release year (#27317)
|
||||||
|
* [amcnetworks] Fix free content extraction (#20354)
|
||||||
|
+ [lbry:channel] Add support for channels (#25584)
|
||||||
|
+ [lbry] Add support for short and embed URLs
|
||||||
|
* [lbry] Fix channel metadata extraction
|
||||||
|
+ [telequebec] Add support for video.telequebec.tv (#27339)
|
||||||
|
* [telequebec] Fix extraction (#25733, #26883)
|
||||||
|
+ [youtube:tab] Capture and output alerts (#27340)
|
||||||
|
* [tvplay:home] Fix extraction (#21153)
|
||||||
|
* [americastestkitchen] Fix Extraction and add support
|
||||||
|
for Cook's Country and Cook's Illustrated (#17234, #27322)
|
||||||
|
+ [slideslive] Add support for yoda service videos and extract subtitles
|
||||||
|
(#27323)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.12.07
|
||||||
|
|
||||||
|
Core
|
||||||
|
* [extractor/common] Extract timestamp from Last-Modified header
|
||||||
|
+ [extractor/common] Add support for dl8-* media tags (#27283)
|
||||||
|
* [extractor/common] Fix media type extraction for HTML5 media tags
|
||||||
|
in start/end form
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [aenetworks] Fix extraction (#23363, #23390, #26795, #26985)
|
||||||
|
* Fix Fastly format extraction
|
||||||
|
+ Add support for play and watch subdomains
|
||||||
|
+ Extract series metadata
|
||||||
|
* [youtube] Improve youtu.be extraction in non-existing playlists (#27324)
|
||||||
|
+ [generic] Extract RSS video description, timestamp and itunes metadata
|
||||||
|
(#27177)
|
||||||
|
* [nrk] Reduce the number of instalments and episodes requests
|
||||||
|
* [nrk] Improve extraction
|
||||||
|
* Improve format extraction for old akamai formats
|
||||||
|
+ Add is_live value to entry info dict
|
||||||
|
* Request instalments only when available
|
||||||
|
* Fix skole extraction
|
||||||
|
+ [peertube] Extract fps
|
||||||
|
+ [peertube] Recognize audio-only formats (#27295)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.12.05
|
||||||
|
|
||||||
|
Core
|
||||||
|
* [extractor/common] Improve Akamai HTTP format extraction
|
||||||
|
* Allow m3u8 manifest without an additional audio format
|
||||||
|
* Fix extraction for qualities starting with a number
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [teachable:course] Improve extraction (#24507, #27286)
|
||||||
|
* [nrk] Improve error extraction
|
||||||
|
* [nrktv:series] Improve extraction (#21926)
|
||||||
|
* [nrktv:season] Improve extraction
|
||||||
|
* [nrk] Improve format extraction and geo-restriction detection (#24221)
|
||||||
|
* [pornhub] Handle HTTP errors gracefully (#26414)
|
||||||
|
* [nrktv] Relax URL regular expression (#27299, #26185)
|
||||||
|
+ [zdf] Extract webm formats (#26659)
|
||||||
|
+ [gamespot] Extract DASH and HTTP formats
|
||||||
|
+ [tver] Add support for tver.jp (#26662, #27284)
|
||||||
|
+ [pornhub] Add support for pornhub.org (#27276)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.12.02
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
+ [tva] Add support for qub.ca (#27235)
|
||||||
|
+ [toggle] Detect DRM protected videos (#16479, #20805)
|
||||||
|
+ [toggle] Add support for new MeWatch URLs (#27256)
|
||||||
|
* [youtube:tab] Extract channels only from channels tab (#27266)
|
||||||
|
+ [cspan] Extract info from jwplayer data (#3672, #3734, #10638, #13030,
|
||||||
|
#18806, #23148, #24461, #26171, #26800, #27263)
|
||||||
|
* [cspan] Pass Referer header with format's video URL (#26032, #25729)
|
||||||
|
* [youtube] Improve age-gated videos extraction (#27259)
|
||||||
|
+ [mediaset] Add support for movie URLs (#27240)
|
||||||
|
* [yandexmusic] Refactor
|
||||||
|
+ [yandexmusic] Add support for artist's tracks and albums (#11887, #22284)
|
||||||
|
* [yandexmusic:track] Fix extraction (#26449, #26669, #26747, #26748, #26762)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.11.29
|
||||||
|
|
||||||
|
Core
|
||||||
|
* [YoutubeDL] Write static debug to stderr and respect quiet for dynamic debug
|
||||||
|
(#14579, #22593)
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [drtv] Extend URL regular expression (#27243)
|
||||||
|
* [tiktok] Fix extraction (#20809, #22838, #22850, #25987, #26281, #26411,
|
||||||
|
#26639, #26776, #27237)
|
||||||
|
+ [ina] Add support for mobile URLs (#27229)
|
||||||
|
* [pornhub] Fix like and dislike count extraction (#27227, #27234)
|
||||||
|
* [youtube] Improve yt initial player response extraction (#27216)
|
||||||
|
* [videa] Fix extraction (#25650, #25973, #26301)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.11.26
|
||||||
|
|
||||||
|
Core
|
||||||
|
* [downloader/fragment] Set final file's mtime according to last fragment's
|
||||||
|
Last-Modified header (#11718, #18384, #27138)
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
+ [spreaker] Add support for spreaker.com (#13480, #13877)
|
||||||
|
* [vlive] Improve extraction for geo-restricted videos
|
||||||
|
+ [vlive] Add support for post URLs (#27122, #27123)
|
||||||
|
* [viki] Fix video API request (#27184)
|
||||||
|
* [bbc] Fix BBC Three clip extraction
|
||||||
|
* [bbc] Fix BBC News videos extraction
|
||||||
|
+ [medaltv] Add support for medal.tv (#27149)
|
||||||
|
* [youtube] Improve music metadata and license extraction (#26013)
|
||||||
|
* [nrk] Fix extraction
|
||||||
|
* [cda] Fix extraction (#17803, #24458, #24518, #26381)
|
||||||
|
|
||||||
|
|
||||||
|
version 2020.11.24
|
||||||
|
|
||||||
|
Core
|
||||||
|
+ [extractor/common] Add generic support for akamai HTTP format extraction
|
||||||
|
|
||||||
|
Extractors
|
||||||
|
* [youtube:tab] Fix feeds extraction (#25695, #26452)
|
||||||
|
* [youtube:favorites] Restore extractor
|
||||||
|
* [youtube:tab] Fix some weird typo (#27157)
|
||||||
|
+ [pinterest] Add support for large collections (more than 25 pins)
|
||||||
|
+ [franceinter] Extract thumbnail (#27153)
|
||||||
|
+ [box] Add support for box.com (#5949)
|
||||||
|
+ [nytimes] Add support for cooking.nytimes.com (#27112, #27143)
|
||||||
|
* [lbry] Relax URL regular expression (#27144)
|
||||||
|
+ [rumble] Add support for embed pages (#10785)
|
||||||
|
+ [skyit] Add support for multiple Sky Italia websites (#26629)
|
||||||
|
+ [pinterest] Add support for pinterest.com (#25747)
|
||||||
|
|
||||||
|
|
||||||
version 2020.11.21.1
|
version 2020.11.21.1
|
||||||
|
|
||||||
Core
|
Core
|
||||||
|
769
README.md
769
README.md
@ -1,4 +1,5 @@
|
|||||||
[![Build Status](https://travis-ci.org/ytdl-org/youtube-dl.svg?branch=master)](https://travis-ci.org/ytdl-org/youtube-dl)
|
[![Build Status](https://github.com/ytdl-org/youtube-dl/workflows/CI/badge.svg)](https://github.com/ytdl-org/youtube-dl/actions?query=workflow%3ACI)
|
||||||
|
|
||||||
|
|
||||||
youtube-dl - download videos from youtube.com or other video platforms
|
youtube-dl - download videos from youtube.com or other video platforms
|
||||||
|
|
||||||
@ -51,394 +52,431 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
|
|||||||
youtube-dl [OPTIONS] URL [URL...]
|
youtube-dl [OPTIONS] URL [URL...]
|
||||||
|
|
||||||
# OPTIONS
|
# OPTIONS
|
||||||
-h, --help Print this help text and exit
|
-h, --help Print this help text and exit
|
||||||
--version Print program version and exit
|
--version Print program version and exit
|
||||||
-U, --update Update this program to latest version. Make
|
-U, --update Update this program to latest version.
|
||||||
sure that you have sufficient permissions
|
Make sure that you have sufficient
|
||||||
(run with sudo if needed)
|
permissions (run with sudo if needed)
|
||||||
-i, --ignore-errors Continue on download errors, for example to
|
-i, --ignore-errors Continue on download errors, for
|
||||||
skip unavailable videos in a playlist
|
example to skip unavailable videos in a
|
||||||
--abort-on-error Abort downloading of further videos (in the
|
playlist
|
||||||
playlist or the command line) if an error
|
--abort-on-error Abort downloading of further videos (in
|
||||||
occurs
|
the playlist or the command line) if an
|
||||||
--dump-user-agent Display the current browser identification
|
error occurs
|
||||||
--list-extractors List all supported extractors
|
--dump-user-agent Display the current browser
|
||||||
--extractor-descriptions Output descriptions of all supported
|
identification
|
||||||
extractors
|
--list-extractors List all supported extractors
|
||||||
--force-generic-extractor Force extraction to use the generic
|
--extractor-descriptions Output descriptions of all supported
|
||||||
extractor
|
extractors
|
||||||
--default-search PREFIX Use this prefix for unqualified URLs. For
|
--force-generic-extractor Force extraction to use the generic
|
||||||
example "gvsearch2:" downloads two videos
|
extractor
|
||||||
from google videos for youtube-dl "large
|
--default-search PREFIX Use this prefix for unqualified URLs.
|
||||||
apple". Use the value "auto" to let
|
For example "gvsearch2:" downloads two
|
||||||
youtube-dl guess ("auto_warning" to emit a
|
videos from google videos for youtube-
|
||||||
warning when guessing). "error" just throws
|
dl "large apple". Use the value "auto"
|
||||||
an error. The default value "fixup_error"
|
to let youtube-dl guess ("auto_warning"
|
||||||
repairs broken URLs, but emits an error if
|
to emit a warning when guessing).
|
||||||
this is not possible instead of searching.
|
"error" just throws an error. The
|
||||||
--ignore-config Do not read configuration files. When given
|
default value "fixup_error" repairs
|
||||||
in the global configuration file
|
broken URLs, but emits an error if this
|
||||||
/etc/youtube-dl.conf: Do not read the user
|
is not possible instead of searching.
|
||||||
configuration in ~/.config/youtube-
|
--ignore-config Do not read configuration files. When
|
||||||
dl/config (%APPDATA%/youtube-dl/config.txt
|
given in the global configuration file
|
||||||
on Windows)
|
/etc/youtube-dl.conf: Do not read the
|
||||||
--config-location PATH Location of the configuration file; either
|
user configuration in
|
||||||
the path to the config or its containing
|
~/.config/youtube-dl/config
|
||||||
directory.
|
(%APPDATA%/youtube-dl/config.txt on
|
||||||
--flat-playlist Do not extract the videos of a playlist,
|
Windows)
|
||||||
only list them.
|
--config-location PATH Location of the configuration file;
|
||||||
--mark-watched Mark videos watched (YouTube only)
|
either the path to the config or its
|
||||||
--no-mark-watched Do not mark videos watched (YouTube only)
|
containing directory.
|
||||||
--no-color Do not emit color codes in output
|
--flat-playlist Do not extract the videos of a
|
||||||
|
playlist, only list them.
|
||||||
|
--mark-watched Mark videos watched (YouTube only)
|
||||||
|
--no-mark-watched Do not mark videos watched (YouTube
|
||||||
|
only)
|
||||||
|
--no-color Do not emit color codes in output
|
||||||
|
|
||||||
## Network Options:
|
## Network Options:
|
||||||
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy.
|
--proxy URL Use the specified HTTP/HTTPS/SOCKS
|
||||||
To enable SOCKS proxy, specify a proper
|
proxy. To enable SOCKS proxy, specify a
|
||||||
scheme. For example
|
proper scheme. For example
|
||||||
socks5://127.0.0.1:1080/. Pass in an empty
|
socks5://127.0.0.1:1080/. Pass in an
|
||||||
string (--proxy "") for direct connection
|
empty string (--proxy "") for direct
|
||||||
--socket-timeout SECONDS Time to wait before giving up, in seconds
|
connection
|
||||||
--source-address IP Client-side IP address to bind to
|
--socket-timeout SECONDS Time to wait before giving up, in
|
||||||
-4, --force-ipv4 Make all connections via IPv4
|
seconds
|
||||||
-6, --force-ipv6 Make all connections via IPv6
|
--source-address IP Client-side IP address to bind to
|
||||||
|
-4, --force-ipv4 Make all connections via IPv4
|
||||||
|
-6, --force-ipv6 Make all connections via IPv6
|
||||||
|
|
||||||
## Geo Restriction:
|
## Geo Restriction:
|
||||||
--geo-verification-proxy URL Use this proxy to verify the IP address for
|
--geo-verification-proxy URL Use this proxy to verify the IP address
|
||||||
some geo-restricted sites. The default
|
for some geo-restricted sites. The
|
||||||
proxy specified by --proxy (or none, if the
|
default proxy specified by --proxy (or
|
||||||
option is not present) is used for the
|
none, if the option is not present) is
|
||||||
actual downloading.
|
used for the actual downloading.
|
||||||
--geo-bypass Bypass geographic restriction via faking
|
--geo-bypass Bypass geographic restriction via
|
||||||
X-Forwarded-For HTTP header
|
faking X-Forwarded-For HTTP header
|
||||||
--no-geo-bypass Do not bypass geographic restriction via
|
--no-geo-bypass Do not bypass geographic restriction
|
||||||
faking X-Forwarded-For HTTP header
|
via faking X-Forwarded-For HTTP header
|
||||||
--geo-bypass-country CODE Force bypass geographic restriction with
|
--geo-bypass-country CODE Force bypass geographic restriction
|
||||||
explicitly provided two-letter ISO 3166-2
|
with explicitly provided two-letter ISO
|
||||||
country code
|
3166-2 country code
|
||||||
--geo-bypass-ip-block IP_BLOCK Force bypass geographic restriction with
|
--geo-bypass-ip-block IP_BLOCK Force bypass geographic restriction
|
||||||
explicitly provided IP block in CIDR
|
with explicitly provided IP block in
|
||||||
notation
|
CIDR notation
|
||||||
|
|
||||||
## Video Selection:
|
## Video Selection:
|
||||||
--playlist-start NUMBER Playlist video to start at (default is 1)
|
--playlist-start NUMBER Playlist video to start at (default is
|
||||||
--playlist-end NUMBER Playlist video to end at (default is last)
|
1)
|
||||||
--playlist-items ITEM_SPEC Playlist video items to download. Specify
|
--playlist-end NUMBER Playlist video to end at (default is
|
||||||
indices of the videos in the playlist
|
last)
|
||||||
separated by commas like: "--playlist-items
|
--playlist-items ITEM_SPEC Playlist video items to download.
|
||||||
1,2,5,8" if you want to download videos
|
Specify indices of the videos in the
|
||||||
indexed 1, 2, 5, 8 in the playlist. You can
|
playlist separated by commas like: "--
|
||||||
specify range: "--playlist-items
|
playlist-items 1,2,5,8" if you want to
|
||||||
1-3,7,10-13", it will download the videos
|
download videos indexed 1, 2, 5, 8 in
|
||||||
at index 1, 2, 3, 7, 10, 11, 12 and 13.
|
the playlist. You can specify range: "
|
||||||
--match-title REGEX Download only matching titles (regex or
|
--playlist-items 1-3,7,10-13", it will
|
||||||
caseless sub-string)
|
download the videos at index 1, 2, 3,
|
||||||
--reject-title REGEX Skip download for matching titles (regex or
|
7, 10, 11, 12 and 13.
|
||||||
caseless sub-string)
|
--match-title REGEX Download only matching titles (regex or
|
||||||
--max-downloads NUMBER Abort after downloading NUMBER files
|
caseless sub-string)
|
||||||
--min-filesize SIZE Do not download any videos smaller than
|
--reject-title REGEX Skip download for matching titles
|
||||||
SIZE (e.g. 50k or 44.6m)
|
(regex or caseless sub-string)
|
||||||
--max-filesize SIZE Do not download any videos larger than SIZE
|
--max-downloads NUMBER Abort after downloading NUMBER files
|
||||||
(e.g. 50k or 44.6m)
|
--min-filesize SIZE Do not download any videos smaller than
|
||||||
--date DATE Download only videos uploaded in this date
|
SIZE (e.g. 50k or 44.6m)
|
||||||
--datebefore DATE Download only videos uploaded on or before
|
--max-filesize SIZE Do not download any videos larger than
|
||||||
this date (i.e. inclusive)
|
SIZE (e.g. 50k or 44.6m)
|
||||||
--dateafter DATE Download only videos uploaded on or after
|
--date DATE Download only videos uploaded in this
|
||||||
this date (i.e. inclusive)
|
date
|
||||||
--min-views COUNT Do not download any videos with less than
|
--datebefore DATE Download only videos uploaded on or
|
||||||
COUNT views
|
before this date (i.e. inclusive)
|
||||||
--max-views COUNT Do not download any videos with more than
|
--dateafter DATE Download only videos uploaded on or
|
||||||
COUNT views
|
after this date (i.e. inclusive)
|
||||||
--match-filter FILTER Generic video filter. Specify any key (see
|
--min-views COUNT Do not download any videos with less
|
||||||
the "OUTPUT TEMPLATE" for a list of
|
than COUNT views
|
||||||
available keys) to match if the key is
|
--max-views COUNT Do not download any videos with more
|
||||||
present, !key to check if the key is not
|
than COUNT views
|
||||||
present, key > NUMBER (like "comment_count
|
--match-filter FILTER Generic video filter. Specify any key
|
||||||
> 12", also works with >=, <, <=, !=, =) to
|
(see the "OUTPUT TEMPLATE" for a list
|
||||||
compare against a number, key = 'LITERAL'
|
of available keys) to match if the key
|
||||||
(like "uploader = 'Mike Smith'", also works
|
is present, !key to check if the key is
|
||||||
with !=) to match against a string literal
|
not present, key > NUMBER (like
|
||||||
and & to require multiple matches. Values
|
"comment_count > 12", also works with
|
||||||
which are not known are excluded unless you
|
>=, <, <=, !=, =) to compare against a
|
||||||
put a question mark (?) after the operator.
|
number, key = 'LITERAL' (like "uploader
|
||||||
For example, to only match videos that have
|
= 'Mike Smith'", also works with !=) to
|
||||||
been liked more than 100 times and disliked
|
match against a string literal and & to
|
||||||
less than 50 times (or the dislike
|
require multiple matches. Values which
|
||||||
functionality is not available at the given
|
are not known are excluded unless you
|
||||||
service), but who also have a description,
|
put a question mark (?) after the
|
||||||
use --match-filter "like_count > 100 &
|
operator. For example, to only match
|
||||||
dislike_count <? 50 & description" .
|
videos that have been liked more than
|
||||||
--no-playlist Download only the video, if the URL refers
|
100 times and disliked less than 50
|
||||||
to a video and a playlist.
|
times (or the dislike functionality is
|
||||||
--yes-playlist Download the playlist, if the URL refers to
|
not available at the given service),
|
||||||
a video and a playlist.
|
but who also have a description, use
|
||||||
--age-limit YEARS Download only videos suitable for the given
|
--match-filter "like_count > 100 &
|
||||||
age
|
dislike_count <? 50 & description" .
|
||||||
--download-archive FILE Download only videos not listed in the
|
--no-playlist Download only the video, if the URL
|
||||||
archive file. Record the IDs of all
|
refers to a video and a playlist.
|
||||||
downloaded videos in it.
|
--yes-playlist Download the playlist, if the URL
|
||||||
--include-ads Download advertisements as well
|
refers to a video and a playlist.
|
||||||
(experimental)
|
--age-limit YEARS Download only videos suitable for the
|
||||||
|
given age
|
||||||
|
--download-archive FILE Download only videos not listed in the
|
||||||
|
archive file. Record the IDs of all
|
||||||
|
downloaded videos in it.
|
||||||
|
--include-ads Download advertisements as well
|
||||||
|
(experimental)
|
||||||
|
|
||||||
## Download Options:
|
## Download Options:
|
||||||
-r, --limit-rate RATE Maximum download rate in bytes per second
|
-r, --limit-rate RATE Maximum download rate in bytes per
|
||||||
(e.g. 50K or 4.2M)
|
second (e.g. 50K or 4.2M)
|
||||||
-R, --retries RETRIES Number of retries (default is 10), or
|
-R, --retries RETRIES Number of retries (default is 10), or
|
||||||
"infinite".
|
"infinite".
|
||||||
--fragment-retries RETRIES Number of retries for a fragment (default
|
--fragment-retries RETRIES Number of retries for a fragment
|
||||||
is 10), or "infinite" (DASH, hlsnative and
|
(default is 10), or "infinite" (DASH,
|
||||||
ISM)
|
hlsnative and ISM)
|
||||||
--skip-unavailable-fragments Skip unavailable fragments (DASH, hlsnative
|
--skip-unavailable-fragments Skip unavailable fragments (DASH,
|
||||||
and ISM)
|
hlsnative and ISM)
|
||||||
--abort-on-unavailable-fragment Abort downloading when some fragment is not
|
--abort-on-unavailable-fragment Abort downloading when some fragment is
|
||||||
available
|
not available
|
||||||
--keep-fragments Keep downloaded fragments on disk after
|
--keep-fragments Keep downloaded fragments on disk after
|
||||||
downloading is finished; fragments are
|
downloading is finished; fragments are
|
||||||
erased by default
|
erased by default
|
||||||
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K)
|
--buffer-size SIZE Size of download buffer (e.g. 1024 or
|
||||||
(default is 1024)
|
16K) (default is 1024)
|
||||||
--no-resize-buffer Do not automatically adjust the buffer
|
--no-resize-buffer Do not automatically adjust the buffer
|
||||||
size. By default, the buffer size is
|
size. By default, the buffer size is
|
||||||
automatically resized from an initial value
|
automatically resized from an initial
|
||||||
of SIZE.
|
value of SIZE.
|
||||||
--http-chunk-size SIZE Size of a chunk for chunk-based HTTP
|
--http-chunk-size SIZE Size of a chunk for chunk-based HTTP
|
||||||
downloading (e.g. 10485760 or 10M) (default
|
downloading (e.g. 10485760 or 10M)
|
||||||
is disabled). May be useful for bypassing
|
(default is disabled). May be useful
|
||||||
bandwidth throttling imposed by a webserver
|
for bypassing bandwidth throttling
|
||||||
(experimental)
|
imposed by a webserver (experimental)
|
||||||
--playlist-reverse Download playlist videos in reverse order
|
--playlist-reverse Download playlist videos in reverse
|
||||||
--playlist-random Download playlist videos in random order
|
order
|
||||||
--xattr-set-filesize Set file xattribute ytdl.filesize with
|
--playlist-random Download playlist videos in random
|
||||||
expected file size
|
order
|
||||||
--hls-prefer-native Use the native HLS downloader instead of
|
--xattr-set-filesize Set file xattribute ytdl.filesize with
|
||||||
ffmpeg
|
expected file size
|
||||||
--hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
|
--hls-prefer-native Use the native HLS downloader instead
|
||||||
downloader
|
of ffmpeg
|
||||||
--hls-use-mpegts Use the mpegts container for HLS videos,
|
--hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
|
||||||
allowing to play the video while
|
downloader
|
||||||
downloading (some players may not be able
|
--hls-use-mpegts Use the mpegts container for HLS
|
||||||
to play it)
|
videos, allowing to play the video
|
||||||
--external-downloader COMMAND Use the specified external downloader.
|
while downloading (some players may not
|
||||||
Currently supports
|
be able to play it)
|
||||||
aria2c,avconv,axel,curl,ffmpeg,httpie,wget
|
--external-downloader COMMAND Use the specified external downloader.
|
||||||
--external-downloader-args ARGS Give these arguments to the external
|
Currently supports aria2c,avconv,axel,c
|
||||||
downloader
|
url,ffmpeg,httpie,wget
|
||||||
|
--external-downloader-args ARGS Give these arguments to the external
|
||||||
|
downloader
|
||||||
|
|
||||||
## Filesystem Options:
|
## Filesystem Options:
|
||||||
-a, --batch-file FILE File containing URLs to download ('-' for
|
-a, --batch-file FILE File containing URLs to download ('-'
|
||||||
stdin), one URL per line. Lines starting
|
for stdin), one URL per line. Lines
|
||||||
with '#', ';' or ']' are considered as
|
starting with '#', ';' or ']' are
|
||||||
comments and ignored.
|
considered as comments and ignored.
|
||||||
--id Use only video ID in file name
|
--id Use only video ID in file name
|
||||||
-o, --output TEMPLATE Output filename template, see the "OUTPUT
|
-o, --output TEMPLATE Output filename template, see the
|
||||||
TEMPLATE" for all the info
|
"OUTPUT TEMPLATE" for all the info
|
||||||
--autonumber-start NUMBER Specify the start value for %(autonumber)s
|
--output-na-placeholder PLACEHOLDER Placeholder value for unavailable meta
|
||||||
(default is 1)
|
fields in output filename template
|
||||||
--restrict-filenames Restrict filenames to only ASCII
|
(default is "NA")
|
||||||
characters, and avoid "&" and spaces in
|
--autonumber-start NUMBER Specify the start value for
|
||||||
filenames
|
%(autonumber)s (default is 1)
|
||||||
-w, --no-overwrites Do not overwrite files
|
--restrict-filenames Restrict filenames to only ASCII
|
||||||
-c, --continue Force resume of partially downloaded files.
|
characters, and avoid "&" and spaces in
|
||||||
By default, youtube-dl will resume
|
filenames
|
||||||
downloads if possible.
|
-w, --no-overwrites Do not overwrite files
|
||||||
--no-continue Do not resume partially downloaded files
|
-c, --continue Force resume of partially downloaded
|
||||||
(restart from beginning)
|
files. By default, youtube-dl will
|
||||||
--no-part Do not use .part files - write directly
|
resume downloads if possible.
|
||||||
into output file
|
--no-continue Do not resume partially downloaded
|
||||||
--no-mtime Do not use the Last-modified header to set
|
files (restart from beginning)
|
||||||
the file modification time
|
--no-part Do not use .part files - write directly
|
||||||
--write-description Write video description to a .description
|
into output file
|
||||||
file
|
--no-mtime Do not use the Last-modified header to
|
||||||
--write-info-json Write video metadata to a .info.json file
|
set the file modification time
|
||||||
--write-annotations Write video annotations to a
|
--write-description Write video description to a
|
||||||
.annotations.xml file
|
.description file
|
||||||
--load-info-json FILE JSON file containing the video information
|
--write-info-json Write video metadata to a .info.json
|
||||||
(created with the "--write-info-json"
|
file
|
||||||
option)
|
--write-annotations Write video annotations to a
|
||||||
--cookies FILE File to read cookies from and dump cookie
|
.annotations.xml file
|
||||||
jar in
|
--load-info-json FILE JSON file containing the video
|
||||||
--cache-dir DIR Location in the filesystem where youtube-dl
|
information (created with the "--write-
|
||||||
can store some downloaded information
|
info-json" option)
|
||||||
permanently. By default
|
--cookies FILE File to read cookies from and dump
|
||||||
$XDG_CACHE_HOME/youtube-dl or
|
cookie jar in
|
||||||
~/.cache/youtube-dl . At the moment, only
|
--cache-dir DIR Location in the filesystem where
|
||||||
YouTube player files (for videos with
|
youtube-dl can store some downloaded
|
||||||
obfuscated signatures) are cached, but that
|
information permanently. By default
|
||||||
may change.
|
$XDG_CACHE_HOME/youtube-dl or
|
||||||
--no-cache-dir Disable filesystem caching
|
~/.cache/youtube-dl . At the moment,
|
||||||
--rm-cache-dir Delete all filesystem cache files
|
only YouTube player files (for videos
|
||||||
|
with obfuscated signatures) are cached,
|
||||||
|
but that may change.
|
||||||
|
--no-cache-dir Disable filesystem caching
|
||||||
|
--rm-cache-dir Delete all filesystem cache files
|
||||||
|
|
||||||
## Thumbnail images:
|
## Thumbnail images:
|
||||||
--write-thumbnail Write thumbnail image to disk
|
--write-thumbnail Write thumbnail image to disk
|
||||||
--write-all-thumbnails Write all thumbnail image formats to disk
|
--write-all-thumbnails Write all thumbnail image formats to
|
||||||
--list-thumbnails Simulate and list all available thumbnail
|
disk
|
||||||
formats
|
--list-thumbnails Simulate and list all available
|
||||||
|
thumbnail formats
|
||||||
|
|
||||||
## Verbosity / Simulation Options:
|
## Verbosity / Simulation Options:
|
||||||
-q, --quiet Activate quiet mode
|
-q, --quiet Activate quiet mode
|
||||||
--no-warnings Ignore warnings
|
--no-warnings Ignore warnings
|
||||||
-s, --simulate Do not download the video and do not write
|
-s, --simulate Do not download the video and do not
|
||||||
anything to disk
|
write anything to disk
|
||||||
--skip-download Do not download the video
|
--skip-download Do not download the video
|
||||||
-g, --get-url Simulate, quiet but print URL
|
-g, --get-url Simulate, quiet but print URL
|
||||||
-e, --get-title Simulate, quiet but print title
|
-e, --get-title Simulate, quiet but print title
|
||||||
--get-id Simulate, quiet but print id
|
--get-id Simulate, quiet but print id
|
||||||
--get-thumbnail Simulate, quiet but print thumbnail URL
|
--get-thumbnail Simulate, quiet but print thumbnail URL
|
||||||
--get-description Simulate, quiet but print video description
|
--get-description Simulate, quiet but print video
|
||||||
--get-duration Simulate, quiet but print video length
|
description
|
||||||
--get-filename Simulate, quiet but print output filename
|
--get-duration Simulate, quiet but print video length
|
||||||
--get-format Simulate, quiet but print output format
|
--get-filename Simulate, quiet but print output
|
||||||
-j, --dump-json Simulate, quiet but print JSON information.
|
filename
|
||||||
See the "OUTPUT TEMPLATE" for a description
|
--get-format Simulate, quiet but print output format
|
||||||
of available keys.
|
-j, --dump-json Simulate, quiet but print JSON
|
||||||
-J, --dump-single-json Simulate, quiet but print JSON information
|
information. See the "OUTPUT TEMPLATE"
|
||||||
for each command-line argument. If the URL
|
for a description of available keys.
|
||||||
refers to a playlist, dump the whole
|
-J, --dump-single-json Simulate, quiet but print JSON
|
||||||
playlist information in a single line.
|
information for each command-line
|
||||||
--print-json Be quiet and print the video information as
|
argument. If the URL refers to a
|
||||||
JSON (video is still being downloaded).
|
playlist, dump the whole playlist
|
||||||
--newline Output progress bar as new lines
|
information in a single line.
|
||||||
--no-progress Do not print progress bar
|
--print-json Be quiet and print the video
|
||||||
--console-title Display progress in console titlebar
|
information as JSON (video is still
|
||||||
-v, --verbose Print various debugging information
|
being downloaded).
|
||||||
--dump-pages Print downloaded pages encoded using base64
|
--newline Output progress bar as new lines
|
||||||
to debug problems (very verbose)
|
--no-progress Do not print progress bar
|
||||||
--write-pages Write downloaded intermediary pages to
|
--console-title Display progress in console titlebar
|
||||||
files in the current directory to debug
|
-v, --verbose Print various debugging information
|
||||||
problems
|
--dump-pages Print downloaded pages encoded using
|
||||||
--print-traffic Display sent and read HTTP traffic
|
base64 to debug problems (very verbose)
|
||||||
-C, --call-home Contact the youtube-dl server for debugging
|
--write-pages Write downloaded intermediary pages to
|
||||||
--no-call-home Do NOT contact the youtube-dl server for
|
files in the current directory to debug
|
||||||
debugging
|
problems
|
||||||
|
--print-traffic Display sent and read HTTP traffic
|
||||||
|
-C, --call-home Contact the youtube-dl server for
|
||||||
|
debugging
|
||||||
|
--no-call-home Do NOT contact the youtube-dl server
|
||||||
|
for debugging
|
||||||
|
|
||||||
## Workarounds:
|
## Workarounds:
|
||||||
--encoding ENCODING Force the specified encoding (experimental)
|
--encoding ENCODING Force the specified encoding
|
||||||
--no-check-certificate Suppress HTTPS certificate validation
|
(experimental)
|
||||||
--prefer-insecure Use an unencrypted connection to retrieve
|
--no-check-certificate Suppress HTTPS certificate validation
|
||||||
information about the video. (Currently
|
--prefer-insecure Use an unencrypted connection to
|
||||||
supported only for YouTube)
|
retrieve information about the video.
|
||||||
--user-agent UA Specify a custom user agent
|
(Currently supported only for YouTube)
|
||||||
--referer URL Specify a custom referer, use if the video
|
--user-agent UA Specify a custom user agent
|
||||||
access is restricted to one domain
|
--referer URL Specify a custom referer, use if the
|
||||||
--add-header FIELD:VALUE Specify a custom HTTP header and its value,
|
video access is restricted to one
|
||||||
separated by a colon ':'. You can use this
|
domain
|
||||||
option multiple times
|
--add-header FIELD:VALUE Specify a custom HTTP header and its
|
||||||
--bidi-workaround Work around terminals that lack
|
value, separated by a colon ':'. You
|
||||||
bidirectional text support. Requires bidiv
|
can use this option multiple times
|
||||||
or fribidi executable in PATH
|
--bidi-workaround Work around terminals that lack
|
||||||
--sleep-interval SECONDS Number of seconds to sleep before each
|
bidirectional text support. Requires
|
||||||
download when used alone or a lower bound
|
bidiv or fribidi executable in PATH
|
||||||
of a range for randomized sleep before each
|
--sleep-interval SECONDS Number of seconds to sleep before each
|
||||||
download (minimum possible number of
|
download when used alone or a lower
|
||||||
seconds to sleep) when used along with
|
bound of a range for randomized sleep
|
||||||
--max-sleep-interval.
|
before each download (minimum possible
|
||||||
--max-sleep-interval SECONDS Upper bound of a range for randomized sleep
|
number of seconds to sleep) when used
|
||||||
before each download (maximum possible
|
along with --max-sleep-interval.
|
||||||
number of seconds to sleep). Must only be
|
--max-sleep-interval SECONDS Upper bound of a range for randomized
|
||||||
used along with --min-sleep-interval.
|
sleep before each download (maximum
|
||||||
|
possible number of seconds to sleep).
|
||||||
|
Must only be used along with --min-
|
||||||
|
sleep-interval.
|
||||||
|
|
||||||
## Video Format Options:
|
## Video Format Options:
|
||||||
-f, --format FORMAT Video format code, see the "FORMAT
|
-f, --format FORMAT Video format code, see the "FORMAT
|
||||||
SELECTION" for all the info
|
SELECTION" for all the info
|
||||||
--all-formats Download all available video formats
|
--all-formats Download all available video formats
|
||||||
--prefer-free-formats Prefer free video formats unless a specific
|
--prefer-free-formats Prefer free video formats unless a
|
||||||
one is requested
|
specific one is requested
|
||||||
-F, --list-formats List all available formats of requested
|
-F, --list-formats List all available formats of requested
|
||||||
videos
|
videos
|
||||||
--youtube-skip-dash-manifest Do not download the DASH manifests and
|
--youtube-skip-dash-manifest Do not download the DASH manifests and
|
||||||
related data on YouTube videos
|
related data on YouTube videos
|
||||||
--merge-output-format FORMAT If a merge is required (e.g.
|
--merge-output-format FORMAT If a merge is required (e.g.
|
||||||
bestvideo+bestaudio), output to given
|
bestvideo+bestaudio), output to given
|
||||||
container format. One of mkv, mp4, ogg,
|
container format. One of mkv, mp4, ogg,
|
||||||
webm, flv. Ignored if no merge is required
|
webm, flv. Ignored if no merge is
|
||||||
|
required
|
||||||
|
|
||||||
## Subtitle Options:
|
## Subtitle Options:
|
||||||
--write-sub Write subtitle file
|
--write-sub Write subtitle file
|
||||||
--write-auto-sub Write automatically generated subtitle file
|
--write-auto-sub Write automatically generated subtitle
|
||||||
(YouTube only)
|
file (YouTube only)
|
||||||
--all-subs Download all the available subtitles of the
|
--all-subs Download all the available subtitles of
|
||||||
video
|
the video
|
||||||
--list-subs List all available subtitles for the video
|
--list-subs List all available subtitles for the
|
||||||
--sub-format FORMAT Subtitle format, accepts formats
|
video
|
||||||
preference, for example: "srt" or
|
--sub-format FORMAT Subtitle format, accepts formats
|
||||||
"ass/srt/best"
|
preference, for example: "srt" or
|
||||||
--sub-lang LANGS Languages of the subtitles to download
|
"ass/srt/best"
|
||||||
(optional) separated by commas, use --list-
|
--sub-lang LANGS Languages of the subtitles to download
|
||||||
subs for available language tags
|
(optional) separated by commas, use
|
||||||
|
--list-subs for available language tags
|
||||||
|
|
||||||
## Authentication Options:
|
## Authentication Options:
|
||||||
-u, --username USERNAME Login with this account ID
|
-u, --username USERNAME Login with this account ID
|
||||||
-p, --password PASSWORD Account password. If this option is left
|
-p, --password PASSWORD Account password. If this option is
|
||||||
out, youtube-dl will ask interactively.
|
left out, youtube-dl will ask
|
||||||
-2, --twofactor TWOFACTOR Two-factor authentication code
|
interactively.
|
||||||
-n, --netrc Use .netrc authentication data
|
-2, --twofactor TWOFACTOR Two-factor authentication code
|
||||||
--video-password PASSWORD Video password (vimeo, smotri, youku)
|
-n, --netrc Use .netrc authentication data
|
||||||
|
--video-password PASSWORD Video password (vimeo, youku)
|
||||||
|
|
||||||
## Adobe Pass Options:
|
## Adobe Pass Options:
|
||||||
--ap-mso MSO Adobe Pass multiple-system operator (TV
|
--ap-mso MSO Adobe Pass multiple-system operator (TV
|
||||||
provider) identifier, use --ap-list-mso for
|
provider) identifier, use --ap-list-mso
|
||||||
a list of available MSOs
|
for a list of available MSOs
|
||||||
--ap-username USERNAME Multiple-system operator account login
|
--ap-username USERNAME Multiple-system operator account login
|
||||||
--ap-password PASSWORD Multiple-system operator account password.
|
--ap-password PASSWORD Multiple-system operator account
|
||||||
If this option is left out, youtube-dl will
|
password. If this option is left out,
|
||||||
ask interactively.
|
youtube-dl will ask interactively.
|
||||||
--ap-list-mso List all supported multiple-system
|
--ap-list-mso List all supported multiple-system
|
||||||
operators
|
operators
|
||||||
|
|
||||||
## Post-processing Options:
|
## Post-processing Options:
|
||||||
-x, --extract-audio Convert video files to audio-only files
|
-x, --extract-audio Convert video files to audio-only files
|
||||||
(requires ffmpeg or avconv and ffprobe or
|
(requires ffmpeg/avconv and
|
||||||
avprobe)
|
ffprobe/avprobe)
|
||||||
--audio-format FORMAT Specify audio format: "best", "aac",
|
--audio-format FORMAT Specify audio format: "best", "aac",
|
||||||
"flac", "mp3", "m4a", "opus", "vorbis", or
|
"flac", "mp3", "m4a", "opus", "vorbis",
|
||||||
"wav"; "best" by default; No effect without
|
or "wav"; "best" by default; No effect
|
||||||
-x
|
without -x
|
||||||
--audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert
|
--audio-quality QUALITY Specify ffmpeg/avconv audio quality,
|
||||||
a value between 0 (better) and 9 (worse)
|
insert a value between 0 (better) and 9
|
||||||
for VBR or a specific bitrate like 128K
|
(worse) for VBR or a specific bitrate
|
||||||
(default 5)
|
like 128K (default 5)
|
||||||
--recode-video FORMAT Encode the video to another format if
|
--recode-video FORMAT Encode the video to another format if
|
||||||
necessary (currently supported:
|
necessary (currently supported:
|
||||||
mp4|flv|ogg|webm|mkv|avi)
|
mp4|flv|ogg|webm|mkv|avi)
|
||||||
--postprocessor-args ARGS Give these arguments to the postprocessor
|
--postprocessor-args ARGS Give these arguments to the
|
||||||
-k, --keep-video Keep the video file on disk after the post-
|
postprocessor
|
||||||
processing; the video is erased by default
|
-k, --keep-video Keep the video file on disk after the
|
||||||
--no-post-overwrites Do not overwrite post-processed files; the
|
post-processing; the video is erased by
|
||||||
post-processed files are overwritten by
|
default
|
||||||
default
|
--no-post-overwrites Do not overwrite post-processed files;
|
||||||
--embed-subs Embed subtitles in the video (only for mp4,
|
the post-processed files are
|
||||||
webm and mkv videos)
|
overwritten by default
|
||||||
--embed-thumbnail Embed thumbnail in the audio as cover art
|
--embed-subs Embed subtitles in the video (only for
|
||||||
--add-metadata Write metadata to the video file
|
mp4, webm and mkv videos)
|
||||||
--metadata-from-title FORMAT Parse additional metadata like song title /
|
--embed-thumbnail Embed thumbnail in the audio as cover
|
||||||
artist from the video title. The format
|
art
|
||||||
syntax is the same as --output. Regular
|
--add-metadata Write metadata to the video file
|
||||||
expression with named capture groups may
|
--metadata-from-title FORMAT Parse additional metadata like song
|
||||||
also be used. The parsed parameters replace
|
title / artist from the video title.
|
||||||
existing values. Example: --metadata-from-
|
The format syntax is the same as
|
||||||
title "%(artist)s - %(title)s" matches a
|
--output. Regular expression with named
|
||||||
title like "Coldplay - Paradise". Example
|
capture groups may also be used. The
|
||||||
(regex): --metadata-from-title
|
parsed parameters replace existing
|
||||||
"(?P<artist>.+?) - (?P<title>.+)"
|
values. Example: --metadata-from-title
|
||||||
--xattrs Write metadata to the video file's xattrs
|
"%(artist)s - %(title)s" matches a
|
||||||
(using dublin core and xdg standards)
|
title like "Coldplay - Paradise".
|
||||||
--fixup POLICY Automatically correct known faults of the
|
Example (regex): --metadata-from-title
|
||||||
file. One of never (do nothing), warn (only
|
"(?P<artist>.+?) - (?P<title>.+)"
|
||||||
emit a warning), detect_or_warn (the
|
--xattrs Write metadata to the video file's
|
||||||
default; fix file if we can, warn
|
xattrs (using dublin core and xdg
|
||||||
otherwise)
|
standards)
|
||||||
--prefer-avconv Prefer avconv over ffmpeg for running the
|
--fixup POLICY Automatically correct known faults of
|
||||||
postprocessors
|
the file. One of never (do nothing),
|
||||||
--prefer-ffmpeg Prefer ffmpeg over avconv for running the
|
warn (only emit a warning),
|
||||||
postprocessors (default)
|
detect_or_warn (the default; fix file
|
||||||
--ffmpeg-location PATH Location of the ffmpeg/avconv binary;
|
if we can, warn otherwise)
|
||||||
either the path to the binary or its
|
--prefer-avconv Prefer avconv over ffmpeg for running
|
||||||
containing directory.
|
the postprocessors
|
||||||
--exec CMD Execute a command on the file after
|
--prefer-ffmpeg Prefer ffmpeg over avconv for running
|
||||||
downloading and post-processing, similar to
|
the postprocessors (default)
|
||||||
find's -exec syntax. Example: --exec 'adb
|
--ffmpeg-location PATH Location of the ffmpeg/avconv binary;
|
||||||
push {} /sdcard/Music/ && rm {}'
|
either the path to the binary or its
|
||||||
--convert-subs FORMAT Convert the subtitles to other format
|
containing directory.
|
||||||
(currently supported: srt|ass|vtt|lrc)
|
--exec CMD Execute a command on the file after
|
||||||
|
downloading and post-processing,
|
||||||
|
similar to find's -exec syntax.
|
||||||
|
Example: --exec 'adb push {}
|
||||||
|
/sdcard/Music/ && rm {}'
|
||||||
|
--convert-subs FORMAT Convert the subtitles to other format
|
||||||
|
(currently supported: srt|ass|vtt|lrc)
|
||||||
|
|
||||||
# CONFIGURATION
|
# CONFIGURATION
|
||||||
|
|
||||||
@ -582,7 +620,7 @@ Available for the media that is a track or a part of a music album:
|
|||||||
- `disc_number` (numeric): Number of the disc or other physical medium the track belongs to
|
- `disc_number` (numeric): Number of the disc or other physical medium the track belongs to
|
||||||
- `release_year` (numeric): Year (YYYY) when the album was released
|
- `release_year` (numeric): Year (YYYY) when the album was released
|
||||||
|
|
||||||
Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with `NA`.
|
Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with placeholder value provided with `--output-na-placeholder` (`NA` by default).
|
||||||
|
|
||||||
For example for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `youtube-dl test video` and id `BaW_jenozKcj`, this will result in a `youtube-dl test video-BaW_jenozKcj.mp4` file created in the current directory.
|
For example for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `youtube-dl test video` and id `BaW_jenozKcj`, this will result in a `youtube-dl test video-BaW_jenozKcj.mp4` file created in the current directory.
|
||||||
|
|
||||||
@ -677,6 +715,7 @@ Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends
|
|||||||
- `container`: Name of the container format
|
- `container`: Name of the container format
|
||||||
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
|
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
|
||||||
- `format_id`: A short description of the format
|
- `format_id`: A short description of the format
|
||||||
|
- `language`: Language code
|
||||||
|
|
||||||
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain).
|
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain).
|
||||||
|
|
||||||
@ -879,7 +918,7 @@ Either prepend `https://www.youtube.com/watch?v=` or separate the ID from the op
|
|||||||
|
|
||||||
Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
|
Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
|
||||||
|
|
||||||
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [cookies.txt](https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/) (for Firefox).
|
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [Get cookies.txt](https://chrome.google.com/webstore/detail/get-cookiestxt/bgaddhkoddajcdgocldbbfleckgcbcid/) (for Chrome) or [cookies.txt](https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/) (for Firefox).
|
||||||
|
|
||||||
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
||||||
|
|
||||||
|
@ -1,5 +0,0 @@
|
|||||||
#!/bin/bash
|
|
||||||
|
|
||||||
wget http://central.maven.org/maven2/org/python/jython-installer/2.7.1/jython-installer-2.7.1.jar
|
|
||||||
java -jar jython-installer-2.7.1.jar -s -d "$HOME/jython"
|
|
||||||
$HOME/jython/bin/jython -m pip install nose
|
|
17
devscripts/run_tests.bat
Normal file
17
devscripts/run_tests.bat
Normal file
@ -0,0 +1,17 @@
|
|||||||
|
@echo off
|
||||||
|
|
||||||
|
rem Keep this list in sync with the `offlinetest` target in Makefile
|
||||||
|
set DOWNLOAD_TESTS="age_restriction^|download^|iqiyi_sdk_interpreter^|socks^|subtitles^|write_annotations^|youtube_lists^|youtube_signature"
|
||||||
|
|
||||||
|
if "%YTDL_TEST_SET%" == "core" (
|
||||||
|
set test_set="-I test_("%DOWNLOAD_TESTS%")\.py"
|
||||||
|
set multiprocess_args=""
|
||||||
|
) else if "%YTDL_TEST_SET%" == "download" (
|
||||||
|
set test_set="-I test_(?!"%DOWNLOAD_TESTS%").+\.py"
|
||||||
|
set multiprocess_args="--processes=4 --process-timeout=540"
|
||||||
|
) else (
|
||||||
|
echo YTDL_TEST_SET is not set or invalid
|
||||||
|
exit /b 1
|
||||||
|
)
|
||||||
|
|
||||||
|
nosetests test --verbose %test_set:"=% %multiprocess_args:"=%
|
@ -1,6 +1,5 @@
|
|||||||
# Supported sites
|
# Supported sites
|
||||||
- **1tv**: Первый канал
|
- **1tv**: Первый канал
|
||||||
- **1up.com**
|
|
||||||
- **20min**
|
- **20min**
|
||||||
- **220.ro**
|
- **220.ro**
|
||||||
- **23video**
|
- **23video**
|
||||||
@ -35,6 +34,8 @@
|
|||||||
- **adobetv:video**
|
- **adobetv:video**
|
||||||
- **AdultSwim**
|
- **AdultSwim**
|
||||||
- **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network and History Vault
|
- **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network and History Vault
|
||||||
|
- **aenetworks:collection**
|
||||||
|
- **aenetworks:show**
|
||||||
- **afreecatv**: afreecatv.com
|
- **afreecatv**: afreecatv.com
|
||||||
- **AirMozilla**
|
- **AirMozilla**
|
||||||
- **AliExpressLive**
|
- **AliExpressLive**
|
||||||
@ -44,21 +45,25 @@
|
|||||||
- **Amara**
|
- **Amara**
|
||||||
- **AMCNetworks**
|
- **AMCNetworks**
|
||||||
- **AmericasTestKitchen**
|
- **AmericasTestKitchen**
|
||||||
|
- **AmericasTestKitchenSeason**
|
||||||
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
|
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
|
||||||
- **AnimeOnDemand**
|
- **AnimeOnDemand**
|
||||||
- **Anvato**
|
- **Anvato**
|
||||||
- **aol.com**
|
- **aol.com**: Yahoo screen and movies
|
||||||
- **APA**
|
- **APA**
|
||||||
- **Aparat**
|
- **Aparat**
|
||||||
- **AppleConnect**
|
- **AppleConnect**
|
||||||
- **AppleDaily**: 臺灣蘋果日報
|
- **AppleDaily**: 臺灣蘋果日報
|
||||||
|
- **ApplePodcasts**
|
||||||
- **appletrailers**
|
- **appletrailers**
|
||||||
- **appletrailers:section**
|
- **appletrailers:section**
|
||||||
- **archive.org**: archive.org videos
|
- **archive.org**: archive.org videos
|
||||||
|
- **ArcPublishing**
|
||||||
- **ARD**
|
- **ARD**
|
||||||
- **ARD:mediathek**
|
- **ARD:mediathek**
|
||||||
- **ARDBetaMediathek**
|
- **ARDBetaMediathek**
|
||||||
- **Arkena**
|
- **Arkena**
|
||||||
|
- **arte.sky.it**
|
||||||
- **ArteTV**
|
- **ArteTV**
|
||||||
- **ArteTVEmbed**
|
- **ArteTVEmbed**
|
||||||
- **ArteTVPlaylist**
|
- **ArteTVPlaylist**
|
||||||
@ -94,6 +99,10 @@
|
|||||||
- **BellMedia**
|
- **BellMedia**
|
||||||
- **Bet**
|
- **Bet**
|
||||||
- **bfi:player**
|
- **bfi:player**
|
||||||
|
- **bfmtv**
|
||||||
|
- **bfmtv:article**
|
||||||
|
- **bfmtv:live**
|
||||||
|
- **BibelTV**
|
||||||
- **Bigflix**
|
- **Bigflix**
|
||||||
- **Bild**: Bild.de
|
- **Bild**: Bild.de
|
||||||
- **BiliBili**
|
- **BiliBili**
|
||||||
@ -101,6 +110,7 @@
|
|||||||
- **BilibiliAudioAlbum**
|
- **BilibiliAudioAlbum**
|
||||||
- **BiliBiliPlayer**
|
- **BiliBiliPlayer**
|
||||||
- **BioBioChileTV**
|
- **BioBioChileTV**
|
||||||
|
- **Biography**
|
||||||
- **BIQLE**
|
- **BIQLE**
|
||||||
- **BitChute**
|
- **BitChute**
|
||||||
- **BitChuteChannel**
|
- **BitChuteChannel**
|
||||||
@ -109,7 +119,9 @@
|
|||||||
- **blinkx**
|
- **blinkx**
|
||||||
- **Bloomberg**
|
- **Bloomberg**
|
||||||
- **BokeCC**
|
- **BokeCC**
|
||||||
|
- **BongaCams**
|
||||||
- **BostonGlobe**
|
- **BostonGlobe**
|
||||||
|
- **Box**
|
||||||
- **Bpb**: Bundeszentrale für politische Bildung
|
- **Bpb**: Bundeszentrale für politische Bildung
|
||||||
- **BR**: Bayerischer Rundfunk
|
- **BR**: Bayerischer Rundfunk
|
||||||
- **BravoTV**
|
- **BravoTV**
|
||||||
@ -142,6 +154,7 @@
|
|||||||
- **CBS**
|
- **CBS**
|
||||||
- **CBSInteractive**
|
- **CBSInteractive**
|
||||||
- **CBSLocal**
|
- **CBSLocal**
|
||||||
|
- **CBSLocalArticle**
|
||||||
- **cbsnews**: CBS News
|
- **cbsnews**: CBS News
|
||||||
- **cbsnews:embed**
|
- **cbsnews:embed**
|
||||||
- **cbsnews:livevideo**: CBS News Live Videos
|
- **cbsnews:livevideo**: CBS News Live Videos
|
||||||
@ -157,6 +170,7 @@
|
|||||||
- **Chilloutzone**
|
- **Chilloutzone**
|
||||||
- **chirbit**
|
- **chirbit**
|
||||||
- **chirbit:profile**
|
- **chirbit:profile**
|
||||||
|
- **cielotv.it**
|
||||||
- **Cinchcast**
|
- **Cinchcast**
|
||||||
- **Cinemax**
|
- **Cinemax**
|
||||||
- **CiscoLiveSearch**
|
- **CiscoLiveSearch**
|
||||||
@ -178,8 +192,6 @@
|
|||||||
- **CNNArticle**
|
- **CNNArticle**
|
||||||
- **CNNBlogs**
|
- **CNNBlogs**
|
||||||
- **ComedyCentral**
|
- **ComedyCentral**
|
||||||
- **ComedyCentralFullEpisodes**
|
|
||||||
- **ComedyCentralShortname**
|
|
||||||
- **ComedyCentralTV**
|
- **ComedyCentralTV**
|
||||||
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
|
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
|
||||||
- **CONtv**
|
- **CONtv**
|
||||||
@ -190,9 +202,9 @@
|
|||||||
- **CrooksAndLiars**
|
- **CrooksAndLiars**
|
||||||
- **crunchyroll**
|
- **crunchyroll**
|
||||||
- **crunchyroll:playlist**
|
- **crunchyroll:playlist**
|
||||||
- **CSNNE**
|
|
||||||
- **CSpan**: C-SPAN
|
- **CSpan**: C-SPAN
|
||||||
- **CtsNews**: 華視新聞
|
- **CtsNews**: 華視新聞
|
||||||
|
- **CTV**
|
||||||
- **CTVNews**
|
- **CTVNews**
|
||||||
- **cu.ntv.co.jp**: Nippon Television Network
|
- **cu.ntv.co.jp**: Nippon Television Network
|
||||||
- **Culturebox**
|
- **Culturebox**
|
||||||
@ -263,7 +275,6 @@
|
|||||||
- **ESPNArticle**
|
- **ESPNArticle**
|
||||||
- **EsriVideo**
|
- **EsriVideo**
|
||||||
- **Europa**
|
- **Europa**
|
||||||
- **EveryonesMixtape**
|
|
||||||
- **EWETV**
|
- **EWETV**
|
||||||
- **ExpoTV**
|
- **ExpoTV**
|
||||||
- **Expressen**
|
- **Expressen**
|
||||||
@ -305,11 +316,11 @@
|
|||||||
- **FrontendMasters**
|
- **FrontendMasters**
|
||||||
- **FrontendMastersCourse**
|
- **FrontendMastersCourse**
|
||||||
- **FrontendMastersLesson**
|
- **FrontendMastersLesson**
|
||||||
|
- **FujiTVFODPlus7**
|
||||||
- **Funimation**
|
- **Funimation**
|
||||||
- **Funk**
|
- **Funk**
|
||||||
- **Fusion**
|
- **Fusion**
|
||||||
- **Fux**
|
- **Fux**
|
||||||
- **FXNetworks**
|
|
||||||
- **Gaia**
|
- **Gaia**
|
||||||
- **GameInformer**
|
- **GameInformer**
|
||||||
- **GameSpot**
|
- **GameSpot**
|
||||||
@ -328,6 +339,8 @@
|
|||||||
- **Go**
|
- **Go**
|
||||||
- **GodTube**
|
- **GodTube**
|
||||||
- **Golem**
|
- **Golem**
|
||||||
|
- **google:podcasts**
|
||||||
|
- **google:podcasts:feed**
|
||||||
- **GoogleDrive**
|
- **GoogleDrive**
|
||||||
- **Goshgay**
|
- **Goshgay**
|
||||||
- **GPUTechConf**
|
- **GPUTechConf**
|
||||||
@ -342,6 +355,7 @@
|
|||||||
- **hgtv.com:show**
|
- **hgtv.com:show**
|
||||||
- **HiDive**
|
- **HiDive**
|
||||||
- **HistoricFilms**
|
- **HistoricFilms**
|
||||||
|
- **history:player**
|
||||||
- **history:topic**: History.com Topic
|
- **history:topic**: History.com Topic
|
||||||
- **hitbox**
|
- **hitbox**
|
||||||
- **hitbox:live**
|
- **hitbox:live**
|
||||||
@ -361,6 +375,10 @@
|
|||||||
- **HungamaSong**
|
- **HungamaSong**
|
||||||
- **Hypem**
|
- **Hypem**
|
||||||
- **ign.com**
|
- **ign.com**
|
||||||
|
- **IGNArticle**
|
||||||
|
- **IGNVideo**
|
||||||
|
- **IHeartRadio**
|
||||||
|
- **iheartradio:podcast**
|
||||||
- **imdb**: Internet Movie Database trailers
|
- **imdb**: Internet Movie Database trailers
|
||||||
- **imdb:list**: Internet Movie Database lists
|
- **imdb:list**: Internet Movie Database lists
|
||||||
- **Imgur**
|
- **Imgur**
|
||||||
@ -394,14 +412,14 @@
|
|||||||
- **JWPlatform**
|
- **JWPlatform**
|
||||||
- **Kakao**
|
- **Kakao**
|
||||||
- **Kaltura**
|
- **Kaltura**
|
||||||
- **KanalPlay**: Kanal 5/9/11 Play
|
|
||||||
- **Kankan**
|
- **Kankan**
|
||||||
- **Karaoketv**
|
- **Karaoketv**
|
||||||
- **KarriereVideos**
|
- **KarriereVideos**
|
||||||
- **Katsomo**
|
- **Katsomo**
|
||||||
- **KeezMovies**
|
- **KeezMovies**
|
||||||
- **Ketnet**
|
- **Ketnet**
|
||||||
- **KhanAcademy**
|
- **khanacademy**
|
||||||
|
- **khanacademy:unit**
|
||||||
- **KickStarter**
|
- **KickStarter**
|
||||||
- **KinjaEmbed**
|
- **KinjaEmbed**
|
||||||
- **KinoPoisk**
|
- **KinoPoisk**
|
||||||
@ -418,7 +436,8 @@
|
|||||||
- **la7.it**
|
- **la7.it**
|
||||||
- **laola1tv**
|
- **laola1tv**
|
||||||
- **laola1tv:embed**
|
- **laola1tv:embed**
|
||||||
- **lbry.tv**
|
- **lbry**
|
||||||
|
- **lbry:channel**
|
||||||
- **LCI**
|
- **LCI**
|
||||||
- **Lcp**
|
- **Lcp**
|
||||||
- **LcpPlay**
|
- **LcpPlay**
|
||||||
@ -468,6 +487,7 @@
|
|||||||
- **massengeschmack.tv**
|
- **massengeschmack.tv**
|
||||||
- **MatchTV**
|
- **MatchTV**
|
||||||
- **MDR**: MDR.DE and KiKA
|
- **MDR**: MDR.DE and KiKA
|
||||||
|
- **MedalTV**
|
||||||
- **media.ccc.de**
|
- **media.ccc.de**
|
||||||
- **media.ccc.de:lists**
|
- **media.ccc.de:lists**
|
||||||
- **Medialaan**
|
- **Medialaan**
|
||||||
@ -482,9 +502,13 @@
|
|||||||
- **META**
|
- **META**
|
||||||
- **metacafe**
|
- **metacafe**
|
||||||
- **Metacritic**
|
- **Metacritic**
|
||||||
|
- **mewatch**
|
||||||
- **Mgoon**
|
- **Mgoon**
|
||||||
- **MGTV**: 芒果TV
|
- **MGTV**: 芒果TV
|
||||||
- **MiaoPai**
|
- **MiaoPai**
|
||||||
|
- **minds**
|
||||||
|
- **minds:channel**
|
||||||
|
- **minds:group**
|
||||||
- **MinistryGrid**
|
- **MinistryGrid**
|
||||||
- **Minoto**
|
- **Minoto**
|
||||||
- **miomio.tv**
|
- **miomio.tv**
|
||||||
@ -492,8 +516,6 @@
|
|||||||
- **mixcloud**
|
- **mixcloud**
|
||||||
- **mixcloud:playlist**
|
- **mixcloud:playlist**
|
||||||
- **mixcloud:user**
|
- **mixcloud:user**
|
||||||
- **Mixer:live**
|
|
||||||
- **Mixer:vod**
|
|
||||||
- **MLB**
|
- **MLB**
|
||||||
- **Mnet**
|
- **Mnet**
|
||||||
- **MNetTV**
|
- **MNetTV**
|
||||||
@ -516,6 +538,7 @@
|
|||||||
- **mtv:video**
|
- **mtv:video**
|
||||||
- **mtvjapan**
|
- **mtvjapan**
|
||||||
- **mtvservices:embedded**
|
- **mtvservices:embedded**
|
||||||
|
- **MTVUutisetArticle**
|
||||||
- **MuenchenTV**: münchen.tv
|
- **MuenchenTV**: münchen.tv
|
||||||
- **mva**: Microsoft Virtual Academy videos
|
- **mva**: Microsoft Virtual Academy videos
|
||||||
- **mva:course**: Microsoft Virtual Academy courses
|
- **mva:course**: Microsoft Virtual Academy courses
|
||||||
@ -534,6 +557,11 @@
|
|||||||
- **NationalGeographicTV**
|
- **NationalGeographicTV**
|
||||||
- **Naver**
|
- **Naver**
|
||||||
- **NBA**
|
- **NBA**
|
||||||
|
- **nba:watch**
|
||||||
|
- **nba:watch:collection**
|
||||||
|
- **NBAChannel**
|
||||||
|
- **NBAEmbed**
|
||||||
|
- **NBAWatchEmbed**
|
||||||
- **NBC**
|
- **NBC**
|
||||||
- **NBCNews**
|
- **NBCNews**
|
||||||
- **nbcolympics**
|
- **nbcolympics**
|
||||||
@ -563,8 +591,10 @@
|
|||||||
- **NextTV**: 壹電視
|
- **NextTV**: 壹電視
|
||||||
- **Nexx**
|
- **Nexx**
|
||||||
- **NexxEmbed**
|
- **NexxEmbed**
|
||||||
- **nfl.com**
|
- **nfl.com** (Currently broken)
|
||||||
|
- **nfl.com:article** (Currently broken)
|
||||||
- **NhkVod**
|
- **NhkVod**
|
||||||
|
- **NhkVodProgram**
|
||||||
- **nhl.com**
|
- **nhl.com**
|
||||||
- **nick.com**
|
- **nick.com**
|
||||||
- **nick.de**
|
- **nick.de**
|
||||||
@ -578,7 +608,6 @@
|
|||||||
- **njoy:embed**
|
- **njoy:embed**
|
||||||
- **NJPWWorld**: 新日本プロレスワールド
|
- **NJPWWorld**: 新日本プロレスワールド
|
||||||
- **NobelPrize**
|
- **NobelPrize**
|
||||||
- **Noco**
|
|
||||||
- **NonkTube**
|
- **NonkTube**
|
||||||
- **Noovo**
|
- **Noovo**
|
||||||
- **Normalboots**
|
- **Normalboots**
|
||||||
@ -596,6 +625,7 @@
|
|||||||
- **Npr**
|
- **Npr**
|
||||||
- **NRK**
|
- **NRK**
|
||||||
- **NRKPlaylist**
|
- **NRKPlaylist**
|
||||||
|
- **NRKRadioPodkast**
|
||||||
- **NRKSkole**: NRK Skole
|
- **NRKSkole**: NRK Skole
|
||||||
- **NRKTV**: NRK TV and NRK Radio
|
- **NRKTV**: NRK TV and NRK Radio
|
||||||
- **NRKTVDirekte**: NRK TV Direkte and NRK Radio Direkte
|
- **NRKTVDirekte**: NRK TV Direkte and NRK Radio Direkte
|
||||||
@ -608,6 +638,7 @@
|
|||||||
- **Nuvid**
|
- **Nuvid**
|
||||||
- **NYTimes**
|
- **NYTimes**
|
||||||
- **NYTimesArticle**
|
- **NYTimesArticle**
|
||||||
|
- **NYTimesCooking**
|
||||||
- **NZZ**
|
- **NZZ**
|
||||||
- **ocw.mit.edu**
|
- **ocw.mit.edu**
|
||||||
- **OdaTV**
|
- **OdaTV**
|
||||||
@ -646,7 +677,6 @@
|
|||||||
- **parliamentlive.tv**: UK parliament videos
|
- **parliamentlive.tv**: UK parliament videos
|
||||||
- **Patreon**
|
- **Patreon**
|
||||||
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
|
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
|
||||||
- **pcmag**
|
|
||||||
- **PearVideo**
|
- **PearVideo**
|
||||||
- **PeerTube**
|
- **PeerTube**
|
||||||
- **People**
|
- **People**
|
||||||
@ -660,10 +690,13 @@
|
|||||||
- **PicartoVod**
|
- **PicartoVod**
|
||||||
- **Piksel**
|
- **Piksel**
|
||||||
- **Pinkbike**
|
- **Pinkbike**
|
||||||
|
- **Pinterest**
|
||||||
|
- **PinterestCollection**
|
||||||
- **Pladform**
|
- **Pladform**
|
||||||
- **Platzi**
|
- **Platzi**
|
||||||
- **PlatziCourse**
|
- **PlatziCourse**
|
||||||
- **play.fm**
|
- **play.fm**
|
||||||
|
- **player.sky.it**
|
||||||
- **PlayPlusTV**
|
- **PlayPlusTV**
|
||||||
- **PlaysTV**
|
- **PlaysTV**
|
||||||
- **Playtvak**: Playtvak.cz, iDNES.cz and Lidovky.cz
|
- **Playtvak**: Playtvak.cz, iDNES.cz and Lidovky.cz
|
||||||
@ -671,7 +704,6 @@
|
|||||||
- **Playwire**
|
- **Playwire**
|
||||||
- **pluralsight**
|
- **pluralsight**
|
||||||
- **pluralsight:course**
|
- **pluralsight:course**
|
||||||
- **plus.google**: Google Plus
|
|
||||||
- **podomatic**
|
- **podomatic**
|
||||||
- **Pokemon**
|
- **Pokemon**
|
||||||
- **PolskieRadio**
|
- **PolskieRadio**
|
||||||
@ -701,6 +733,7 @@
|
|||||||
- **qqmusic:singer**: QQ音乐 - 歌手
|
- **qqmusic:singer**: QQ音乐 - 歌手
|
||||||
- **qqmusic:toplist**: QQ音乐 - 排行榜
|
- **qqmusic:toplist**: QQ音乐 - 排行榜
|
||||||
- **QuantumTV**
|
- **QuantumTV**
|
||||||
|
- **Qub**
|
||||||
- **Quickline**
|
- **Quickline**
|
||||||
- **QuicklineLive**
|
- **QuicklineLive**
|
||||||
- **R7**
|
- **R7**
|
||||||
@ -755,6 +788,7 @@
|
|||||||
- **RTVNH**
|
- **RTVNH**
|
||||||
- **RTVS**
|
- **RTVS**
|
||||||
- **RUHD**
|
- **RUHD**
|
||||||
|
- **RumbleEmbed**
|
||||||
- **rutube**: Rutube videos
|
- **rutube**: Rutube videos
|
||||||
- **rutube:channel**: Rutube channels
|
- **rutube:channel**: Rutube channels
|
||||||
- **rutube:embed**: Rutube embedded videos
|
- **rutube:embed**: Rutube embedded videos
|
||||||
@ -792,18 +826,17 @@
|
|||||||
- **Shared**: shared.sx
|
- **Shared**: shared.sx
|
||||||
- **ShowRoomLive**
|
- **ShowRoomLive**
|
||||||
- **Sina**
|
- **Sina**
|
||||||
|
- **sky.it**
|
||||||
|
- **sky:news**
|
||||||
|
- **sky:sports**
|
||||||
|
- **sky:sports:news**
|
||||||
|
- **skyacademy.it**
|
||||||
- **SkylineWebcams**
|
- **SkylineWebcams**
|
||||||
- **SkyNews**
|
|
||||||
- **skynewsarabia:article**
|
- **skynewsarabia:article**
|
||||||
- **skynewsarabia:video**
|
- **skynewsarabia:video**
|
||||||
- **SkySports**
|
|
||||||
- **Slideshare**
|
- **Slideshare**
|
||||||
- **SlidesLive**
|
- **SlidesLive**
|
||||||
- **Slutload**
|
- **Slutload**
|
||||||
- **smotri**: Smotri.com
|
|
||||||
- **smotri:broadcast**: Smotri.com broadcasts
|
|
||||||
- **smotri:community**: Smotri.com community videos
|
|
||||||
- **smotri:user**: Smotri.com user videos
|
|
||||||
- **Snotr**
|
- **Snotr**
|
||||||
- **Sohu**
|
- **Sohu**
|
||||||
- **SonyLIV**
|
- **SonyLIV**
|
||||||
@ -829,6 +862,12 @@
|
|||||||
- **Sport5**
|
- **Sport5**
|
||||||
- **SportBox**
|
- **SportBox**
|
||||||
- **SportDeutschland**
|
- **SportDeutschland**
|
||||||
|
- **spotify**
|
||||||
|
- **spotify:show**
|
||||||
|
- **Spreaker**
|
||||||
|
- **SpreakerPage**
|
||||||
|
- **SpreakerShow**
|
||||||
|
- **SpreakerShowPage**
|
||||||
- **SpringboardPlatform**
|
- **SpringboardPlatform**
|
||||||
- **Sprout**
|
- **Sprout**
|
||||||
- **sr:mediathek**: Saarländischer Rundfunk
|
- **sr:mediathek**: Saarländischer Rundfunk
|
||||||
@ -837,6 +876,7 @@
|
|||||||
- **stanfordoc**: Stanford Open ClassRoom
|
- **stanfordoc**: Stanford Open ClassRoom
|
||||||
- **Steam**
|
- **Steam**
|
||||||
- **Stitcher**
|
- **Stitcher**
|
||||||
|
- **StitcherShow**
|
||||||
- **Streamable**
|
- **Streamable**
|
||||||
- **streamcloud.eu**
|
- **streamcloud.eu**
|
||||||
- **StreamCZ**
|
- **StreamCZ**
|
||||||
@ -857,7 +897,6 @@
|
|||||||
- **Tagesschau**
|
- **Tagesschau**
|
||||||
- **tagesschau:player**
|
- **tagesschau:player**
|
||||||
- **Tass**
|
- **Tass**
|
||||||
- **TastyTrade**
|
|
||||||
- **TBS**
|
- **TBS**
|
||||||
- **TDSLifeway**
|
- **TDSLifeway**
|
||||||
- **Teachable**
|
- **Teachable**
|
||||||
@ -880,6 +919,7 @@
|
|||||||
- **TeleQuebecEmission**
|
- **TeleQuebecEmission**
|
||||||
- **TeleQuebecLive**
|
- **TeleQuebecLive**
|
||||||
- **TeleQuebecSquat**
|
- **TeleQuebecSquat**
|
||||||
|
- **TeleQuebecVideo**
|
||||||
- **TeleTask**
|
- **TeleTask**
|
||||||
- **Telewebion**
|
- **Telewebion**
|
||||||
- **TennisTV**
|
- **TennisTV**
|
||||||
@ -897,7 +937,7 @@
|
|||||||
- **ThisAV**
|
- **ThisAV**
|
||||||
- **ThisOldHouse**
|
- **ThisOldHouse**
|
||||||
- **TikTok**
|
- **TikTok**
|
||||||
- **TikTokUser**
|
- **TikTokUser** (Currently broken)
|
||||||
- **tinypic**: tinypic.com videos
|
- **tinypic**: tinypic.com videos
|
||||||
- **TMZ**
|
- **TMZ**
|
||||||
- **TMZArticle**
|
- **TMZArticle**
|
||||||
@ -905,12 +945,13 @@
|
|||||||
- **TNAFlixNetworkEmbed**
|
- **TNAFlixNetworkEmbed**
|
||||||
- **toggle**
|
- **toggle**
|
||||||
- **ToonGoggles**
|
- **ToonGoggles**
|
||||||
- **Tosh**: Tosh.0
|
|
||||||
- **tou.tv**
|
- **tou.tv**
|
||||||
- **Toypics**: Toypics video
|
- **Toypics**: Toypics video
|
||||||
- **ToypicsUser**: Toypics user profile
|
- **ToypicsUser**: Toypics user profile
|
||||||
- **TrailerAddict** (Currently broken)
|
- **TrailerAddict** (Currently broken)
|
||||||
- **Trilulilu**
|
- **Trilulilu**
|
||||||
|
- **Trovo**
|
||||||
|
- **TrovoVod**
|
||||||
- **TruNews**
|
- **TruNews**
|
||||||
- **TruTV**
|
- **TruTV**
|
||||||
- **Tube8**
|
- **Tube8**
|
||||||
@ -930,11 +971,15 @@
|
|||||||
- **TV2DKBornholmPlay**
|
- **TV2DKBornholmPlay**
|
||||||
- **TV4**: tv4.se and tv4play.se
|
- **TV4**: tv4.se and tv4play.se
|
||||||
- **TV5MondePlus**: TV5MONDE+
|
- **TV5MondePlus**: TV5MONDE+
|
||||||
|
- **tv5unis**
|
||||||
|
- **tv5unis:video**
|
||||||
|
- **tv8.it**
|
||||||
- **TVA**
|
- **TVA**
|
||||||
- **TVANouvelles**
|
- **TVANouvelles**
|
||||||
- **TVANouvellesArticle**
|
- **TVANouvellesArticle**
|
||||||
- **TVC**
|
- **TVC**
|
||||||
- **TVCArticle**
|
- **TVCArticle**
|
||||||
|
- **TVer**
|
||||||
- **tvigle**: Интернет-телевидение Tvigle.ru
|
- **tvigle**: Интернет-телевидение Tvigle.ru
|
||||||
- **tvland.com**
|
- **tvland.com**
|
||||||
- **TVN24**
|
- **TVN24**
|
||||||
@ -1001,6 +1046,8 @@
|
|||||||
- **Viddler**
|
- **Viddler**
|
||||||
- **Videa**
|
- **Videa**
|
||||||
- **video.google:search**: Google Video search
|
- **video.google:search**: Google Video search
|
||||||
|
- **video.sky.it**
|
||||||
|
- **video.sky.it:live**
|
||||||
- **VideoDetective**
|
- **VideoDetective**
|
||||||
- **videofy.me**
|
- **videofy.me**
|
||||||
- **videomore**
|
- **videomore**
|
||||||
@ -1012,7 +1059,6 @@
|
|||||||
- **vidme**
|
- **vidme**
|
||||||
- **vidme:user**
|
- **vidme:user**
|
||||||
- **vidme:user:likes**
|
- **vidme:user:likes**
|
||||||
- **Vidzi**
|
|
||||||
- **vier**: vier.be and vijf.be
|
- **vier**: vier.be and vijf.be
|
||||||
- **vier:videos**
|
- **vier:videos**
|
||||||
- **viewlift**
|
- **viewlift**
|
||||||
@ -1042,6 +1088,7 @@
|
|||||||
- **vk:wallpost**
|
- **vk:wallpost**
|
||||||
- **vlive**
|
- **vlive**
|
||||||
- **vlive:channel**
|
- **vlive:channel**
|
||||||
|
- **vlive:post**
|
||||||
- **Vodlocker**
|
- **Vodlocker**
|
||||||
- **VODPl**
|
- **VODPl**
|
||||||
- **VODPlatform**
|
- **VODPlatform**
|
||||||
@ -1056,10 +1103,12 @@
|
|||||||
- **vrv**
|
- **vrv**
|
||||||
- **vrv:series**
|
- **vrv:series**
|
||||||
- **VShare**
|
- **VShare**
|
||||||
|
- **VTM**
|
||||||
- **VTXTV**
|
- **VTXTV**
|
||||||
- **vube**: Vube.com
|
- **vube**: Vube.com
|
||||||
- **VuClip**
|
- **VuClip**
|
||||||
- **VVVVID**
|
- **VVVVID**
|
||||||
|
- **VVVVIDShow**
|
||||||
- **VyboryMos**
|
- **VyboryMos**
|
||||||
- **Vzaar**
|
- **Vzaar**
|
||||||
- **Wakanim**
|
- **Wakanim**
|
||||||
@ -1082,6 +1131,7 @@
|
|||||||
- **WeiboMobile**
|
- **WeiboMobile**
|
||||||
- **WeiqiTV**: WQTV
|
- **WeiqiTV**: WQTV
|
||||||
- **Wistia**
|
- **Wistia**
|
||||||
|
- **WistiaPlaylist**
|
||||||
- **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
|
- **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
|
||||||
- **WorldStarHipHop**
|
- **WorldStarHipHop**
|
||||||
- **WSJ**: Wall Street Journal
|
- **WSJ**: Wall Street Journal
|
||||||
@ -1089,7 +1139,7 @@
|
|||||||
- **WWE**
|
- **WWE**
|
||||||
- **XBef**
|
- **XBef**
|
||||||
- **XboxClips**
|
- **XboxClips**
|
||||||
- **XFileShare**: XFileShare based sites: ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, XVideoSharing
|
- **XFileShare**: XFileShare based sites: Aparat, ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, XVideoSharing
|
||||||
- **XHamster**
|
- **XHamster**
|
||||||
- **XHamsterEmbed**
|
- **XHamsterEmbed**
|
||||||
- **XHamsterUser**
|
- **XHamsterUser**
|
||||||
@ -1113,6 +1163,8 @@
|
|||||||
- **yahoo:japannews**: Yahoo! Japan News
|
- **yahoo:japannews**: Yahoo! Japan News
|
||||||
- **YandexDisk**
|
- **YandexDisk**
|
||||||
- **yandexmusic:album**: Яндекс.Музыка - Альбом
|
- **yandexmusic:album**: Яндекс.Музыка - Альбом
|
||||||
|
- **yandexmusic:artist:albums**: Яндекс.Музыка - Артист - Альбомы
|
||||||
|
- **yandexmusic:artist:tracks**: Яндекс.Музыка - Артист - Треки
|
||||||
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист
|
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист
|
||||||
- **yandexmusic:track**: Яндекс.Музыка - Трек
|
- **yandexmusic:track**: Яндекс.Музыка - Трек
|
||||||
- **YandexVideo**
|
- **YandexVideo**
|
||||||
@ -1130,6 +1182,7 @@
|
|||||||
- **YourPorn**
|
- **YourPorn**
|
||||||
- **YourUpload**
|
- **YourUpload**
|
||||||
- **youtube**: YouTube.com
|
- **youtube**: YouTube.com
|
||||||
|
- **youtube:favorites**: YouTube.com favourite videos, ":ytfav" for short (requires authentication)
|
||||||
- **youtube:history**: Youtube watch history, ":ythistory" for short (requires authentication)
|
- **youtube:history**: Youtube watch history, ":ythistory" for short (requires authentication)
|
||||||
- **youtube:playlist**: YouTube.com playlists
|
- **youtube:playlist**: YouTube.com playlists
|
||||||
- **youtube:recommended**: YouTube.com recommended videos, ":ytrec" for short (requires authentication)
|
- **youtube:recommended**: YouTube.com recommended videos, ":ytrec" for short (requires authentication)
|
||||||
@ -1138,9 +1191,9 @@
|
|||||||
- **youtube:subscriptions**: YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)
|
- **youtube:subscriptions**: YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)
|
||||||
- **youtube:tab**: YouTube.com tab
|
- **youtube:tab**: YouTube.com tab
|
||||||
- **youtube:watchlater**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
|
- **youtube:watchlater**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
|
||||||
|
- **YoutubeYtBe**
|
||||||
- **YoutubeYtUser**
|
- **YoutubeYtUser**
|
||||||
- **Zapiks**
|
- **Zapiks**
|
||||||
- **Zaq1**
|
|
||||||
- **Zattoo**
|
- **Zattoo**
|
||||||
- **ZattooLive**
|
- **ZattooLive**
|
||||||
- **ZDF**
|
- **ZDF**
|
||||||
|
@ -98,6 +98,55 @@ class TestInfoExtractor(unittest.TestCase):
|
|||||||
self.assertRaises(RegexNotFoundError, ie._html_search_meta, 'z', html, None, fatal=True)
|
self.assertRaises(RegexNotFoundError, ie._html_search_meta, 'z', html, None, fatal=True)
|
||||||
self.assertRaises(RegexNotFoundError, ie._html_search_meta, ('z', 'x'), html, None, fatal=True)
|
self.assertRaises(RegexNotFoundError, ie._html_search_meta, ('z', 'x'), html, None, fatal=True)
|
||||||
|
|
||||||
|
def test_search_json_ld_realworld(self):
|
||||||
|
# https://github.com/ytdl-org/youtube-dl/issues/23306
|
||||||
|
expect_dict(
|
||||||
|
self,
|
||||||
|
self.ie._search_json_ld(r'''<script type="application/ld+json">
|
||||||
|
{
|
||||||
|
"@context": "http://schema.org/",
|
||||||
|
"@type": "VideoObject",
|
||||||
|
"name": "1 On 1 With Kleio",
|
||||||
|
"url": "https://www.eporner.com/hd-porn/xN49A1cT3eB/1-On-1-With-Kleio/",
|
||||||
|
"duration": "PT0H12M23S",
|
||||||
|
"thumbnailUrl": ["https://static-eu-cdn.eporner.com/thumbs/static4/7/78/780/780814/9_360.jpg", "https://imggen.eporner.com/780814/1920/1080/9.jpg"],
|
||||||
|
"contentUrl": "https://gvideo.eporner.com/xN49A1cT3eB/xN49A1cT3eB.mp4",
|
||||||
|
"embedUrl": "https://www.eporner.com/embed/xN49A1cT3eB/1-On-1-With-Kleio/",
|
||||||
|
"image": "https://static-eu-cdn.eporner.com/thumbs/static4/7/78/780/780814/9_360.jpg",
|
||||||
|
"width": "1920",
|
||||||
|
"height": "1080",
|
||||||
|
"encodingFormat": "mp4",
|
||||||
|
"bitrate": "6617kbps",
|
||||||
|
"isFamilyFriendly": "False",
|
||||||
|
"description": "Kleio Valentien",
|
||||||
|
"uploadDate": "2015-12-05T21:24:35+01:00",
|
||||||
|
"interactionStatistic": {
|
||||||
|
"@type": "InteractionCounter",
|
||||||
|
"interactionType": { "@type": "http://schema.org/WatchAction" },
|
||||||
|
"userInteractionCount": 1120958
|
||||||
|
}, "aggregateRating": {
|
||||||
|
"@type": "AggregateRating",
|
||||||
|
"ratingValue": "88",
|
||||||
|
"ratingCount": "630",
|
||||||
|
"bestRating": "100",
|
||||||
|
"worstRating": "0"
|
||||||
|
}, "actor": [{
|
||||||
|
"@type": "Person",
|
||||||
|
"name": "Kleio Valentien",
|
||||||
|
"url": "https://www.eporner.com/pornstar/kleio-valentien/"
|
||||||
|
}]}
|
||||||
|
</script>''', None),
|
||||||
|
{
|
||||||
|
'title': '1 On 1 With Kleio',
|
||||||
|
'description': 'Kleio Valentien',
|
||||||
|
'url': 'https://gvideo.eporner.com/xN49A1cT3eB/xN49A1cT3eB.mp4',
|
||||||
|
'timestamp': 1449347075,
|
||||||
|
'duration': 743.0,
|
||||||
|
'view_count': 1120958,
|
||||||
|
'width': 1920,
|
||||||
|
'height': 1080,
|
||||||
|
})
|
||||||
|
|
||||||
def test_download_json(self):
|
def test_download_json(self):
|
||||||
uri = encode_data_uri(b'{"foo": "blah"}', 'application/json')
|
uri = encode_data_uri(b'{"foo": "blah"}', 'application/json')
|
||||||
self.assertEqual(self.ie._download_json(uri, None), {'foo': 'blah'})
|
self.assertEqual(self.ie._download_json(uri, None), {'foo': 'blah'})
|
||||||
@ -108,6 +157,18 @@ class TestInfoExtractor(unittest.TestCase):
|
|||||||
self.assertEqual(self.ie._download_json(uri, None, fatal=False), None)
|
self.assertEqual(self.ie._download_json(uri, None, fatal=False), None)
|
||||||
|
|
||||||
def test_parse_html5_media_entries(self):
|
def test_parse_html5_media_entries(self):
|
||||||
|
# inline video tag
|
||||||
|
expect_dict(
|
||||||
|
self,
|
||||||
|
self.ie._parse_html5_media_entries(
|
||||||
|
'https://127.0.0.1/video.html',
|
||||||
|
r'<html><video src="/vid.mp4" /></html>', None)[0],
|
||||||
|
{
|
||||||
|
'formats': [{
|
||||||
|
'url': 'https://127.0.0.1/vid.mp4',
|
||||||
|
}],
|
||||||
|
})
|
||||||
|
|
||||||
# from https://www.r18.com/
|
# from https://www.r18.com/
|
||||||
# with kpbs in label
|
# with kpbs in label
|
||||||
expect_dict(
|
expect_dict(
|
||||||
|
@ -464,6 +464,7 @@ class TestFormatSelection(unittest.TestCase):
|
|||||||
assert_syntax_error('+bestaudio')
|
assert_syntax_error('+bestaudio')
|
||||||
assert_syntax_error('bestvideo+')
|
assert_syntax_error('bestvideo+')
|
||||||
assert_syntax_error('/')
|
assert_syntax_error('/')
|
||||||
|
assert_syntax_error('bestvideo+bestvideo+bestaudio')
|
||||||
|
|
||||||
def test_format_filtering(self):
|
def test_format_filtering(self):
|
||||||
formats = [
|
formats = [
|
||||||
@ -632,13 +633,20 @@ class TestYoutubeDL(unittest.TestCase):
|
|||||||
'title2': '%PATH%',
|
'title2': '%PATH%',
|
||||||
}
|
}
|
||||||
|
|
||||||
def fname(templ):
|
def fname(templ, na_placeholder='NA'):
|
||||||
ydl = YoutubeDL({'outtmpl': templ})
|
params = {'outtmpl': templ}
|
||||||
|
if na_placeholder != 'NA':
|
||||||
|
params['outtmpl_na_placeholder'] = na_placeholder
|
||||||
|
ydl = YoutubeDL(params)
|
||||||
return ydl.prepare_filename(info)
|
return ydl.prepare_filename(info)
|
||||||
self.assertEqual(fname('%(id)s.%(ext)s'), '1234.mp4')
|
self.assertEqual(fname('%(id)s.%(ext)s'), '1234.mp4')
|
||||||
self.assertEqual(fname('%(id)s-%(width)s.%(ext)s'), '1234-NA.mp4')
|
self.assertEqual(fname('%(id)s-%(width)s.%(ext)s'), '1234-NA.mp4')
|
||||||
# Replace missing fields with 'NA'
|
NA_TEST_OUTTMPL = '%(uploader_date)s-%(width)d-%(id)s.%(ext)s'
|
||||||
self.assertEqual(fname('%(uploader_date)s-%(id)s.%(ext)s'), 'NA-1234.mp4')
|
# Replace missing fields with 'NA' by default
|
||||||
|
self.assertEqual(fname(NA_TEST_OUTTMPL), 'NA-NA-1234.mp4')
|
||||||
|
# Or by provided placeholder
|
||||||
|
self.assertEqual(fname(NA_TEST_OUTTMPL, na_placeholder='none'), 'none-none-1234.mp4')
|
||||||
|
self.assertEqual(fname(NA_TEST_OUTTMPL, na_placeholder=''), '--1234.mp4')
|
||||||
self.assertEqual(fname('%(height)d.%(ext)s'), '1080.mp4')
|
self.assertEqual(fname('%(height)d.%(ext)s'), '1080.mp4')
|
||||||
self.assertEqual(fname('%(height)6d.%(ext)s'), ' 1080.mp4')
|
self.assertEqual(fname('%(height)6d.%(ext)s'), ' 1080.mp4')
|
||||||
self.assertEqual(fname('%(height)-6d.%(ext)s'), '1080 .mp4')
|
self.assertEqual(fname('%(height)-6d.%(ext)s'), '1080 .mp4')
|
||||||
|
@ -36,7 +36,7 @@ class TestAllURLsMatching(unittest.TestCase):
|
|||||||
assertPlaylist('UUBABnxM4Ar9ten8Mdjj1j0Q') # 585
|
assertPlaylist('UUBABnxM4Ar9ten8Mdjj1j0Q') # 585
|
||||||
assertPlaylist('PL63F0C78739B09958')
|
assertPlaylist('PL63F0C78739B09958')
|
||||||
assertTab('https://www.youtube.com/playlist?list=UUBABnxM4Ar9ten8Mdjj1j0Q')
|
assertTab('https://www.youtube.com/playlist?list=UUBABnxM4Ar9ten8Mdjj1j0Q')
|
||||||
assertPlaylist('https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8')
|
assertTab('https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8')
|
||||||
assertTab('https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC')
|
assertTab('https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC')
|
||||||
assertTab('https://www.youtube.com/watch?v=AV6J6_AeFEQ&playnext=1&list=PL4023E734DA416012') # 668
|
assertTab('https://www.youtube.com/watch?v=AV6J6_AeFEQ&playnext=1&list=PL4023E734DA416012') # 668
|
||||||
self.assertFalse('youtube:playlist' in self.matching_ies('PLtS2H6bU1M'))
|
self.assertFalse('youtube:playlist' in self.matching_ies('PLtS2H6bU1M'))
|
||||||
@ -57,13 +57,14 @@ class TestAllURLsMatching(unittest.TestCase):
|
|||||||
assertChannel('https://www.youtube.com/channel/HCtnHdj3df7iM?feature=gb_ch_rec')
|
assertChannel('https://www.youtube.com/channel/HCtnHdj3df7iM?feature=gb_ch_rec')
|
||||||
assertChannel('https://www.youtube.com/channel/HCtnHdj3df7iM/videos')
|
assertChannel('https://www.youtube.com/channel/HCtnHdj3df7iM/videos')
|
||||||
|
|
||||||
# def test_youtube_user_matching(self):
|
def test_youtube_user_matching(self):
|
||||||
# self.assertMatch('http://www.youtube.com/NASAgovVideo/videos', ['youtube:tab'])
|
self.assertMatch('http://www.youtube.com/NASAgovVideo/videos', ['youtube:tab'])
|
||||||
|
|
||||||
def test_youtube_feeds(self):
|
def test_youtube_feeds(self):
|
||||||
self.assertMatch('https://www.youtube.com/feed/watch_later', ['youtube:watchlater'])
|
self.assertMatch('https://www.youtube.com/feed/library', ['youtube:tab'])
|
||||||
self.assertMatch('https://www.youtube.com/feed/subscriptions', ['youtube:subscriptions'])
|
self.assertMatch('https://www.youtube.com/feed/history', ['youtube:tab'])
|
||||||
self.assertMatch('https://www.youtube.com/feed/recommended', ['youtube:recommended'])
|
self.assertMatch('https://www.youtube.com/feed/watch_later', ['youtube:tab'])
|
||||||
|
self.assertMatch('https://www.youtube.com/feed/subscriptions', ['youtube:tab'])
|
||||||
|
|
||||||
# def test_youtube_search_matching(self):
|
# def test_youtube_search_matching(self):
|
||||||
# self.assertMatch('http://www.youtube.com/results?search_query=making+mustard', ['youtube:search_url'])
|
# self.assertMatch('http://www.youtube.com/results?search_query=making+mustard', ['youtube:search_url'])
|
||||||
|
@ -258,16 +258,24 @@ class TestNRKSubtitles(BaseTestSubtitles):
|
|||||||
|
|
||||||
|
|
||||||
class TestRaiPlaySubtitles(BaseTestSubtitles):
|
class TestRaiPlaySubtitles(BaseTestSubtitles):
|
||||||
url = 'http://www.raiplay.it/video/2014/04/Report-del-07042014-cb27157f-9dd0-4aee-b788-b1f67643a391.html'
|
|
||||||
IE = RaiPlayIE
|
IE = RaiPlayIE
|
||||||
|
|
||||||
def test_allsubtitles(self):
|
def test_subtitles_key(self):
|
||||||
|
self.url = 'http://www.raiplay.it/video/2014/04/Report-del-07042014-cb27157f-9dd0-4aee-b788-b1f67643a391.html'
|
||||||
self.DL.params['writesubtitles'] = True
|
self.DL.params['writesubtitles'] = True
|
||||||
self.DL.params['allsubtitles'] = True
|
self.DL.params['allsubtitles'] = True
|
||||||
subtitles = self.getSubtitles()
|
subtitles = self.getSubtitles()
|
||||||
self.assertEqual(set(subtitles.keys()), set(['it']))
|
self.assertEqual(set(subtitles.keys()), set(['it']))
|
||||||
self.assertEqual(md5(subtitles['it']), 'b1d90a98755126b61e667567a1f6680a')
|
self.assertEqual(md5(subtitles['it']), 'b1d90a98755126b61e667567a1f6680a')
|
||||||
|
|
||||||
|
def test_subtitles_array_key(self):
|
||||||
|
self.url = 'https://www.raiplay.it/video/2020/12/Report---04-01-2021-2e90f1de-8eee-4de4-ac0e-78d21db5b600.html'
|
||||||
|
self.DL.params['writesubtitles'] = True
|
||||||
|
self.DL.params['allsubtitles'] = True
|
||||||
|
subtitles = self.getSubtitles()
|
||||||
|
self.assertEqual(set(subtitles.keys()), set(['it']))
|
||||||
|
self.assertEqual(md5(subtitles['it']), '4b3264186fbb103508abe5311cfcb9cd')
|
||||||
|
|
||||||
|
|
||||||
class TestVikiSubtitles(BaseTestSubtitles):
|
class TestVikiSubtitles(BaseTestSubtitles):
|
||||||
url = 'http://www.viki.com/videos/1060846v-punch-episode-18'
|
url = 'http://www.viki.com/videos/1060846v-punch-episode-18'
|
||||||
|
@ -21,6 +21,7 @@ from youtube_dl.utils import (
|
|||||||
encode_base_n,
|
encode_base_n,
|
||||||
caesar,
|
caesar,
|
||||||
clean_html,
|
clean_html,
|
||||||
|
clean_podcast_url,
|
||||||
date_from_str,
|
date_from_str,
|
||||||
DateRange,
|
DateRange,
|
||||||
detect_exe_version,
|
detect_exe_version,
|
||||||
@ -554,6 +555,11 @@ class TestUtil(unittest.TestCase):
|
|||||||
self.assertEqual(url_or_none('http$://foo.de'), None)
|
self.assertEqual(url_or_none('http$://foo.de'), None)
|
||||||
self.assertEqual(url_or_none('http://foo.de'), 'http://foo.de')
|
self.assertEqual(url_or_none('http://foo.de'), 'http://foo.de')
|
||||||
self.assertEqual(url_or_none('//foo.de'), '//foo.de')
|
self.assertEqual(url_or_none('//foo.de'), '//foo.de')
|
||||||
|
self.assertEqual(url_or_none('s3://foo.de'), None)
|
||||||
|
self.assertEqual(url_or_none('rtmpte://foo.de'), 'rtmpte://foo.de')
|
||||||
|
self.assertEqual(url_or_none('mms://foo.de'), 'mms://foo.de')
|
||||||
|
self.assertEqual(url_or_none('rtspu://foo.de'), 'rtspu://foo.de')
|
||||||
|
self.assertEqual(url_or_none('ftps://foo.de'), 'ftps://foo.de')
|
||||||
|
|
||||||
def test_parse_age_limit(self):
|
def test_parse_age_limit(self):
|
||||||
self.assertEqual(parse_age_limit(None), None)
|
self.assertEqual(parse_age_limit(None), None)
|
||||||
@ -1465,6 +1471,10 @@ Line 1
|
|||||||
self.assertEqual(get_elements_by_attribute('class', 'foo', html), [])
|
self.assertEqual(get_elements_by_attribute('class', 'foo', html), [])
|
||||||
self.assertEqual(get_elements_by_attribute('class', 'no-such-foo', html), [])
|
self.assertEqual(get_elements_by_attribute('class', 'no-such-foo', html), [])
|
||||||
|
|
||||||
|
def test_clean_podcast_url(self):
|
||||||
|
self.assertEqual(clean_podcast_url('https://www.podtrac.com/pts/redirect.mp3/chtbl.com/track/5899E/traffic.megaphone.fm/HSW7835899191.mp3'), 'https://traffic.megaphone.fm/HSW7835899191.mp3')
|
||||||
|
self.assertEqual(clean_podcast_url('https://play.podtrac.com/npr-344098539/edge1.pod.npr.org/anon.npr-podcasts/podcast/npr/waitwait/2020/10/20201003_waitwait_wwdtmpodcast201003-015621a5-f035-4eca-a9a1-7c118d90bc3c.mp3'), 'https://edge1.pod.npr.org/anon.npr-podcasts/podcast/npr/waitwait/2020/10/20201003_waitwait_wwdtmpodcast201003-015621a5-f035-4eca-a9a1-7c118d90bc3c.mp3')
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
@ -1,275 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# coding: utf-8
|
|
||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
# Allow direct execution
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import unittest
|
|
||||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
|
||||||
|
|
||||||
from test.helper import expect_value
|
|
||||||
from youtube_dl.extractor import YoutubeIE
|
|
||||||
|
|
||||||
|
|
||||||
class TestYoutubeChapters(unittest.TestCase):
|
|
||||||
|
|
||||||
_TEST_CASES = [
|
|
||||||
(
|
|
||||||
# https://www.youtube.com/watch?v=A22oy8dFjqc
|
|
||||||
# pattern: 00:00 - <title>
|
|
||||||
'''This is the absolute ULTIMATE experience of Queen's set at LIVE AID, this is the best video mixed to the absolutely superior stereo radio broadcast. This vastly superior audio mix takes a huge dump on all of the official mixes. Best viewed in 1080p. ENJOY! ***MAKE SURE TO READ THE DESCRIPTION***<br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+36);return false;">00:36</a> - Bohemian Rhapsody<br /><a href="#" onclick="yt.www.watch.player.seekTo(02*60+42);return false;">02:42</a> - Radio Ga Ga<br /><a href="#" onclick="yt.www.watch.player.seekTo(06*60+53);return false;">06:53</a> - Ay Oh!<br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+34);return false;">07:34</a> - Hammer To Fall<br /><a href="#" onclick="yt.www.watch.player.seekTo(12*60+08);return false;">12:08</a> - Crazy Little Thing Called Love<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+03);return false;">16:03</a> - We Will Rock You<br /><a href="#" onclick="yt.www.watch.player.seekTo(17*60+18);return false;">17:18</a> - We Are The Champions<br /><a href="#" onclick="yt.www.watch.player.seekTo(21*60+12);return false;">21:12</a> - Is This The World We Created...?<br /><br />Short song analysis:<br /><br />- "Bohemian Rhapsody": Although it's a short medley version, it's one of the best performances of the ballad section, with Freddie nailing the Bb4s with the correct studio phrasing (for the first time ever!).<br /><br />- "Radio Ga Ga": Although it's missing one chorus, this is one of - if not the best - the best versions ever, Freddie nails all the Bb4s and sounds very clean! Spike Edney's Roland Jupiter 8 also really shines through on this mix, compared to the DVD releases!<br /><br />- "Audience Improv": A great improv, Freddie sounds strong and confident. You gotta love when he sustains that A4 for 4 seconds!<br /><br />- "Hammer To Fall": Despite missing a verse and a chorus, it's a strong version (possibly the best ever). Freddie sings the song amazingly, and even ad-libs a C#5 and a C5! Also notice how heavy Brian's guitar sounds compared to the thin DVD mixes - it roars!<br /><br />- "Crazy Little Thing Called Love": A great version, the crowd loves the song, the jam is great as well! Only downside to this is the slight feedback issues.<br /><br />- "We Will Rock You": Although cut down to the 1st verse and chorus, Freddie sounds strong. He nails the A4, and the solo from Dr. May is brilliant!<br /><br />- "We Are the Champions": Perhaps the high-light of the performance - Freddie is very daring on this version, he sustains the pre-chorus Bb4s, nails the 1st C5, belts great A4s, but most importantly: He nails the chorus Bb4s, in all 3 choruses! This is the only time he has ever done so! It has to be said though, the last one sounds a bit rough, but that's a side effect of belting high notes for the past 18 minutes, with nodules AND laryngitis!<br /><br />- "Is This The World We Created... ?": Freddie and Brian perform a beautiful version of this, and it is one of the best versions ever. It's both sad and hilarious that a couple of BBC engineers are talking over the song, one of them being completely oblivious of the fact that he is interrupting the performance, on live television... Which was being televised to almost 2 billion homes.<br /><br /><br />All rights go to their respective owners!<br />-----Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for fair use for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use''',
|
|
||||||
1477,
|
|
||||||
[{
|
|
||||||
'start_time': 36,
|
|
||||||
'end_time': 162,
|
|
||||||
'title': 'Bohemian Rhapsody',
|
|
||||||
}, {
|
|
||||||
'start_time': 162,
|
|
||||||
'end_time': 413,
|
|
||||||
'title': 'Radio Ga Ga',
|
|
||||||
}, {
|
|
||||||
'start_time': 413,
|
|
||||||
'end_time': 454,
|
|
||||||
'title': 'Ay Oh!',
|
|
||||||
}, {
|
|
||||||
'start_time': 454,
|
|
||||||
'end_time': 728,
|
|
||||||
'title': 'Hammer To Fall',
|
|
||||||
}, {
|
|
||||||
'start_time': 728,
|
|
||||||
'end_time': 963,
|
|
||||||
'title': 'Crazy Little Thing Called Love',
|
|
||||||
}, {
|
|
||||||
'start_time': 963,
|
|
||||||
'end_time': 1038,
|
|
||||||
'title': 'We Will Rock You',
|
|
||||||
}, {
|
|
||||||
'start_time': 1038,
|
|
||||||
'end_time': 1272,
|
|
||||||
'title': 'We Are The Champions',
|
|
||||||
}, {
|
|
||||||
'start_time': 1272,
|
|
||||||
'end_time': 1477,
|
|
||||||
'title': 'Is This The World We Created...?',
|
|
||||||
}]
|
|
||||||
),
|
|
||||||
(
|
|
||||||
# https://www.youtube.com/watch?v=ekYlRhALiRQ
|
|
||||||
# pattern: <num>. <title> 0:00
|
|
||||||
'1. Those Beaten Paths of Confusion <a href="#" onclick="yt.www.watch.player.seekTo(0*60+00);return false;">0:00</a><br />2. Beyond the Shadows of Emptiness & Nothingness <a href="#" onclick="yt.www.watch.player.seekTo(11*60+47);return false;">11:47</a><br />3. Poison Yourself...With Thought <a href="#" onclick="yt.www.watch.player.seekTo(26*60+30);return false;">26:30</a><br />4. The Agents of Transformation <a href="#" onclick="yt.www.watch.player.seekTo(35*60+57);return false;">35:57</a><br />5. Drowning in the Pain of Consciousness <a href="#" onclick="yt.www.watch.player.seekTo(44*60+32);return false;">44:32</a><br />6. Deny the Disease of Life <a href="#" onclick="yt.www.watch.player.seekTo(53*60+07);return false;">53:07</a><br /><br />More info/Buy: http://crepusculonegro.storenvy.com/products/257645-cn-03-arizmenda-within-the-vacuum-of-infinity<br /><br />No copyright is intended. The rights to this video are assumed by the owner and its affiliates.',
|
|
||||||
4009,
|
|
||||||
[{
|
|
||||||
'start_time': 0,
|
|
||||||
'end_time': 707,
|
|
||||||
'title': '1. Those Beaten Paths of Confusion',
|
|
||||||
}, {
|
|
||||||
'start_time': 707,
|
|
||||||
'end_time': 1590,
|
|
||||||
'title': '2. Beyond the Shadows of Emptiness & Nothingness',
|
|
||||||
}, {
|
|
||||||
'start_time': 1590,
|
|
||||||
'end_time': 2157,
|
|
||||||
'title': '3. Poison Yourself...With Thought',
|
|
||||||
}, {
|
|
||||||
'start_time': 2157,
|
|
||||||
'end_time': 2672,
|
|
||||||
'title': '4. The Agents of Transformation',
|
|
||||||
}, {
|
|
||||||
'start_time': 2672,
|
|
||||||
'end_time': 3187,
|
|
||||||
'title': '5. Drowning in the Pain of Consciousness',
|
|
||||||
}, {
|
|
||||||
'start_time': 3187,
|
|
||||||
'end_time': 4009,
|
|
||||||
'title': '6. Deny the Disease of Life',
|
|
||||||
}]
|
|
||||||
),
|
|
||||||
(
|
|
||||||
# https://www.youtube.com/watch?v=WjL4pSzog9w
|
|
||||||
# pattern: 00:00 <title>
|
|
||||||
'<a href="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" class="yt-uix-servicelink " data-target-new-window="True" data-servicelink="CDAQ6TgYACITCNf1raqT2dMCFdRjGAod_o0CBSj4HQ" data-url="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" rel="nofollow noopener" target="_blank">https://arizmenda.bandcamp.com/merch/...</a><br /><br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+00);return false;">00:00</a> Christening Unborn Deformities <br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+08);return false;">07:08</a> Taste of Purity<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+16);return false;">16:16</a> Sculpting Sins of a Universal Tongue<br /><a href="#" onclick="yt.www.watch.player.seekTo(24*60+45);return false;">24:45</a> Birth<br /><a href="#" onclick="yt.www.watch.player.seekTo(31*60+24);return false;">31:24</a> Neves<br /><a href="#" onclick="yt.www.watch.player.seekTo(37*60+55);return false;">37:55</a> Libations in Limbo',
|
|
||||||
2705,
|
|
||||||
[{
|
|
||||||
'start_time': 0,
|
|
||||||
'end_time': 428,
|
|
||||||
'title': 'Christening Unborn Deformities',
|
|
||||||
}, {
|
|
||||||
'start_time': 428,
|
|
||||||
'end_time': 976,
|
|
||||||
'title': 'Taste of Purity',
|
|
||||||
}, {
|
|
||||||
'start_time': 976,
|
|
||||||
'end_time': 1485,
|
|
||||||
'title': 'Sculpting Sins of a Universal Tongue',
|
|
||||||
}, {
|
|
||||||
'start_time': 1485,
|
|
||||||
'end_time': 1884,
|
|
||||||
'title': 'Birth',
|
|
||||||
}, {
|
|
||||||
'start_time': 1884,
|
|
||||||
'end_time': 2275,
|
|
||||||
'title': 'Neves',
|
|
||||||
}, {
|
|
||||||
'start_time': 2275,
|
|
||||||
'end_time': 2705,
|
|
||||||
'title': 'Libations in Limbo',
|
|
||||||
}]
|
|
||||||
),
|
|
||||||
(
|
|
||||||
# https://www.youtube.com/watch?v=o3r1sn-t3is
|
|
||||||
# pattern: <title> 00:00 <note>
|
|
||||||
'Download this show in MP3: <a href="http://sh.st/njZKK" class="yt-uix-servicelink " data-url="http://sh.st/njZKK" data-target-new-window="True" data-servicelink="CDAQ6TgYACITCK3j8_6o2dMCFVDCGAoduVAKKij4HQ" rel="nofollow noopener" target="_blank">http://sh.st/njZKK</a><br /><br />Setlist:<br />I-E-A-I-A-I-O <a href="#" onclick="yt.www.watch.player.seekTo(00*60+45);return false;">00:45</a><br />Suite-Pee <a href="#" onclick="yt.www.watch.player.seekTo(4*60+26);return false;">4:26</a> (Incomplete)<br />Attack <a href="#" onclick="yt.www.watch.player.seekTo(5*60+31);return false;">5:31</a> (First live performance since 2011)<br />Prison Song <a href="#" onclick="yt.www.watch.player.seekTo(8*60+42);return false;">8:42</a><br />Know <a href="#" onclick="yt.www.watch.player.seekTo(12*60+32);return false;">12:32</a> (First live performance since 2011)<br />Aerials <a href="#" onclick="yt.www.watch.player.seekTo(15*60+32);return false;">15:32</a><br />Soldier Side - Intro <a href="#" onclick="yt.www.watch.player.seekTo(19*60+13);return false;">19:13</a><br />B.Y.O.B. <a href="#" onclick="yt.www.watch.player.seekTo(20*60+09);return false;">20:09</a><br />Soil <a href="#" onclick="yt.www.watch.player.seekTo(24*60+32);return false;">24:32</a><br />Darts <a href="#" onclick="yt.www.watch.player.seekTo(27*60+48);return false;">27:48</a><br />Radio/Video <a href="#" onclick="yt.www.watch.player.seekTo(30*60+38);return false;">30:38</a><br />Hypnotize <a href="#" onclick="yt.www.watch.player.seekTo(35*60+05);return false;">35:05</a><br />Temper <a href="#" onclick="yt.www.watch.player.seekTo(38*60+08);return false;">38:08</a> (First live performance since 1999)<br />CUBErt <a href="#" onclick="yt.www.watch.player.seekTo(41*60+00);return false;">41:00</a><br />Needles <a href="#" onclick="yt.www.watch.player.seekTo(42*60+57);return false;">42:57</a><br />Deer Dance <a href="#" onclick="yt.www.watch.player.seekTo(46*60+27);return false;">46:27</a><br />Bounce <a href="#" onclick="yt.www.watch.player.seekTo(49*60+38);return false;">49:38</a><br />Suggestions <a href="#" onclick="yt.www.watch.player.seekTo(51*60+25);return false;">51:25</a><br />Psycho <a href="#" onclick="yt.www.watch.player.seekTo(53*60+52);return false;">53:52</a><br />Chop Suey! <a href="#" onclick="yt.www.watch.player.seekTo(58*60+13);return false;">58:13</a><br />Lonely Day <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+01*60+15);return false;">1:01:15</a><br />Question! <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+04*60+14);return false;">1:04:14</a><br />Lost in Hollywood <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+08*60+10);return false;">1:08:10</a><br />Vicinity of Obscenity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+13*60+40);return false;">1:13:40</a>(First live performance since 2012)<br />Forest <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+16*60+17);return false;">1:16:17</a><br />Cigaro <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+20*60+02);return false;">1:20:02</a><br />Toxicity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+23*60+57);return false;">1:23:57</a>(with Chino Moreno)<br />Sugar <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+27*60+53);return false;">1:27:53</a>',
|
|
||||||
5640,
|
|
||||||
[{
|
|
||||||
'start_time': 45,
|
|
||||||
'end_time': 266,
|
|
||||||
'title': 'I-E-A-I-A-I-O',
|
|
||||||
}, {
|
|
||||||
'start_time': 266,
|
|
||||||
'end_time': 331,
|
|
||||||
'title': 'Suite-Pee (Incomplete)',
|
|
||||||
}, {
|
|
||||||
'start_time': 331,
|
|
||||||
'end_time': 522,
|
|
||||||
'title': 'Attack (First live performance since 2011)',
|
|
||||||
}, {
|
|
||||||
'start_time': 522,
|
|
||||||
'end_time': 752,
|
|
||||||
'title': 'Prison Song',
|
|
||||||
}, {
|
|
||||||
'start_time': 752,
|
|
||||||
'end_time': 932,
|
|
||||||
'title': 'Know (First live performance since 2011)',
|
|
||||||
}, {
|
|
||||||
'start_time': 932,
|
|
||||||
'end_time': 1153,
|
|
||||||
'title': 'Aerials',
|
|
||||||
}, {
|
|
||||||
'start_time': 1153,
|
|
||||||
'end_time': 1209,
|
|
||||||
'title': 'Soldier Side - Intro',
|
|
||||||
}, {
|
|
||||||
'start_time': 1209,
|
|
||||||
'end_time': 1472,
|
|
||||||
'title': 'B.Y.O.B.',
|
|
||||||
}, {
|
|
||||||
'start_time': 1472,
|
|
||||||
'end_time': 1668,
|
|
||||||
'title': 'Soil',
|
|
||||||
}, {
|
|
||||||
'start_time': 1668,
|
|
||||||
'end_time': 1838,
|
|
||||||
'title': 'Darts',
|
|
||||||
}, {
|
|
||||||
'start_time': 1838,
|
|
||||||
'end_time': 2105,
|
|
||||||
'title': 'Radio/Video',
|
|
||||||
}, {
|
|
||||||
'start_time': 2105,
|
|
||||||
'end_time': 2288,
|
|
||||||
'title': 'Hypnotize',
|
|
||||||
}, {
|
|
||||||
'start_time': 2288,
|
|
||||||
'end_time': 2460,
|
|
||||||
'title': 'Temper (First live performance since 1999)',
|
|
||||||
}, {
|
|
||||||
'start_time': 2460,
|
|
||||||
'end_time': 2577,
|
|
||||||
'title': 'CUBErt',
|
|
||||||
}, {
|
|
||||||
'start_time': 2577,
|
|
||||||
'end_time': 2787,
|
|
||||||
'title': 'Needles',
|
|
||||||
}, {
|
|
||||||
'start_time': 2787,
|
|
||||||
'end_time': 2978,
|
|
||||||
'title': 'Deer Dance',
|
|
||||||
}, {
|
|
||||||
'start_time': 2978,
|
|
||||||
'end_time': 3085,
|
|
||||||
'title': 'Bounce',
|
|
||||||
}, {
|
|
||||||
'start_time': 3085,
|
|
||||||
'end_time': 3232,
|
|
||||||
'title': 'Suggestions',
|
|
||||||
}, {
|
|
||||||
'start_time': 3232,
|
|
||||||
'end_time': 3493,
|
|
||||||
'title': 'Psycho',
|
|
||||||
}, {
|
|
||||||
'start_time': 3493,
|
|
||||||
'end_time': 3675,
|
|
||||||
'title': 'Chop Suey!',
|
|
||||||
}, {
|
|
||||||
'start_time': 3675,
|
|
||||||
'end_time': 3854,
|
|
||||||
'title': 'Lonely Day',
|
|
||||||
}, {
|
|
||||||
'start_time': 3854,
|
|
||||||
'end_time': 4090,
|
|
||||||
'title': 'Question!',
|
|
||||||
}, {
|
|
||||||
'start_time': 4090,
|
|
||||||
'end_time': 4420,
|
|
||||||
'title': 'Lost in Hollywood',
|
|
||||||
}, {
|
|
||||||
'start_time': 4420,
|
|
||||||
'end_time': 4577,
|
|
||||||
'title': 'Vicinity of Obscenity (First live performance since 2012)',
|
|
||||||
}, {
|
|
||||||
'start_time': 4577,
|
|
||||||
'end_time': 4802,
|
|
||||||
'title': 'Forest',
|
|
||||||
}, {
|
|
||||||
'start_time': 4802,
|
|
||||||
'end_time': 5037,
|
|
||||||
'title': 'Cigaro',
|
|
||||||
}, {
|
|
||||||
'start_time': 5037,
|
|
||||||
'end_time': 5273,
|
|
||||||
'title': 'Toxicity (with Chino Moreno)',
|
|
||||||
}, {
|
|
||||||
'start_time': 5273,
|
|
||||||
'end_time': 5640,
|
|
||||||
'title': 'Sugar',
|
|
||||||
}]
|
|
||||||
),
|
|
||||||
(
|
|
||||||
# https://www.youtube.com/watch?v=PkYLQbsqCE8
|
|
||||||
# pattern: <num> - <title> [<latinized title>] 0:00:00
|
|
||||||
'''Затемно (Zatemno) is an Obscure Black Metal Band from Russia.<br /><br />"Во прах (Vo prakh)'' Into The Ashes", Debut mini-album released may 6, 2016, by Death Knell Productions<br />Released on 6 panel digipak CD, limited to 100 copies only<br />And digital format on Bandcamp<br /><br />Tracklist<br /><br />1 - Во прах [Vo prakh] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+00*60+00);return false;">0:00:00</a><br />2 - Искупление [Iskupleniye] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+08*60+10);return false;">0:08:10</a><br />3 - Из серпов луны...[Iz serpov luny] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+14*60+30);return false;">0:14:30</a><br /><br />Links:<br /><a href="https://deathknellprod.bandcamp.com/album/--2" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://deathknellprod.bandcamp.com/album/--2" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://deathknellprod.bandcamp.com/a...</a><br /><a href="https://www.facebook.com/DeathKnellProd/" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://www.facebook.com/DeathKnellProd/" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://www.facebook.com/DeathKnellProd/</a><br /><br /><br />I don't have any right about this artifact, my only intention is to spread the music of the band, all rights are reserved to the Затемно (Zatemno) and his producers, Death Knell Productions.<br /><br />------------------------------------------------------------------<br /><br />Subscribe for more videos like this.<br />My link: <a href="https://web.facebook.com/AttackOfTheDragons" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://web.facebook.com/AttackOfTheDragons" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://web.facebook.com/AttackOfTheD...</a>''',
|
|
||||||
1138,
|
|
||||||
[{
|
|
||||||
'start_time': 0,
|
|
||||||
'end_time': 490,
|
|
||||||
'title': '1 - Во прах [Vo prakh]',
|
|
||||||
}, {
|
|
||||||
'start_time': 490,
|
|
||||||
'end_time': 870,
|
|
||||||
'title': '2 - Искупление [Iskupleniye]',
|
|
||||||
}, {
|
|
||||||
'start_time': 870,
|
|
||||||
'end_time': 1138,
|
|
||||||
'title': '3 - Из серпов луны...[Iz serpov luny]',
|
|
||||||
}]
|
|
||||||
),
|
|
||||||
(
|
|
||||||
# https://www.youtube.com/watch?v=xZW70zEasOk
|
|
||||||
# time point more than duration
|
|
||||||
'''● LCS Spring finals: Saturday and Sunday from <a href="#" onclick="yt.www.watch.player.seekTo(13*60+30);return false;">13:30</a> outside the venue! <br />● PAX East: Fri, Sat & Sun - more info in tomorrows video on the main channel!''',
|
|
||||||
283,
|
|
||||||
[]
|
|
||||||
),
|
|
||||||
]
|
|
||||||
|
|
||||||
def test_youtube_chapters(self):
|
|
||||||
for description, duration, expected_chapters in self._TEST_CASES:
|
|
||||||
ie = YoutubeIE()
|
|
||||||
expect_value(
|
|
||||||
self, ie._extract_chapters_from_description(description, duration),
|
|
||||||
expected_chapters, None)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
unittest.main()
|
|
@ -19,55 +19,46 @@ from youtube_dl.compat import compat_str, compat_urlretrieve
|
|||||||
_TESTS = [
|
_TESTS = [
|
||||||
(
|
(
|
||||||
'https://s.ytimg.com/yts/jsbin/html5player-vflHOr_nV.js',
|
'https://s.ytimg.com/yts/jsbin/html5player-vflHOr_nV.js',
|
||||||
'js',
|
|
||||||
86,
|
86,
|
||||||
'>=<;:/.-[+*)(\'&%$#"!ZYX0VUTSRQPONMLKJIHGFEDCBA\\yxwvutsrqponmlkjihgfedcba987654321',
|
'>=<;:/.-[+*)(\'&%$#"!ZYX0VUTSRQPONMLKJIHGFEDCBA\\yxwvutsrqponmlkjihgfedcba987654321',
|
||||||
),
|
),
|
||||||
(
|
(
|
||||||
'https://s.ytimg.com/yts/jsbin/html5player-vfldJ8xgI.js',
|
'https://s.ytimg.com/yts/jsbin/html5player-vfldJ8xgI.js',
|
||||||
'js',
|
|
||||||
85,
|
85,
|
||||||
'3456789a0cdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRS[UVWXYZ!"#$%&\'()*+,-./:;<=>?@',
|
'3456789a0cdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRS[UVWXYZ!"#$%&\'()*+,-./:;<=>?@',
|
||||||
),
|
),
|
||||||
(
|
(
|
||||||
'https://s.ytimg.com/yts/jsbin/html5player-vfle-mVwz.js',
|
'https://s.ytimg.com/yts/jsbin/html5player-vfle-mVwz.js',
|
||||||
'js',
|
|
||||||
90,
|
90,
|
||||||
']\\[@?>=<;:/.-,+*)(\'&%$#"hZYXWVUTSRQPONMLKJIHGFEDCBAzyxwvutsrqponmlkjiagfedcb39876',
|
']\\[@?>=<;:/.-,+*)(\'&%$#"hZYXWVUTSRQPONMLKJIHGFEDCBAzyxwvutsrqponmlkjiagfedcb39876',
|
||||||
),
|
),
|
||||||
(
|
(
|
||||||
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vfl0Cbn9e.js',
|
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vfl0Cbn9e.js',
|
||||||
'js',
|
|
||||||
84,
|
84,
|
||||||
'O1I3456789abcde0ghijklmnopqrstuvwxyzABCDEFGHfJKLMN2PQRSTUVW@YZ!"#$%&\'()*+,-./:;<=',
|
'O1I3456789abcde0ghijklmnopqrstuvwxyzABCDEFGHfJKLMN2PQRSTUVW@YZ!"#$%&\'()*+,-./:;<=',
|
||||||
),
|
),
|
||||||
(
|
(
|
||||||
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js',
|
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js',
|
||||||
'js',
|
|
||||||
'2ACFC7A61CA478CD21425E5A57EBD73DDC78E22A.2094302436B2D377D14A3BBA23022D023B8BC25AA',
|
'2ACFC7A61CA478CD21425E5A57EBD73DDC78E22A.2094302436B2D377D14A3BBA23022D023B8BC25AA',
|
||||||
'A52CB8B320D22032ABB3A41D773D2B6342034902.A22E87CDD37DBE75A5E52412DC874AC16A7CFCA2',
|
'A52CB8B320D22032ABB3A41D773D2B6342034902.A22E87CDD37DBE75A5E52412DC874AC16A7CFCA2',
|
||||||
),
|
),
|
||||||
(
|
(
|
||||||
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflBb0OQx.js',
|
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflBb0OQx.js',
|
||||||
'js',
|
|
||||||
84,
|
84,
|
||||||
'123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQ0STUVWXYZ!"#$%&\'()*+,@./:;<=>'
|
'123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQ0STUVWXYZ!"#$%&\'()*+,@./:;<=>'
|
||||||
),
|
),
|
||||||
(
|
(
|
||||||
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vfl9FYC6l.js',
|
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vfl9FYC6l.js',
|
||||||
'js',
|
|
||||||
83,
|
83,
|
||||||
'123456789abcdefghijklmnopqr0tuvwxyzABCDETGHIJKLMNOPQRS>UVWXYZ!"#$%&\'()*+,-./:;<=F'
|
'123456789abcdefghijklmnopqr0tuvwxyzABCDETGHIJKLMNOPQRS>UVWXYZ!"#$%&\'()*+,-./:;<=F'
|
||||||
),
|
),
|
||||||
(
|
(
|
||||||
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflCGk6yw/html5player.js',
|
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflCGk6yw/html5player.js',
|
||||||
'js',
|
|
||||||
'4646B5181C6C3020DF1D9C7FCFEA.AD80ABF70C39BD369CCCAE780AFBB98FA6B6CB42766249D9488C288',
|
'4646B5181C6C3020DF1D9C7FCFEA.AD80ABF70C39BD369CCCAE780AFBB98FA6B6CB42766249D9488C288',
|
||||||
'82C8849D94266724DC6B6AF89BBFA087EACCD963.B93C07FBA084ACAEFCF7C9D1FD0203C6C1815B6B'
|
'82C8849D94266724DC6B6AF89BBFA087EACCD963.B93C07FBA084ACAEFCF7C9D1FD0203C6C1815B6B'
|
||||||
),
|
),
|
||||||
(
|
(
|
||||||
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js',
|
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js',
|
||||||
'js',
|
|
||||||
'312AA52209E3623129A412D56A40F11CB0AF14AE.3EE09501CB14E3BCDC3B2AE808BF3F1D14E7FBF12',
|
'312AA52209E3623129A412D56A40F11CB0AF14AE.3EE09501CB14E3BCDC3B2AE808BF3F1D14E7FBF12',
|
||||||
'112AA5220913623229A412D56A40F11CB0AF14AE.3EE0950FCB14EEBCDC3B2AE808BF331D14E7FBF3',
|
'112AA5220913623229A412D56A40F11CB0AF14AE.3EE0950FCB14EEBCDC3B2AE808BF331D14E7FBF3',
|
||||||
)
|
)
|
||||||
@ -78,6 +69,10 @@ class TestPlayerInfo(unittest.TestCase):
|
|||||||
def test_youtube_extract_player_info(self):
|
def test_youtube_extract_player_info(self):
|
||||||
PLAYER_URLS = (
|
PLAYER_URLS = (
|
||||||
('https://www.youtube.com/s/player/64dddad9/player_ias.vflset/en_US/base.js', '64dddad9'),
|
('https://www.youtube.com/s/player/64dddad9/player_ias.vflset/en_US/base.js', '64dddad9'),
|
||||||
|
('https://www.youtube.com/s/player/64dddad9/player_ias.vflset/fr_FR/base.js', '64dddad9'),
|
||||||
|
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-en_US.vflset/base.js', '64dddad9'),
|
||||||
|
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-de_DE.vflset/base.js', '64dddad9'),
|
||||||
|
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-tablet-en_US.vflset/base.js', '64dddad9'),
|
||||||
# obsolete
|
# obsolete
|
||||||
('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'),
|
('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'),
|
||||||
('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'),
|
('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'),
|
||||||
@ -86,13 +81,9 @@ class TestPlayerInfo(unittest.TestCase):
|
|||||||
('https://www.youtube.com/yts/jsbin/player-en_US-vflaxXRn1/base.js', 'vflaxXRn1'),
|
('https://www.youtube.com/yts/jsbin/player-en_US-vflaxXRn1/base.js', 'vflaxXRn1'),
|
||||||
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'),
|
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'),
|
||||||
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'),
|
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'),
|
||||||
('http://s.ytimg.com/yt/swfbin/watch_as3-vflrEm9Nq.swf', 'vflrEm9Nq'),
|
|
||||||
('https://s.ytimg.com/yts/swfbin/player-vflenCdZL/watch_as3.swf', 'vflenCdZL'),
|
|
||||||
)
|
)
|
||||||
for player_url, expected_player_id in PLAYER_URLS:
|
for player_url, expected_player_id in PLAYER_URLS:
|
||||||
expected_player_type = player_url.split('.')[-1]
|
player_id = YoutubeIE._extract_player_info(player_url)
|
||||||
player_type, player_id = YoutubeIE._extract_player_info(player_url)
|
|
||||||
self.assertEqual(player_type, expected_player_type)
|
|
||||||
self.assertEqual(player_id, expected_player_id)
|
self.assertEqual(player_id, expected_player_id)
|
||||||
|
|
||||||
|
|
||||||
@ -104,13 +95,13 @@ class TestSignature(unittest.TestCase):
|
|||||||
os.mkdir(self.TESTDATA_DIR)
|
os.mkdir(self.TESTDATA_DIR)
|
||||||
|
|
||||||
|
|
||||||
def make_tfunc(url, stype, sig_input, expected_sig):
|
def make_tfunc(url, sig_input, expected_sig):
|
||||||
m = re.match(r'.*-([a-zA-Z0-9_-]+)(?:/watch_as3|/html5player)?\.[a-z]+$', url)
|
m = re.match(r'.*-([a-zA-Z0-9_-]+)(?:/watch_as3|/html5player)?\.[a-z]+$', url)
|
||||||
assert m, '%r should follow URL format' % url
|
assert m, '%r should follow URL format' % url
|
||||||
test_id = m.group(1)
|
test_id = m.group(1)
|
||||||
|
|
||||||
def test_func(self):
|
def test_func(self):
|
||||||
basename = 'player-%s.%s' % (test_id, stype)
|
basename = 'player-%s.js' % test_id
|
||||||
fn = os.path.join(self.TESTDATA_DIR, basename)
|
fn = os.path.join(self.TESTDATA_DIR, basename)
|
||||||
|
|
||||||
if not os.path.exists(fn):
|
if not os.path.exists(fn):
|
||||||
@ -118,22 +109,16 @@ def make_tfunc(url, stype, sig_input, expected_sig):
|
|||||||
|
|
||||||
ydl = FakeYDL()
|
ydl = FakeYDL()
|
||||||
ie = YoutubeIE(ydl)
|
ie = YoutubeIE(ydl)
|
||||||
if stype == 'js':
|
with io.open(fn, encoding='utf-8') as testf:
|
||||||
with io.open(fn, encoding='utf-8') as testf:
|
jscode = testf.read()
|
||||||
jscode = testf.read()
|
func = ie._parse_sig_js(jscode)
|
||||||
func = ie._parse_sig_js(jscode)
|
|
||||||
else:
|
|
||||||
assert stype == 'swf'
|
|
||||||
with open(fn, 'rb') as testf:
|
|
||||||
swfcode = testf.read()
|
|
||||||
func = ie._parse_sig_swf(swfcode)
|
|
||||||
src_sig = (
|
src_sig = (
|
||||||
compat_str(string.printable[:sig_input])
|
compat_str(string.printable[:sig_input])
|
||||||
if isinstance(sig_input, int) else sig_input)
|
if isinstance(sig_input, int) else sig_input)
|
||||||
got_sig = func(src_sig)
|
got_sig = func(src_sig)
|
||||||
self.assertEqual(got_sig, expected_sig)
|
self.assertEqual(got_sig, expected_sig)
|
||||||
|
|
||||||
test_func.__name__ = str('test_signature_' + stype + '_' + test_id)
|
test_func.__name__ = str('test_signature_js_' + test_id)
|
||||||
setattr(TestSignature, test_func.__name__, test_func)
|
setattr(TestSignature, test_func.__name__, test_func)
|
||||||
|
|
||||||
|
|
||||||
|
@ -163,6 +163,7 @@ class YoutubeDL(object):
|
|||||||
simulate: Do not download the video files.
|
simulate: Do not download the video files.
|
||||||
format: Video format code. See options.py for more information.
|
format: Video format code. See options.py for more information.
|
||||||
outtmpl: Template for output names.
|
outtmpl: Template for output names.
|
||||||
|
outtmpl_na_placeholder: Placeholder for unavailable meta fields.
|
||||||
restrictfilenames: Do not allow "&" and spaces in file names
|
restrictfilenames: Do not allow "&" and spaces in file names
|
||||||
ignoreerrors: Do not stop on download errors.
|
ignoreerrors: Do not stop on download errors.
|
||||||
force_generic_extractor: Force downloader to use the generic extractor
|
force_generic_extractor: Force downloader to use the generic extractor
|
||||||
@ -338,6 +339,8 @@ class YoutubeDL(object):
|
|||||||
_pps = []
|
_pps = []
|
||||||
_download_retcode = None
|
_download_retcode = None
|
||||||
_num_downloads = None
|
_num_downloads = None
|
||||||
|
_playlist_level = 0
|
||||||
|
_playlist_urls = set()
|
||||||
_screen_file = None
|
_screen_file = None
|
||||||
|
|
||||||
def __init__(self, params=None, auto_init=True):
|
def __init__(self, params=None, auto_init=True):
|
||||||
@ -656,7 +659,7 @@ class YoutubeDL(object):
|
|||||||
template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
|
template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
|
||||||
for k, v in template_dict.items()
|
for k, v in template_dict.items()
|
||||||
if v is not None and not isinstance(v, (list, tuple, dict)))
|
if v is not None and not isinstance(v, (list, tuple, dict)))
|
||||||
template_dict = collections.defaultdict(lambda: 'NA', template_dict)
|
template_dict = collections.defaultdict(lambda: self.params.get('outtmpl_na_placeholder', 'NA'), template_dict)
|
||||||
|
|
||||||
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
|
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
|
||||||
|
|
||||||
@ -676,8 +679,8 @@ class YoutubeDL(object):
|
|||||||
|
|
||||||
# Missing numeric fields used together with integer presentation types
|
# Missing numeric fields used together with integer presentation types
|
||||||
# in format specification will break the argument substitution since
|
# in format specification will break the argument substitution since
|
||||||
# string 'NA' is returned for missing fields. We will patch output
|
# string NA placeholder is returned for missing fields. We will patch
|
||||||
# template for missing fields to meet string presentation type.
|
# output template for missing fields to meet string presentation type.
|
||||||
for numeric_field in self._NUMERIC_FIELDS:
|
for numeric_field in self._NUMERIC_FIELDS:
|
||||||
if numeric_field not in template_dict:
|
if numeric_field not in template_dict:
|
||||||
# As of [1] format syntax is:
|
# As of [1] format syntax is:
|
||||||
@ -906,115 +909,23 @@ class YoutubeDL(object):
|
|||||||
return self.process_ie_result(
|
return self.process_ie_result(
|
||||||
new_result, download=download, extra_info=extra_info)
|
new_result, download=download, extra_info=extra_info)
|
||||||
elif result_type in ('playlist', 'multi_video'):
|
elif result_type in ('playlist', 'multi_video'):
|
||||||
# We process each entry in the playlist
|
# Protect from infinite recursion due to recursively nested playlists
|
||||||
playlist = ie_result.get('title') or ie_result.get('id')
|
# (see https://github.com/ytdl-org/youtube-dl/issues/27833)
|
||||||
self.to_screen('[download] Downloading playlist: %s' % playlist)
|
webpage_url = ie_result['webpage_url']
|
||||||
|
if webpage_url in self._playlist_urls:
|
||||||
playlist_results = []
|
|
||||||
|
|
||||||
playliststart = self.params.get('playliststart', 1) - 1
|
|
||||||
playlistend = self.params.get('playlistend')
|
|
||||||
# For backwards compatibility, interpret -1 as whole list
|
|
||||||
if playlistend == -1:
|
|
||||||
playlistend = None
|
|
||||||
|
|
||||||
playlistitems_str = self.params.get('playlist_items')
|
|
||||||
playlistitems = None
|
|
||||||
if playlistitems_str is not None:
|
|
||||||
def iter_playlistitems(format):
|
|
||||||
for string_segment in format.split(','):
|
|
||||||
if '-' in string_segment:
|
|
||||||
start, end = string_segment.split('-')
|
|
||||||
for item in range(int(start), int(end) + 1):
|
|
||||||
yield int(item)
|
|
||||||
else:
|
|
||||||
yield int(string_segment)
|
|
||||||
playlistitems = orderedSet(iter_playlistitems(playlistitems_str))
|
|
||||||
|
|
||||||
ie_entries = ie_result['entries']
|
|
||||||
|
|
||||||
def make_playlistitems_entries(list_ie_entries):
|
|
||||||
num_entries = len(list_ie_entries)
|
|
||||||
return [
|
|
||||||
list_ie_entries[i - 1] for i in playlistitems
|
|
||||||
if -num_entries <= i - 1 < num_entries]
|
|
||||||
|
|
||||||
def report_download(num_entries):
|
|
||||||
self.to_screen(
|
self.to_screen(
|
||||||
'[%s] playlist %s: Downloading %d videos' %
|
'[download] Skipping already downloaded playlist: %s'
|
||||||
(ie_result['extractor'], playlist, num_entries))
|
% ie_result.get('title') or ie_result.get('id'))
|
||||||
|
return
|
||||||
|
|
||||||
if isinstance(ie_entries, list):
|
self._playlist_level += 1
|
||||||
n_all_entries = len(ie_entries)
|
self._playlist_urls.add(webpage_url)
|
||||||
if playlistitems:
|
try:
|
||||||
entries = make_playlistitems_entries(ie_entries)
|
return self.__process_playlist(ie_result, download)
|
||||||
else:
|
finally:
|
||||||
entries = ie_entries[playliststart:playlistend]
|
self._playlist_level -= 1
|
||||||
n_entries = len(entries)
|
if not self._playlist_level:
|
||||||
self.to_screen(
|
self._playlist_urls.clear()
|
||||||
'[%s] playlist %s: Collected %d video ids (downloading %d of them)' %
|
|
||||||
(ie_result['extractor'], playlist, n_all_entries, n_entries))
|
|
||||||
elif isinstance(ie_entries, PagedList):
|
|
||||||
if playlistitems:
|
|
||||||
entries = []
|
|
||||||
for item in playlistitems:
|
|
||||||
entries.extend(ie_entries.getslice(
|
|
||||||
item - 1, item
|
|
||||||
))
|
|
||||||
else:
|
|
||||||
entries = ie_entries.getslice(
|
|
||||||
playliststart, playlistend)
|
|
||||||
n_entries = len(entries)
|
|
||||||
report_download(n_entries)
|
|
||||||
else: # iterable
|
|
||||||
if playlistitems:
|
|
||||||
entries = make_playlistitems_entries(list(itertools.islice(
|
|
||||||
ie_entries, 0, max(playlistitems))))
|
|
||||||
else:
|
|
||||||
entries = list(itertools.islice(
|
|
||||||
ie_entries, playliststart, playlistend))
|
|
||||||
n_entries = len(entries)
|
|
||||||
report_download(n_entries)
|
|
||||||
|
|
||||||
if self.params.get('playlistreverse', False):
|
|
||||||
entries = entries[::-1]
|
|
||||||
|
|
||||||
if self.params.get('playlistrandom', False):
|
|
||||||
random.shuffle(entries)
|
|
||||||
|
|
||||||
x_forwarded_for = ie_result.get('__x_forwarded_for_ip')
|
|
||||||
|
|
||||||
for i, entry in enumerate(entries, 1):
|
|
||||||
self.to_screen('[download] Downloading video %s of %s' % (i, n_entries))
|
|
||||||
# This __x_forwarded_for_ip thing is a bit ugly but requires
|
|
||||||
# minimal changes
|
|
||||||
if x_forwarded_for:
|
|
||||||
entry['__x_forwarded_for_ip'] = x_forwarded_for
|
|
||||||
extra = {
|
|
||||||
'n_entries': n_entries,
|
|
||||||
'playlist': playlist,
|
|
||||||
'playlist_id': ie_result.get('id'),
|
|
||||||
'playlist_title': ie_result.get('title'),
|
|
||||||
'playlist_uploader': ie_result.get('uploader'),
|
|
||||||
'playlist_uploader_id': ie_result.get('uploader_id'),
|
|
||||||
'playlist_index': playlistitems[i - 1] if playlistitems else i + playliststart,
|
|
||||||
'extractor': ie_result['extractor'],
|
|
||||||
'webpage_url': ie_result['webpage_url'],
|
|
||||||
'webpage_url_basename': url_basename(ie_result['webpage_url']),
|
|
||||||
'extractor_key': ie_result['extractor_key'],
|
|
||||||
}
|
|
||||||
|
|
||||||
reason = self._match_entry(entry, incomplete=True)
|
|
||||||
if reason is not None:
|
|
||||||
self.to_screen('[download] ' + reason)
|
|
||||||
continue
|
|
||||||
|
|
||||||
entry_result = self.__process_iterable_entry(entry, download, extra)
|
|
||||||
# TODO: skip failed (empty) entries?
|
|
||||||
playlist_results.append(entry_result)
|
|
||||||
ie_result['entries'] = playlist_results
|
|
||||||
self.to_screen('[download] Finished downloading playlist: %s' % playlist)
|
|
||||||
return ie_result
|
|
||||||
elif result_type == 'compat_list':
|
elif result_type == 'compat_list':
|
||||||
self.report_warning(
|
self.report_warning(
|
||||||
'Extractor %s returned a compat_list result. '
|
'Extractor %s returned a compat_list result. '
|
||||||
@ -1039,6 +950,118 @@ class YoutubeDL(object):
|
|||||||
else:
|
else:
|
||||||
raise Exception('Invalid result type: %s' % result_type)
|
raise Exception('Invalid result type: %s' % result_type)
|
||||||
|
|
||||||
|
def __process_playlist(self, ie_result, download):
|
||||||
|
# We process each entry in the playlist
|
||||||
|
playlist = ie_result.get('title') or ie_result.get('id')
|
||||||
|
|
||||||
|
self.to_screen('[download] Downloading playlist: %s' % playlist)
|
||||||
|
|
||||||
|
playlist_results = []
|
||||||
|
|
||||||
|
playliststart = self.params.get('playliststart', 1) - 1
|
||||||
|
playlistend = self.params.get('playlistend')
|
||||||
|
# For backwards compatibility, interpret -1 as whole list
|
||||||
|
if playlistend == -1:
|
||||||
|
playlistend = None
|
||||||
|
|
||||||
|
playlistitems_str = self.params.get('playlist_items')
|
||||||
|
playlistitems = None
|
||||||
|
if playlistitems_str is not None:
|
||||||
|
def iter_playlistitems(format):
|
||||||
|
for string_segment in format.split(','):
|
||||||
|
if '-' in string_segment:
|
||||||
|
start, end = string_segment.split('-')
|
||||||
|
for item in range(int(start), int(end) + 1):
|
||||||
|
yield int(item)
|
||||||
|
else:
|
||||||
|
yield int(string_segment)
|
||||||
|
playlistitems = orderedSet(iter_playlistitems(playlistitems_str))
|
||||||
|
|
||||||
|
ie_entries = ie_result['entries']
|
||||||
|
|
||||||
|
def make_playlistitems_entries(list_ie_entries):
|
||||||
|
num_entries = len(list_ie_entries)
|
||||||
|
return [
|
||||||
|
list_ie_entries[i - 1] for i in playlistitems
|
||||||
|
if -num_entries <= i - 1 < num_entries]
|
||||||
|
|
||||||
|
def report_download(num_entries):
|
||||||
|
self.to_screen(
|
||||||
|
'[%s] playlist %s: Downloading %d videos' %
|
||||||
|
(ie_result['extractor'], playlist, num_entries))
|
||||||
|
|
||||||
|
if isinstance(ie_entries, list):
|
||||||
|
n_all_entries = len(ie_entries)
|
||||||
|
if playlistitems:
|
||||||
|
entries = make_playlistitems_entries(ie_entries)
|
||||||
|
else:
|
||||||
|
entries = ie_entries[playliststart:playlistend]
|
||||||
|
n_entries = len(entries)
|
||||||
|
self.to_screen(
|
||||||
|
'[%s] playlist %s: Collected %d video ids (downloading %d of them)' %
|
||||||
|
(ie_result['extractor'], playlist, n_all_entries, n_entries))
|
||||||
|
elif isinstance(ie_entries, PagedList):
|
||||||
|
if playlistitems:
|
||||||
|
entries = []
|
||||||
|
for item in playlistitems:
|
||||||
|
entries.extend(ie_entries.getslice(
|
||||||
|
item - 1, item
|
||||||
|
))
|
||||||
|
else:
|
||||||
|
entries = ie_entries.getslice(
|
||||||
|
playliststart, playlistend)
|
||||||
|
n_entries = len(entries)
|
||||||
|
report_download(n_entries)
|
||||||
|
else: # iterable
|
||||||
|
if playlistitems:
|
||||||
|
entries = make_playlistitems_entries(list(itertools.islice(
|
||||||
|
ie_entries, 0, max(playlistitems))))
|
||||||
|
else:
|
||||||
|
entries = list(itertools.islice(
|
||||||
|
ie_entries, playliststart, playlistend))
|
||||||
|
n_entries = len(entries)
|
||||||
|
report_download(n_entries)
|
||||||
|
|
||||||
|
if self.params.get('playlistreverse', False):
|
||||||
|
entries = entries[::-1]
|
||||||
|
|
||||||
|
if self.params.get('playlistrandom', False):
|
||||||
|
random.shuffle(entries)
|
||||||
|
|
||||||
|
x_forwarded_for = ie_result.get('__x_forwarded_for_ip')
|
||||||
|
|
||||||
|
for i, entry in enumerate(entries, 1):
|
||||||
|
self.to_screen('[download] Downloading video %s of %s' % (i, n_entries))
|
||||||
|
# This __x_forwarded_for_ip thing is a bit ugly but requires
|
||||||
|
# minimal changes
|
||||||
|
if x_forwarded_for:
|
||||||
|
entry['__x_forwarded_for_ip'] = x_forwarded_for
|
||||||
|
extra = {
|
||||||
|
'n_entries': n_entries,
|
||||||
|
'playlist': playlist,
|
||||||
|
'playlist_id': ie_result.get('id'),
|
||||||
|
'playlist_title': ie_result.get('title'),
|
||||||
|
'playlist_uploader': ie_result.get('uploader'),
|
||||||
|
'playlist_uploader_id': ie_result.get('uploader_id'),
|
||||||
|
'playlist_index': playlistitems[i - 1] if playlistitems else i + playliststart,
|
||||||
|
'extractor': ie_result['extractor'],
|
||||||
|
'webpage_url': ie_result['webpage_url'],
|
||||||
|
'webpage_url_basename': url_basename(ie_result['webpage_url']),
|
||||||
|
'extractor_key': ie_result['extractor_key'],
|
||||||
|
}
|
||||||
|
|
||||||
|
reason = self._match_entry(entry, incomplete=True)
|
||||||
|
if reason is not None:
|
||||||
|
self.to_screen('[download] ' + reason)
|
||||||
|
continue
|
||||||
|
|
||||||
|
entry_result = self.__process_iterable_entry(entry, download, extra)
|
||||||
|
# TODO: skip failed (empty) entries?
|
||||||
|
playlist_results.append(entry_result)
|
||||||
|
ie_result['entries'] = playlist_results
|
||||||
|
self.to_screen('[download] Finished downloading playlist: %s' % playlist)
|
||||||
|
return ie_result
|
||||||
|
|
||||||
@__handle_extraction_exceptions
|
@__handle_extraction_exceptions
|
||||||
def __process_iterable_entry(self, entry, download, extra_info):
|
def __process_iterable_entry(self, entry, download, extra_info):
|
||||||
return self.process_ie_result(
|
return self.process_ie_result(
|
||||||
@ -1083,7 +1106,7 @@ class YoutubeDL(object):
|
|||||||
'*=': lambda attr, value: value in attr,
|
'*=': lambda attr, value: value in attr,
|
||||||
}
|
}
|
||||||
str_operator_rex = re.compile(r'''(?x)
|
str_operator_rex = re.compile(r'''(?x)
|
||||||
\s*(?P<key>ext|acodec|vcodec|container|protocol|format_id)
|
\s*(?P<key>ext|acodec|vcodec|container|protocol|format_id|language)
|
||||||
\s*(?P<negation>!\s*)?(?P<op>%s)(?P<none_inclusive>\s*\?)?
|
\s*(?P<negation>!\s*)?(?P<op>%s)(?P<none_inclusive>\s*\?)?
|
||||||
\s*(?P<value>[a-zA-Z0-9._-]+)
|
\s*(?P<value>[a-zA-Z0-9._-]+)
|
||||||
\s*$
|
\s*$
|
||||||
@ -1226,6 +1249,8 @@ class YoutubeDL(object):
|
|||||||
group = _parse_format_selection(tokens, inside_group=True)
|
group = _parse_format_selection(tokens, inside_group=True)
|
||||||
current_selector = FormatSelector(GROUP, group, [])
|
current_selector = FormatSelector(GROUP, group, [])
|
||||||
elif string == '+':
|
elif string == '+':
|
||||||
|
if inside_merge:
|
||||||
|
raise syntax_error('Unexpected "+"', start)
|
||||||
video_selector = current_selector
|
video_selector = current_selector
|
||||||
audio_selector = _parse_format_selection(tokens, inside_merge=True)
|
audio_selector = _parse_format_selection(tokens, inside_merge=True)
|
||||||
if not video_selector or not audio_selector:
|
if not video_selector or not audio_selector:
|
||||||
@ -1610,7 +1635,7 @@ class YoutubeDL(object):
|
|||||||
if req_format is None:
|
if req_format is None:
|
||||||
req_format = self._default_format_spec(info_dict, download=download)
|
req_format = self._default_format_spec(info_dict, download=download)
|
||||||
if self.params.get('verbose'):
|
if self.params.get('verbose'):
|
||||||
self.to_stdout('[debug] Default format spec: %s' % req_format)
|
self._write_string('[debug] Default format spec: %s\n' % req_format)
|
||||||
|
|
||||||
format_selector = self.build_format_selector(req_format)
|
format_selector = self.build_format_selector(req_format)
|
||||||
|
|
||||||
@ -1777,6 +1802,8 @@ class YoutubeDL(object):
|
|||||||
os.makedirs(dn)
|
os.makedirs(dn)
|
||||||
return True
|
return True
|
||||||
except (OSError, IOError) as err:
|
except (OSError, IOError) as err:
|
||||||
|
if isinstance(err, OSError) and err.errno == errno.EEXIST:
|
||||||
|
return True
|
||||||
self.report_error('unable to create directory ' + error_to_compat_str(err))
|
self.report_error('unable to create directory ' + error_to_compat_str(err))
|
||||||
return False
|
return False
|
||||||
|
|
||||||
@ -1871,7 +1898,7 @@ class YoutubeDL(object):
|
|||||||
for ph in self._progress_hooks:
|
for ph in self._progress_hooks:
|
||||||
fd.add_progress_hook(ph)
|
fd.add_progress_hook(ph)
|
||||||
if self.params.get('verbose'):
|
if self.params.get('verbose'):
|
||||||
self.to_stdout('[debug] Invoking downloader on %r' % info.get('url'))
|
self.to_screen('[debug] Invoking downloader on %r' % info.get('url'))
|
||||||
return fd.download(name, info)
|
return fd.download(name, info)
|
||||||
|
|
||||||
if info_dict.get('requested_formats') is not None:
|
if info_dict.get('requested_formats') is not None:
|
||||||
@ -2410,7 +2437,7 @@ class YoutubeDL(object):
|
|||||||
thumb_ext = determine_ext(t['url'], 'jpg')
|
thumb_ext = determine_ext(t['url'], 'jpg')
|
||||||
suffix = '_%s' % t['id'] if len(thumbnails) > 1 else ''
|
suffix = '_%s' % t['id'] if len(thumbnails) > 1 else ''
|
||||||
thumb_display_id = '%s ' % t['id'] if len(thumbnails) > 1 else ''
|
thumb_display_id = '%s ' % t['id'] if len(thumbnails) > 1 else ''
|
||||||
t['filename'] = thumb_filename = os.path.splitext(filename)[0] + suffix + '.' + thumb_ext
|
t['filename'] = thumb_filename = replace_extension(filename + suffix, thumb_ext, info_dict.get('ext'))
|
||||||
|
|
||||||
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(thumb_filename)):
|
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(thumb_filename)):
|
||||||
self.to_screen('[%s] %s: Thumbnail %sis already present' %
|
self.to_screen('[%s] %s: Thumbnail %sis already present' %
|
||||||
|
@ -340,6 +340,7 @@ def _real_main(argv=None):
|
|||||||
'format': opts.format,
|
'format': opts.format,
|
||||||
'listformats': opts.listformats,
|
'listformats': opts.listformats,
|
||||||
'outtmpl': outtmpl,
|
'outtmpl': outtmpl,
|
||||||
|
'outtmpl_na_placeholder': opts.outtmpl_na_placeholder,
|
||||||
'autonumber_size': opts.autonumber_size,
|
'autonumber_size': opts.autonumber_size,
|
||||||
'autonumber_start': opts.autonumber_start,
|
'autonumber_start': opts.autonumber_start,
|
||||||
'restrictfilenames': opts.restrictfilenames,
|
'restrictfilenames': opts.restrictfilenames,
|
||||||
|
@ -97,12 +97,15 @@ class FragmentFD(FileDownloader):
|
|||||||
|
|
||||||
def _download_fragment(self, ctx, frag_url, info_dict, headers=None):
|
def _download_fragment(self, ctx, frag_url, info_dict, headers=None):
|
||||||
fragment_filename = '%s-Frag%d' % (ctx['tmpfilename'], ctx['fragment_index'])
|
fragment_filename = '%s-Frag%d' % (ctx['tmpfilename'], ctx['fragment_index'])
|
||||||
success = ctx['dl'].download(fragment_filename, {
|
fragment_info_dict = {
|
||||||
'url': frag_url,
|
'url': frag_url,
|
||||||
'http_headers': headers or info_dict.get('http_headers'),
|
'http_headers': headers or info_dict.get('http_headers'),
|
||||||
})
|
}
|
||||||
|
success = ctx['dl'].download(fragment_filename, fragment_info_dict)
|
||||||
if not success:
|
if not success:
|
||||||
return False, None
|
return False, None
|
||||||
|
if fragment_info_dict.get('filetime'):
|
||||||
|
ctx['fragment_filetime'] = fragment_info_dict.get('filetime')
|
||||||
down, frag_sanitized = sanitize_open(fragment_filename, 'rb')
|
down, frag_sanitized = sanitize_open(fragment_filename, 'rb')
|
||||||
ctx['fragment_filename_sanitized'] = frag_sanitized
|
ctx['fragment_filename_sanitized'] = frag_sanitized
|
||||||
frag_content = down.read()
|
frag_content = down.read()
|
||||||
@ -258,6 +261,13 @@ class FragmentFD(FileDownloader):
|
|||||||
downloaded_bytes = ctx['complete_frags_downloaded_bytes']
|
downloaded_bytes = ctx['complete_frags_downloaded_bytes']
|
||||||
else:
|
else:
|
||||||
self.try_rename(ctx['tmpfilename'], ctx['filename'])
|
self.try_rename(ctx['tmpfilename'], ctx['filename'])
|
||||||
|
if self.params.get('updatetime', True):
|
||||||
|
filetime = ctx.get('fragment_filetime')
|
||||||
|
if filetime:
|
||||||
|
try:
|
||||||
|
os.utime(ctx['filename'], (time.time(), filetime))
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
downloaded_bytes = os.path.getsize(encodeFilename(ctx['filename']))
|
downloaded_bytes = os.path.getsize(encodeFilename(ctx['filename']))
|
||||||
|
|
||||||
self._hook_progress({
|
self._hook_progress({
|
||||||
|
@ -42,11 +42,13 @@ class HlsFD(FragmentFD):
|
|||||||
# no segments will definitely be appended to the end of the playlist.
|
# no segments will definitely be appended to the end of the playlist.
|
||||||
# r'#EXT-X-PLAYLIST-TYPE:EVENT', # media segments may be appended to the end of
|
# r'#EXT-X-PLAYLIST-TYPE:EVENT', # media segments may be appended to the end of
|
||||||
# # event media playlists [4]
|
# # event media playlists [4]
|
||||||
|
r'#EXT-X-MAP:', # media initialization [5]
|
||||||
|
|
||||||
# 1. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.2.4
|
# 1. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.2.4
|
||||||
# 2. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.2.2
|
# 2. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.2.2
|
||||||
# 3. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.2
|
# 3. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.2
|
||||||
# 4. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.5
|
# 4. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.5
|
||||||
|
# 5. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.2.5
|
||||||
)
|
)
|
||||||
check_results = [not re.search(feature, manifest) for feature in UNSUPPORTED_FEATURES]
|
check_results = [not re.search(feature, manifest) for feature in UNSUPPORTED_FEATURES]
|
||||||
is_aes128_enc = '#EXT-X-KEY:METHOD=AES-128' in manifest
|
is_aes128_enc = '#EXT-X-KEY:METHOD=AES-128' in manifest
|
||||||
@ -170,8 +172,12 @@ class HlsFD(FragmentFD):
|
|||||||
iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
|
iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
|
||||||
decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen(
|
decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen(
|
||||||
self._prepare_url(info_dict, info_dict.get('_decryption_key_url') or decrypt_info['URI'])).read()
|
self._prepare_url(info_dict, info_dict.get('_decryption_key_url') or decrypt_info['URI'])).read()
|
||||||
frag_content = AES.new(
|
# Don't decrypt the content in tests since the data is explicitly truncated and it's not to a valid block
|
||||||
decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content)
|
# size (see https://github.com/ytdl-org/youtube-dl/pull/27660). Tests only care that the correct data downloaded,
|
||||||
|
# not what it decrypts to.
|
||||||
|
if not test:
|
||||||
|
frag_content = AES.new(
|
||||||
|
decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content)
|
||||||
self._append_fragment(ctx, frag_content)
|
self._append_fragment(ctx, frag_content)
|
||||||
# We only download the first fragment during the test
|
# We only download the first fragment during the test
|
||||||
if test:
|
if test:
|
||||||
|
@ -1,14 +1,15 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import calendar
|
|
||||||
import re
|
import re
|
||||||
import time
|
|
||||||
|
|
||||||
from .amp import AMPIE
|
from .amp import AMPIE
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .youtube import YoutubeIE
|
from ..utils import (
|
||||||
from ..compat import compat_urlparse
|
parse_duration,
|
||||||
|
parse_iso8601,
|
||||||
|
try_get,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class AbcNewsVideoIE(AMPIE):
|
class AbcNewsVideoIE(AMPIE):
|
||||||
@ -18,8 +19,8 @@ class AbcNewsVideoIE(AMPIE):
|
|||||||
(?:
|
(?:
|
||||||
abcnews\.go\.com/
|
abcnews\.go\.com/
|
||||||
(?:
|
(?:
|
||||||
[^/]+/video/(?P<display_id>[0-9a-z-]+)-|
|
(?:[^/]+/)*video/(?P<display_id>[0-9a-z-]+)-|
|
||||||
video/embed\?.*?\bid=
|
video/(?:embed|itemfeed)\?.*?\bid=
|
||||||
)|
|
)|
|
||||||
fivethirtyeight\.abcnews\.go\.com/video/embed/\d+/
|
fivethirtyeight\.abcnews\.go\.com/video/embed/\d+/
|
||||||
)
|
)
|
||||||
@ -36,6 +37,8 @@ class AbcNewsVideoIE(AMPIE):
|
|||||||
'description': 'George Stephanopoulos goes one-on-one with Iranian Foreign Minister Dr. Javad Zarif.',
|
'description': 'George Stephanopoulos goes one-on-one with Iranian Foreign Minister Dr. Javad Zarif.',
|
||||||
'duration': 180,
|
'duration': 180,
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
|
'timestamp': 1380454200,
|
||||||
|
'upload_date': '20130929',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
# m3u8 download
|
# m3u8 download
|
||||||
@ -47,6 +50,12 @@ class AbcNewsVideoIE(AMPIE):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://abcnews.go.com/2020/video/2020-husband-stands-teacher-jail-student-affairs-26119478',
|
'url': 'http://abcnews.go.com/2020/video/2020-husband-stands-teacher-jail-student-affairs-26119478',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'http://abcnews.go.com/video/itemfeed?id=46979033',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://abcnews.go.com/GMA/News/video/history-christmas-story-67894761',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -67,28 +76,23 @@ class AbcNewsIE(InfoExtractor):
|
|||||||
_VALID_URL = r'https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)'
|
_VALID_URL = r'https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY',
|
# Youtube Embeds
|
||||||
|
'url': 'https://abcnews.go.com/Entertainment/peter-billingsley-child-actor-christmas-story-hollywood-power/story?id=51286501',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '10505354',
|
'id': '51286501',
|
||||||
'ext': 'flv',
|
'title': "Peter Billingsley: From child actor in 'A Christmas Story' to Hollywood power player",
|
||||||
'display_id': 'dramatic-video-rare-death-job-america',
|
'description': 'Billingsley went from a child actor to Hollywood power player.',
|
||||||
'title': 'Occupational Hazards',
|
|
||||||
'description': 'Nightline investigates the dangers that lurk at various jobs.',
|
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
|
||||||
'upload_date': '20100428',
|
|
||||||
'timestamp': 1272412800,
|
|
||||||
},
|
},
|
||||||
'add_ie': ['AbcNewsVideo'],
|
'playlist_count': 5,
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818',
|
'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '38897857',
|
'id': '38897857',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'display_id': 'justin-timberlake-performs-stop-feeling-eurovision-2016',
|
|
||||||
'title': 'Justin Timberlake Drops Hints For Secret Single',
|
'title': 'Justin Timberlake Drops Hints For Secret Single',
|
||||||
'description': 'Lara Spencer reports the buzziest stories of the day in "GMA" Pop News.',
|
'description': 'Lara Spencer reports the buzziest stories of the day in "GMA" Pop News.',
|
||||||
'upload_date': '20160515',
|
'upload_date': '20160505',
|
||||||
'timestamp': 1463329500,
|
'timestamp': 1462442280,
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
# m3u8 download
|
# m3u8 download
|
||||||
@ -100,49 +104,55 @@ class AbcNewsIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://abcnews.go.com/Technology/exclusive-apple-ceo-tim-cook-iphone-cracking-software/story?id=37173343',
|
'url': 'http://abcnews.go.com/Technology/exclusive-apple-ceo-tim-cook-iphone-cracking-software/story?id=37173343',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# inline.type == 'video'
|
||||||
|
'url': 'http://abcnews.go.com/Technology/exclusive-apple-ceo-tim-cook-iphone-cracking-software/story?id=37173343',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
story_id = self._match_id(url)
|
||||||
display_id = mobj.group('display_id')
|
webpage = self._download_webpage(url, story_id)
|
||||||
video_id = mobj.group('id')
|
story = self._parse_json(self._search_regex(
|
||||||
|
r"window\['__abcnews__'\]\s*=\s*({.+?});",
|
||||||
|
webpage, 'data'), story_id)['page']['content']['story']['everscroll'][0]
|
||||||
|
article_contents = story.get('articleContents') or {}
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
def entries():
|
||||||
video_url = self._search_regex(
|
featured_video = story.get('featuredVideo') or {}
|
||||||
r'window\.abcnvideo\.url\s*=\s*"([^"]+)"', webpage, 'video URL')
|
feed = try_get(featured_video, lambda x: x['video']['feed'])
|
||||||
full_video_url = compat_urlparse.urljoin(url, video_url)
|
if feed:
|
||||||
|
yield {
|
||||||
|
'_type': 'url',
|
||||||
|
'id': featured_video.get('id'),
|
||||||
|
'title': featured_video.get('name'),
|
||||||
|
'url': feed,
|
||||||
|
'thumbnail': featured_video.get('images'),
|
||||||
|
'description': featured_video.get('description'),
|
||||||
|
'timestamp': parse_iso8601(featured_video.get('uploadDate')),
|
||||||
|
'duration': parse_duration(featured_video.get('duration')),
|
||||||
|
'ie_key': AbcNewsVideoIE.ie_key(),
|
||||||
|
}
|
||||||
|
|
||||||
youtube_url = YoutubeIE._extract_url(webpage)
|
for inline in (article_contents.get('inlines') or []):
|
||||||
|
inline_type = inline.get('type')
|
||||||
|
if inline_type == 'iframe':
|
||||||
|
iframe_url = try_get(inline, lambda x: x['attrs']['src'])
|
||||||
|
if iframe_url:
|
||||||
|
yield self.url_result(iframe_url)
|
||||||
|
elif inline_type == 'video':
|
||||||
|
video_id = inline.get('id')
|
||||||
|
if video_id:
|
||||||
|
yield {
|
||||||
|
'_type': 'url',
|
||||||
|
'id': video_id,
|
||||||
|
'url': 'http://abcnews.go.com/video/embed?id=' + video_id,
|
||||||
|
'thumbnail': inline.get('imgSrc') or inline.get('imgDefault'),
|
||||||
|
'description': inline.get('description'),
|
||||||
|
'duration': parse_duration(inline.get('duration')),
|
||||||
|
'ie_key': AbcNewsVideoIE.ie_key(),
|
||||||
|
}
|
||||||
|
|
||||||
timestamp = None
|
return self.playlist_result(
|
||||||
date_str = self._html_search_regex(
|
entries(), story_id, article_contents.get('headline'),
|
||||||
r'<span[^>]+class="timestamp">([^<]+)</span>',
|
article_contents.get('subHead'))
|
||||||
webpage, 'timestamp', fatal=False)
|
|
||||||
if date_str:
|
|
||||||
tz_offset = 0
|
|
||||||
if date_str.endswith(' ET'): # Eastern Time
|
|
||||||
tz_offset = -5
|
|
||||||
date_str = date_str[:-3]
|
|
||||||
date_formats = ['%b. %d, %Y', '%b %d, %Y, %I:%M %p']
|
|
||||||
for date_format in date_formats:
|
|
||||||
try:
|
|
||||||
timestamp = calendar.timegm(time.strptime(date_str.strip(), date_format))
|
|
||||||
except ValueError:
|
|
||||||
continue
|
|
||||||
if timestamp is not None:
|
|
||||||
timestamp -= tz_offset * 3600
|
|
||||||
|
|
||||||
entry = {
|
|
||||||
'_type': 'url_transparent',
|
|
||||||
'ie_key': AbcNewsVideoIE.ie_key(),
|
|
||||||
'url': full_video_url,
|
|
||||||
'id': video_id,
|
|
||||||
'display_id': display_id,
|
|
||||||
'timestamp': timestamp,
|
|
||||||
}
|
|
||||||
|
|
||||||
if youtube_url:
|
|
||||||
entries = [entry, self.url_result(youtube_url, ie=YoutubeIE.ie_key())]
|
|
||||||
return self.playlist_result(entries)
|
|
||||||
|
|
||||||
return entry
|
|
||||||
|
@ -2,21 +2,48 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
import re
|
||||||
import functools
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import compat_str
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
clean_html,
|
clean_html,
|
||||||
float_or_none,
|
clean_podcast_url,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
try_get,
|
parse_iso8601,
|
||||||
unified_timestamp,
|
|
||||||
OnDemandPagedList,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class ACastIE(InfoExtractor):
|
class ACastBaseIE(InfoExtractor):
|
||||||
|
def _extract_episode(self, episode, show_info):
|
||||||
|
title = episode['title']
|
||||||
|
info = {
|
||||||
|
'id': episode['id'],
|
||||||
|
'display_id': episode.get('episodeUrl'),
|
||||||
|
'url': clean_podcast_url(episode['url']),
|
||||||
|
'title': title,
|
||||||
|
'description': clean_html(episode.get('description') or episode.get('summary')),
|
||||||
|
'thumbnail': episode.get('image'),
|
||||||
|
'timestamp': parse_iso8601(episode.get('publishDate')),
|
||||||
|
'duration': int_or_none(episode.get('duration')),
|
||||||
|
'filesize': int_or_none(episode.get('contentLength')),
|
||||||
|
'season_number': int_or_none(episode.get('season')),
|
||||||
|
'episode': title,
|
||||||
|
'episode_number': int_or_none(episode.get('episode')),
|
||||||
|
}
|
||||||
|
info.update(show_info)
|
||||||
|
return info
|
||||||
|
|
||||||
|
def _extract_show_info(self, show):
|
||||||
|
return {
|
||||||
|
'creator': show.get('author'),
|
||||||
|
'series': show.get('title'),
|
||||||
|
}
|
||||||
|
|
||||||
|
def _call_api(self, path, video_id, query=None):
|
||||||
|
return self._download_json(
|
||||||
|
'https://feeder.acast.com/api/v1/shows/' + path, video_id, query=query)
|
||||||
|
|
||||||
|
|
||||||
|
class ACastIE(ACastBaseIE):
|
||||||
IE_NAME = 'acast'
|
IE_NAME = 'acast'
|
||||||
_VALID_URL = r'''(?x)
|
_VALID_URL = r'''(?x)
|
||||||
https?://
|
https?://
|
||||||
@ -28,15 +55,15 @@ class ACastIE(InfoExtractor):
|
|||||||
'''
|
'''
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna',
|
'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna',
|
||||||
'md5': '16d936099ec5ca2d5869e3a813ee8dc4',
|
'md5': 'f5598f3ad1e4776fed12ec1407153e4b',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '2a92b283-1a75-4ad8-8396-499c641de0d9',
|
'id': '2a92b283-1a75-4ad8-8396-499c641de0d9',
|
||||||
'ext': 'mp3',
|
'ext': 'mp3',
|
||||||
'title': '2. Raggarmordet - Röster ur det förflutna',
|
'title': '2. Raggarmordet - Röster ur det förflutna',
|
||||||
'description': 'md5:4f81f6d8cf2e12ee21a321d8bca32db4',
|
'description': 'md5:a992ae67f4d98f1c0141598f7bebbf67',
|
||||||
'timestamp': 1477346700,
|
'timestamp': 1477346700,
|
||||||
'upload_date': '20161024',
|
'upload_date': '20161024',
|
||||||
'duration': 2766.602563,
|
'duration': 2766,
|
||||||
'creator': 'Anton Berg & Martin Johnson',
|
'creator': 'Anton Berg & Martin Johnson',
|
||||||
'series': 'Spår',
|
'series': 'Spår',
|
||||||
'episode': '2. Raggarmordet - Röster ur det förflutna',
|
'episode': '2. Raggarmordet - Röster ur det förflutna',
|
||||||
@ -45,7 +72,7 @@ class ACastIE(InfoExtractor):
|
|||||||
'url': 'http://embed.acast.com/adambuxton/ep.12-adam-joeschristmaspodcast2015',
|
'url': 'http://embed.acast.com/adambuxton/ep.12-adam-joeschristmaspodcast2015',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://play.acast.com/s/rattegangspodden/s04e09-styckmordet-i-helenelund-del-22',
|
'url': 'https://play.acast.com/s/rattegangspodden/s04e09styckmordetihelenelund-del2-2',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://play.acast.com/s/sparpodcast/2a92b283-1a75-4ad8-8396-499c641de0d9',
|
'url': 'https://play.acast.com/s/sparpodcast/2a92b283-1a75-4ad8-8396-499c641de0d9',
|
||||||
@ -54,40 +81,14 @@ class ACastIE(InfoExtractor):
|
|||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
channel, display_id = re.match(self._VALID_URL, url).groups()
|
channel, display_id = re.match(self._VALID_URL, url).groups()
|
||||||
s = self._download_json(
|
episode = self._call_api(
|
||||||
'https://feeder.acast.com/api/v1/shows/%s/episodes/%s' % (channel, display_id),
|
'%s/episodes/%s' % (channel, display_id),
|
||||||
display_id)
|
display_id, {'showInfo': 'true'})
|
||||||
media_url = s['url']
|
return self._extract_episode(
|
||||||
if re.search(r'[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12}', display_id):
|
episode, self._extract_show_info(episode.get('show') or {}))
|
||||||
episode_url = s.get('episodeUrl')
|
|
||||||
if episode_url:
|
|
||||||
display_id = episode_url
|
|
||||||
else:
|
|
||||||
channel, display_id = re.match(self._VALID_URL, s['link']).groups()
|
|
||||||
cast_data = self._download_json(
|
|
||||||
'https://play-api.acast.com/splash/%s/%s' % (channel, display_id),
|
|
||||||
display_id)['result']
|
|
||||||
e = cast_data['episode']
|
|
||||||
title = e.get('name') or s['title']
|
|
||||||
return {
|
|
||||||
'id': compat_str(e['id']),
|
|
||||||
'display_id': display_id,
|
|
||||||
'url': media_url,
|
|
||||||
'title': title,
|
|
||||||
'description': e.get('summary') or clean_html(e.get('description') or s.get('description')),
|
|
||||||
'thumbnail': e.get('image'),
|
|
||||||
'timestamp': unified_timestamp(e.get('publishingDate') or s.get('publishDate')),
|
|
||||||
'duration': float_or_none(e.get('duration') or s.get('duration')),
|
|
||||||
'filesize': int_or_none(e.get('contentLength')),
|
|
||||||
'creator': try_get(cast_data, lambda x: x['show']['author'], compat_str),
|
|
||||||
'series': try_get(cast_data, lambda x: x['show']['name'], compat_str),
|
|
||||||
'season_number': int_or_none(e.get('seasonNumber')),
|
|
||||||
'episode': title,
|
|
||||||
'episode_number': int_or_none(e.get('episodeNumber')),
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class ACastChannelIE(InfoExtractor):
|
class ACastChannelIE(ACastBaseIE):
|
||||||
IE_NAME = 'acast:channel'
|
IE_NAME = 'acast:channel'
|
||||||
_VALID_URL = r'''(?x)
|
_VALID_URL = r'''(?x)
|
||||||
https?://
|
https?://
|
||||||
@ -102,34 +103,24 @@ class ACastChannelIE(InfoExtractor):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '4efc5294-5385-4847-98bd-519799ce5786',
|
'id': '4efc5294-5385-4847-98bd-519799ce5786',
|
||||||
'title': 'Today in Focus',
|
'title': 'Today in Focus',
|
||||||
'description': 'md5:9ba5564de5ce897faeb12963f4537a64',
|
'description': 'md5:c09ce28c91002ce4ffce71d6504abaae',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 35,
|
'playlist_mincount': 200,
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://play.acast.com/s/ft-banking-weekly',
|
'url': 'http://play.acast.com/s/ft-banking-weekly',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
_API_BASE_URL = 'https://play.acast.com/api/'
|
|
||||||
_PAGE_SIZE = 10
|
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def suitable(cls, url):
|
def suitable(cls, url):
|
||||||
return False if ACastIE.suitable(url) else super(ACastChannelIE, cls).suitable(url)
|
return False if ACastIE.suitable(url) else super(ACastChannelIE, cls).suitable(url)
|
||||||
|
|
||||||
def _fetch_page(self, channel_slug, page):
|
|
||||||
casts = self._download_json(
|
|
||||||
self._API_BASE_URL + 'channels/%s/acasts?page=%s' % (channel_slug, page),
|
|
||||||
channel_slug, note='Download page %d of channel data' % page)
|
|
||||||
for cast in casts:
|
|
||||||
yield self.url_result(
|
|
||||||
'https://play.acast.com/s/%s/%s' % (channel_slug, cast['url']),
|
|
||||||
'ACast', cast['id'])
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
channel_slug = self._match_id(url)
|
show_slug = self._match_id(url)
|
||||||
channel_data = self._download_json(
|
show = self._call_api(show_slug, show_slug)
|
||||||
self._API_BASE_URL + 'channels/%s' % channel_slug, channel_slug)
|
show_info = self._extract_show_info(show)
|
||||||
entries = OnDemandPagedList(functools.partial(
|
entries = []
|
||||||
self._fetch_page, channel_slug), self._PAGE_SIZE)
|
for episode in (show.get('episodes') or []):
|
||||||
return self.playlist_result(entries, compat_str(
|
entries.append(self._extract_episode(episode, show_info))
|
||||||
channel_data['id']), channel_data['name'], channel_data.get('description'))
|
return self.playlist_result(
|
||||||
|
entries, show.get('id'), show.get('title'), show.get('description'))
|
||||||
|
@ -10,6 +10,7 @@ import random
|
|||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..aes import aes_cbc_decrypt
|
from ..aes import aes_cbc_decrypt
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
|
compat_HTTPError,
|
||||||
compat_b64decode,
|
compat_b64decode,
|
||||||
compat_ord,
|
compat_ord,
|
||||||
)
|
)
|
||||||
@ -18,11 +19,14 @@ from ..utils import (
|
|||||||
bytes_to_long,
|
bytes_to_long,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
float_or_none,
|
float_or_none,
|
||||||
|
int_or_none,
|
||||||
intlist_to_bytes,
|
intlist_to_bytes,
|
||||||
long_to_bytes,
|
long_to_bytes,
|
||||||
pkcs1pad,
|
pkcs1pad,
|
||||||
strip_or_none,
|
strip_or_none,
|
||||||
urljoin,
|
try_get,
|
||||||
|
unified_strdate,
|
||||||
|
urlencode_postdata,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -31,16 +35,30 @@ class ADNIE(InfoExtractor):
|
|||||||
_VALID_URL = r'https?://(?:www\.)?animedigitalnetwork\.fr/video/[^/]+/(?P<id>\d+)'
|
_VALID_URL = r'https?://(?:www\.)?animedigitalnetwork\.fr/video/[^/]+/(?P<id>\d+)'
|
||||||
_TEST = {
|
_TEST = {
|
||||||
'url': 'http://animedigitalnetwork.fr/video/blue-exorcist-kyoto-saga/7778-episode-1-debut-des-hostilites',
|
'url': 'http://animedigitalnetwork.fr/video/blue-exorcist-kyoto-saga/7778-episode-1-debut-des-hostilites',
|
||||||
'md5': 'e497370d847fd79d9d4c74be55575c7a',
|
'md5': '0319c99885ff5547565cacb4f3f9348d',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '7778',
|
'id': '7778',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Blue Exorcist - Kyôto Saga - Épisode 1',
|
'title': 'Blue Exorcist - Kyôto Saga - Episode 1',
|
||||||
'description': 'md5:2f7b5aa76edbc1a7a92cedcda8a528d5',
|
'description': 'md5:2f7b5aa76edbc1a7a92cedcda8a528d5',
|
||||||
|
'series': 'Blue Exorcist - Kyôto Saga',
|
||||||
|
'duration': 1467,
|
||||||
|
'release_date': '20170106',
|
||||||
|
'comment_count': int,
|
||||||
|
'average_rating': float,
|
||||||
|
'season_number': 2,
|
||||||
|
'episode': 'Début des hostilités',
|
||||||
|
'episode_number': 1,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
_NETRC_MACHINE = 'animedigitalnetwork'
|
||||||
_BASE_URL = 'http://animedigitalnetwork.fr'
|
_BASE_URL = 'http://animedigitalnetwork.fr'
|
||||||
_RSA_KEY = (0xc35ae1e4356b65a73b551493da94b8cb443491c0aa092a357a5aee57ffc14dda85326f42d716e539a34542a0d3f363adf16c5ec222d713d5997194030ee2e4f0d1fb328c01a81cf6868c090d50de8e169c6b13d1675b9eeed1cbc51e1fffca9b38af07f37abd790924cd3bee59d0257cfda4fe5f3f0534877e21ce5821447d1b, 65537)
|
_API_BASE_URL = 'https://gw.api.animedigitalnetwork.fr/'
|
||||||
|
_PLAYER_BASE_URL = _API_BASE_URL + 'player/'
|
||||||
|
_HEADERS = {}
|
||||||
|
_LOGIN_ERR_MESSAGE = 'Unable to log in'
|
||||||
|
_RSA_KEY = (0x9B42B08905199A5CCE2026274399CA560ECB209EE9878A708B1C0812E1BB8CB5D1FB7441861147C1A1F2F3A0476DD63A9CAC20D3E983613346850AA6CB38F16DC7D720FD7D86FC6E5B3D5BBC72E14CD0BF9E869F2CEA2CCAD648F1DCE38F1FF916CEFB2D339B64AA0264372344BC775E265E8A852F88144AB0BD9AA06C1A4ABB, 65537)
|
||||||
_POS_ALIGN_MAP = {
|
_POS_ALIGN_MAP = {
|
||||||
'start': 1,
|
'start': 1,
|
||||||
'end': 3,
|
'end': 3,
|
||||||
@ -54,26 +72,24 @@ class ADNIE(InfoExtractor):
|
|||||||
def _ass_subtitles_timecode(seconds):
|
def _ass_subtitles_timecode(seconds):
|
||||||
return '%01d:%02d:%02d.%02d' % (seconds / 3600, (seconds % 3600) / 60, seconds % 60, (seconds % 1) * 100)
|
return '%01d:%02d:%02d.%02d' % (seconds / 3600, (seconds % 3600) / 60, seconds % 60, (seconds % 1) * 100)
|
||||||
|
|
||||||
def _get_subtitles(self, sub_path, video_id):
|
def _get_subtitles(self, sub_url, video_id):
|
||||||
if not sub_path:
|
if not sub_url:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
enc_subtitles = self._download_webpage(
|
enc_subtitles = self._download_webpage(
|
||||||
urljoin(self._BASE_URL, sub_path),
|
sub_url, video_id, 'Downloading subtitles location', fatal=False) or '{}'
|
||||||
video_id, 'Downloading subtitles location', fatal=False) or '{}'
|
|
||||||
subtitle_location = (self._parse_json(enc_subtitles, video_id, fatal=False) or {}).get('location')
|
subtitle_location = (self._parse_json(enc_subtitles, video_id, fatal=False) or {}).get('location')
|
||||||
if subtitle_location:
|
if subtitle_location:
|
||||||
enc_subtitles = self._download_webpage(
|
enc_subtitles = self._download_webpage(
|
||||||
urljoin(self._BASE_URL, subtitle_location),
|
subtitle_location, video_id, 'Downloading subtitles data',
|
||||||
video_id, 'Downloading subtitles data', fatal=False,
|
fatal=False, headers={'Origin': 'https://animedigitalnetwork.fr'})
|
||||||
headers={'Origin': 'https://animedigitalnetwork.fr'})
|
|
||||||
if not enc_subtitles:
|
if not enc_subtitles:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
|
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
|
||||||
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
|
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
|
||||||
bytes_to_intlist(compat_b64decode(enc_subtitles[24:])),
|
bytes_to_intlist(compat_b64decode(enc_subtitles[24:])),
|
||||||
bytes_to_intlist(binascii.unhexlify(self._K + '4b8ef13ec1872730')),
|
bytes_to_intlist(binascii.unhexlify(self._K + 'ab9f52f5baae7c72')),
|
||||||
bytes_to_intlist(compat_b64decode(enc_subtitles[:24]))
|
bytes_to_intlist(compat_b64decode(enc_subtitles[:24]))
|
||||||
))
|
))
|
||||||
subtitles_json = self._parse_json(
|
subtitles_json = self._parse_json(
|
||||||
@ -117,61 +133,100 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
|
|||||||
}])
|
}])
|
||||||
return subtitles
|
return subtitles
|
||||||
|
|
||||||
|
def _real_initialize(self):
|
||||||
|
username, password = self._get_login_info()
|
||||||
|
if not username:
|
||||||
|
return
|
||||||
|
try:
|
||||||
|
access_token = (self._download_json(
|
||||||
|
self._API_BASE_URL + 'authentication/login', None,
|
||||||
|
'Logging in', self._LOGIN_ERR_MESSAGE, fatal=False,
|
||||||
|
data=urlencode_postdata({
|
||||||
|
'password': password,
|
||||||
|
'rememberMe': False,
|
||||||
|
'source': 'Web',
|
||||||
|
'username': username,
|
||||||
|
})) or {}).get('accessToken')
|
||||||
|
if access_token:
|
||||||
|
self._HEADERS = {'authorization': 'Bearer ' + access_token}
|
||||||
|
except ExtractorError as e:
|
||||||
|
message = None
|
||||||
|
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
|
||||||
|
resp = self._parse_json(
|
||||||
|
e.cause.read().decode(), None, fatal=False) or {}
|
||||||
|
message = resp.get('message') or resp.get('code')
|
||||||
|
self.report_warning(message or self._LOGIN_ERR_MESSAGE)
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
webpage = self._download_webpage(url, video_id)
|
video_base_url = self._PLAYER_BASE_URL + 'video/%s/' % video_id
|
||||||
player_config = self._parse_json(self._search_regex(
|
player = self._download_json(
|
||||||
r'playerConfig\s*=\s*({.+});', webpage,
|
video_base_url + 'configuration', video_id,
|
||||||
'player config', default='{}'), video_id, fatal=False)
|
'Downloading player config JSON metadata',
|
||||||
if not player_config:
|
headers=self._HEADERS)['player']
|
||||||
config_url = urljoin(self._BASE_URL, self._search_regex(
|
options = player['options']
|
||||||
r'(?:id="player"|class="[^"]*adn-player-container[^"]*")[^>]+data-url="([^"]+)"',
|
|
||||||
webpage, 'config url'))
|
|
||||||
player_config = self._download_json(
|
|
||||||
config_url, video_id,
|
|
||||||
'Downloading player config JSON metadata')['player']
|
|
||||||
|
|
||||||
video_info = {}
|
user = options['user']
|
||||||
video_info_str = self._search_regex(
|
if not user.get('hasAccess'):
|
||||||
r'videoInfo\s*=\s*({.+});', webpage,
|
self.raise_login_required()
|
||||||
'video info', fatal=False)
|
|
||||||
if video_info_str:
|
|
||||||
video_info = self._parse_json(
|
|
||||||
video_info_str, video_id, fatal=False) or {}
|
|
||||||
|
|
||||||
options = player_config.get('options') or {}
|
token = self._download_json(
|
||||||
metas = options.get('metas') or {}
|
user.get('refreshTokenUrl') or (self._PLAYER_BASE_URL + 'refresh/token'),
|
||||||
links = player_config.get('links') or {}
|
video_id, 'Downloading access token', headers={
|
||||||
sub_path = player_config.get('subtitles')
|
'x-player-refresh-token': user['refreshToken']
|
||||||
error = None
|
}, data=b'')['token']
|
||||||
if not links:
|
|
||||||
links_url = player_config.get('linksurl') or options['videoUrl']
|
links_url = try_get(options, lambda x: x['video']['url']) or (video_base_url + 'link')
|
||||||
token = options['token']
|
self._K = ''.join([random.choice('0123456789abcdef') for _ in range(16)])
|
||||||
self._K = ''.join([random.choice('0123456789abcdef') for _ in range(16)])
|
message = bytes_to_intlist(json.dumps({
|
||||||
message = bytes_to_intlist(json.dumps({
|
'k': self._K,
|
||||||
'k': self._K,
|
't': token,
|
||||||
'e': 60,
|
}))
|
||||||
't': token,
|
|
||||||
}))
|
# Sometimes authentication fails for no good reason, retry with
|
||||||
|
# a different random padding
|
||||||
|
links_data = None
|
||||||
|
for _ in range(3):
|
||||||
padded_message = intlist_to_bytes(pkcs1pad(message, 128))
|
padded_message = intlist_to_bytes(pkcs1pad(message, 128))
|
||||||
n, e = self._RSA_KEY
|
n, e = self._RSA_KEY
|
||||||
encrypted_message = long_to_bytes(pow(bytes_to_long(padded_message), e, n))
|
encrypted_message = long_to_bytes(pow(bytes_to_long(padded_message), e, n))
|
||||||
authorization = base64.b64encode(encrypted_message).decode()
|
authorization = base64.b64encode(encrypted_message).decode()
|
||||||
links_data = self._download_json(
|
|
||||||
urljoin(self._BASE_URL, links_url), video_id,
|
try:
|
||||||
'Downloading links JSON metadata', headers={
|
links_data = self._download_json(
|
||||||
'Authorization': 'Bearer ' + authorization,
|
links_url, video_id, 'Downloading links JSON metadata', headers={
|
||||||
})
|
'X-Player-Token': authorization
|
||||||
links = links_data.get('links') or {}
|
}, query={
|
||||||
metas = metas or links_data.get('meta') or {}
|
'freeWithAds': 'true',
|
||||||
sub_path = sub_path or links_data.get('subtitles') or \
|
'adaptive': 'false',
|
||||||
'index.php?option=com_vodapi&task=subtitles.getJSON&format=json&id=' + video_id
|
'withMetadata': 'true',
|
||||||
sub_path += '&token=' + token
|
'source': 'Web'
|
||||||
error = links_data.get('error')
|
})
|
||||||
title = metas.get('title') or video_info['title']
|
break
|
||||||
|
except ExtractorError as e:
|
||||||
|
if not isinstance(e.cause, compat_HTTPError):
|
||||||
|
raise e
|
||||||
|
|
||||||
|
if e.cause.code == 401:
|
||||||
|
# This usually goes away with a different random pkcs1pad, so retry
|
||||||
|
continue
|
||||||
|
|
||||||
|
error = self._parse_json(e.cause.read(), video_id)
|
||||||
|
message = error.get('message')
|
||||||
|
if e.cause.code == 403 and error.get('code') == 'player-bad-geolocation-country':
|
||||||
|
self.raise_geo_restricted(msg=message)
|
||||||
|
raise ExtractorError(message)
|
||||||
|
else:
|
||||||
|
raise ExtractorError('Giving up retrying')
|
||||||
|
|
||||||
|
links = links_data.get('links') or {}
|
||||||
|
metas = links_data.get('metadata') or {}
|
||||||
|
sub_url = (links.get('subtitles') or {}).get('all')
|
||||||
|
video_info = links_data.get('video') or {}
|
||||||
|
title = metas['title']
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for format_id, qualities in links.items():
|
for format_id, qualities in (links.get('streaming') or {}).items():
|
||||||
if not isinstance(qualities, dict):
|
if not isinstance(qualities, dict):
|
||||||
continue
|
continue
|
||||||
for quality, load_balancer_url in qualities.items():
|
for quality, load_balancer_url in qualities.items():
|
||||||
@ -189,19 +244,26 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
|
|||||||
for f in m3u8_formats:
|
for f in m3u8_formats:
|
||||||
f['language'] = 'fr'
|
f['language'] = 'fr'
|
||||||
formats.extend(m3u8_formats)
|
formats.extend(m3u8_formats)
|
||||||
if not error:
|
|
||||||
error = options.get('error')
|
|
||||||
if not formats and error:
|
|
||||||
raise ExtractorError('%s said: %s' % (self.IE_NAME, error), expected=True)
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
video = (self._download_json(
|
||||||
|
self._API_BASE_URL + 'video/%s' % video_id, video_id,
|
||||||
|
'Downloading additional video metadata', fatal=False) or {}).get('video') or {}
|
||||||
|
show = video.get('show') or {}
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
'description': strip_or_none(metas.get('summary') or video_info.get('resume')),
|
'description': strip_or_none(metas.get('summary') or video.get('summary')),
|
||||||
'thumbnail': video_info.get('image'),
|
'thumbnail': video_info.get('image') or player.get('image'),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'subtitles': self.extract_subtitles(sub_path, video_id),
|
'subtitles': self.extract_subtitles(sub_url, video_id),
|
||||||
'episode': metas.get('subtitle') or video_info.get('videoTitle'),
|
'episode': metas.get('subtitle') or video.get('name'),
|
||||||
'series': video_info.get('playlistTitle'),
|
'episode_number': int_or_none(video.get('shortNumber')),
|
||||||
|
'series': show.get('title'),
|
||||||
|
'season_number': int_or_none(video.get('season')),
|
||||||
|
'duration': int_or_none(video_info.get('duration') or video.get('duration')),
|
||||||
|
'release_date': unified_strdate(video.get('releaseDate')),
|
||||||
|
'average_rating': float_or_none(video.get('rating') or metas.get('rating')),
|
||||||
|
'comment_count': int_or_none(video.get('commentsCount')),
|
||||||
}
|
}
|
||||||
|
@ -5,20 +5,32 @@ import re
|
|||||||
|
|
||||||
from .theplatform import ThePlatformIE
|
from .theplatform import ThePlatformIE
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
extract_attributes,
|
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
|
GeoRestrictedError,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
smuggle_url,
|
|
||||||
update_url_query,
|
update_url_query,
|
||||||
)
|
urlencode_postdata,
|
||||||
from ..compat import (
|
|
||||||
compat_urlparse,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class AENetworksBaseIE(ThePlatformIE):
|
class AENetworksBaseIE(ThePlatformIE):
|
||||||
|
_BASE_URL_REGEX = r'''(?x)https?://
|
||||||
|
(?:(?:www|play|watch)\.)?
|
||||||
|
(?P<domain>
|
||||||
|
(?:history(?:vault)?|aetv|mylifetime|lifetimemovieclub)\.com|
|
||||||
|
fyi\.tv
|
||||||
|
)/'''
|
||||||
_THEPLATFORM_KEY = 'crazyjava'
|
_THEPLATFORM_KEY = 'crazyjava'
|
||||||
_THEPLATFORM_SECRET = 's3cr3t'
|
_THEPLATFORM_SECRET = 's3cr3t'
|
||||||
|
_DOMAIN_MAP = {
|
||||||
|
'history.com': ('HISTORY', 'history'),
|
||||||
|
'aetv.com': ('AETV', 'aetv'),
|
||||||
|
'mylifetime.com': ('LIFETIME', 'lifetime'),
|
||||||
|
'lifetimemovieclub.com': ('LIFETIMEMOVIECLUB', 'lmc'),
|
||||||
|
'fyi.tv': ('FYI', 'fyi'),
|
||||||
|
'historyvault.com': (None, 'historyvault'),
|
||||||
|
'biography.com': (None, 'biography'),
|
||||||
|
}
|
||||||
|
|
||||||
def _extract_aen_smil(self, smil_url, video_id, auth=None):
|
def _extract_aen_smil(self, smil_url, video_id, auth=None):
|
||||||
query = {'mbr': 'true'}
|
query = {'mbr': 'true'}
|
||||||
@ -31,7 +43,7 @@ class AENetworksBaseIE(ThePlatformIE):
|
|||||||
'assetTypes': 'high_video_s3'
|
'assetTypes': 'high_video_s3'
|
||||||
}, {
|
}, {
|
||||||
'assetTypes': 'high_video_s3',
|
'assetTypes': 'high_video_s3',
|
||||||
'switch': 'hls_ingest_fastly'
|
'switch': 'hls_high_fastly',
|
||||||
}]
|
}]
|
||||||
formats = []
|
formats = []
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
@ -44,6 +56,8 @@ class AENetworksBaseIE(ThePlatformIE):
|
|||||||
tp_formats, tp_subtitles = self._extract_theplatform_smil(
|
tp_formats, tp_subtitles = self._extract_theplatform_smil(
|
||||||
m_url, video_id, 'Downloading %s SMIL data' % (q.get('switch') or q['assetTypes']))
|
m_url, video_id, 'Downloading %s SMIL data' % (q.get('switch') or q['assetTypes']))
|
||||||
except ExtractorError as e:
|
except ExtractorError as e:
|
||||||
|
if isinstance(e, GeoRestrictedError):
|
||||||
|
raise
|
||||||
last_e = e
|
last_e = e
|
||||||
continue
|
continue
|
||||||
formats.extend(tp_formats)
|
formats.extend(tp_formats)
|
||||||
@ -57,24 +71,45 @@ class AENetworksBaseIE(ThePlatformIE):
|
|||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
def _extract_aetn_info(self, domain, filter_key, filter_value, url):
|
||||||
|
requestor_id, brand = self._DOMAIN_MAP[domain]
|
||||||
|
result = self._download_json(
|
||||||
|
'https://feeds.video.aetnd.com/api/v2/%s/videos' % brand,
|
||||||
|
filter_value, query={'filter[%s]' % filter_key: filter_value})['results'][0]
|
||||||
|
title = result['title']
|
||||||
|
video_id = result['id']
|
||||||
|
media_url = result['publicUrl']
|
||||||
|
theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
|
||||||
|
r'https?://link\.theplatform\.com/s/([^?]+)', media_url, 'theplatform_path'), video_id)
|
||||||
|
info = self._parse_theplatform_metadata(theplatform_metadata)
|
||||||
|
auth = None
|
||||||
|
if theplatform_metadata.get('AETN$isBehindWall'):
|
||||||
|
resource = self._get_mvpd_resource(
|
||||||
|
requestor_id, theplatform_metadata['title'],
|
||||||
|
theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'),
|
||||||
|
theplatform_metadata['ratings'][0]['rating'])
|
||||||
|
auth = self._extract_mvpd_auth(
|
||||||
|
url, video_id, requestor_id, resource)
|
||||||
|
info.update(self._extract_aen_smil(media_url, video_id, auth))
|
||||||
|
info.update({
|
||||||
|
'title': title,
|
||||||
|
'series': result.get('seriesName'),
|
||||||
|
'season_number': int_or_none(result.get('tvSeasonNumber')),
|
||||||
|
'episode_number': int_or_none(result.get('tvSeasonEpisodeNumber')),
|
||||||
|
})
|
||||||
|
return info
|
||||||
|
|
||||||
|
|
||||||
class AENetworksIE(AENetworksBaseIE):
|
class AENetworksIE(AENetworksBaseIE):
|
||||||
IE_NAME = 'aenetworks'
|
IE_NAME = 'aenetworks'
|
||||||
IE_DESC = 'A+E Networks: A&E, Lifetime, History.com, FYI Network and History Vault'
|
IE_DESC = 'A+E Networks: A&E, Lifetime, History.com, FYI Network and History Vault'
|
||||||
_VALID_URL = r'''(?x)
|
_VALID_URL = AENetworksBaseIE._BASE_URL_REGEX + r'''(?P<id>
|
||||||
https?://
|
shows/[^/]+/season-\d+/episode-\d+|
|
||||||
(?:www\.)?
|
(?:
|
||||||
(?P<domain>
|
(?:movie|special)s/[^/]+|
|
||||||
(?:history(?:vault)?|aetv|mylifetime|lifetimemovieclub)\.com|
|
(?:shows/[^/]+/)?videos
|
||||||
fyi\.tv
|
)/[^/?#&]+
|
||||||
)/
|
)'''
|
||||||
(?:
|
|
||||||
shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|
|
|
||||||
movies/(?P<movie_display_id>[^/]+)(?:/full-movie)?|
|
|
||||||
specials/(?P<special_display_id>[^/]+)/(?:full-special|preview-)|
|
|
||||||
collections/[^/]+/(?P<collection_display_id>[^/]+)
|
|
||||||
)
|
|
||||||
'''
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
|
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -91,22 +126,23 @@ class AENetworksIE(AENetworksBaseIE):
|
|||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
'add_ie': ['ThePlatform'],
|
'add_ie': ['ThePlatform'],
|
||||||
}, {
|
'skip': 'This video is only available for users of participating TV providers.',
|
||||||
'url': 'http://www.history.com/shows/ancient-aliens/season-1',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '71889446852',
|
|
||||||
},
|
|
||||||
'playlist_mincount': 5,
|
|
||||||
}, {
|
|
||||||
'url': 'http://www.mylifetime.com/shows/atlanta-plastic',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'SERIES4317',
|
|
||||||
'title': 'Atlanta Plastic',
|
|
||||||
},
|
|
||||||
'playlist_mincount': 2,
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.aetv.com/shows/duck-dynasty/season-9/episode-1',
|
'url': 'http://www.aetv.com/shows/duck-dynasty/season-9/episode-1',
|
||||||
'only_matching': True
|
'info_dict': {
|
||||||
|
'id': '600587331957',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Inlawful Entry',
|
||||||
|
'description': 'md5:57c12115a2b384d883fe64ca50529e08',
|
||||||
|
'timestamp': 1452634428,
|
||||||
|
'upload_date': '20160112',
|
||||||
|
'uploader': 'AENE-NEW',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# m3u8 download
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
'add_ie': ['ThePlatform'],
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.fyi.tv/shows/tiny-house-nation/season-1/episode-8',
|
'url': 'http://www.fyi.tv/shows/tiny-house-nation/season-1/episode-8',
|
||||||
'only_matching': True
|
'only_matching': True
|
||||||
@ -117,78 +153,125 @@ class AENetworksIE(AENetworksBaseIE):
|
|||||||
'url': 'http://www.mylifetime.com/movies/center-stage-on-pointe/full-movie',
|
'url': 'http://www.mylifetime.com/movies/center-stage-on-pointe/full-movie',
|
||||||
'only_matching': True
|
'only_matching': True
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.lifetimemovieclub.com/movies/a-killer-among-us',
|
'url': 'https://watch.lifetimemovieclub.com/movies/10-year-reunion/full-movie',
|
||||||
'only_matching': True
|
'only_matching': True
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.history.com/specials/sniper-into-the-kill-zone/full-special',
|
'url': 'http://www.history.com/specials/sniper-into-the-kill-zone/full-special',
|
||||||
'only_matching': True
|
'only_matching': True
|
||||||
}, {
|
|
||||||
'url': 'https://www.historyvault.com/collections/america-the-story-of-us/westward',
|
|
||||||
'only_matching': True
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.aetv.com/specials/hunting-jonbenets-killer-the-untold-story/preview-hunting-jonbenets-killer-the-untold-story',
|
'url': 'https://www.aetv.com/specials/hunting-jonbenets-killer-the-untold-story/preview-hunting-jonbenets-killer-the-untold-story',
|
||||||
'only_matching': True
|
'only_matching': True
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.history.com/videos/history-of-valentines-day',
|
||||||
|
'only_matching': True
|
||||||
|
}, {
|
||||||
|
'url': 'https://play.aetv.com/shows/duck-dynasty/videos/best-of-duck-dynasty-getting-quack-in-shape',
|
||||||
|
'only_matching': True
|
||||||
}]
|
}]
|
||||||
_DOMAIN_TO_REQUESTOR_ID = {
|
|
||||||
'history.com': 'HISTORY',
|
|
||||||
'aetv.com': 'AETV',
|
|
||||||
'mylifetime.com': 'LIFETIME',
|
|
||||||
'lifetimemovieclub.com': 'LIFETIMEMOVIECLUB',
|
|
||||||
'fyi.tv': 'FYI',
|
|
||||||
}
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
domain, show_path, movie_display_id, special_display_id, collection_display_id = re.match(self._VALID_URL, url).groups()
|
domain, canonical = re.match(self._VALID_URL, url).groups()
|
||||||
display_id = show_path or movie_display_id or special_display_id or collection_display_id
|
return self._extract_aetn_info(domain, 'canonical', '/' + canonical, url)
|
||||||
webpage = self._download_webpage(url, display_id, headers=self.geo_verification_headers())
|
|
||||||
if show_path:
|
|
||||||
url_parts = show_path.split('/')
|
|
||||||
url_parts_len = len(url_parts)
|
|
||||||
if url_parts_len == 1:
|
|
||||||
entries = []
|
|
||||||
for season_url_path in re.findall(r'(?s)<li[^>]+data-href="(/shows/%s/season-\d+)"' % url_parts[0], webpage):
|
|
||||||
entries.append(self.url_result(
|
|
||||||
compat_urlparse.urljoin(url, season_url_path), 'AENetworks'))
|
|
||||||
if entries:
|
|
||||||
return self.playlist_result(
|
|
||||||
entries, self._html_search_meta('aetn:SeriesId', webpage),
|
|
||||||
self._html_search_meta('aetn:SeriesTitle', webpage))
|
|
||||||
else:
|
|
||||||
# single season
|
|
||||||
url_parts_len = 2
|
|
||||||
if url_parts_len == 2:
|
|
||||||
entries = []
|
|
||||||
for episode_item in re.findall(r'(?s)<[^>]+class="[^"]*(?:episode|program)-item[^"]*"[^>]*>', webpage):
|
|
||||||
episode_attributes = extract_attributes(episode_item)
|
|
||||||
episode_url = compat_urlparse.urljoin(
|
|
||||||
url, episode_attributes['data-canonical'])
|
|
||||||
entries.append(self.url_result(
|
|
||||||
episode_url, 'AENetworks',
|
|
||||||
episode_attributes.get('data-videoid') or episode_attributes.get('data-video-id')))
|
|
||||||
return self.playlist_result(
|
|
||||||
entries, self._html_search_meta('aetn:SeasonId', webpage))
|
|
||||||
|
|
||||||
video_id = self._html_search_meta('aetn:VideoID', webpage)
|
|
||||||
media_url = self._search_regex(
|
class AENetworksListBaseIE(AENetworksBaseIE):
|
||||||
[r"media_url\s*=\s*'(?P<url>[^']+)'",
|
def _call_api(self, resource, slug, brand, fields):
|
||||||
r'data-media-url=(?P<url>(?:https?:)?//[^\s>]+)',
|
return self._download_json(
|
||||||
r'data-media-url=(["\'])(?P<url>(?:(?!\1).)+?)\1'],
|
'https://yoga.appsvcs.aetnd.com/graphql',
|
||||||
webpage, 'video url', group='url')
|
slug, query={'brand': brand}, data=urlencode_postdata({
|
||||||
theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
|
'query': '''{
|
||||||
r'https?://link\.theplatform\.com/s/([^?]+)', media_url, 'theplatform_path'), video_id)
|
%s(slug: "%s") {
|
||||||
info = self._parse_theplatform_metadata(theplatform_metadata)
|
%s
|
||||||
auth = None
|
}
|
||||||
if theplatform_metadata.get('AETN$isBehindWall'):
|
}''' % (resource, slug, fields),
|
||||||
requestor_id = self._DOMAIN_TO_REQUESTOR_ID[domain]
|
}))['data'][resource]
|
||||||
resource = self._get_mvpd_resource(
|
|
||||||
requestor_id, theplatform_metadata['title'],
|
def _real_extract(self, url):
|
||||||
theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'),
|
domain, slug = re.match(self._VALID_URL, url).groups()
|
||||||
theplatform_metadata['ratings'][0]['rating'])
|
_, brand = self._DOMAIN_MAP[domain]
|
||||||
auth = self._extract_mvpd_auth(
|
playlist = self._call_api(self._RESOURCE, slug, brand, self._FIELDS)
|
||||||
url, video_id, requestor_id, resource)
|
base_url = 'http://watch.%s' % domain
|
||||||
info.update(self._search_json_ld(webpage, video_id, fatal=False))
|
|
||||||
info.update(self._extract_aen_smil(media_url, video_id, auth))
|
entries = []
|
||||||
return info
|
for item in (playlist.get(self._ITEMS_KEY) or []):
|
||||||
|
doc = self._get_doc(item)
|
||||||
|
canonical = doc.get('canonical')
|
||||||
|
if not canonical:
|
||||||
|
continue
|
||||||
|
entries.append(self.url_result(
|
||||||
|
base_url + canonical, AENetworksIE.ie_key(), doc.get('id')))
|
||||||
|
|
||||||
|
description = None
|
||||||
|
if self._PLAYLIST_DESCRIPTION_KEY:
|
||||||
|
description = playlist.get(self._PLAYLIST_DESCRIPTION_KEY)
|
||||||
|
|
||||||
|
return self.playlist_result(
|
||||||
|
entries, playlist.get('id'),
|
||||||
|
playlist.get(self._PLAYLIST_TITLE_KEY), description)
|
||||||
|
|
||||||
|
|
||||||
|
class AENetworksCollectionIE(AENetworksListBaseIE):
|
||||||
|
IE_NAME = 'aenetworks:collection'
|
||||||
|
_VALID_URL = AENetworksBaseIE._BASE_URL_REGEX + r'(?:[^/]+/)*(?:list|collections)/(?P<id>[^/?#&]+)/?(?:[?#&]|$)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://watch.historyvault.com/list/america-the-story-of-us',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '282',
|
||||||
|
'title': 'America The Story of Us',
|
||||||
|
},
|
||||||
|
'playlist_mincount': 12,
|
||||||
|
}, {
|
||||||
|
'url': 'https://watch.historyvault.com/shows/america-the-story-of-us-2/season-1/list/america-the-story-of-us',
|
||||||
|
'only_matching': True
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.historyvault.com/collections/mysteryquest',
|
||||||
|
'only_matching': True
|
||||||
|
}]
|
||||||
|
_RESOURCE = 'list'
|
||||||
|
_ITEMS_KEY = 'items'
|
||||||
|
_PLAYLIST_TITLE_KEY = 'display_title'
|
||||||
|
_PLAYLIST_DESCRIPTION_KEY = None
|
||||||
|
_FIELDS = '''id
|
||||||
|
display_title
|
||||||
|
items {
|
||||||
|
... on ListVideoItem {
|
||||||
|
doc {
|
||||||
|
canonical
|
||||||
|
id
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}'''
|
||||||
|
|
||||||
|
def _get_doc(self, item):
|
||||||
|
return item.get('doc') or {}
|
||||||
|
|
||||||
|
|
||||||
|
class AENetworksShowIE(AENetworksListBaseIE):
|
||||||
|
IE_NAME = 'aenetworks:show'
|
||||||
|
_VALID_URL = AENetworksBaseIE._BASE_URL_REGEX + r'shows/(?P<id>[^/?#&]+)/?(?:[?#&]|$)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'http://www.history.com/shows/ancient-aliens',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'SERIES1574',
|
||||||
|
'title': 'Ancient Aliens',
|
||||||
|
'description': 'md5:3f6d74daf2672ff3ae29ed732e37ea7f',
|
||||||
|
},
|
||||||
|
'playlist_mincount': 150,
|
||||||
|
}]
|
||||||
|
_RESOURCE = 'series'
|
||||||
|
_ITEMS_KEY = 'episodes'
|
||||||
|
_PLAYLIST_TITLE_KEY = 'title'
|
||||||
|
_PLAYLIST_DESCRIPTION_KEY = 'description'
|
||||||
|
_FIELDS = '''description
|
||||||
|
id
|
||||||
|
title
|
||||||
|
episodes {
|
||||||
|
canonical
|
||||||
|
id
|
||||||
|
}'''
|
||||||
|
|
||||||
|
def _get_doc(self, item):
|
||||||
|
return item
|
||||||
|
|
||||||
|
|
||||||
class HistoryTopicIE(AENetworksBaseIE):
|
class HistoryTopicIE(AENetworksBaseIE):
|
||||||
@ -204,6 +287,7 @@ class HistoryTopicIE(AENetworksBaseIE):
|
|||||||
'description': 'md5:7b57ea4829b391995b405fa60bd7b5f7',
|
'description': 'md5:7b57ea4829b391995b405fa60bd7b5f7',
|
||||||
'timestamp': 1375819729,
|
'timestamp': 1375819729,
|
||||||
'upload_date': '20130806',
|
'upload_date': '20130806',
|
||||||
|
'uploader': 'AENE-NEW',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
# m3u8 download
|
# m3u8 download
|
||||||
@ -212,36 +296,47 @@ class HistoryTopicIE(AENetworksBaseIE):
|
|||||||
'add_ie': ['ThePlatform'],
|
'add_ie': ['ThePlatform'],
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def theplatform_url_result(self, theplatform_url, video_id, query):
|
def _real_extract(self, url):
|
||||||
return {
|
display_id = self._match_id(url)
|
||||||
'_type': 'url_transparent',
|
return self.url_result(
|
||||||
'id': video_id,
|
'http://www.history.com/videos/' + display_id,
|
||||||
'url': smuggle_url(
|
AENetworksIE.ie_key())
|
||||||
update_url_query(theplatform_url, query),
|
|
||||||
{
|
|
||||||
'sig': {
|
class HistoryPlayerIE(AENetworksBaseIE):
|
||||||
'key': self._THEPLATFORM_KEY,
|
IE_NAME = 'history:player'
|
||||||
'secret': self._THEPLATFORM_SECRET,
|
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|biography)\.com)/player/(?P<id>\d+)'
|
||||||
},
|
_TESTS = []
|
||||||
'force_smil_url': True
|
|
||||||
}),
|
def _real_extract(self, url):
|
||||||
'ie_key': 'ThePlatform',
|
domain, video_id = re.match(self._VALID_URL, url).groups()
|
||||||
}
|
return self._extract_aetn_info(domain, 'id', video_id, url)
|
||||||
|
|
||||||
|
|
||||||
|
class BiographyIE(AENetworksBaseIE):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?biography\.com/video/(?P<id>[^/?#&]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.biography.com/video/vincent-van-gogh-full-episode-2075049808',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '30322987',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Vincent Van Gogh - Full Episode',
|
||||||
|
'description': 'A full biography about the most influential 20th century painter, Vincent Van Gogh.',
|
||||||
|
'timestamp': 1311970571,
|
||||||
|
'upload_date': '20110729',
|
||||||
|
'uploader': 'AENE-NEW',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# m3u8 download
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
'add_ie': ['ThePlatform'],
|
||||||
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
display_id = self._match_id(url)
|
||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, display_id)
|
||||||
video_id = self._search_regex(
|
player_url = self._search_regex(
|
||||||
r'<phoenix-iframe[^>]+src="[^"]+\btpid=(\d+)', webpage, 'tpid')
|
r'<phoenix-iframe[^>]+src="(%s)' % HistoryPlayerIE._VALID_URL,
|
||||||
result = self._download_json(
|
webpage, 'player URL')
|
||||||
'https://feeds.video.aetnd.com/api/v2/history/videos',
|
return self.url_result(player_url, HistoryPlayerIE.ie_key())
|
||||||
video_id, query={'filter[id]': video_id})['results'][0]
|
|
||||||
title = result['title']
|
|
||||||
info = self._extract_aen_smil(result['publicUrl'], video_id)
|
|
||||||
info.update({
|
|
||||||
'title': title,
|
|
||||||
'description': result.get('description'),
|
|
||||||
'duration': int_or_none(result.get('duration')),
|
|
||||||
'timestamp': int_or_none(result.get('added'), 1000),
|
|
||||||
})
|
|
||||||
return info
|
|
||||||
|
@ -1,13 +1,16 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
class AlJazeeraIE(InfoExtractor):
|
class AlJazeeraIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?aljazeera\.com/(?:programmes|video)/.*?/(?P<id>[^/]+)\.html'
|
_VALID_URL = r'https?://(?:www\.)?aljazeera\.com/(?P<type>program/[^/]+|(?:feature|video)s)/\d{4}/\d{1,2}/\d{1,2}/(?P<id>[^/?&#]+)'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html',
|
'url': 'https://www.aljazeera.com/program/episode/2014/9/19/deliverance',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '3792260579001',
|
'id': '3792260579001',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
@ -20,14 +23,34 @@ class AlJazeeraIE(InfoExtractor):
|
|||||||
'add_ie': ['BrightcoveNew'],
|
'add_ie': ['BrightcoveNew'],
|
||||||
'skip': 'Not accessible from Travis CI server',
|
'skip': 'Not accessible from Travis CI server',
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.aljazeera.com/video/news/2017/05/sierra-leone-709-carat-diamond-auctioned-170511100111930.html',
|
'url': 'https://www.aljazeera.com/videos/2017/5/11/sierra-leone-709-carat-diamond-to-be-auctioned-off',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.aljazeera.com/features/2017/8/21/transforming-pakistans-buses-into-art',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/665003303001/default_default/index.html?videoId=%s'
|
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/%s_default/index.html?videoId=%s'
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
program_name = self._match_id(url)
|
post_type, name = re.match(self._VALID_URL, url).groups()
|
||||||
webpage = self._download_webpage(url, program_name)
|
post_type = {
|
||||||
brightcove_id = self._search_regex(
|
'features': 'post',
|
||||||
r'RenderPagesVideo\(\'(.+?)\'', webpage, 'brightcove id')
|
'program': 'episode',
|
||||||
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
|
'videos': 'video',
|
||||||
|
}[post_type.split('/')[0]]
|
||||||
|
video = self._download_json(
|
||||||
|
'https://www.aljazeera.com/graphql', name, query={
|
||||||
|
'operationName': 'SingleArticleQuery',
|
||||||
|
'variables': json.dumps({
|
||||||
|
'name': name,
|
||||||
|
'postType': post_type,
|
||||||
|
}),
|
||||||
|
}, headers={
|
||||||
|
'wp-site': 'aje',
|
||||||
|
})['data']['article']['video']
|
||||||
|
video_id = video['id']
|
||||||
|
account_id = video.get('accountId') or '665003303001'
|
||||||
|
player_id = video.get('playerId') or 'BkeSH5BDb'
|
||||||
|
return self.url_result(
|
||||||
|
self.BRIGHTCOVE_URL_TEMPLATE % (account_id, player_id, video_id),
|
||||||
|
'BrightcoveNew', video_id)
|
||||||
|
@ -1,6 +1,8 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
from .theplatform import ThePlatformIE
|
from .theplatform import ThePlatformIE
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
int_or_none,
|
int_or_none,
|
||||||
@ -11,25 +13,22 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class AMCNetworksIE(ThePlatformIE):
|
class AMCNetworksIE(ThePlatformIE):
|
||||||
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|(?:we|sundance)tv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
|
_VALID_URL = r'https?://(?:www\.)?(?P<site>amc|bbcamerica|ifc|(?:we|sundance)tv)\.com/(?P<id>(?:movies|shows(?:/[^/]+)+)/[^/?#&]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
|
'url': 'https://www.bbcamerica.com/shows/the-graham-norton-show/videos/tina-feys-adorable-airline-themed-family-dinner--51631',
|
||||||
'md5': '',
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 's3MX01Nl4vPH',
|
'id': '4Lq1dzOnZGt0',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Maron - Season 4 - Step 1',
|
'title': "The Graham Norton Show - Season 28 - Tina Fey's Adorable Airline-Themed Family Dinner",
|
||||||
'description': 'In denial about his current situation, Marc is reluctantly convinced by his friends to enter rehab. Starring Marc Maron and Constance Zimmer.',
|
'description': "It turns out child stewardesses are very generous with the wine! All-new episodes of 'The Graham Norton Show' premiere Fridays at 11/10c on BBC America.",
|
||||||
'age_limit': 17,
|
'upload_date': '20201120',
|
||||||
'upload_date': '20160505',
|
'timestamp': 1605904350,
|
||||||
'timestamp': 1462468831,
|
|
||||||
'uploader': 'AMCN',
|
'uploader': 'AMCN',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
# m3u8 download
|
# m3u8 download
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
'skip': 'Requires TV provider accounts',
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.bbcamerica.com/shows/the-hunt/full-episodes/season-1/episode-01-the-hardest-challenge',
|
'url': 'http://www.bbcamerica.com/shows/the-hunt/full-episodes/season-1/episode-01-the-hardest-challenge',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -55,32 +54,34 @@ class AMCNetworksIE(ThePlatformIE):
|
|||||||
'url': 'https://www.sundancetv.com/shows/riviera/full-episodes/season-1/episode-01-episode-1',
|
'url': 'https://www.sundancetv.com/shows/riviera/full-episodes/season-1/episode-01-episode-1',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
_REQUESTOR_ID_MAP = {
|
||||||
|
'amc': 'AMC',
|
||||||
|
'bbcamerica': 'BBCA',
|
||||||
|
'ifc': 'IFC',
|
||||||
|
'sundancetv': 'SUNDANCE',
|
||||||
|
'wetv': 'WETV',
|
||||||
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
site, display_id = re.match(self._VALID_URL, url).groups()
|
||||||
webpage = self._download_webpage(url, display_id)
|
requestor_id = self._REQUESTOR_ID_MAP[site]
|
||||||
|
properties = self._download_json(
|
||||||
|
'https://content-delivery-gw.svc.ds.amcn.com/api/v2/content/amcn/%s/url/%s' % (requestor_id.lower(), display_id),
|
||||||
|
display_id)['data']['properties']
|
||||||
query = {
|
query = {
|
||||||
'mbr': 'true',
|
'mbr': 'true',
|
||||||
'manifest': 'm3u',
|
'manifest': 'm3u',
|
||||||
}
|
}
|
||||||
media_url = self._search_regex(
|
tp_path = 'M_UwQC/media/' + properties['videoPid']
|
||||||
r'window\.platformLinkURL\s*=\s*[\'"]([^\'"]+)',
|
media_url = 'https://link.theplatform.com/s/' + tp_path
|
||||||
webpage, 'media url')
|
theplatform_metadata = self._download_theplatform_metadata(tp_path, display_id)
|
||||||
theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
|
|
||||||
r'link\.theplatform\.com/s/([^?]+)',
|
|
||||||
media_url, 'theplatform_path'), display_id)
|
|
||||||
info = self._parse_theplatform_metadata(theplatform_metadata)
|
info = self._parse_theplatform_metadata(theplatform_metadata)
|
||||||
video_id = theplatform_metadata['pid']
|
video_id = theplatform_metadata['pid']
|
||||||
title = theplatform_metadata['title']
|
title = theplatform_metadata['title']
|
||||||
rating = try_get(
|
rating = try_get(
|
||||||
theplatform_metadata, lambda x: x['ratings'][0]['rating'])
|
theplatform_metadata, lambda x: x['ratings'][0]['rating'])
|
||||||
auth_required = self._search_regex(
|
video_category = properties.get('videoCategory')
|
||||||
r'window\.authRequired\s*=\s*(true|false);',
|
if video_category and video_category.endswith('-Auth'):
|
||||||
webpage, 'auth required')
|
|
||||||
if auth_required == 'true':
|
|
||||||
requestor_id = self._search_regex(
|
|
||||||
r'window\.requestor_id\s*=\s*[\'"]([^\'"]+)',
|
|
||||||
webpage, 'requestor id')
|
|
||||||
resource = self._get_mvpd_resource(
|
resource = self._get_mvpd_resource(
|
||||||
requestor_id, title, video_id, rating)
|
requestor_id, title, video_id, rating)
|
||||||
query['auth'] = self._extract_mvpd_auth(
|
query['auth'] = self._extract_mvpd_auth(
|
||||||
|
@ -1,82 +1,159 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
clean_html,
|
clean_html,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
js_to_json,
|
|
||||||
try_get,
|
try_get,
|
||||||
unified_strdate,
|
unified_strdate,
|
||||||
|
unified_timestamp,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class AmericasTestKitchenIE(InfoExtractor):
|
class AmericasTestKitchenIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?americastestkitchen\.com/(?:episode|videos)/(?P<id>\d+)'
|
_VALID_URL = r'https?://(?:www\.)?(?:americastestkitchen|cooks(?:country|illustrated))\.com/(?P<resource_type>episode|videos)/(?P<id>\d+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.americastestkitchen.com/episode/582-weeknight-japanese-suppers',
|
'url': 'https://www.americastestkitchen.com/episode/582-weeknight-japanese-suppers',
|
||||||
'md5': 'b861c3e365ac38ad319cfd509c30577f',
|
'md5': 'b861c3e365ac38ad319cfd509c30577f',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '5b400b9ee338f922cb06450c',
|
'id': '5b400b9ee338f922cb06450c',
|
||||||
'title': 'Weeknight Japanese Suppers',
|
'title': 'Japanese Suppers',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'description': 'md5:3d0c1a44bb3b27607ce82652db25b4a8',
|
'description': 'md5:64e606bfee910627efc4b5f050de92b3',
|
||||||
'thumbnail': r're:^https?://',
|
'thumbnail': r're:^https?://',
|
||||||
'timestamp': 1523664000,
|
'timestamp': 1523318400,
|
||||||
'upload_date': '20180414',
|
'upload_date': '20180410',
|
||||||
'release_date': '20180414',
|
'release_date': '20180410',
|
||||||
'series': "America's Test Kitchen",
|
'series': "America's Test Kitchen",
|
||||||
'season_number': 18,
|
'season_number': 18,
|
||||||
'episode': 'Weeknight Japanese Suppers',
|
'episode': 'Japanese Suppers',
|
||||||
'episode_number': 15,
|
'episode_number': 15,
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
# Metadata parsing behaves differently for newer episodes (705) as opposed to older episodes (582 above)
|
||||||
|
'url': 'https://www.americastestkitchen.com/episode/705-simple-chicken-dinner',
|
||||||
|
'md5': '06451608c57651e985a498e69cec17e5',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '5fbe8c61bda2010001c6763b',
|
||||||
|
'title': 'Simple Chicken Dinner',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'description': 'md5:eb68737cc2fd4c26ca7db30139d109e7',
|
||||||
|
'thumbnail': r're:^https?://',
|
||||||
|
'timestamp': 1610755200,
|
||||||
|
'upload_date': '20210116',
|
||||||
|
'release_date': '20210116',
|
||||||
|
'series': "America's Test Kitchen",
|
||||||
|
'season_number': 21,
|
||||||
|
'episode': 'Simple Chicken Dinner',
|
||||||
|
'episode_number': 3,
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.americastestkitchen.com/videos/3420-pan-seared-salmon',
|
'url': 'https://www.americastestkitchen.com/videos/3420-pan-seared-salmon',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.cookscountry.com/episode/564-when-only-chocolate-will-do',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.cooksillustrated.com/videos/4478-beef-wellington',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
resource_type, video_id = re.match(self._VALID_URL, url).groups()
|
||||||
|
is_episode = resource_type == 'episode'
|
||||||
|
if is_episode:
|
||||||
|
resource_type = 'episodes'
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
resource = self._download_json(
|
||||||
|
'https://www.americastestkitchen.com/api/v6/%s/%s' % (resource_type, video_id), video_id)
|
||||||
video_data = self._parse_json(
|
video = resource['video'] if is_episode else resource
|
||||||
self._search_regex(
|
episode = resource if is_episode else resource.get('episode') or {}
|
||||||
r'window\.__INITIAL_STATE__\s*=\s*({.+?})\s*;\s*</script>',
|
|
||||||
webpage, 'initial context'),
|
|
||||||
video_id, js_to_json)
|
|
||||||
|
|
||||||
ep_data = try_get(
|
|
||||||
video_data,
|
|
||||||
(lambda x: x['episodeDetail']['content']['data'],
|
|
||||||
lambda x: x['videoDetail']['content']['data']), dict)
|
|
||||||
ep_meta = ep_data.get('full_video', {})
|
|
||||||
|
|
||||||
zype_id = ep_data.get('zype_id') or ep_meta['zype_id']
|
|
||||||
|
|
||||||
title = ep_data.get('title') or ep_meta.get('title')
|
|
||||||
description = clean_html(ep_meta.get('episode_description') or ep_data.get(
|
|
||||||
'description') or ep_meta.get('description'))
|
|
||||||
thumbnail = try_get(ep_meta, lambda x: x['photo']['image_url'])
|
|
||||||
release_date = unified_strdate(ep_data.get('aired_at'))
|
|
||||||
|
|
||||||
season_number = int_or_none(ep_meta.get('season_number'))
|
|
||||||
episode = ep_meta.get('title')
|
|
||||||
episode_number = int_or_none(ep_meta.get('episode_number'))
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'_type': 'url_transparent',
|
'_type': 'url_transparent',
|
||||||
'url': 'https://player.zype.com/embed/%s.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ' % zype_id,
|
'url': 'https://player.zype.com/embed/%s.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ' % video['zypeId'],
|
||||||
'ie_key': 'Zype',
|
'ie_key': 'Zype',
|
||||||
'title': title,
|
'description': clean_html(video.get('description')),
|
||||||
'description': description,
|
'timestamp': unified_timestamp(video.get('publishDate')),
|
||||||
'thumbnail': thumbnail,
|
'release_date': unified_strdate(video.get('publishDate')),
|
||||||
'release_date': release_date,
|
'episode_number': int_or_none(episode.get('number')),
|
||||||
'series': "America's Test Kitchen",
|
'season_number': int_or_none(episode.get('season')),
|
||||||
'season_number': season_number,
|
'series': try_get(episode, lambda x: x['show']['title']),
|
||||||
'episode': episode,
|
'episode': episode.get('title'),
|
||||||
'episode_number': episode_number,
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class AmericasTestKitchenSeasonIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?(?P<show>americastestkitchen|cookscountry)\.com/episodes/browse/season_(?P<id>\d+)'
|
||||||
|
_TESTS = [{
|
||||||
|
# ATK Season
|
||||||
|
'url': 'https://www.americastestkitchen.com/episodes/browse/season_1',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'season_1',
|
||||||
|
'title': 'Season 1',
|
||||||
|
},
|
||||||
|
'playlist_count': 13,
|
||||||
|
}, {
|
||||||
|
# Cooks Country Season
|
||||||
|
'url': 'https://www.cookscountry.com/episodes/browse/season_12',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'season_12',
|
||||||
|
'title': 'Season 12',
|
||||||
|
},
|
||||||
|
'playlist_count': 13,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
show_name, season_number = re.match(self._VALID_URL, url).groups()
|
||||||
|
season_number = int(season_number)
|
||||||
|
|
||||||
|
slug = 'atk' if show_name == 'americastestkitchen' else 'cco'
|
||||||
|
|
||||||
|
season = 'Season %d' % season_number
|
||||||
|
|
||||||
|
season_search = self._download_json(
|
||||||
|
'https://y1fnzxui30-dsn.algolia.net/1/indexes/everest_search_%s_season_desc_production' % slug,
|
||||||
|
season, headers={
|
||||||
|
'Origin': 'https://www.%s.com' % show_name,
|
||||||
|
'X-Algolia-API-Key': '8d504d0099ed27c1b73708d22871d805',
|
||||||
|
'X-Algolia-Application-Id': 'Y1FNZXUI30',
|
||||||
|
}, query={
|
||||||
|
'facetFilters': json.dumps([
|
||||||
|
'search_season_list:' + season,
|
||||||
|
'search_document_klass:episode',
|
||||||
|
'search_show_slug:' + slug,
|
||||||
|
]),
|
||||||
|
'attributesToRetrieve': 'description,search_%s_episode_number,search_document_date,search_url,title' % slug,
|
||||||
|
'attributesToHighlight': '',
|
||||||
|
'hitsPerPage': 1000,
|
||||||
|
})
|
||||||
|
|
||||||
|
def entries():
|
||||||
|
for episode in (season_search.get('hits') or []):
|
||||||
|
search_url = episode.get('search_url')
|
||||||
|
if not search_url:
|
||||||
|
continue
|
||||||
|
yield {
|
||||||
|
'_type': 'url',
|
||||||
|
'url': 'https://www.%s.com%s' % (show_name, search_url),
|
||||||
|
'id': try_get(episode, lambda e: e['objectID'].split('_')[-1]),
|
||||||
|
'title': episode.get('title'),
|
||||||
|
'description': episode.get('description'),
|
||||||
|
'timestamp': unified_timestamp(episode.get('search_document_date')),
|
||||||
|
'season_number': season_number,
|
||||||
|
'episode_number': int_or_none(episode.get('search_%s_episode_number' % slug)),
|
||||||
|
'ie_key': AmericasTestKitchenIE.ie_key(),
|
||||||
|
}
|
||||||
|
|
||||||
|
return self.playlist_result(
|
||||||
|
entries(), 'season_%d' % season_number, season)
|
||||||
|
@ -8,6 +8,7 @@ from ..utils import (
|
|||||||
int_or_none,
|
int_or_none,
|
||||||
mimetype2ext,
|
mimetype2ext,
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
|
unified_timestamp,
|
||||||
url_or_none,
|
url_or_none,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -88,7 +89,7 @@ class AMPIE(InfoExtractor):
|
|||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
timestamp = parse_iso8601(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date'))
|
timestamp = unified_timestamp(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date'))
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
|
@ -116,8 +116,6 @@ class AnimeOnDemandIE(InfoExtractor):
|
|||||||
r'(?s)<div[^>]+itemprop="description"[^>]*>(.+?)</div>',
|
r'(?s)<div[^>]+itemprop="description"[^>]*>(.+?)</div>',
|
||||||
webpage, 'anime description', default=None)
|
webpage, 'anime description', default=None)
|
||||||
|
|
||||||
entries = []
|
|
||||||
|
|
||||||
def extract_info(html, video_id, num=None):
|
def extract_info(html, video_id, num=None):
|
||||||
title, description = [None] * 2
|
title, description = [None] * 2
|
||||||
formats = []
|
formats = []
|
||||||
@ -233,7 +231,7 @@ class AnimeOnDemandIE(InfoExtractor):
|
|||||||
self._sort_formats(info['formats'])
|
self._sort_formats(info['formats'])
|
||||||
f = common_info.copy()
|
f = common_info.copy()
|
||||||
f.update(info)
|
f.update(info)
|
||||||
entries.append(f)
|
yield f
|
||||||
|
|
||||||
# Extract teaser/trailer only when full episode is not available
|
# Extract teaser/trailer only when full episode is not available
|
||||||
if not info['formats']:
|
if not info['formats']:
|
||||||
@ -247,7 +245,7 @@ class AnimeOnDemandIE(InfoExtractor):
|
|||||||
'title': m.group('title'),
|
'title': m.group('title'),
|
||||||
'url': urljoin(url, m.group('href')),
|
'url': urljoin(url, m.group('href')),
|
||||||
})
|
})
|
||||||
entries.append(f)
|
yield f
|
||||||
|
|
||||||
def extract_episodes(html):
|
def extract_episodes(html):
|
||||||
for num, episode_html in enumerate(re.findall(
|
for num, episode_html in enumerate(re.findall(
|
||||||
@ -275,7 +273,8 @@ class AnimeOnDemandIE(InfoExtractor):
|
|||||||
'episode_number': episode_number,
|
'episode_number': episode_number,
|
||||||
}
|
}
|
||||||
|
|
||||||
extract_entries(episode_html, video_id, common_info)
|
for e in extract_entries(episode_html, video_id, common_info):
|
||||||
|
yield e
|
||||||
|
|
||||||
def extract_film(html, video_id):
|
def extract_film(html, video_id):
|
||||||
common_info = {
|
common_info = {
|
||||||
@ -283,11 +282,18 @@ class AnimeOnDemandIE(InfoExtractor):
|
|||||||
'title': anime_title,
|
'title': anime_title,
|
||||||
'description': anime_description,
|
'description': anime_description,
|
||||||
}
|
}
|
||||||
extract_entries(html, video_id, common_info)
|
for e in extract_entries(html, video_id, common_info):
|
||||||
|
yield e
|
||||||
|
|
||||||
extract_episodes(webpage)
|
def entries():
|
||||||
|
has_episodes = False
|
||||||
|
for e in extract_episodes(webpage):
|
||||||
|
has_episodes = True
|
||||||
|
yield e
|
||||||
|
|
||||||
if not entries:
|
if not has_episodes:
|
||||||
extract_film(webpage, anime_id)
|
for e in extract_film(webpage, anime_id):
|
||||||
|
yield e
|
||||||
|
|
||||||
return self.playlist_result(entries, anime_id, anime_title, anime_description)
|
return self.playlist_result(
|
||||||
|
entries(), anime_id, anime_title, anime_description)
|
||||||
|
@ -116,7 +116,76 @@ class AnvatoIE(InfoExtractor):
|
|||||||
'anvato_scripps_app_ios_prod_409c41960c60b308db43c3cc1da79cab9f1c3d93': 'WPxj5GraLTkYCyj3M7RozLqIycjrXOEcDGFMIJPn',
|
'anvato_scripps_app_ios_prod_409c41960c60b308db43c3cc1da79cab9f1c3d93': 'WPxj5GraLTkYCyj3M7RozLqIycjrXOEcDGFMIJPn',
|
||||||
'EZqvRyKBJLrgpClDPDF8I7Xpdp40Vx73': '4OxGd2dEakylntVKjKF0UK9PDPYB6A9W',
|
'EZqvRyKBJLrgpClDPDF8I7Xpdp40Vx73': '4OxGd2dEakylntVKjKF0UK9PDPYB6A9W',
|
||||||
'M2v78QkpleXm9hPp9jUXI63x5vA6BogR': 'ka6K32k7ZALmpINkjJUGUo0OE42Md1BQ',
|
'M2v78QkpleXm9hPp9jUXI63x5vA6BogR': 'ka6K32k7ZALmpINkjJUGUo0OE42Md1BQ',
|
||||||
'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6_secure': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ'
|
'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6_secure': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ',
|
||||||
|
'X8POa4zPPaKVZHqmWjuEzfP31b1QM9VN': 'Dn5vOY9ooDw7VSl9qztjZI5o0g08mA0z',
|
||||||
|
'M2v78QkBMpNJlSPp9diX5F2PBmBy6Bog': 'ka6K32kyo7nDZfNkjQCGWf1lpApXMd1B',
|
||||||
|
'bvJ0dQpav07l0hG5JgfVLF2dv1vARwpP': 'BzoQW24GrJZoJfmNodiJKSPeB9B8NOxj',
|
||||||
|
'lxQMLg2XZKuEZaWgsqubBxV9INZ6bryY': 'Vm2Mx6noKds9jB71h6urazwlTG3m9x8l',
|
||||||
|
'04EnjvXeoSmkbJ9ckPs7oY0mcxv7PlyN': 'aXERQP9LMfQVlEDsgGs6eEA1SWznAQ8P',
|
||||||
|
'mQbO2ge6BFRWVPYCYpU06YvNt80XLvAX': 'E2BV1NGmasN5v7eujECVPJgwflnLPm2A',
|
||||||
|
'g43oeBzJrCml7o6fa5fRL1ErCdeD8z4K': 'RX34mZ6zVH4Nr6whbxIGLv9WSbxEKo8V',
|
||||||
|
'VQrDJoP7mtdBzkxhXbSPwGB1coeElk4x': 'j2VejQx0VFKQepAF7dI0mJLKtOVJE18z',
|
||||||
|
'WxA5NzLRjCrmq0NUgaU5pdMDuZO7RJ4w': 'lyY5ADLKaIOLEgAsGQCveEMAcqnx3rY9',
|
||||||
|
'M4lpMXB71ie0PjMCjdFzVXq0SeRVqz49': 'n2zVkOqaLIv3GbLfBjcwW51LcveWOZ2e',
|
||||||
|
'dyDZGEqN8u8nkJZcJns0oxYmtP7KbGAn': 'VXOEqQW9BtEVLajfZQSLEqxgS5B7qn2D',
|
||||||
|
'E7QNjrVY5u5mGvgu67IoDgV1CjEND8QR': 'rz8AaDmdKIkLmPNhB5ILPJnjS5PnlL8d',
|
||||||
|
'a4zrqjoKlfzg0dwHEWtP31VqcLBpjm4g': 'LY9J16gwETdGWa3hjBu5o0RzuoQDjqXQ',
|
||||||
|
'dQP5BZroMsMVLO1hbmT5r2Enu86GjxA6': '7XR3oOdbPF6x3PRFLDCq9RkgsRjAo48V',
|
||||||
|
'M4lKNBO1NFe0PjMCj1tzVXq0SeRVqzA9': 'n2zoRqGLRUv3GbLfBmTwW51LcveWOZYe',
|
||||||
|
'nAZ7MZdpGCGg1pqFEbsoJOz2C60mv143': 'dYJgdqA9aT4yojETqGi7yNgoFADxqmXP',
|
||||||
|
'3y1MERYgOuE9NzbFgwhV6Wv2F0YKvbyz': '081xpZDQgC4VadLTavhWQxrku56DAgXV',
|
||||||
|
'bmQvmEXr5HWklBMCZOcpE2Z3HBYwqGyl': 'zxXPbVNyMiMAZldhr9FkOmA0fl4aKr2v',
|
||||||
|
'wA7oDNYldfr6050Hwxi52lPZiVlB86Ap': 'ZYK16aA7ni0d3l3c34uwpxD7CbReMm8Q',
|
||||||
|
'g43MbKMWmFml7o7sJoSRkXxZiXRvJ3QK': 'RX3oBJonvs4Nr6rUWBCGn3matRGqJPXV',
|
||||||
|
'mA9VdlqpLS0raGaSDvtoqNrBTzb8XY4q': '0XN4OjBD3fnW7r7IbmtJB4AyfOmlrE2r',
|
||||||
|
'mAajOwgkGt17oGoFmEuklMP9H0GnW54d': 'lXbBLPGyzikNGeGujAuAJGjZiwLRxyXR',
|
||||||
|
'vy8vjJ9kbUwrRqRu59Cj5dWZfzYErlAb': 'K8l7gpwaGcBpnAnCLNCmPZRdin3eaQX0',
|
||||||
|
'xQMWBpR8oHEZaWaSMGUb0avOHjLVYn4Y': 'm2MrN4vEaf9jB7BFy5Srb40jTrN67AYl',
|
||||||
|
'xyKEmVO3miRr6D6UVkt7oB8jtD6aJEAv': 'g2ddDebqDfqdgKgswyUKwGjbTWwzq923',
|
||||||
|
'7Qk0wa2D9FjKapacoJF27aLvUDKkLGA0': 'b2kgBEkephJaMkMTL7s1PLe4Ua6WyP2P',
|
||||||
|
'3QLg6nqmNTJ5VvVTo7f508LPidz1xwyY': 'g2L1GgpraipmAOAUqmIbBnPxHOmw4MYa',
|
||||||
|
'3y1B7zZjXTE9NZNSzZSVNPZaTNLjo6Qz': '081b5G6wzH4VagaURmcWbN5mT4JGEe2V',
|
||||||
|
'lAqnwvkw6SG6D8DSqmUg6DRLUp0w3G4x': 'O2pbP0xPDFNJjpjIEvcdryOJtpkVM4X5',
|
||||||
|
'awA7xd1N0Hr6050Hw2c52lPZiVlB864p': 'GZYKpn4aoT0d3l3c3PiwpxD7CbReMmXQ',
|
||||||
|
'jQVqPLl9YHL1WGWtR1HDgWBGT63qRNyV': '6X03ne6vrU4oWyWUN7tQVoajikxJR3Ye',
|
||||||
|
'GQRMR8mL7uZK797t7xH3eNzPIP5dOny1': 'm2vqPWGd4U31zWzSyasDRAoMT1PKRp8o',
|
||||||
|
'zydq9RdmRhXLkNkfNoTJlMzaF0lWekQB': '3X7LnvE7vH5nkEkSqLiey793Un7dLB8e',
|
||||||
|
'VQrDzwkB2IdBzjzu9MHPbEYkSB50gR4x': 'j2VebLzoKUKQeEesmVh0gM1eIp9jKz8z',
|
||||||
|
'mAa2wMamBs17oGoFmktklMP9H0GnW54d': 'lXbgP74xZTkNGeGujVUAJGjZiwLRxy8R',
|
||||||
|
'7yjB6ZLG6sW8R6RF2xcan1KGfJ5dNoyd': 'wXQkPorvPHZ45N5t4Jf6qwg5Tp4xvw29',
|
||||||
|
'a4zPpNeWGuzg0m0iX3tPeanGSkRKWXQg': 'LY9oa3QAyHdGW9Wu3Ri5JGeEik7l1N8Q',
|
||||||
|
'k2rneA2M38k25cXDwwSknTJlxPxQLZ6M': '61lyA2aEVDzklfdwmmh31saPxQx2VRjp',
|
||||||
|
'bK9Zk4OvPnvxduLgxvi8VUeojnjA02eV': 'o5jANYjbeMb4nfBaQvcLAt1jzLzYx6ze',
|
||||||
|
'5VD6EydM3R9orHmNMGInGCJwbxbQvGRw': 'w3zjmX7g4vnxzCxElvUEOiewkokXprkZ',
|
||||||
|
'70X35QbVYVYNPUmP9YfbzI06YqYQk2R1': 'vG4Aj2BMjMjoztB7zeFOnCVPJpJ8lMOa',
|
||||||
|
'26qYwQVG9p1Bks2GgBckjfDJOXOAMgG1': 'r4ev9X0mv5zqJc0yk5IBDcQOwZw8mnwQ',
|
||||||
|
'rvVKpA56MBXWlSxMw3cobT5pdkd4Dm7q': '1J7ZkY53pZ645c93owcLZuveE7E8B3rL',
|
||||||
|
'qN1zdy1zlYL23IWZGWtDvfV6WeWQWkJo': 'qN1zdy1zlYL23IWZGWtDvfV6WeWQWkJo',
|
||||||
|
'jdKqRGF16dKsBviMDae7IGDl7oTjEbVV': 'Q09l7vhlNxPFErIOK6BVCe7KnwUW5DVV',
|
||||||
|
'3QLkogW1OUJ5VvPsrDH56DY2u7lgZWyY': 'g2LRE1V9espmAOPhE4ubj4ZdUA57yDXa',
|
||||||
|
'wyJvWbXGBSdbkEzhv0CW8meou82aqRy8': 'M2wolPvyBIpQGkbT4juedD4ruzQGdK2y',
|
||||||
|
'7QkdZrzEkFjKap6IYDU2PB0oCNZORmA0': 'b2kN1l96qhJaMkPs9dt1lpjBfwqZoA8P',
|
||||||
|
'pvA05113MHG1w3JTYxc6DVlRCjErVz4O': 'gQXeAbblBUnDJ7vujbHvbRd1cxlz3AXO',
|
||||||
|
'mA9blJDZwT0raG1cvkuoeVjLC7ZWd54q': '0XN9jRPwMHnW7rvumgfJZOD9CJgVkWYr',
|
||||||
|
'5QwRN5qKJTvGKlDTmnf7xwNZcjRmvEy9': 'R2GP6LWBJU1QlnytwGt0B9pytWwAdDYy',
|
||||||
|
'eyn5rPPbkfw2KYxH32fG1q58CbLJzM40': 'p2gyqooZnS56JWeiDgfmOy1VugOQEBXn',
|
||||||
|
'3BABn3b5RfPJGDwilbHe7l82uBoR05Am': '7OYZG7KMVhbPdKJS3xcWEN3AuDlLNmXj',
|
||||||
|
'xA5zNGXD3HrmqMlF6OS5pdMDuZO7RJ4w': 'yY5DAm6r1IOLE3BCVMFveEMAcqnx3r29',
|
||||||
|
'g43PgW3JZfml7o6fDEURL1ErCdeD8zyK': 'RX3aQn1zrS4Nr6whDgCGLv9WSbxEKo2V',
|
||||||
|
'lAqp8WbGgiG6D8LTKJcg3O72CDdre1Qx': 'O2pnm6473HNJjpKuVosd3vVeh975yrX5',
|
||||||
|
'wyJbYEDxKSdbkJ6S6RhW8meou82aqRy8': 'M2wPm7EgRSpQGlAh70CedD4ruzQGdKYy',
|
||||||
|
'M4lgW28nLCe0PVdtaXszVXq0SeRVqzA9': 'n2zmJvg4jHv3G0ETNgiwW51LcveWOZ8e',
|
||||||
|
'5Qw3OVvp9FvGKlDTmOC7xwNZcjRmvEQ9': 'R2GzDdml9F1Qlnytw9s0B9pytWwAdD8y',
|
||||||
|
'vy8a98X7zCwrRqbHrLUjYzwDiK2b70Qb': 'K8lVwzyjZiBpnAaSGeUmnAgxuGOBxmY0',
|
||||||
|
'g4eGjJLLoiqRD3Pf9oT5O03LuNbLRDQp': '6XqD59zzpfN4EwQuaGt67qNpSyRBlnYy',
|
||||||
|
'g43OPp9boIml7o6fDOIRL1ErCdeD8z4K': 'RX33alNB4s4Nr6whDPUGLv9WSbxEKoXV',
|
||||||
|
'xA2ng9OkBcGKzDbTkKsJlx7dUK8R3dA5': 'z2aPnJvzBfObkwGC3vFaPxeBhxoMqZ8K',
|
||||||
|
'xyKEgBajZuRr6DEC0Kt7XpD1cnNW9gAv': 'g2ddlEBvRsqdgKaI4jUK9PrgfMexGZ23',
|
||||||
|
'BAogww51jIMa2JnH1BcYpXM5F658RNAL': 'rYWDmm0KptlkGv4FGJFMdZmjs9RDE6XR',
|
||||||
|
'BAokpg62VtMa2JnH1mHYpXM5F658RNAL': 'rYWryDnlNslkGv4FG4HMdZmjs9RDE62R',
|
||||||
|
'a4z1Px5e2hzg0m0iMMCPeanGSkRKWXAg': 'LY9eorNQGUdGW9WuKKf5JGeEik7l1NYQ',
|
||||||
|
'kAx69R58kF9nY5YcdecJdl2pFXP53WyX': 'gXyRxELpbfPvLeLSaRil0mp6UEzbZJ8L',
|
||||||
|
'BAoY13nwViMa2J2uo2cY6BlETgmdwryL': 'rYWwKzJmNFlkGvGtNoUM9bzwIJVzB1YR',
|
||||||
}
|
}
|
||||||
|
|
||||||
_MCP_TO_ACCESS_KEY_TABLE = {
|
_MCP_TO_ACCESS_KEY_TABLE = {
|
||||||
@ -189,19 +258,17 @@ class AnvatoIE(InfoExtractor):
|
|||||||
|
|
||||||
video_data_url += '&X-Anvato-Adst-Auth=' + base64.b64encode(auth_secret).decode('ascii')
|
video_data_url += '&X-Anvato-Adst-Auth=' + base64.b64encode(auth_secret).decode('ascii')
|
||||||
anvrid = md5_text(time.time() * 1000 * random.random())[:30]
|
anvrid = md5_text(time.time() * 1000 * random.random())[:30]
|
||||||
payload = {
|
api = {
|
||||||
'api': {
|
'anvrid': anvrid,
|
||||||
'anvrid': anvrid,
|
'anvts': server_time,
|
||||||
'anvstk': md5_text('%s|%s|%d|%s' % (
|
|
||||||
access_key, anvrid, server_time,
|
|
||||||
self._ANVACK_TABLE.get(access_key, self._API_KEY))),
|
|
||||||
'anvts': server_time,
|
|
||||||
},
|
|
||||||
}
|
}
|
||||||
|
api['anvstk'] = md5_text('%s|%s|%d|%s' % (
|
||||||
|
access_key, anvrid, server_time,
|
||||||
|
self._ANVACK_TABLE.get(access_key, self._API_KEY)))
|
||||||
|
|
||||||
return self._download_json(
|
return self._download_json(
|
||||||
video_data_url, video_id, transform_source=strip_jsonp,
|
video_data_url, video_id, transform_source=strip_jsonp,
|
||||||
data=json.dumps(payload).encode('utf-8'))
|
data=json.dumps({'api': api}).encode('utf-8'))
|
||||||
|
|
||||||
def _get_anvato_videos(self, access_key, video_id):
|
def _get_anvato_videos(self, access_key, video_id):
|
||||||
video_data = self._get_video_json(access_key, video_id)
|
video_data = self._get_video_json(access_key, video_id)
|
||||||
@ -259,7 +326,7 @@ class AnvatoIE(InfoExtractor):
|
|||||||
'description': video_data.get('def_description'),
|
'description': video_data.get('def_description'),
|
||||||
'tags': video_data.get('def_tags', '').split(','),
|
'tags': video_data.get('def_tags', '').split(','),
|
||||||
'categories': video_data.get('categories'),
|
'categories': video_data.get('categories'),
|
||||||
'thumbnail': video_data.get('thumbnail'),
|
'thumbnail': video_data.get('src_image_url') or video_data.get('thumbnail'),
|
||||||
'timestamp': int_or_none(video_data.get(
|
'timestamp': int_or_none(video_data.get(
|
||||||
'ts_published') or video_data.get('ts_added')),
|
'ts_published') or video_data.get('ts_added')),
|
||||||
'uploader': video_data.get('mcp_id'),
|
'uploader': video_data.get('mcp_id'),
|
||||||
|
@ -3,7 +3,7 @@ from __future__ import unicode_literals
|
|||||||
|
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .yahoo import YahooIE
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_parse_qs,
|
compat_parse_qs,
|
||||||
compat_urllib_parse_urlparse,
|
compat_urllib_parse_urlparse,
|
||||||
@ -15,9 +15,9 @@ from ..utils import (
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class AolIE(InfoExtractor):
|
class AolIE(YahooIE):
|
||||||
IE_NAME = 'aol.com'
|
IE_NAME = 'aol.com'
|
||||||
_VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>[0-9a-f]+)'
|
_VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>\d{9}|[0-9a-f]{24}|[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12})'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# video with 5min ID
|
# video with 5min ID
|
||||||
@ -76,10 +76,16 @@ class AolIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://www.aol.jp/video/playlist/5a28e936a1334d000137da0c/5a28f3151e642219fde19831/',
|
'url': 'https://www.aol.jp/video/playlist/5a28e936a1334d000137da0c/5a28f3151e642219fde19831/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# Yahoo video
|
||||||
|
'url': 'https://www.aol.com/video/play/991e6700-ac02-11ea-99ff-357400036f61/24bbc846-3e30-3c46-915e-fe8ccd7fcc46/',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
if '-' in video_id:
|
||||||
|
return self._extract_yahoo_video(video_id, 'us')
|
||||||
|
|
||||||
response = self._download_json(
|
response = self._download_json(
|
||||||
'https://feedapi.b2c.on.aol.com/v1.0/app/videos/aolon/%s/details' % video_id,
|
'https://feedapi.b2c.on.aol.com/v1.0/app/videos/aolon/%s/details' % video_id,
|
||||||
|
@ -3,6 +3,7 @@ from __future__ import unicode_literals
|
|||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
|
get_element_by_id,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
merge_dicts,
|
merge_dicts,
|
||||||
mimetype2ext,
|
mimetype2ext,
|
||||||
@ -39,23 +40,15 @@ class AparatIE(InfoExtractor):
|
|||||||
webpage = self._download_webpage(url, video_id, fatal=False)
|
webpage = self._download_webpage(url, video_id, fatal=False)
|
||||||
|
|
||||||
if not webpage:
|
if not webpage:
|
||||||
# Note: There is an easier-to-parse configuration at
|
|
||||||
# http://www.aparat.com/video/video/config/videohash/%video_id
|
|
||||||
# but the URL in there does not work
|
|
||||||
webpage = self._download_webpage(
|
webpage = self._download_webpage(
|
||||||
'http://www.aparat.com/video/video/embed/vt/frame/showvideo/yes/videohash/' + video_id,
|
'http://www.aparat.com/video/video/embed/vt/frame/showvideo/yes/videohash/' + video_id,
|
||||||
video_id)
|
video_id)
|
||||||
|
|
||||||
options = self._parse_json(
|
options = self._parse_json(self._search_regex(
|
||||||
self._search_regex(
|
r'options\s*=\s*({.+?})\s*;', webpage, 'options'), video_id)
|
||||||
r'options\s*=\s*JSON\.parse\(\s*(["\'])(?P<value>(?:(?!\1).)+)\1\s*\)',
|
|
||||||
webpage, 'options', group='value'),
|
|
||||||
video_id)
|
|
||||||
|
|
||||||
player = options['plugins']['sabaPlayerPlugin']
|
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for sources in player['multiSRC']:
|
for sources in (options.get('multiSRC') or []):
|
||||||
for item in sources:
|
for item in sources:
|
||||||
if not isinstance(item, dict):
|
if not isinstance(item, dict):
|
||||||
continue
|
continue
|
||||||
@ -85,11 +78,12 @@ class AparatIE(InfoExtractor):
|
|||||||
info = self._search_json_ld(webpage, video_id, default={})
|
info = self._search_json_ld(webpage, video_id, default={})
|
||||||
|
|
||||||
if not info.get('title'):
|
if not info.get('title'):
|
||||||
info['title'] = player['title']
|
info['title'] = get_element_by_id('videoTitle', webpage) or \
|
||||||
|
self._html_search_meta(['og:title', 'twitter:title', 'DC.Title', 'title'], webpage, fatal=True)
|
||||||
|
|
||||||
return merge_dicts(info, {
|
return merge_dicts(info, {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'thumbnail': url_or_none(options.get('poster')),
|
'thumbnail': url_or_none(options.get('poster')),
|
||||||
'duration': int_or_none(player.get('duration')),
|
'duration': int_or_none(options.get('duration')),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
})
|
})
|
||||||
|
61
youtube_dl/extractor/applepodcasts.py
Normal file
61
youtube_dl/extractor/applepodcasts.py
Normal file
@ -0,0 +1,61 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
clean_podcast_url,
|
||||||
|
int_or_none,
|
||||||
|
parse_iso8601,
|
||||||
|
try_get,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class ApplePodcastsIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://podcasts\.apple\.com/(?:[^/]+/)?podcast(?:/[^/]+){1,2}.*?\bi=(?P<id>\d+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://podcasts.apple.com/us/podcast/207-whitney-webb-returns/id1135137367?i=1000482637777',
|
||||||
|
'md5': 'df02e6acb11c10e844946a39e7222b08',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '1000482637777',
|
||||||
|
'ext': 'mp3',
|
||||||
|
'title': '207 - Whitney Webb Returns',
|
||||||
|
'description': 'md5:13a73bade02d2e43737751e3987e1399',
|
||||||
|
'upload_date': '20200705',
|
||||||
|
'timestamp': 1593921600,
|
||||||
|
'duration': 6425,
|
||||||
|
'series': 'The Tim Dillon Show',
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'https://podcasts.apple.com/podcast/207-whitney-webb-returns/id1135137367?i=1000482637777',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://podcasts.apple.com/podcast/207-whitney-webb-returns?i=1000482637777',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://podcasts.apple.com/podcast/id1135137367?i=1000482637777',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
episode_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, episode_id)
|
||||||
|
ember_data = self._parse_json(self._search_regex(
|
||||||
|
r'id="shoebox-ember-data-store"[^>]*>\s*({.+?})\s*<',
|
||||||
|
webpage, 'ember data'), episode_id)
|
||||||
|
episode = ember_data['data']['attributes']
|
||||||
|
description = episode.get('description') or {}
|
||||||
|
|
||||||
|
series = None
|
||||||
|
for inc in (ember_data.get('included') or []):
|
||||||
|
if inc.get('type') == 'media/podcast':
|
||||||
|
series = try_get(inc, lambda x: x['attributes']['name'])
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': episode_id,
|
||||||
|
'title': episode['name'],
|
||||||
|
'url': clean_podcast_url(episode['assetUrl']),
|
||||||
|
'description': description.get('standard') or description.get('short'),
|
||||||
|
'timestamp': parse_iso8601(episode.get('releaseDateTime')),
|
||||||
|
'duration': int_or_none(episode.get('durationInMilliseconds'), 1000),
|
||||||
|
'series': series,
|
||||||
|
}
|
@ -2,15 +2,17 @@ from __future__ import unicode_literals
|
|||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
unified_strdate,
|
|
||||||
clean_html,
|
clean_html,
|
||||||
|
extract_attributes,
|
||||||
|
unified_strdate,
|
||||||
|
unified_timestamp,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class ArchiveOrgIE(InfoExtractor):
|
class ArchiveOrgIE(InfoExtractor):
|
||||||
IE_NAME = 'archive.org'
|
IE_NAME = 'archive.org'
|
||||||
IE_DESC = 'archive.org videos'
|
IE_DESC = 'archive.org videos'
|
||||||
_VALID_URL = r'https?://(?:www\.)?archive\.org/(?:details|embed)/(?P<id>[^/?#]+)(?:[?].*)?$'
|
_VALID_URL = r'https?://(?:www\.)?archive\.org/(?:details|embed)/(?P<id>[^/?#&]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://archive.org/details/XD300-23_68HighlightsAResearchCntAugHumanIntellect',
|
'url': 'http://archive.org/details/XD300-23_68HighlightsAResearchCntAugHumanIntellect',
|
||||||
'md5': '8af1d4cf447933ed3c7f4871162602db',
|
'md5': '8af1d4cf447933ed3c7f4871162602db',
|
||||||
@ -19,8 +21,11 @@ class ArchiveOrgIE(InfoExtractor):
|
|||||||
'ext': 'ogg',
|
'ext': 'ogg',
|
||||||
'title': '1968 Demo - FJCC Conference Presentation Reel #1',
|
'title': '1968 Demo - FJCC Conference Presentation Reel #1',
|
||||||
'description': 'md5:da45c349df039f1cc8075268eb1b5c25',
|
'description': 'md5:da45c349df039f1cc8075268eb1b5c25',
|
||||||
'upload_date': '19681210',
|
'creator': 'SRI International',
|
||||||
'uploader': 'SRI International'
|
'release_date': '19681210',
|
||||||
|
'uploader': 'SRI International',
|
||||||
|
'timestamp': 1268695290,
|
||||||
|
'upload_date': '20100315',
|
||||||
}
|
}
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://archive.org/details/Cops1922',
|
'url': 'https://archive.org/details/Cops1922',
|
||||||
@ -29,22 +34,43 @@ class ArchiveOrgIE(InfoExtractor):
|
|||||||
'id': 'Cops1922',
|
'id': 'Cops1922',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Buster Keaton\'s "Cops" (1922)',
|
'title': 'Buster Keaton\'s "Cops" (1922)',
|
||||||
'description': 'md5:89e7c77bf5d965dd5c0372cfb49470f6',
|
'description': 'md5:43a603fd6c5b4b90d12a96b921212b9c',
|
||||||
|
'timestamp': 1387699629,
|
||||||
|
'upload_date': '20131222',
|
||||||
}
|
}
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://archive.org/embed/XD300-23_68HighlightsAResearchCntAugHumanIntellect',
|
'url': 'http://archive.org/embed/XD300-23_68HighlightsAResearchCntAugHumanIntellect',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://archive.org/details/MSNBCW_20131125_040000_To_Catch_a_Predator/',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
webpage = self._download_webpage(
|
webpage = self._download_webpage(
|
||||||
'http://archive.org/embed/' + video_id, video_id)
|
'http://archive.org/embed/' + video_id, video_id)
|
||||||
jwplayer_playlist = self._parse_json(self._search_regex(
|
|
||||||
r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\)",
|
playlist = None
|
||||||
webpage, 'jwplayer playlist'), video_id)
|
play8 = self._search_regex(
|
||||||
info = self._parse_jwplayer_data(
|
r'(<[^>]+\bclass=["\']js-play8-playlist[^>]+>)', webpage,
|
||||||
{'playlist': jwplayer_playlist}, video_id, base_url=url)
|
'playlist', default=None)
|
||||||
|
if play8:
|
||||||
|
attrs = extract_attributes(play8)
|
||||||
|
playlist = attrs.get('value')
|
||||||
|
if not playlist:
|
||||||
|
# Old jwplayer fallback
|
||||||
|
playlist = self._search_regex(
|
||||||
|
r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\)",
|
||||||
|
webpage, 'jwplayer playlist', default='[]')
|
||||||
|
jwplayer_playlist = self._parse_json(playlist, video_id, fatal=False)
|
||||||
|
if jwplayer_playlist:
|
||||||
|
info = self._parse_jwplayer_data(
|
||||||
|
{'playlist': jwplayer_playlist}, video_id, base_url=url)
|
||||||
|
else:
|
||||||
|
# HTML5 media fallback
|
||||||
|
info = self._parse_html5_media_entries(url, webpage, video_id)[0]
|
||||||
|
info['id'] = video_id
|
||||||
|
|
||||||
def get_optional(metadata, field):
|
def get_optional(metadata, field):
|
||||||
return metadata.get(field, [None])[0]
|
return metadata.get(field, [None])[0]
|
||||||
@ -58,8 +84,12 @@ class ArchiveOrgIE(InfoExtractor):
|
|||||||
'description': clean_html(get_optional(metadata, 'description')),
|
'description': clean_html(get_optional(metadata, 'description')),
|
||||||
})
|
})
|
||||||
if info.get('_type') != 'playlist':
|
if info.get('_type') != 'playlist':
|
||||||
|
creator = get_optional(metadata, 'creator')
|
||||||
info.update({
|
info.update({
|
||||||
'uploader': get_optional(metadata, 'creator'),
|
'creator': creator,
|
||||||
'upload_date': unified_strdate(get_optional(metadata, 'date')),
|
'release_date': unified_strdate(get_optional(metadata, 'date')),
|
||||||
|
'uploader': get_optional(metadata, 'publisher') or creator,
|
||||||
|
'timestamp': unified_timestamp(get_optional(metadata, 'publicdate')),
|
||||||
|
'language': get_optional(metadata, 'language'),
|
||||||
})
|
})
|
||||||
return info
|
return info
|
||||||
|
174
youtube_dl/extractor/arcpublishing.py
Normal file
174
youtube_dl/extractor/arcpublishing.py
Normal file
@ -0,0 +1,174 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
extract_attributes,
|
||||||
|
int_or_none,
|
||||||
|
parse_iso8601,
|
||||||
|
try_get,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class ArcPublishingIE(InfoExtractor):
|
||||||
|
_UUID_REGEX = r'[\da-f]{8}-(?:[\da-f]{4}-){3}[\da-f]{12}'
|
||||||
|
_VALID_URL = r'arcpublishing:(?P<org>[a-z]+):(?P<id>%s)' % _UUID_REGEX
|
||||||
|
_TESTS = [{
|
||||||
|
# https://www.adn.com/politics/2020/11/02/video-senate-candidates-campaign-in-anchorage-on-eve-of-election-day/
|
||||||
|
'url': 'arcpublishing:adn:8c99cb6e-b29c-4bc9-9173-7bf9979225ab',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# https://www.bostonglobe.com/video/2020/12/30/metro/footage-released-showing-officer-talking-about-striking-protesters-with-car/
|
||||||
|
'url': 'arcpublishing:bostonglobe:232b7ae6-7d73-432d-bc0a-85dbf0119ab1',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# https://www.actionnewsjax.com/video/live-stream/
|
||||||
|
'url': 'arcpublishing:cmg:cfb1cf1b-3ab5-4d1b-86c5-a5515d311f2a',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# https://elcomercio.pe/videos/deportes/deporte-total-futbol-peruano-seleccion-peruana-la-valorizacion-de-los-peruanos-en-el-exterior-tras-un-2020-atipico-nnav-vr-video-noticia/
|
||||||
|
'url': 'arcpublishing:elcomercio:27a7e1f8-2ec7-4177-874f-a4feed2885b3',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# https://www.clickondetroit.com/video/community/2020/05/15/events-surrounding-woodward-dream-cruise-being-canceled/
|
||||||
|
'url': 'arcpublishing:gmg:c8793fb2-8d44-4242-881e-2db31da2d9fe',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# https://www.wabi.tv/video/2020/12/30/trenton-company-making-equipment-pfizer-covid-vaccine/
|
||||||
|
'url': 'arcpublishing:gray:0b0ba30e-032a-4598-8810-901d70e6033e',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# https://www.lateja.cr/el-mundo/video-china-aprueba-con-condiciones-su-primera/dfcbfa57-527f-45ff-a69b-35fe71054143/video/
|
||||||
|
'url': 'arcpublishing:gruponacion:dfcbfa57-527f-45ff-a69b-35fe71054143',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# https://www.fifthdomain.com/video/2018/03/09/is-america-vulnerable-to-a-cyber-attack/
|
||||||
|
'url': 'arcpublishing:mco:aa0ca6fe-1127-46d4-b32c-be0d6fdb8055',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# https://www.vl.no/kultur/2020/12/09/en-melding-fra-en-lytter-endret-julelista-til-lewi-bergrud/
|
||||||
|
'url': 'arcpublishing:mentormedier:47a12084-650b-4011-bfd0-3699b6947b2d',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# https://www.14news.com/2020/12/30/whiskey-theft-caught-camera-henderson-liquor-store/
|
||||||
|
'url': 'arcpublishing:raycom:b89f61f8-79fa-4c09-8255-e64237119bf7',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# https://www.theglobeandmail.com/world/video-ethiopian-woman-who-became-symbol-of-integration-in-italy-killed-on/
|
||||||
|
'url': 'arcpublishing:tgam:411b34c1-8701-4036-9831-26964711664b',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# https://www.pilotonline.com/460f2931-8130-4719-8ea1-ffcb2d7cb685-132.html
|
||||||
|
'url': 'arcpublishing:tronc:460f2931-8130-4719-8ea1-ffcb2d7cb685',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
_POWA_DEFAULTS = [
|
||||||
|
(['cmg', 'prisa'], '%s-config-prod.api.cdn.arcpublishing.com/video'),
|
||||||
|
([
|
||||||
|
'adn', 'advancelocal', 'answers', 'bonnier', 'bostonglobe', 'demo',
|
||||||
|
'gmg', 'gruponacion', 'infobae', 'mco', 'nzme', 'pmn', 'raycom',
|
||||||
|
'spectator', 'tbt', 'tgam', 'tronc', 'wapo', 'wweek',
|
||||||
|
], 'video-api-cdn.%s.arcpublishing.com/api'),
|
||||||
|
]
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _extract_urls(webpage):
|
||||||
|
entries = []
|
||||||
|
# https://arcpublishing.atlassian.net/wiki/spaces/POWA/overview
|
||||||
|
for powa_el in re.findall(r'(<div[^>]+class="[^"]*\bpowa\b[^"]*"[^>]+data-uuid="%s"[^>]*>)' % ArcPublishingIE._UUID_REGEX, webpage):
|
||||||
|
powa = extract_attributes(powa_el) or {}
|
||||||
|
org = powa.get('data-org')
|
||||||
|
uuid = powa.get('data-uuid')
|
||||||
|
if org and uuid:
|
||||||
|
entries.append('arcpublishing:%s:%s' % (org, uuid))
|
||||||
|
return entries
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
org, uuid = re.match(self._VALID_URL, url).groups()
|
||||||
|
for orgs, tmpl in self._POWA_DEFAULTS:
|
||||||
|
if org in orgs:
|
||||||
|
base_api_tmpl = tmpl
|
||||||
|
break
|
||||||
|
else:
|
||||||
|
base_api_tmpl = '%s-prod-cdn.video-api.arcpublishing.com/api'
|
||||||
|
if org == 'wapo':
|
||||||
|
org = 'washpost'
|
||||||
|
video = self._download_json(
|
||||||
|
'https://%s/v1/ansvideos/findByUuid' % (base_api_tmpl % org),
|
||||||
|
uuid, query={'uuid': uuid})[0]
|
||||||
|
title = video['headlines']['basic']
|
||||||
|
is_live = video.get('status') == 'live'
|
||||||
|
|
||||||
|
urls = []
|
||||||
|
formats = []
|
||||||
|
for s in video.get('streams', []):
|
||||||
|
s_url = s.get('url')
|
||||||
|
if not s_url or s_url in urls:
|
||||||
|
continue
|
||||||
|
urls.append(s_url)
|
||||||
|
stream_type = s.get('stream_type')
|
||||||
|
if stream_type == 'smil':
|
||||||
|
smil_formats = self._extract_smil_formats(
|
||||||
|
s_url, uuid, fatal=False)
|
||||||
|
for f in smil_formats:
|
||||||
|
if f['url'].endswith('/cfx/st'):
|
||||||
|
f['app'] = 'cfx/st'
|
||||||
|
if not f['play_path'].startswith('mp4:'):
|
||||||
|
f['play_path'] = 'mp4:' + f['play_path']
|
||||||
|
if isinstance(f['tbr'], float):
|
||||||
|
f['vbr'] = f['tbr'] * 1000
|
||||||
|
del f['tbr']
|
||||||
|
f['format_id'] = 'rtmp-%d' % f['vbr']
|
||||||
|
formats.extend(smil_formats)
|
||||||
|
elif stream_type in ('ts', 'hls'):
|
||||||
|
m3u8_formats = self._extract_m3u8_formats(
|
||||||
|
s_url, uuid, 'mp4', 'm3u8' if is_live else 'm3u8_native',
|
||||||
|
m3u8_id='hls', fatal=False)
|
||||||
|
if all([f.get('acodec') == 'none' for f in m3u8_formats]):
|
||||||
|
continue
|
||||||
|
for f in m3u8_formats:
|
||||||
|
if f.get('acodec') == 'none':
|
||||||
|
f['preference'] = -40
|
||||||
|
elif f.get('vcodec') == 'none':
|
||||||
|
f['preference'] = -50
|
||||||
|
height = f.get('height')
|
||||||
|
if not height:
|
||||||
|
continue
|
||||||
|
vbr = self._search_regex(
|
||||||
|
r'[_x]%d[_-](\d+)' % height, f['url'], 'vbr', default=None)
|
||||||
|
if vbr:
|
||||||
|
f['vbr'] = int(vbr)
|
||||||
|
formats.extend(m3u8_formats)
|
||||||
|
else:
|
||||||
|
vbr = int_or_none(s.get('bitrate'))
|
||||||
|
formats.append({
|
||||||
|
'format_id': '%s-%d' % (stream_type, vbr) if vbr else stream_type,
|
||||||
|
'vbr': vbr,
|
||||||
|
'width': int_or_none(s.get('width')),
|
||||||
|
'height': int_or_none(s.get('height')),
|
||||||
|
'filesize': int_or_none(s.get('filesize')),
|
||||||
|
'url': s_url,
|
||||||
|
'preference': -1,
|
||||||
|
})
|
||||||
|
self._sort_formats(
|
||||||
|
formats, ('preference', 'width', 'height', 'vbr', 'filesize', 'tbr', 'ext', 'format_id'))
|
||||||
|
|
||||||
|
subtitles = {}
|
||||||
|
for subtitle in (try_get(video, lambda x: x['subtitles']['urls'], list) or []):
|
||||||
|
subtitle_url = subtitle.get('url')
|
||||||
|
if subtitle_url:
|
||||||
|
subtitles.setdefault('en', []).append({'url': subtitle_url})
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': uuid,
|
||||||
|
'title': self._live_title(title) if is_live else title,
|
||||||
|
'thumbnail': try_get(video, lambda x: x['promo_image']['url']),
|
||||||
|
'description': try_get(video, lambda x: x['subheadlines']['basic']),
|
||||||
|
'formats': formats,
|
||||||
|
'duration': int_or_none(video.get('duration'), 100),
|
||||||
|
'timestamp': parse_iso8601(video.get('created_date')),
|
||||||
|
'subtitles': subtitles,
|
||||||
|
'is_live': is_live,
|
||||||
|
}
|
@ -187,13 +187,13 @@ class ARDMediathekIE(ARDMediathekBaseIE):
|
|||||||
if doc.tag == 'rss':
|
if doc.tag == 'rss':
|
||||||
return GenericIE()._extract_rss(url, video_id, doc)
|
return GenericIE()._extract_rss(url, video_id, doc)
|
||||||
|
|
||||||
title = self._html_search_regex(
|
title = self._og_search_title(webpage, default=None) or self._html_search_regex(
|
||||||
[r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>',
|
[r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>',
|
||||||
r'<meta name="dcterms\.title" content="(.*?)"/>',
|
r'<meta name="dcterms\.title" content="(.*?)"/>',
|
||||||
r'<h4 class="headline">(.*?)</h4>',
|
r'<h4 class="headline">(.*?)</h4>',
|
||||||
r'<title[^>]*>(.*?)</title>'],
|
r'<title[^>]*>(.*?)</title>'],
|
||||||
webpage, 'title')
|
webpage, 'title')
|
||||||
description = self._html_search_meta(
|
description = self._og_search_description(webpage, default=None) or self._html_search_meta(
|
||||||
'dcterms.abstract', webpage, 'description', default=None)
|
'dcterms.abstract', webpage, 'description', default=None)
|
||||||
if description is None:
|
if description is None:
|
||||||
description = self._html_search_meta(
|
description = self._html_search_meta(
|
||||||
@ -249,18 +249,18 @@ class ARDMediathekIE(ARDMediathekBaseIE):
|
|||||||
|
|
||||||
|
|
||||||
class ARDIE(InfoExtractor):
|
class ARDIE(InfoExtractor):
|
||||||
_VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos(?:extern)?/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
|
_VALID_URL = r'(?P<mainurl>https?://(?:www\.)?daserste\.de/[^?#]+/videos(?:extern)?/(?P<display_id>[^/?#]+)-(?:video-?)?(?P<id>[0-9]+))\.html'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# available till 14.02.2019
|
# available till 7.01.2022
|
||||||
'url': 'http://www.daserste.de/information/talk/maischberger/videos/das-groko-drama-zerlegen-sich-die-volksparteien-video-102.html',
|
'url': 'https://www.daserste.de/information/talk/maischberger/videos/maischberger-die-woche-video100.html',
|
||||||
'md5': '8e4ec85f31be7c7fc08a26cdbc5a1f49',
|
'md5': '867d8aa39eeaf6d76407c5ad1bb0d4c1',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'display_id': 'das-groko-drama-zerlegen-sich-die-volksparteien-video',
|
'display_id': 'maischberger-die-woche',
|
||||||
'id': '102',
|
'id': '100',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'duration': 4435.0,
|
'duration': 3687.0,
|
||||||
'title': 'Das GroKo-Drama: Zerlegen sich die Volksparteien?',
|
'title': 'maischberger. die woche vom 7. Januar 2021',
|
||||||
'upload_date': '20180214',
|
'upload_date': '20210107',
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
@ -284,20 +284,42 @@ class ARDIE(InfoExtractor):
|
|||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for a in video_node.findall('.//asset'):
|
for a in video_node.findall('.//asset'):
|
||||||
|
file_name = xpath_text(a, './fileName', default=None)
|
||||||
|
if not file_name:
|
||||||
|
continue
|
||||||
|
format_type = a.attrib.get('type')
|
||||||
|
format_url = url_or_none(file_name)
|
||||||
|
if format_url:
|
||||||
|
ext = determine_ext(file_name)
|
||||||
|
if ext == 'm3u8':
|
||||||
|
formats.extend(self._extract_m3u8_formats(
|
||||||
|
format_url, display_id, 'mp4', entry_protocol='m3u8_native',
|
||||||
|
m3u8_id=format_type or 'hls', fatal=False))
|
||||||
|
continue
|
||||||
|
elif ext == 'f4m':
|
||||||
|
formats.extend(self._extract_f4m_formats(
|
||||||
|
update_url_query(format_url, {'hdcore': '3.7.0'}),
|
||||||
|
display_id, f4m_id=format_type or 'hds', fatal=False))
|
||||||
|
continue
|
||||||
f = {
|
f = {
|
||||||
'format_id': a.attrib['type'],
|
'format_id': format_type,
|
||||||
'width': int_or_none(a.find('./frameWidth').text),
|
'width': int_or_none(xpath_text(a, './frameWidth')),
|
||||||
'height': int_or_none(a.find('./frameHeight').text),
|
'height': int_or_none(xpath_text(a, './frameHeight')),
|
||||||
'vbr': int_or_none(a.find('./bitrateVideo').text),
|
'vbr': int_or_none(xpath_text(a, './bitrateVideo')),
|
||||||
'abr': int_or_none(a.find('./bitrateAudio').text),
|
'abr': int_or_none(xpath_text(a, './bitrateAudio')),
|
||||||
'vcodec': a.find('./codecVideo').text,
|
'vcodec': xpath_text(a, './codecVideo'),
|
||||||
'tbr': int_or_none(a.find('./totalBitrate').text),
|
'tbr': int_or_none(xpath_text(a, './totalBitrate')),
|
||||||
}
|
}
|
||||||
if a.find('./serverPrefix').text:
|
server_prefix = xpath_text(a, './serverPrefix', default=None)
|
||||||
f['url'] = a.find('./serverPrefix').text
|
if server_prefix:
|
||||||
f['playpath'] = a.find('./fileName').text
|
f.update({
|
||||||
|
'url': server_prefix,
|
||||||
|
'playpath': file_name,
|
||||||
|
})
|
||||||
else:
|
else:
|
||||||
f['url'] = a.find('./fileName').text
|
if not format_url:
|
||||||
|
continue
|
||||||
|
f['url'] = format_url
|
||||||
formats.append(f)
|
formats.append(f)
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
@ -315,17 +337,17 @@ class ARDIE(InfoExtractor):
|
|||||||
class ARDBetaMediathekIE(ARDMediathekBaseIE):
|
class ARDBetaMediathekIE(ARDMediathekBaseIE):
|
||||||
_VALID_URL = r'https://(?:(?:beta|www)\.)?ardmediathek\.de/(?P<client>[^/]+)/(?:player|live|video)/(?P<display_id>(?:[^/]+/)*)(?P<video_id>[a-zA-Z0-9]+)'
|
_VALID_URL = r'https://(?:(?:beta|www)\.)?ardmediathek\.de/(?P<client>[^/]+)/(?:player|live|video)/(?P<display_id>(?:[^/]+/)*)(?P<video_id>[a-zA-Z0-9]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://ardmediathek.de/ard/video/die-robuste-roswita/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
|
'url': 'https://www.ardmediathek.de/mdr/video/die-robuste-roswita/Y3JpZDovL21kci5kZS9iZWl0cmFnL2Ntcy84MWMxN2MzZC0wMjkxLTRmMzUtODk4ZS0wYzhlOWQxODE2NGI/',
|
||||||
'md5': 'dfdc87d2e7e09d073d5a80770a9ce88f',
|
'md5': 'a1dc75a39c61601b980648f7c9f9f71d',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'display_id': 'die-robuste-roswita',
|
'display_id': 'die-robuste-roswita',
|
||||||
'id': '70153354',
|
'id': '78566716',
|
||||||
'title': 'Die robuste Roswita',
|
'title': 'Die robuste Roswita',
|
||||||
'description': r're:^Der Mord.*trüber ist als die Ilm.',
|
'description': r're:^Der Mord.*totgeglaubte Ehefrau Roswita',
|
||||||
'duration': 5316,
|
'duration': 5316,
|
||||||
'thumbnail': 'https://img.ardmediathek.de/standard/00/70/15/33/90/-1852531467/16x9/960?mandant=ard',
|
'thumbnail': 'https://img.ardmediathek.de/standard/00/78/56/67/84/575672121/16x9/960?mandant=ard',
|
||||||
'timestamp': 1577047500,
|
'timestamp': 1596658200,
|
||||||
'upload_date': '20191222',
|
'upload_date': '20200805',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
|
@ -6,13 +6,11 @@ import re
|
|||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import compat_urlparse
|
from ..compat import compat_urlparse
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
determine_ext,
|
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
float_or_none,
|
float_or_none,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
mimetype2ext,
|
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
strip_jsonp,
|
try_get,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -20,22 +18,27 @@ class ArkenaIE(InfoExtractor):
|
|||||||
_VALID_URL = r'''(?x)
|
_VALID_URL = r'''(?x)
|
||||||
https?://
|
https?://
|
||||||
(?:
|
(?:
|
||||||
video\.arkena\.com/play2/embed/player\?|
|
video\.(?:arkena|qbrick)\.com/play2/embed/player\?|
|
||||||
play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)
|
play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)
|
||||||
)
|
)
|
||||||
'''
|
'''
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411',
|
'url': 'https://video.qbrick.com/play2/embed/player?accountId=1034090&mediaId=d8ab4607-00090107-aab86310',
|
||||||
'md5': 'b96f2f71b359a8ecd05ce4e1daa72365',
|
'md5': '97f117754e5f3c020f5f26da4a44ebaf',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe',
|
'id': 'd8ab4607-00090107-aab86310',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Big Buck Bunny',
|
'title': 'EM_HT20_117_roslund_v2.mp4',
|
||||||
'description': 'Royalty free test video',
|
'timestamp': 1608285912,
|
||||||
'timestamp': 1432816365,
|
'upload_date': '20201218',
|
||||||
'upload_date': '20150528',
|
'duration': 1429.162667,
|
||||||
'is_live': False,
|
'subtitles': {
|
||||||
|
'sv': 'count:3',
|
||||||
|
},
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411',
|
||||||
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://play.arkena.com/config/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411/?callbackMethod=jQuery1111023664739129262213_1469227693893',
|
'url': 'https://play.arkena.com/config/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411/?callbackMethod=jQuery1111023664739129262213_1469227693893',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -72,62 +75,89 @@ class ArkenaIE(InfoExtractor):
|
|||||||
if not video_id or not account_id:
|
if not video_id or not account_id:
|
||||||
raise ExtractorError('Invalid URL', expected=True)
|
raise ExtractorError('Invalid URL', expected=True)
|
||||||
|
|
||||||
playlist = self._download_json(
|
media = self._download_json(
|
||||||
'https://play.arkena.com/config/avp/v2/player/media/%s/0/%s/?callbackMethod=_'
|
'https://video.qbrick.com/api/v1/public/accounts/%s/medias/%s' % (account_id, video_id),
|
||||||
% (video_id, account_id),
|
video_id, query={
|
||||||
video_id, transform_source=strip_jsonp)['Playlist'][0]
|
# https://video.qbrick.com/docs/api/examples/library-api.html
|
||||||
|
'fields': 'asset/resources/*/renditions/*(height,id,language,links/*(href,mimeType),type,size,videos/*(audios/*(codec,sampleRate),bitrate,codec,duration,height,width),width),created,metadata/*(title,description),tags',
|
||||||
|
})
|
||||||
|
metadata = media.get('metadata') or {}
|
||||||
|
title = metadata['title']
|
||||||
|
|
||||||
media_info = playlist['MediaInfo']
|
duration = None
|
||||||
title = media_info['Title']
|
|
||||||
media_files = playlist['MediaFiles']
|
|
||||||
|
|
||||||
is_live = False
|
|
||||||
formats = []
|
formats = []
|
||||||
for kind_case, kind_formats in media_files.items():
|
thumbnails = []
|
||||||
kind = kind_case.lower()
|
subtitles = {}
|
||||||
for f in kind_formats:
|
for resource in media['asset']['resources']:
|
||||||
f_url = f.get('Url')
|
for rendition in (resource.get('renditions') or []):
|
||||||
if not f_url:
|
rendition_type = rendition.get('type')
|
||||||
continue
|
for i, link in enumerate(rendition.get('links') or []):
|
||||||
is_live = f.get('Live') == 'true'
|
href = link.get('href')
|
||||||
exts = (mimetype2ext(f.get('Type')), determine_ext(f_url, None))
|
if not href:
|
||||||
if kind == 'm3u8' or 'm3u8' in exts:
|
continue
|
||||||
formats.extend(self._extract_m3u8_formats(
|
if rendition_type == 'image':
|
||||||
f_url, video_id, 'mp4', 'm3u8_native',
|
thumbnails.append({
|
||||||
m3u8_id=kind, fatal=False, live=is_live))
|
'filesize': int_or_none(rendition.get('size')),
|
||||||
elif kind == 'flash' or 'f4m' in exts:
|
'height': int_or_none(rendition.get('height')),
|
||||||
formats.extend(self._extract_f4m_formats(
|
'id': rendition.get('id'),
|
||||||
f_url, video_id, f4m_id=kind, fatal=False))
|
'url': href,
|
||||||
elif kind == 'dash' or 'mpd' in exts:
|
'width': int_or_none(rendition.get('width')),
|
||||||
formats.extend(self._extract_mpd_formats(
|
})
|
||||||
f_url, video_id, mpd_id=kind, fatal=False))
|
elif rendition_type == 'subtitle':
|
||||||
elif kind == 'silverlight':
|
subtitles.setdefault(rendition.get('language') or 'en', []).append({
|
||||||
# TODO: process when ism is supported (see
|
'url': href,
|
||||||
# https://github.com/ytdl-org/youtube-dl/issues/8118)
|
})
|
||||||
continue
|
elif rendition_type == 'video':
|
||||||
else:
|
f = {
|
||||||
tbr = float_or_none(f.get('Bitrate'), 1000)
|
'filesize': int_or_none(rendition.get('size')),
|
||||||
formats.append({
|
'format_id': rendition.get('id'),
|
||||||
'url': f_url,
|
'url': href,
|
||||||
'format_id': '%s-%d' % (kind, tbr) if tbr else kind,
|
}
|
||||||
'tbr': tbr,
|
video = try_get(rendition, lambda x: x['videos'][i], dict)
|
||||||
})
|
if video:
|
||||||
|
if not duration:
|
||||||
|
duration = float_or_none(video.get('duration'))
|
||||||
|
f.update({
|
||||||
|
'height': int_or_none(video.get('height')),
|
||||||
|
'tbr': int_or_none(video.get('bitrate'), 1000),
|
||||||
|
'vcodec': video.get('codec'),
|
||||||
|
'width': int_or_none(video.get('width')),
|
||||||
|
})
|
||||||
|
audio = try_get(video, lambda x: x['audios'][0], dict)
|
||||||
|
if audio:
|
||||||
|
f.update({
|
||||||
|
'acodec': audio.get('codec'),
|
||||||
|
'asr': int_or_none(audio.get('sampleRate')),
|
||||||
|
})
|
||||||
|
formats.append(f)
|
||||||
|
elif rendition_type == 'index':
|
||||||
|
mime_type = link.get('mimeType')
|
||||||
|
if mime_type == 'application/smil+xml':
|
||||||
|
formats.extend(self._extract_smil_formats(
|
||||||
|
href, video_id, fatal=False))
|
||||||
|
elif mime_type == 'application/x-mpegURL':
|
||||||
|
formats.extend(self._extract_m3u8_formats(
|
||||||
|
href, video_id, 'mp4', 'm3u8_native',
|
||||||
|
m3u8_id='hls', fatal=False))
|
||||||
|
elif mime_type == 'application/hds+xml':
|
||||||
|
formats.extend(self._extract_f4m_formats(
|
||||||
|
href, video_id, f4m_id='hds', fatal=False))
|
||||||
|
elif mime_type == 'application/dash+xml':
|
||||||
|
formats.extend(self._extract_f4m_formats(
|
||||||
|
href, video_id, f4m_id='hds', fatal=False))
|
||||||
|
elif mime_type == 'application/vnd.ms-sstr+xml':
|
||||||
|
formats.extend(self._extract_ism_formats(
|
||||||
|
href, video_id, ism_id='mss', fatal=False))
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
description = media_info.get('Description')
|
|
||||||
video_id = media_info.get('VideoId') or video_id
|
|
||||||
timestamp = parse_iso8601(media_info.get('PublishDate'))
|
|
||||||
thumbnails = [{
|
|
||||||
'url': thumbnail['Url'],
|
|
||||||
'width': int_or_none(thumbnail.get('Size')),
|
|
||||||
} for thumbnail in (media_info.get('Poster') or []) if thumbnail.get('Url')]
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
'description': description,
|
'description': metadata.get('description'),
|
||||||
'timestamp': timestamp,
|
'timestamp': parse_iso8601(media.get('created')),
|
||||||
'is_live': is_live,
|
|
||||||
'thumbnails': thumbnails,
|
'thumbnails': thumbnails,
|
||||||
|
'subtitles': subtitles,
|
||||||
|
'duration': duration,
|
||||||
|
'tags': media.get('tags'),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
}
|
}
|
||||||
|
@ -1,27 +1,91 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import functools
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .kaltura import KalturaIE
|
from .kaltura import KalturaIE
|
||||||
from ..utils import extract_attributes
|
from ..utils import (
|
||||||
|
extract_attributes,
|
||||||
|
int_or_none,
|
||||||
|
OnDemandPagedList,
|
||||||
|
parse_age_limit,
|
||||||
|
strip_or_none,
|
||||||
|
try_get,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class AsianCrushIE(InfoExtractor):
|
class AsianCrushBaseIE(InfoExtractor):
|
||||||
_VALID_URL_BASE = r'https?://(?:www\.)?(?P<host>(?:(?:asiancrush|yuyutv|midnightpulp)\.com|cocoro\.tv))'
|
_VALID_URL_BASE = r'https?://(?:www\.)?(?P<host>(?:(?:asiancrush|yuyutv|midnightpulp)\.com|(?:cocoro|retrocrush)\.tv))'
|
||||||
_VALID_URL = r'%s/video/(?:[^/]+/)?0+(?P<id>\d+)v\b' % _VALID_URL_BASE
|
_KALTURA_KEYS = [
|
||||||
|
'video_url', 'progressive_url', 'download_url', 'thumbnail_url',
|
||||||
|
'widescreen_thumbnail_url', 'screencap_widescreen',
|
||||||
|
]
|
||||||
|
_API_SUFFIX = {'retrocrush.tv': '-ott'}
|
||||||
|
|
||||||
|
def _call_api(self, host, endpoint, video_id, query, resource):
|
||||||
|
return self._download_json(
|
||||||
|
'https://api%s.%s/%s' % (self._API_SUFFIX.get(host, ''), host, endpoint), video_id,
|
||||||
|
'Downloading %s JSON metadata' % resource, query=query,
|
||||||
|
headers=self.geo_verification_headers())['objects']
|
||||||
|
|
||||||
|
def _download_object_data(self, host, object_id, resource):
|
||||||
|
return self._call_api(
|
||||||
|
host, 'search', object_id, {'id': object_id}, resource)[0]
|
||||||
|
|
||||||
|
def _get_object_description(self, obj):
|
||||||
|
return strip_or_none(obj.get('long_description') or obj.get('short_description'))
|
||||||
|
|
||||||
|
def _parse_video_data(self, video):
|
||||||
|
title = video['name']
|
||||||
|
|
||||||
|
entry_id, partner_id = [None] * 2
|
||||||
|
for k in self._KALTURA_KEYS:
|
||||||
|
k_url = video.get(k)
|
||||||
|
if k_url:
|
||||||
|
mobj = re.search(r'/p/(\d+)/.+?/entryId/([^/]+)/', k_url)
|
||||||
|
if mobj:
|
||||||
|
partner_id, entry_id = mobj.groups()
|
||||||
|
break
|
||||||
|
|
||||||
|
meta_categories = try_get(video, lambda x: x['meta']['categories'], list) or []
|
||||||
|
categories = list(filter(None, [c.get('name') for c in meta_categories]))
|
||||||
|
|
||||||
|
show_info = video.get('show_info') or {}
|
||||||
|
|
||||||
|
return {
|
||||||
|
'_type': 'url_transparent',
|
||||||
|
'url': 'kaltura:%s:%s' % (partner_id, entry_id),
|
||||||
|
'ie_key': KalturaIE.ie_key(),
|
||||||
|
'id': entry_id,
|
||||||
|
'title': title,
|
||||||
|
'description': self._get_object_description(video),
|
||||||
|
'age_limit': parse_age_limit(video.get('mpaa_rating') or video.get('tv_rating')),
|
||||||
|
'categories': categories,
|
||||||
|
'series': show_info.get('show_name'),
|
||||||
|
'season_number': int_or_none(show_info.get('season_num')),
|
||||||
|
'season_id': show_info.get('season_id'),
|
||||||
|
'episode_number': int_or_none(show_info.get('episode_num')),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class AsianCrushIE(AsianCrushBaseIE):
|
||||||
|
_VALID_URL = r'%s/video/(?:[^/]+/)?0+(?P<id>\d+)v\b' % AsianCrushBaseIE._VALID_URL_BASE
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.asiancrush.com/video/012869v/women-who-flirt/',
|
'url': 'https://www.asiancrush.com/video/004289v/women-who-flirt',
|
||||||
'md5': 'c3b740e48d0ba002a42c0b72857beae6',
|
'md5': 'c3b740e48d0ba002a42c0b72857beae6',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '1_y4tmjm5r',
|
'id': '1_y4tmjm5r',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Women Who Flirt',
|
'title': 'Women Who Flirt',
|
||||||
'description': 'md5:7e986615808bcfb11756eb503a751487',
|
'description': 'md5:b65c7e0ae03a85585476a62a186f924c',
|
||||||
'timestamp': 1496936429,
|
'timestamp': 1496936429,
|
||||||
'upload_date': '20170608',
|
'upload_date': '20170608',
|
||||||
'uploader_id': 'craig@crifkin.com',
|
'uploader_id': 'craig@crifkin.com',
|
||||||
|
'age_limit': 13,
|
||||||
|
'categories': 'count:5',
|
||||||
|
'duration': 5812,
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.asiancrush.com/video/she-was-pretty/011886v-pretty-episode-3/',
|
'url': 'https://www.asiancrush.com/video/she-was-pretty/011886v-pretty-episode-3/',
|
||||||
@ -41,67 +105,35 @@ class AsianCrushIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://www.cocoro.tv/video/the-wonderful-wizard-of-oz/008878v-the-wonderful-wizard-of-oz-ep01/',
|
'url': 'https://www.cocoro.tv/video/the-wonderful-wizard-of-oz/008878v-the-wonderful-wizard-of-oz-ep01/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.retrocrush.tv/video/true-tears/012328v-i...gave-away-my-tears',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
host, video_id = re.match(self._VALID_URL, url).groups()
|
||||||
host = mobj.group('host')
|
|
||||||
video_id = mobj.group('id')
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
if host == 'cocoro.tv':
|
||||||
|
webpage = self._download_webpage(url, video_id)
|
||||||
entry_id, partner_id, title = [None] * 3
|
embed_vars = self._parse_json(self._search_regex(
|
||||||
|
|
||||||
vars = self._parse_json(
|
|
||||||
self._search_regex(
|
|
||||||
r'iEmbedVars\s*=\s*({.+?})', webpage, 'embed vars',
|
r'iEmbedVars\s*=\s*({.+?})', webpage, 'embed vars',
|
||||||
default='{}'), video_id, fatal=False)
|
default='{}'), video_id, fatal=False) or {}
|
||||||
if vars:
|
video_id = embed_vars.get('entry_id') or video_id
|
||||||
entry_id = vars.get('entry_id')
|
|
||||||
partner_id = vars.get('partner_id')
|
|
||||||
title = vars.get('vid_label')
|
|
||||||
|
|
||||||
if not entry_id:
|
video = self._download_object_data(host, video_id, 'video')
|
||||||
entry_id = self._search_regex(
|
return self._parse_video_data(video)
|
||||||
r'\bentry_id["\']\s*:\s*["\'](\d+)', webpage, 'entry id')
|
|
||||||
|
|
||||||
player = self._download_webpage(
|
|
||||||
'https://api.%s/embeddedVideoPlayer' % host, video_id,
|
|
||||||
query={'id': entry_id})
|
|
||||||
|
|
||||||
kaltura_id = self._search_regex(
|
|
||||||
r'entry_id["\']\s*:\s*(["\'])(?P<id>(?:(?!\1).)+)\1', player,
|
|
||||||
'kaltura id', group='id')
|
|
||||||
|
|
||||||
if not partner_id:
|
|
||||||
partner_id = self._search_regex(
|
|
||||||
r'/p(?:artner_id)?/(\d+)', player, 'partner id',
|
|
||||||
default='513551')
|
|
||||||
|
|
||||||
description = self._html_search_regex(
|
|
||||||
r'(?s)<div[^>]+\bclass=["\']description["\'][^>]*>(.+?)</div>',
|
|
||||||
webpage, 'description', fatal=False)
|
|
||||||
|
|
||||||
return {
|
|
||||||
'_type': 'url_transparent',
|
|
||||||
'url': 'kaltura:%s:%s' % (partner_id, kaltura_id),
|
|
||||||
'ie_key': KalturaIE.ie_key(),
|
|
||||||
'id': video_id,
|
|
||||||
'title': title,
|
|
||||||
'description': description,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class AsianCrushPlaylistIE(InfoExtractor):
|
class AsianCrushPlaylistIE(AsianCrushBaseIE):
|
||||||
_VALID_URL = r'%s/series/0+(?P<id>\d+)s\b' % AsianCrushIE._VALID_URL_BASE
|
_VALID_URL = r'%s/series/0+(?P<id>\d+)s\b' % AsianCrushBaseIE._VALID_URL_BASE
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.asiancrush.com/series/012481s/scholar-walks-night/',
|
'url': 'https://www.asiancrush.com/series/006447s/fruity-samurai',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '12481',
|
'id': '6447',
|
||||||
'title': 'Scholar Who Walks the Night',
|
'title': 'Fruity Samurai',
|
||||||
'description': 'md5:7addd7c5132a09fd4741152d96cce886',
|
'description': 'md5:7535174487e4a202d3872a7fc8f2f154',
|
||||||
},
|
},
|
||||||
'playlist_count': 20,
|
'playlist_count': 13,
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.yuyutv.com/series/013920s/peep-show/',
|
'url': 'https://www.yuyutv.com/series/013920s/peep-show/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -111,35 +143,58 @@ class AsianCrushPlaylistIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://www.cocoro.tv/series/008549s/the-wonderful-wizard-of-oz/',
|
'url': 'https://www.cocoro.tv/series/008549s/the-wonderful-wizard-of-oz/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.retrocrush.tv/series/012355s/true-tears',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
_PAGE_SIZE = 1000000000
|
||||||
|
|
||||||
|
def _fetch_page(self, domain, parent_id, page):
|
||||||
|
videos = self._call_api(
|
||||||
|
domain, 'getreferencedobjects', parent_id, {
|
||||||
|
'max': self._PAGE_SIZE,
|
||||||
|
'object_type': 'video',
|
||||||
|
'parent_id': parent_id,
|
||||||
|
'start': page * self._PAGE_SIZE,
|
||||||
|
}, 'page %d' % (page + 1))
|
||||||
|
for video in videos:
|
||||||
|
yield self._parse_video_data(video)
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
playlist_id = self._match_id(url)
|
host, playlist_id = re.match(self._VALID_URL, url).groups()
|
||||||
|
|
||||||
webpage = self._download_webpage(url, playlist_id)
|
if host == 'cocoro.tv':
|
||||||
|
webpage = self._download_webpage(url, playlist_id)
|
||||||
|
|
||||||
entries = []
|
entries = []
|
||||||
|
|
||||||
for mobj in re.finditer(
|
for mobj in re.finditer(
|
||||||
r'<a[^>]+href=(["\'])(?P<url>%s.*?)\1[^>]*>' % AsianCrushIE._VALID_URL,
|
r'<a[^>]+href=(["\'])(?P<url>%s.*?)\1[^>]*>' % AsianCrushIE._VALID_URL,
|
||||||
webpage):
|
webpage):
|
||||||
attrs = extract_attributes(mobj.group(0))
|
attrs = extract_attributes(mobj.group(0))
|
||||||
if attrs.get('class') == 'clearfix':
|
if attrs.get('class') == 'clearfix':
|
||||||
entries.append(self.url_result(
|
entries.append(self.url_result(
|
||||||
mobj.group('url'), ie=AsianCrushIE.ie_key()))
|
mobj.group('url'), ie=AsianCrushIE.ie_key()))
|
||||||
|
|
||||||
title = self._html_search_regex(
|
title = self._html_search_regex(
|
||||||
r'(?s)<h1\b[^>]\bid=["\']movieTitle[^>]+>(.+?)</h1>', webpage,
|
r'(?s)<h1\b[^>]\bid=["\']movieTitle[^>]+>(.+?)</h1>', webpage,
|
||||||
'title', default=None) or self._og_search_title(
|
'title', default=None) or self._og_search_title(
|
||||||
webpage, default=None) or self._html_search_meta(
|
webpage, default=None) or self._html_search_meta(
|
||||||
'twitter:title', webpage, 'title',
|
'twitter:title', webpage, 'title',
|
||||||
default=None) or self._search_regex(
|
default=None) or self._search_regex(
|
||||||
r'<title>([^<]+)</title>', webpage, 'title', fatal=False)
|
r'<title>([^<]+)</title>', webpage, 'title', fatal=False)
|
||||||
if title:
|
if title:
|
||||||
title = re.sub(r'\s*\|\s*.+?$', '', title)
|
title = re.sub(r'\s*\|\s*.+?$', '', title)
|
||||||
|
|
||||||
description = self._og_search_description(
|
description = self._og_search_description(
|
||||||
webpage, default=None) or self._html_search_meta(
|
webpage, default=None) or self._html_search_meta(
|
||||||
'twitter:description', webpage, 'description', fatal=False)
|
'twitter:description', webpage, 'description', fatal=False)
|
||||||
|
else:
|
||||||
|
show = self._download_object_data(host, playlist_id, 'show')
|
||||||
|
title = show.get('name')
|
||||||
|
description = self._get_object_description(show)
|
||||||
|
entries = OnDemandPagedList(
|
||||||
|
functools.partial(self._fetch_page, host, playlist_id),
|
||||||
|
self._PAGE_SIZE)
|
||||||
|
|
||||||
return self.playlist_result(entries, playlist_id, title, description)
|
return self.playlist_result(entries, playlist_id, title, description)
|
||||||
|
@ -48,6 +48,7 @@ class AWAANBaseIE(InfoExtractor):
|
|||||||
'duration': int_or_none(video_data.get('duration')),
|
'duration': int_or_none(video_data.get('duration')),
|
||||||
'timestamp': parse_iso8601(video_data.get('create_time'), ' '),
|
'timestamp': parse_iso8601(video_data.get('create_time'), ' '),
|
||||||
'is_live': is_live,
|
'is_live': is_live,
|
||||||
|
'uploader_id': video_data.get('user_id'),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -107,6 +108,7 @@ class AWAANLiveIE(AWAANBaseIE):
|
|||||||
'title': 're:Dubai Al Oula [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
'title': 're:Dubai Al Oula [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
||||||
'upload_date': '20150107',
|
'upload_date': '20150107',
|
||||||
'timestamp': 1420588800,
|
'timestamp': 1420588800,
|
||||||
|
'uploader_id': '71',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
# m3u8 download
|
# m3u8 download
|
||||||
|
@ -47,7 +47,7 @@ class AZMedienIE(InfoExtractor):
|
|||||||
'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1',
|
'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1',
|
||||||
'only_matching': True
|
'only_matching': True
|
||||||
}]
|
}]
|
||||||
_API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/cb9f2f81ed22e9b47f4ca64ea3cc5a5d13e88d1d'
|
_API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/a4016f65fe62b81dc6664dd9f4910e4ab40383be'
|
||||||
_PARTNER_ID = '1719221'
|
_PARTNER_ID = '1719221'
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -49,22 +49,17 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
_LOGIN_URL = 'https://account.bbc.com/signin'
|
_LOGIN_URL = 'https://account.bbc.com/signin'
|
||||||
_NETRC_MACHINE = 'bbc'
|
_NETRC_MACHINE = 'bbc'
|
||||||
|
|
||||||
_MEDIASELECTOR_URLS = [
|
_MEDIA_SELECTOR_URL_TEMPL = 'https://open.live.bbc.co.uk/mediaselector/6/select/version/2.0/mediaset/%s/vpid/%s'
|
||||||
|
_MEDIA_SETS = [
|
||||||
# Provides HQ HLS streams with even better quality that pc mediaset but fails
|
# Provides HQ HLS streams with even better quality that pc mediaset but fails
|
||||||
# with geolocation in some cases when it's even not geo restricted at all (e.g.
|
# with geolocation in some cases when it's even not geo restricted at all (e.g.
|
||||||
# http://www.bbc.co.uk/programmes/b06bp7lf). Also may fail with selectionunavailable.
|
# http://www.bbc.co.uk/programmes/b06bp7lf). Also may fail with selectionunavailable.
|
||||||
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',
|
'iptv-all',
|
||||||
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s',
|
'pc',
|
||||||
]
|
]
|
||||||
|
|
||||||
_MEDIASELECTION_NS = 'http://bbc.co.uk/2008/mp/mediaselection'
|
|
||||||
_EMP_PLAYLIST_NS = 'http://bbc.co.uk/2008/emp/playlist'
|
_EMP_PLAYLIST_NS = 'http://bbc.co.uk/2008/emp/playlist'
|
||||||
|
|
||||||
_NAMESPACES = (
|
|
||||||
_MEDIASELECTION_NS,
|
|
||||||
_EMP_PLAYLIST_NS,
|
|
||||||
)
|
|
||||||
|
|
||||||
_TESTS = [
|
_TESTS = [
|
||||||
{
|
{
|
||||||
'url': 'http://www.bbc.co.uk/programmes/b039g8p7',
|
'url': 'http://www.bbc.co.uk/programmes/b039g8p7',
|
||||||
@ -261,8 +256,6 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
_USP_RE = r'/([^/]+?)\.ism(?:\.hlsv2\.ism)?/[^/]+\.m3u8'
|
|
||||||
|
|
||||||
def _login(self):
|
def _login(self):
|
||||||
username, password = self._get_login_info()
|
username, password = self._get_login_info()
|
||||||
if username is None:
|
if username is None:
|
||||||
@ -307,22 +300,14 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
def _extract_items(self, playlist):
|
def _extract_items(self, playlist):
|
||||||
return playlist.findall('./{%s}item' % self._EMP_PLAYLIST_NS)
|
return playlist.findall('./{%s}item' % self._EMP_PLAYLIST_NS)
|
||||||
|
|
||||||
def _findall_ns(self, element, xpath):
|
|
||||||
elements = []
|
|
||||||
for ns in self._NAMESPACES:
|
|
||||||
elements.extend(element.findall(xpath % ns))
|
|
||||||
return elements
|
|
||||||
|
|
||||||
def _extract_medias(self, media_selection):
|
def _extract_medias(self, media_selection):
|
||||||
error = media_selection.find('./{%s}error' % self._MEDIASELECTION_NS)
|
error = media_selection.get('result')
|
||||||
if error is None:
|
if error:
|
||||||
media_selection.find('./{%s}error' % self._EMP_PLAYLIST_NS)
|
raise BBCCoUkIE.MediaSelectionError(error)
|
||||||
if error is not None:
|
return media_selection.get('media') or []
|
||||||
raise BBCCoUkIE.MediaSelectionError(error.get('id'))
|
|
||||||
return self._findall_ns(media_selection, './{%s}media')
|
|
||||||
|
|
||||||
def _extract_connections(self, media):
|
def _extract_connections(self, media):
|
||||||
return self._findall_ns(media, './{%s}connection')
|
return media.get('connection') or []
|
||||||
|
|
||||||
def _get_subtitles(self, media, programme_id):
|
def _get_subtitles(self, media, programme_id):
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
@ -334,13 +319,13 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
cc_url, programme_id, 'Downloading captions', fatal=False)
|
cc_url, programme_id, 'Downloading captions', fatal=False)
|
||||||
if not isinstance(captions, compat_etree_Element):
|
if not isinstance(captions, compat_etree_Element):
|
||||||
continue
|
continue
|
||||||
lang = captions.get('{http://www.w3.org/XML/1998/namespace}lang', 'en')
|
subtitles['en'] = [
|
||||||
subtitles[lang] = [
|
|
||||||
{
|
{
|
||||||
'url': connection.get('href'),
|
'url': connection.get('href'),
|
||||||
'ext': 'ttml',
|
'ext': 'ttml',
|
||||||
},
|
},
|
||||||
]
|
]
|
||||||
|
break
|
||||||
return subtitles
|
return subtitles
|
||||||
|
|
||||||
def _raise_extractor_error(self, media_selection_error):
|
def _raise_extractor_error(self, media_selection_error):
|
||||||
@ -350,10 +335,10 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
|
|
||||||
def _download_media_selector(self, programme_id):
|
def _download_media_selector(self, programme_id):
|
||||||
last_exception = None
|
last_exception = None
|
||||||
for mediaselector_url in self._MEDIASELECTOR_URLS:
|
for media_set in self._MEDIA_SETS:
|
||||||
try:
|
try:
|
||||||
return self._download_media_selector_url(
|
return self._download_media_selector_url(
|
||||||
mediaselector_url % programme_id, programme_id)
|
self._MEDIA_SELECTOR_URL_TEMPL % (media_set, programme_id), programme_id)
|
||||||
except BBCCoUkIE.MediaSelectionError as e:
|
except BBCCoUkIE.MediaSelectionError as e:
|
||||||
if e.id in ('notukerror', 'geolocation', 'selectionunavailable'):
|
if e.id in ('notukerror', 'geolocation', 'selectionunavailable'):
|
||||||
last_exception = e
|
last_exception = e
|
||||||
@ -362,8 +347,8 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
self._raise_extractor_error(last_exception)
|
self._raise_extractor_error(last_exception)
|
||||||
|
|
||||||
def _download_media_selector_url(self, url, programme_id=None):
|
def _download_media_selector_url(self, url, programme_id=None):
|
||||||
media_selection = self._download_xml(
|
media_selection = self._download_json(
|
||||||
url, programme_id, 'Downloading media selection XML',
|
url, programme_id, 'Downloading media selection JSON',
|
||||||
expected_status=(403, 404))
|
expected_status=(403, 404))
|
||||||
return self._process_media_selector(media_selection, programme_id)
|
return self._process_media_selector(media_selection, programme_id)
|
||||||
|
|
||||||
@ -377,7 +362,6 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
if kind in ('video', 'audio'):
|
if kind in ('video', 'audio'):
|
||||||
bitrate = int_or_none(media.get('bitrate'))
|
bitrate = int_or_none(media.get('bitrate'))
|
||||||
encoding = media.get('encoding')
|
encoding = media.get('encoding')
|
||||||
service = media.get('service')
|
|
||||||
width = int_or_none(media.get('width'))
|
width = int_or_none(media.get('width'))
|
||||||
height = int_or_none(media.get('height'))
|
height = int_or_none(media.get('height'))
|
||||||
file_size = int_or_none(media.get('media_file_size'))
|
file_size = int_or_none(media.get('media_file_size'))
|
||||||
@ -392,8 +376,6 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
supplier = connection.get('supplier')
|
supplier = connection.get('supplier')
|
||||||
transfer_format = connection.get('transferFormat')
|
transfer_format = connection.get('transferFormat')
|
||||||
format_id = supplier or conn_kind or protocol
|
format_id = supplier or conn_kind or protocol
|
||||||
if service:
|
|
||||||
format_id = '%s_%s' % (service, format_id)
|
|
||||||
# ASX playlist
|
# ASX playlist
|
||||||
if supplier == 'asx':
|
if supplier == 'asx':
|
||||||
for i, ref in enumerate(self._extract_asx_playlist(connection, programme_id)):
|
for i, ref in enumerate(self._extract_asx_playlist(connection, programme_id)):
|
||||||
@ -408,20 +390,11 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
formats.extend(self._extract_m3u8_formats(
|
formats.extend(self._extract_m3u8_formats(
|
||||||
href, programme_id, ext='mp4', entry_protocol='m3u8_native',
|
href, programme_id, ext='mp4', entry_protocol='m3u8_native',
|
||||||
m3u8_id=format_id, fatal=False))
|
m3u8_id=format_id, fatal=False))
|
||||||
if re.search(self._USP_RE, href):
|
|
||||||
usp_formats = self._extract_m3u8_formats(
|
|
||||||
re.sub(self._USP_RE, r'/\1.ism/\1.m3u8', href),
|
|
||||||
programme_id, ext='mp4', entry_protocol='m3u8_native',
|
|
||||||
m3u8_id=format_id, fatal=False)
|
|
||||||
for f in usp_formats:
|
|
||||||
if f.get('height') and f['height'] > 720:
|
|
||||||
continue
|
|
||||||
formats.append(f)
|
|
||||||
elif transfer_format == 'hds':
|
elif transfer_format == 'hds':
|
||||||
formats.extend(self._extract_f4m_formats(
|
formats.extend(self._extract_f4m_formats(
|
||||||
href, programme_id, f4m_id=format_id, fatal=False))
|
href, programme_id, f4m_id=format_id, fatal=False))
|
||||||
else:
|
else:
|
||||||
if not service and not supplier and bitrate:
|
if not supplier and bitrate:
|
||||||
format_id += '-%d' % bitrate
|
format_id += '-%d' % bitrate
|
||||||
fmt = {
|
fmt = {
|
||||||
'format_id': format_id,
|
'format_id': format_id,
|
||||||
@ -554,7 +527,7 @@ class BBCCoUkIE(InfoExtractor):
|
|||||||
webpage = self._download_webpage(url, group_id, 'Downloading video page')
|
webpage = self._download_webpage(url, group_id, 'Downloading video page')
|
||||||
|
|
||||||
error = self._search_regex(
|
error = self._search_regex(
|
||||||
r'<div\b[^>]+\bclass=["\']smp__message delta["\'][^>]*>([^<]+)<',
|
r'<div\b[^>]+\bclass=["\'](?:smp|playout)__message delta["\'][^>]*>\s*([^<]+?)\s*<',
|
||||||
webpage, 'error', default=None)
|
webpage, 'error', default=None)
|
||||||
if error:
|
if error:
|
||||||
raise ExtractorError(error, expected=True)
|
raise ExtractorError(error, expected=True)
|
||||||
@ -607,16 +580,9 @@ class BBCIE(BBCCoUkIE):
|
|||||||
IE_DESC = 'BBC'
|
IE_DESC = 'BBC'
|
||||||
_VALID_URL = r'https?://(?:www\.)?bbc\.(?:com|co\.uk)/(?:[^/]+/)+(?P<id>[^/#?]+)'
|
_VALID_URL = r'https?://(?:www\.)?bbc\.(?:com|co\.uk)/(?:[^/]+/)+(?P<id>[^/#?]+)'
|
||||||
|
|
||||||
_MEDIASELECTOR_URLS = [
|
_MEDIA_SETS = [
|
||||||
# Provides HQ HLS streams but fails with geolocation in some cases when it's
|
'mobile-tablet-main',
|
||||||
# even not geo restricted at all
|
'pc',
|
||||||
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',
|
|
||||||
# Provides more formats, namely direct mp4 links, but fails on some videos with
|
|
||||||
# notukerror for non UK (?) users (e.g.
|
|
||||||
# http://www.bbc.com/travel/story/20150625-sri-lankas-spicy-secret)
|
|
||||||
'http://open.live.bbc.co.uk/mediaselector/4/mtis/stream/%s',
|
|
||||||
# Provides fewer formats, but works everywhere for everybody (hopefully)
|
|
||||||
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/journalism-pc/vpid/%s',
|
|
||||||
]
|
]
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
@ -981,7 +947,7 @@ class BBCIE(BBCCoUkIE):
|
|||||||
group_id = self._search_regex(
|
group_id = self._search_regex(
|
||||||
r'<div[^>]+\bclass=["\']video["\'][^>]+\bdata-pid=["\'](%s)' % self._ID_REGEX,
|
r'<div[^>]+\bclass=["\']video["\'][^>]+\bdata-pid=["\'](%s)' % self._ID_REGEX,
|
||||||
webpage, 'group id', default=None)
|
webpage, 'group id', default=None)
|
||||||
if playlist_id:
|
if group_id:
|
||||||
return self.url_result(
|
return self.url_result(
|
||||||
'https://www.bbc.co.uk/programmes/%s' % group_id,
|
'https://www.bbc.co.uk/programmes/%s' % group_id,
|
||||||
ie=BBCCoUkIE.ie_key())
|
ie=BBCCoUkIE.ie_key())
|
||||||
@ -1092,10 +1058,26 @@ class BBCIE(BBCCoUkIE):
|
|||||||
self._search_regex(
|
self._search_regex(
|
||||||
r'(?s)bbcthreeConfig\s*=\s*({.+?})\s*;\s*<', webpage,
|
r'(?s)bbcthreeConfig\s*=\s*({.+?})\s*;\s*<', webpage,
|
||||||
'bbcthree config', default='{}'),
|
'bbcthree config', default='{}'),
|
||||||
playlist_id, transform_source=js_to_json, fatal=False)
|
playlist_id, transform_source=js_to_json, fatal=False) or {}
|
||||||
if bbc3_config:
|
payload = bbc3_config.get('payload') or {}
|
||||||
|
if payload:
|
||||||
|
clip = payload.get('currentClip') or {}
|
||||||
|
clip_vpid = clip.get('vpid')
|
||||||
|
clip_title = clip.get('title')
|
||||||
|
if clip_vpid and clip_title:
|
||||||
|
formats, subtitles = self._download_media_selector(clip_vpid)
|
||||||
|
self._sort_formats(formats)
|
||||||
|
return {
|
||||||
|
'id': clip_vpid,
|
||||||
|
'title': clip_title,
|
||||||
|
'thumbnail': dict_get(clip, ('poster', 'imageUrl')),
|
||||||
|
'description': clip.get('description'),
|
||||||
|
'duration': parse_duration(clip.get('duration')),
|
||||||
|
'formats': formats,
|
||||||
|
'subtitles': subtitles,
|
||||||
|
}
|
||||||
bbc3_playlist = try_get(
|
bbc3_playlist = try_get(
|
||||||
bbc3_config, lambda x: x['payload']['content']['bbcMedia']['playlist'],
|
payload, lambda x: x['content']['bbcMedia']['playlist'],
|
||||||
dict)
|
dict)
|
||||||
if bbc3_playlist:
|
if bbc3_playlist:
|
||||||
playlist_title = bbc3_playlist.get('title') or playlist_title
|
playlist_title = bbc3_playlist.get('title') or playlist_title
|
||||||
@ -1118,6 +1100,39 @@ class BBCIE(BBCCoUkIE):
|
|||||||
return self.playlist_result(
|
return self.playlist_result(
|
||||||
entries, playlist_id, playlist_title, playlist_description)
|
entries, playlist_id, playlist_title, playlist_description)
|
||||||
|
|
||||||
|
initial_data = self._parse_json(self._search_regex(
|
||||||
|
r'window\.__INITIAL_DATA__\s*=\s*({.+?});', webpage,
|
||||||
|
'preload state', default='{}'), playlist_id, fatal=False)
|
||||||
|
if initial_data:
|
||||||
|
def parse_media(media):
|
||||||
|
if not media:
|
||||||
|
return
|
||||||
|
for item in (try_get(media, lambda x: x['media']['items'], list) or []):
|
||||||
|
item_id = item.get('id')
|
||||||
|
item_title = item.get('title')
|
||||||
|
if not (item_id and item_title):
|
||||||
|
continue
|
||||||
|
formats, subtitles = self._download_media_selector(item_id)
|
||||||
|
self._sort_formats(formats)
|
||||||
|
entries.append({
|
||||||
|
'id': item_id,
|
||||||
|
'title': item_title,
|
||||||
|
'thumbnail': item.get('holdingImageUrl'),
|
||||||
|
'formats': formats,
|
||||||
|
'subtitles': subtitles,
|
||||||
|
})
|
||||||
|
for resp in (initial_data.get('data') or {}).values():
|
||||||
|
name = resp.get('name')
|
||||||
|
if name == 'media-experience':
|
||||||
|
parse_media(try_get(resp, lambda x: x['data']['initialItem']['mediaItem'], dict))
|
||||||
|
elif name == 'article':
|
||||||
|
for block in (try_get(resp, lambda x: x['data']['blocks'], list) or []):
|
||||||
|
if block.get('type') != 'media':
|
||||||
|
continue
|
||||||
|
parse_media(block.get('model'))
|
||||||
|
return self.playlist_result(
|
||||||
|
entries, playlist_id, playlist_title, playlist_description)
|
||||||
|
|
||||||
def extract_all(pattern):
|
def extract_all(pattern):
|
||||||
return list(filter(None, map(
|
return list(filter(None, map(
|
||||||
lambda s: self._parse_json(s, playlist_id, fatal=False),
|
lambda s: self._parse_json(s, playlist_id, fatal=False),
|
||||||
|
@ -1,194 +0,0 @@
|
|||||||
# coding: utf-8
|
|
||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
from ..utils import (
|
|
||||||
ExtractorError,
|
|
||||||
clean_html,
|
|
||||||
compat_str,
|
|
||||||
float_or_none,
|
|
||||||
int_or_none,
|
|
||||||
parse_iso8601,
|
|
||||||
try_get,
|
|
||||||
urljoin,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class BeamProBaseIE(InfoExtractor):
|
|
||||||
_API_BASE = 'https://mixer.com/api/v1'
|
|
||||||
_RATINGS = {'family': 0, 'teen': 13, '18+': 18}
|
|
||||||
|
|
||||||
def _extract_channel_info(self, chan):
|
|
||||||
user_id = chan.get('userId') or try_get(chan, lambda x: x['user']['id'])
|
|
||||||
return {
|
|
||||||
'uploader': chan.get('token') or try_get(
|
|
||||||
chan, lambda x: x['user']['username'], compat_str),
|
|
||||||
'uploader_id': compat_str(user_id) if user_id else None,
|
|
||||||
'age_limit': self._RATINGS.get(chan.get('audience')),
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class BeamProLiveIE(BeamProBaseIE):
|
|
||||||
IE_NAME = 'Mixer:live'
|
|
||||||
_VALID_URL = r'https?://(?:\w+\.)?(?:beam\.pro|mixer\.com)/(?P<id>[^/?#&]+)'
|
|
||||||
_TEST = {
|
|
||||||
'url': 'http://mixer.com/niterhayven',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '261562',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Introducing The Witcher 3 // The Grind Starts Now!',
|
|
||||||
'description': 'md5:0b161ac080f15fe05d18a07adb44a74d',
|
|
||||||
'thumbnail': r're:https://.*\.jpg$',
|
|
||||||
'timestamp': 1483477281,
|
|
||||||
'upload_date': '20170103',
|
|
||||||
'uploader': 'niterhayven',
|
|
||||||
'uploader_id': '373396',
|
|
||||||
'age_limit': 18,
|
|
||||||
'is_live': True,
|
|
||||||
'view_count': int,
|
|
||||||
},
|
|
||||||
'skip': 'niterhayven is offline',
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
_MANIFEST_URL_TEMPLATE = '%s/channels/%%s/manifest.%%s' % BeamProBaseIE._API_BASE
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def suitable(cls, url):
|
|
||||||
return False if BeamProVodIE.suitable(url) else super(BeamProLiveIE, cls).suitable(url)
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
channel_name = self._match_id(url)
|
|
||||||
|
|
||||||
chan = self._download_json(
|
|
||||||
'%s/channels/%s' % (self._API_BASE, channel_name), channel_name)
|
|
||||||
|
|
||||||
if chan.get('online') is False:
|
|
||||||
raise ExtractorError(
|
|
||||||
'{0} is offline'.format(channel_name), expected=True)
|
|
||||||
|
|
||||||
channel_id = chan['id']
|
|
||||||
|
|
||||||
def manifest_url(kind):
|
|
||||||
return self._MANIFEST_URL_TEMPLATE % (channel_id, kind)
|
|
||||||
|
|
||||||
formats = self._extract_m3u8_formats(
|
|
||||||
manifest_url('m3u8'), channel_name, ext='mp4', m3u8_id='hls',
|
|
||||||
fatal=False)
|
|
||||||
formats.extend(self._extract_smil_formats(
|
|
||||||
manifest_url('smil'), channel_name, fatal=False))
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
info = {
|
|
||||||
'id': compat_str(chan.get('id') or channel_name),
|
|
||||||
'title': self._live_title(chan.get('name') or channel_name),
|
|
||||||
'description': clean_html(chan.get('description')),
|
|
||||||
'thumbnail': try_get(
|
|
||||||
chan, lambda x: x['thumbnail']['url'], compat_str),
|
|
||||||
'timestamp': parse_iso8601(chan.get('updatedAt')),
|
|
||||||
'is_live': True,
|
|
||||||
'view_count': int_or_none(chan.get('viewersTotal')),
|
|
||||||
'formats': formats,
|
|
||||||
}
|
|
||||||
info.update(self._extract_channel_info(chan))
|
|
||||||
|
|
||||||
return info
|
|
||||||
|
|
||||||
|
|
||||||
class BeamProVodIE(BeamProBaseIE):
|
|
||||||
IE_NAME = 'Mixer:vod'
|
|
||||||
_VALID_URL = r'https?://(?:\w+\.)?(?:beam\.pro|mixer\.com)/[^/?#&]+\?.*?\bvod=(?P<id>[^?#&]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://mixer.com/willow8714?vod=2259830',
|
|
||||||
'md5': 'b2431e6e8347dc92ebafb565d368b76b',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '2259830',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'willow8714\'s Channel',
|
|
||||||
'duration': 6828.15,
|
|
||||||
'thumbnail': r're:https://.*source\.png$',
|
|
||||||
'timestamp': 1494046474,
|
|
||||||
'upload_date': '20170506',
|
|
||||||
'uploader': 'willow8714',
|
|
||||||
'uploader_id': '6085379',
|
|
||||||
'age_limit': 13,
|
|
||||||
'view_count': int,
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://mixer.com/streamer?vod=IxFno1rqC0S_XJ1a2yGgNw',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'https://mixer.com/streamer?vod=Rh3LY0VAqkGpEQUe2pN-ig',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _extract_format(vod, vod_type):
|
|
||||||
if not vod.get('baseUrl'):
|
|
||||||
return []
|
|
||||||
|
|
||||||
if vod_type == 'hls':
|
|
||||||
filename, protocol = 'manifest.m3u8', 'm3u8_native'
|
|
||||||
elif vod_type == 'raw':
|
|
||||||
filename, protocol = 'source.mp4', 'https'
|
|
||||||
else:
|
|
||||||
assert False
|
|
||||||
|
|
||||||
data = vod.get('data') if isinstance(vod.get('data'), dict) else {}
|
|
||||||
|
|
||||||
format_id = [vod_type]
|
|
||||||
if isinstance(data.get('Height'), compat_str):
|
|
||||||
format_id.append('%sp' % data['Height'])
|
|
||||||
|
|
||||||
return [{
|
|
||||||
'url': urljoin(vod['baseUrl'], filename),
|
|
||||||
'format_id': '-'.join(format_id),
|
|
||||||
'ext': 'mp4',
|
|
||||||
'protocol': protocol,
|
|
||||||
'width': int_or_none(data.get('Width')),
|
|
||||||
'height': int_or_none(data.get('Height')),
|
|
||||||
'fps': int_or_none(data.get('Fps')),
|
|
||||||
'tbr': int_or_none(data.get('Bitrate'), 1000),
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
vod_id = self._match_id(url)
|
|
||||||
|
|
||||||
vod_info = self._download_json(
|
|
||||||
'%s/recordings/%s' % (self._API_BASE, vod_id), vod_id)
|
|
||||||
|
|
||||||
state = vod_info.get('state')
|
|
||||||
if state != 'AVAILABLE':
|
|
||||||
raise ExtractorError(
|
|
||||||
'VOD %s is not available (state: %s)' % (vod_id, state),
|
|
||||||
expected=True)
|
|
||||||
|
|
||||||
formats = []
|
|
||||||
thumbnail_url = None
|
|
||||||
|
|
||||||
for vod in vod_info['vods']:
|
|
||||||
vod_type = vod.get('format')
|
|
||||||
if vod_type in ('hls', 'raw'):
|
|
||||||
formats.extend(self._extract_format(vod, vod_type))
|
|
||||||
elif vod_type == 'thumbnail':
|
|
||||||
thumbnail_url = urljoin(vod.get('baseUrl'), 'source.png')
|
|
||||||
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
info = {
|
|
||||||
'id': vod_id,
|
|
||||||
'title': vod_info.get('name') or vod_id,
|
|
||||||
'duration': float_or_none(vod_info.get('duration')),
|
|
||||||
'thumbnail': thumbnail_url,
|
|
||||||
'timestamp': parse_iso8601(vod_info.get('createdAt')),
|
|
||||||
'view_count': int_or_none(vod_info.get('viewsTotal')),
|
|
||||||
'formats': formats,
|
|
||||||
}
|
|
||||||
info.update(self._extract_channel_info(vod_info.get('channel') or {}))
|
|
||||||
|
|
||||||
return info
|
|
103
youtube_dl/extractor/bfmtv.py
Normal file
103
youtube_dl/extractor/bfmtv.py
Normal file
@ -0,0 +1,103 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import extract_attributes
|
||||||
|
|
||||||
|
|
||||||
|
class BFMTVBaseIE(InfoExtractor):
|
||||||
|
_VALID_URL_BASE = r'https?://(?:www\.)?bfmtv\.com/'
|
||||||
|
_VALID_URL_TMPL = _VALID_URL_BASE + r'(?:[^/]+/)*[^/?&#]+_%s[A-Z]-(?P<id>\d{12})\.html'
|
||||||
|
_VIDEO_BLOCK_REGEX = r'(<div[^>]+class="video_block"[^>]*>)'
|
||||||
|
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/%s_default/index.html?videoId=%s'
|
||||||
|
|
||||||
|
def _brightcove_url_result(self, video_id, video_block):
|
||||||
|
account_id = video_block.get('accountid') or '876450612001'
|
||||||
|
player_id = video_block.get('playerid') or 'I2qBTln4u'
|
||||||
|
return self.url_result(
|
||||||
|
self.BRIGHTCOVE_URL_TEMPLATE % (account_id, player_id, video_id),
|
||||||
|
'BrightcoveNew', video_id)
|
||||||
|
|
||||||
|
|
||||||
|
class BFMTVIE(BFMTVBaseIE):
|
||||||
|
IE_NAME = 'bfmtv'
|
||||||
|
_VALID_URL = BFMTVBaseIE._VALID_URL_TMPL % 'V'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.bfmtv.com/politique/emmanuel-macron-l-islam-est-une-religion-qui-vit-une-crise-aujourd-hui-partout-dans-le-monde_VN-202010020146.html',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '6196747868001',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Emmanuel Macron: "L\'Islam est une religion qui vit une crise aujourd’hui, partout dans le monde"',
|
||||||
|
'description': 'Le Président s\'exprime sur la question du séparatisme depuis les Mureaux, dans les Yvelines.',
|
||||||
|
'uploader_id': '876450610001',
|
||||||
|
'upload_date': '20201002',
|
||||||
|
'timestamp': 1601629620,
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
bfmtv_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, bfmtv_id)
|
||||||
|
video_block = extract_attributes(self._search_regex(
|
||||||
|
self._VIDEO_BLOCK_REGEX, webpage, 'video block'))
|
||||||
|
return self._brightcove_url_result(video_block['videoid'], video_block)
|
||||||
|
|
||||||
|
|
||||||
|
class BFMTVLiveIE(BFMTVIE):
|
||||||
|
IE_NAME = 'bfmtv:live'
|
||||||
|
_VALID_URL = BFMTVBaseIE._VALID_URL_BASE + '(?P<id>(?:[^/]+/)?en-direct)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.bfmtv.com/en-direct/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '5615950982001',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': r're:^le direct BFMTV WEB \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
|
||||||
|
'uploader_id': '876450610001',
|
||||||
|
'upload_date': '20171018',
|
||||||
|
'timestamp': 1508329950,
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.bfmtv.com/economie/en-direct/',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
|
||||||
|
class BFMTVArticleIE(BFMTVBaseIE):
|
||||||
|
IE_NAME = 'bfmtv:article'
|
||||||
|
_VALID_URL = BFMTVBaseIE._VALID_URL_TMPL % 'A'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.bfmtv.com/sante/covid-19-un-responsable-de-l-institut-pasteur-se-demande-quand-la-france-va-se-reconfiner_AV-202101060198.html',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '202101060198',
|
||||||
|
'title': 'Covid-19: un responsable de l\'Institut Pasteur se demande "quand la France va se reconfiner"',
|
||||||
|
'description': 'md5:947974089c303d3ac6196670ae262843',
|
||||||
|
},
|
||||||
|
'playlist_count': 2,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.bfmtv.com/international/pour-bolsonaro-le-bresil-est-en-faillite-mais-il-ne-peut-rien-faire_AD-202101060232.html',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.bfmtv.com/sante/covid-19-oui-le-vaccin-de-pfizer-distribue-en-france-a-bien-ete-teste-sur-des-personnes-agees_AN-202101060275.html',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
bfmtv_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, bfmtv_id)
|
||||||
|
|
||||||
|
entries = []
|
||||||
|
for video_block_el in re.findall(self._VIDEO_BLOCK_REGEX, webpage):
|
||||||
|
video_block = extract_attributes(video_block_el)
|
||||||
|
video_id = video_block.get('videoid')
|
||||||
|
if not video_id:
|
||||||
|
continue
|
||||||
|
entries.append(self._brightcove_url_result(video_id, video_block))
|
||||||
|
|
||||||
|
return self.playlist_result(
|
||||||
|
entries, bfmtv_id, self._og_search_title(webpage, fatal=False),
|
||||||
|
self._html_search_meta(['og:description', 'description'], webpage))
|
30
youtube_dl/extractor/bibeltv.py
Normal file
30
youtube_dl/extractor/bibeltv.py
Normal file
@ -0,0 +1,30 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
|
class BibelTVIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?bibeltv\.de/mediathek/videos/(?:crn/)?(?P<id>\d+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.bibeltv.de/mediathek/videos/329703-sprachkurs-in-malaiisch',
|
||||||
|
'md5': '252f908192d611de038b8504b08bf97f',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'ref:329703',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Sprachkurs in Malaiisch',
|
||||||
|
'description': 'md5:3e9f197d29ee164714e67351cf737dfe',
|
||||||
|
'timestamp': 1608316701,
|
||||||
|
'uploader_id': '5840105145001',
|
||||||
|
'upload_date': '20201218',
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.bibeltv.de/mediathek/videos/crn/326374',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/5840105145001/default_default/index.html?videoId=ref:%s'
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
crn_id = self._match_id(url)
|
||||||
|
return self.url_result(
|
||||||
|
self.BRIGHTCOVE_URL_TEMPLATE % crn_id, 'BrightcoveNew')
|
@ -90,13 +90,19 @@ class BleacherReportCMSIE(AMPIE):
|
|||||||
_VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36}|\d{5})'
|
_VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36}|\d{5})'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1&library=video-cms',
|
'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1&library=video-cms',
|
||||||
'md5': '2e4b0a997f9228ffa31fada5c53d1ed1',
|
'md5': '670b2d73f48549da032861130488c681',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
|
'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
|
||||||
'ext': 'flv',
|
'ext': 'mp4',
|
||||||
'title': 'Cena vs. Rollins Would Expose the Heavyweight Division',
|
'title': 'Cena vs. Rollins Would Expose the Heavyweight Division',
|
||||||
'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e',
|
'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e',
|
||||||
|
'upload_date': '20150723',
|
||||||
|
'timestamp': 1437679032,
|
||||||
|
|
||||||
},
|
},
|
||||||
|
'expected_warnings': [
|
||||||
|
'Unable to download f4m manifest'
|
||||||
|
]
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
60
youtube_dl/extractor/bongacams.py
Normal file
60
youtube_dl/extractor/bongacams.py
Normal file
@ -0,0 +1,60 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_str
|
||||||
|
from ..utils import (
|
||||||
|
int_or_none,
|
||||||
|
try_get,
|
||||||
|
urlencode_postdata,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class BongaCamsIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?P<host>(?:[^/]+\.)?bongacams\d*\.com)/(?P<id>[^/?&#]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://de.bongacams.com/azumi-8',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://cn.bongacams.com/azumi-8',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
mobj = re.match(self._VALID_URL, url)
|
||||||
|
host = mobj.group('host')
|
||||||
|
channel_id = mobj.group('id')
|
||||||
|
|
||||||
|
amf = self._download_json(
|
||||||
|
'https://%s/tools/amf.php' % host, channel_id,
|
||||||
|
data=urlencode_postdata((
|
||||||
|
('method', 'getRoomData'),
|
||||||
|
('args[]', channel_id),
|
||||||
|
('args[]', 'false'),
|
||||||
|
)), headers={'X-Requested-With': 'XMLHttpRequest'})
|
||||||
|
|
||||||
|
server_url = amf['localData']['videoServerUrl']
|
||||||
|
|
||||||
|
uploader_id = try_get(
|
||||||
|
amf, lambda x: x['performerData']['username'], compat_str) or channel_id
|
||||||
|
uploader = try_get(
|
||||||
|
amf, lambda x: x['performerData']['displayName'], compat_str)
|
||||||
|
like_count = int_or_none(try_get(
|
||||||
|
amf, lambda x: x['performerData']['loversCount']))
|
||||||
|
|
||||||
|
formats = self._extract_m3u8_formats(
|
||||||
|
'%s/hls/stream_%s/playlist.m3u8' % (server_url, uploader_id),
|
||||||
|
channel_id, 'mp4', m3u8_id='hls', live=True)
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': channel_id,
|
||||||
|
'title': self._live_title(uploader or uploader_id),
|
||||||
|
'uploader': uploader,
|
||||||
|
'uploader_id': uploader_id,
|
||||||
|
'like_count': like_count,
|
||||||
|
'age_limit': 18,
|
||||||
|
'is_live': True,
|
||||||
|
'formats': formats,
|
||||||
|
}
|
@ -12,7 +12,7 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class BravoTVIE(AdobePassIE):
|
class BravoTVIE(AdobePassIE):
|
||||||
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
|
_VALID_URL = r'https?://(?:www\.)?(?P<req_id>bravotv|oxygen)\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is',
|
'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is',
|
||||||
'md5': 'e34684cfea2a96cd2ee1ef3a60909de9',
|
'md5': 'e34684cfea2a96cd2ee1ef3a60909de9',
|
||||||
@ -28,10 +28,13 @@ class BravoTVIE(AdobePassIE):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
|
'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.oxygen.com/in-ice-cold-blood/season-2/episode-16/videos/handling-the-horwitz-house-after-the-murder-season-2',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
site, display_id = re.match(self._VALID_URL, url).groups()
|
||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, display_id)
|
||||||
settings = self._parse_json(self._search_regex(
|
settings = self._parse_json(self._search_regex(
|
||||||
r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>({.+?})</script>', webpage, 'drupal settings'),
|
r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>({.+?})</script>', webpage, 'drupal settings'),
|
||||||
@ -53,11 +56,14 @@ class BravoTVIE(AdobePassIE):
|
|||||||
tp_path = release_pid = tve['release_pid']
|
tp_path = release_pid = tve['release_pid']
|
||||||
if tve.get('entitlement') == 'auth':
|
if tve.get('entitlement') == 'auth':
|
||||||
adobe_pass = settings.get('tve_adobe_auth', {})
|
adobe_pass = settings.get('tve_adobe_auth', {})
|
||||||
|
if site == 'bravotv':
|
||||||
|
site = 'bravo'
|
||||||
resource = self._get_mvpd_resource(
|
resource = self._get_mvpd_resource(
|
||||||
adobe_pass.get('adobePassResourceId', 'bravo'),
|
adobe_pass.get('adobePassResourceId') or site,
|
||||||
tve['title'], release_pid, tve.get('rating'))
|
tve['title'], release_pid, tve.get('rating'))
|
||||||
query['auth'] = self._extract_mvpd_auth(
|
query['auth'] = self._extract_mvpd_auth(
|
||||||
url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource)
|
url, release_pid,
|
||||||
|
adobe_pass.get('adobePassRequestorId') or site, resource)
|
||||||
else:
|
else:
|
||||||
shared_playlist = settings['ls_playlist']
|
shared_playlist = settings['ls_playlist']
|
||||||
account_pid = shared_playlist['account_pid']
|
account_pid = shared_playlist['account_pid']
|
||||||
|
@ -28,6 +28,7 @@ from ..utils import (
|
|||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
smuggle_url,
|
smuggle_url,
|
||||||
str_or_none,
|
str_or_none,
|
||||||
|
try_get,
|
||||||
unescapeHTML,
|
unescapeHTML,
|
||||||
unsmuggle_url,
|
unsmuggle_url,
|
||||||
UnsupportedError,
|
UnsupportedError,
|
||||||
@ -470,13 +471,18 @@ class BrightcoveNewIE(AdobePassIE):
|
|||||||
def _parse_brightcove_metadata(self, json_data, video_id, headers={}):
|
def _parse_brightcove_metadata(self, json_data, video_id, headers={}):
|
||||||
title = json_data['name'].strip()
|
title = json_data['name'].strip()
|
||||||
|
|
||||||
|
num_drm_sources = 0
|
||||||
formats = []
|
formats = []
|
||||||
for source in json_data.get('sources', []):
|
sources = json_data.get('sources') or []
|
||||||
|
for source in sources:
|
||||||
container = source.get('container')
|
container = source.get('container')
|
||||||
ext = mimetype2ext(source.get('type'))
|
ext = mimetype2ext(source.get('type'))
|
||||||
src = source.get('src')
|
src = source.get('src')
|
||||||
# https://support.brightcove.com/playback-api-video-fields-reference#key_systems_object
|
# https://support.brightcove.com/playback-api-video-fields-reference#key_systems_object
|
||||||
if ext == 'ism' or container == 'WVM' or source.get('key_systems'):
|
if container == 'WVM' or source.get('key_systems'):
|
||||||
|
num_drm_sources += 1
|
||||||
|
continue
|
||||||
|
elif ext == 'ism':
|
||||||
continue
|
continue
|
||||||
elif ext == 'm3u8' or container == 'M2TS':
|
elif ext == 'm3u8' or container == 'M2TS':
|
||||||
if not src:
|
if not src:
|
||||||
@ -533,20 +539,15 @@ class BrightcoveNewIE(AdobePassIE):
|
|||||||
'format_id': build_format_id('rtmp'),
|
'format_id': build_format_id('rtmp'),
|
||||||
})
|
})
|
||||||
formats.append(f)
|
formats.append(f)
|
||||||
if not formats:
|
|
||||||
# for sonyliv.com DRM protected videos
|
|
||||||
s3_source_url = json_data.get('custom_fields', {}).get('s3sourceurl')
|
|
||||||
if s3_source_url:
|
|
||||||
formats.append({
|
|
||||||
'url': s3_source_url,
|
|
||||||
'format_id': 'source',
|
|
||||||
})
|
|
||||||
|
|
||||||
errors = json_data.get('errors')
|
if not formats:
|
||||||
if not formats and errors:
|
errors = json_data.get('errors')
|
||||||
error = errors[0]
|
if errors:
|
||||||
raise ExtractorError(
|
error = errors[0]
|
||||||
error.get('message') or error.get('error_subcode') or error['error_code'], expected=True)
|
raise ExtractorError(
|
||||||
|
error.get('message') or error.get('error_subcode') or error['error_code'], expected=True)
|
||||||
|
if sources and num_drm_sources == len(sources):
|
||||||
|
raise ExtractorError('This video is DRM protected.', expected=True)
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
@ -600,24 +601,27 @@ class BrightcoveNewIE(AdobePassIE):
|
|||||||
store_pk = lambda x: self._downloader.cache.store('brightcove', policy_key_id, x)
|
store_pk = lambda x: self._downloader.cache.store('brightcove', policy_key_id, x)
|
||||||
|
|
||||||
def extract_policy_key():
|
def extract_policy_key():
|
||||||
webpage = self._download_webpage(
|
base_url = 'http://players.brightcove.net/%s/%s_%s/' % (account_id, player_id, embed)
|
||||||
'http://players.brightcove.net/%s/%s_%s/index.min.js'
|
config = self._download_json(
|
||||||
% (account_id, player_id, embed), video_id)
|
base_url + 'config.json', video_id, fatal=False) or {}
|
||||||
|
policy_key = try_get(
|
||||||
policy_key = None
|
config, lambda x: x['video_cloud']['policy_key'])
|
||||||
|
|
||||||
catalog = self._search_regex(
|
|
||||||
r'catalog\(({.+?})\);', webpage, 'catalog', default=None)
|
|
||||||
if catalog:
|
|
||||||
catalog = self._parse_json(
|
|
||||||
js_to_json(catalog), video_id, fatal=False)
|
|
||||||
if catalog:
|
|
||||||
policy_key = catalog.get('policyKey')
|
|
||||||
|
|
||||||
if not policy_key:
|
if not policy_key:
|
||||||
policy_key = self._search_regex(
|
webpage = self._download_webpage(
|
||||||
r'policyKey\s*:\s*(["\'])(?P<pk>.+?)\1',
|
base_url + 'index.min.js', video_id)
|
||||||
webpage, 'policy key', group='pk')
|
|
||||||
|
catalog = self._search_regex(
|
||||||
|
r'catalog\(({.+?})\);', webpage, 'catalog', default=None)
|
||||||
|
if catalog:
|
||||||
|
catalog = self._parse_json(
|
||||||
|
js_to_json(catalog), video_id, fatal=False)
|
||||||
|
if catalog:
|
||||||
|
policy_key = catalog.get('policyKey')
|
||||||
|
|
||||||
|
if not policy_key:
|
||||||
|
policy_key = self._search_regex(
|
||||||
|
r'policyKey\s*:\s*(["\'])(?P<pk>.+?)\1',
|
||||||
|
webpage, 'policy key', group='pk')
|
||||||
|
|
||||||
store_pk(policy_key)
|
store_pk(policy_key)
|
||||||
return policy_key
|
return policy_key
|
||||||
|
@ -8,18 +8,20 @@ from .gigya import GigyaBaseIE
|
|||||||
from ..compat import compat_HTTPError
|
from ..compat import compat_HTTPError
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
strip_or_none,
|
clean_html,
|
||||||
|
extract_attributes,
|
||||||
float_or_none,
|
float_or_none,
|
||||||
|
get_element_by_class,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
merge_dicts,
|
merge_dicts,
|
||||||
parse_iso8601,
|
|
||||||
str_or_none,
|
str_or_none,
|
||||||
|
strip_or_none,
|
||||||
url_or_none,
|
url_or_none,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class CanvasIE(InfoExtractor):
|
class CanvasIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://mediazone\.vrt\.be/api/v1/(?P<site_id>canvas|een|ketnet|vrt(?:video|nieuws)|sporza)/assets/(?P<id>[^/?#&]+)'
|
_VALID_URL = r'https?://mediazone\.vrt\.be/api/v1/(?P<site_id>canvas|een|ketnet|vrt(?:video|nieuws)|sporza|dako)/assets/(?P<id>[^/?#&]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://mediazone.vrt.be/api/v1/ketnet/assets/md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
|
'url': 'https://mediazone.vrt.be/api/v1/ketnet/assets/md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
|
||||||
'md5': '68993eda72ef62386a15ea2cf3c93107',
|
'md5': '68993eda72ef62386a15ea2cf3c93107',
|
||||||
@ -37,6 +39,7 @@ class CanvasIE(InfoExtractor):
|
|||||||
'url': 'https://mediazone.vrt.be/api/v1/canvas/assets/mz-ast-5e5f90b6-2d72-4c40-82c2-e134f884e93e',
|
'url': 'https://mediazone.vrt.be/api/v1/canvas/assets/mz-ast-5e5f90b6-2d72-4c40-82c2-e134f884e93e',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
_GEO_BYPASS = False
|
||||||
_HLS_ENTRY_PROTOCOLS_MAP = {
|
_HLS_ENTRY_PROTOCOLS_MAP = {
|
||||||
'HLS': 'm3u8_native',
|
'HLS': 'm3u8_native',
|
||||||
'HLS_AES': 'm3u8',
|
'HLS_AES': 'm3u8',
|
||||||
@ -47,29 +50,34 @@ class CanvasIE(InfoExtractor):
|
|||||||
mobj = re.match(self._VALID_URL, url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
site_id, video_id = mobj.group('site_id'), mobj.group('id')
|
site_id, video_id = mobj.group('site_id'), mobj.group('id')
|
||||||
|
|
||||||
# Old API endpoint, serves more formats but may fail for some videos
|
data = None
|
||||||
data = self._download_json(
|
if site_id != 'vrtvideo':
|
||||||
'https://mediazone.vrt.be/api/v1/%s/assets/%s'
|
# Old API endpoint, serves more formats but may fail for some videos
|
||||||
% (site_id, video_id), video_id, 'Downloading asset JSON',
|
data = self._download_json(
|
||||||
'Unable to download asset JSON', fatal=False)
|
'https://mediazone.vrt.be/api/v1/%s/assets/%s'
|
||||||
|
% (site_id, video_id), video_id, 'Downloading asset JSON',
|
||||||
|
'Unable to download asset JSON', fatal=False)
|
||||||
|
|
||||||
# New API endpoint
|
# New API endpoint
|
||||||
if not data:
|
if not data:
|
||||||
|
headers = self.geo_verification_headers()
|
||||||
|
headers.update({'Content-Type': 'application/json'})
|
||||||
token = self._download_json(
|
token = self._download_json(
|
||||||
'%s/tokens' % self._REST_API_BASE, video_id,
|
'%s/tokens' % self._REST_API_BASE, video_id,
|
||||||
'Downloading token', data=b'',
|
'Downloading token', data=b'', headers=headers)['vrtPlayerToken']
|
||||||
headers={'Content-Type': 'application/json'})['vrtPlayerToken']
|
|
||||||
data = self._download_json(
|
data = self._download_json(
|
||||||
'%s/videos/%s' % (self._REST_API_BASE, video_id),
|
'%s/videos/%s' % (self._REST_API_BASE, video_id),
|
||||||
video_id, 'Downloading video JSON', fatal=False, query={
|
video_id, 'Downloading video JSON', query={
|
||||||
'vrtPlayerToken': token,
|
'vrtPlayerToken': token,
|
||||||
'client': '%s@PROD' % site_id,
|
'client': '%s@PROD' % site_id,
|
||||||
}, expected_status=400)
|
}, expected_status=400)
|
||||||
message = data.get('message')
|
if not data.get('title'):
|
||||||
if message and not data.get('title'):
|
code = data.get('code')
|
||||||
if data.get('code') == 'AUTHENTICATION_REQUIRED':
|
if code == 'AUTHENTICATION_REQUIRED':
|
||||||
self.raise_login_required(message)
|
self.raise_login_required()
|
||||||
raise ExtractorError(message, expected=True)
|
elif code == 'INVALID_LOCATION':
|
||||||
|
self.raise_geo_restricted(countries=['BE'])
|
||||||
|
raise ExtractorError(data.get('message') or code, expected=True)
|
||||||
|
|
||||||
title = data['title']
|
title = data['title']
|
||||||
description = data.get('description')
|
description = data.get('description')
|
||||||
@ -205,20 +213,24 @@ class CanvasEenIE(InfoExtractor):
|
|||||||
|
|
||||||
class VrtNUIE(GigyaBaseIE):
|
class VrtNUIE(GigyaBaseIE):
|
||||||
IE_DESC = 'VrtNU.be'
|
IE_DESC = 'VrtNU.be'
|
||||||
_VALID_URL = r'https?://(?:www\.)?vrt\.be/(?P<site_id>vrtnu)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
_VALID_URL = r'https?://(?:www\.)?vrt\.be/vrtnu/a-z/(?:[^/]+/){2}(?P<id>[^/?#&]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# Available via old API endpoint
|
# Available via old API endpoint
|
||||||
'url': 'https://www.vrt.be/vrtnu/a-z/postbus-x/1/postbus-x-s1a1/',
|
'url': 'https://www.vrt.be/vrtnu/a-z/postbus-x/1989/postbus-x-s1989a1/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'pbs-pub-2e2d8c27-df26-45c9-9dc6-90c78153044d$vid-90c932b1-e21d-4fb8-99b1-db7b49cf74de',
|
'id': 'pbs-pub-e8713dac-899e-41de-9313-81269f4c04ac$vid-90c932b1-e21d-4fb8-99b1-db7b49cf74de',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'De zwarte weduwe',
|
'title': 'Postbus X - Aflevering 1 (Seizoen 1989)',
|
||||||
'description': 'md5:db1227b0f318c849ba5eab1fef895ee4',
|
'description': 'md5:b704f669eb9262da4c55b33d7c6ed4b7',
|
||||||
'duration': 1457.04,
|
'duration': 1457.04,
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
'season': 'Season 1',
|
'series': 'Postbus X',
|
||||||
'season_number': 1,
|
'season': 'Seizoen 1989',
|
||||||
|
'season_number': 1989,
|
||||||
|
'episode': 'De zwarte weduwe',
|
||||||
'episode_number': 1,
|
'episode_number': 1,
|
||||||
|
'timestamp': 1595822400,
|
||||||
|
'upload_date': '20200727',
|
||||||
},
|
},
|
||||||
'skip': 'This video is only available for registered users',
|
'skip': 'This video is only available for registered users',
|
||||||
'params': {
|
'params': {
|
||||||
@ -300,69 +312,73 @@ class VrtNUIE(GigyaBaseIE):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
display_id = self._match_id(url)
|
||||||
|
|
||||||
webpage, urlh = self._download_webpage_handle(url, display_id)
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
attrs = extract_attributes(self._search_regex(
|
||||||
|
r'(<nui-media[^>]+>)', webpage, 'media element'))
|
||||||
|
video_id = attrs['videoid']
|
||||||
|
publication_id = attrs.get('publicationid')
|
||||||
|
if publication_id:
|
||||||
|
video_id = publication_id + '$' + video_id
|
||||||
|
|
||||||
|
page = (self._parse_json(self._search_regex(
|
||||||
|
r'digitalData\s*=\s*({.+?});', webpage, 'digial data',
|
||||||
|
default='{}'), video_id, fatal=False) or {}).get('page') or {}
|
||||||
|
|
||||||
info = self._search_json_ld(webpage, display_id, default={})
|
info = self._search_json_ld(webpage, display_id, default={})
|
||||||
|
|
||||||
# title is optional here since it may be extracted by extractor
|
|
||||||
# that is delegated from here
|
|
||||||
title = strip_or_none(self._html_search_regex(
|
|
||||||
r'(?ms)<h1 class="content__heading">(.+?)</h1>',
|
|
||||||
webpage, 'title', default=None))
|
|
||||||
|
|
||||||
description = self._html_search_regex(
|
|
||||||
r'(?ms)<div class="content__description">(.+?)</div>',
|
|
||||||
webpage, 'description', default=None)
|
|
||||||
|
|
||||||
season = self._html_search_regex(
|
|
||||||
[r'''(?xms)<div\ class="tabs__tab\ tabs__tab--active">\s*
|
|
||||||
<span>seizoen\ (.+?)</span>\s*
|
|
||||||
</div>''',
|
|
||||||
r'<option value="seizoen (\d{1,3})" data-href="[^"]+?" selected>'],
|
|
||||||
webpage, 'season', default=None)
|
|
||||||
|
|
||||||
season_number = int_or_none(season)
|
|
||||||
|
|
||||||
episode_number = int_or_none(self._html_search_regex(
|
|
||||||
r'''(?xms)<div\ class="content__episode">\s*
|
|
||||||
<abbr\ title="aflevering">afl</abbr>\s*<span>(\d+)</span>
|
|
||||||
</div>''',
|
|
||||||
webpage, 'episode_number', default=None))
|
|
||||||
|
|
||||||
release_date = parse_iso8601(self._html_search_regex(
|
|
||||||
r'(?ms)<div class="content__broadcastdate">\s*<time\ datetime="(.+?)"',
|
|
||||||
webpage, 'release_date', default=None))
|
|
||||||
|
|
||||||
# If there's a ? or a # in the URL, remove them and everything after
|
|
||||||
clean_url = urlh.geturl().split('?')[0].split('#')[0].strip('/')
|
|
||||||
securevideo_url = clean_url + '.mssecurevideo.json'
|
|
||||||
|
|
||||||
try:
|
|
||||||
video = self._download_json(securevideo_url, display_id)
|
|
||||||
except ExtractorError as e:
|
|
||||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
|
|
||||||
self.raise_login_required()
|
|
||||||
raise
|
|
||||||
|
|
||||||
# We are dealing with a '../<show>.relevant' URL
|
|
||||||
redirect_url = video.get('url')
|
|
||||||
if redirect_url:
|
|
||||||
return self.url_result(self._proto_relative_url(redirect_url, 'https:'))
|
|
||||||
|
|
||||||
# There is only one entry, but with an unknown key, so just get
|
|
||||||
# the first one
|
|
||||||
video_id = list(video.values())[0].get('videoid')
|
|
||||||
|
|
||||||
return merge_dicts(info, {
|
return merge_dicts(info, {
|
||||||
'_type': 'url_transparent',
|
'_type': 'url_transparent',
|
||||||
'url': 'https://mediazone.vrt.be/api/v1/vrtvideo/assets/%s' % video_id,
|
'url': 'https://mediazone.vrt.be/api/v1/vrtvideo/assets/%s' % video_id,
|
||||||
'ie_key': CanvasIE.ie_key(),
|
'ie_key': CanvasIE.ie_key(),
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'display_id': display_id,
|
'display_id': display_id,
|
||||||
|
'season_number': int_or_none(page.get('episode_season')),
|
||||||
|
})
|
||||||
|
|
||||||
|
|
||||||
|
class DagelijkseKostIE(InfoExtractor):
|
||||||
|
IE_DESC = 'dagelijksekost.een.be'
|
||||||
|
_VALID_URL = r'https?://dagelijksekost\.een\.be/gerechten/(?P<id>[^/?#&]+)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'https://dagelijksekost.een.be/gerechten/hachis-parmentier-met-witloof',
|
||||||
|
'md5': '30bfffc323009a3e5f689bef6efa2365',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'md-ast-27a4d1ff-7d7b-425e-b84f-a4d227f592fa',
|
||||||
|
'display_id': 'hachis-parmentier-met-witloof',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Hachis parmentier met witloof',
|
||||||
|
'description': 'md5:9960478392d87f63567b5b117688cdc5',
|
||||||
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
|
'duration': 283.02,
|
||||||
|
},
|
||||||
|
'expected_warnings': ['is not a supported codec'],
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
title = strip_or_none(get_element_by_class(
|
||||||
|
'dish-metadata__title', webpage
|
||||||
|
) or self._html_search_meta(
|
||||||
|
'twitter:title', webpage))
|
||||||
|
|
||||||
|
description = clean_html(get_element_by_class(
|
||||||
|
'dish-description', webpage)
|
||||||
|
) or self._html_search_meta(
|
||||||
|
('description', 'twitter:description', 'og:description'),
|
||||||
|
webpage)
|
||||||
|
|
||||||
|
video_id = self._html_search_regex(
|
||||||
|
r'data-url=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video id',
|
||||||
|
group='id')
|
||||||
|
|
||||||
|
return {
|
||||||
|
'_type': 'url_transparent',
|
||||||
|
'url': 'https://mediazone.vrt.be/api/v1/dako/assets/%s' % video_id,
|
||||||
|
'ie_key': CanvasIE.ie_key(),
|
||||||
|
'id': video_id,
|
||||||
|
'display_id': display_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
'description': description,
|
'description': description,
|
||||||
'season': season,
|
}
|
||||||
'season_number': season_number,
|
|
||||||
'episode_number': episode_number,
|
|
||||||
'release_date': release_date,
|
|
||||||
})
|
|
||||||
|
@ -11,7 +11,47 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class CBSLocalIE(AnvatoIE):
|
class CBSLocalIE(AnvatoIE):
|
||||||
_VALID_URL = r'https?://[a-z]+\.cbslocal\.com/(?:\d+/\d+/\d+|video)/(?P<id>[0-9a-z-]+)'
|
_VALID_URL_BASE = r'https?://[a-z]+\.cbslocal\.com/'
|
||||||
|
_VALID_URL = _VALID_URL_BASE + r'video/(?P<id>\d+)'
|
||||||
|
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'http://newyork.cbslocal.com/video/3580809-a-very-blue-anniversary/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3580809',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'A Very Blue Anniversary',
|
||||||
|
'description': 'CBS2’s Cindy Hsu has more.',
|
||||||
|
'thumbnail': 're:^https?://.*',
|
||||||
|
'timestamp': int,
|
||||||
|
'upload_date': r're:^\d{8}$',
|
||||||
|
'uploader': 'CBS',
|
||||||
|
'subtitles': {
|
||||||
|
'en': 'mincount:5',
|
||||||
|
},
|
||||||
|
'categories': [
|
||||||
|
'Stations\\Spoken Word\\WCBSTV',
|
||||||
|
'Syndication\\AOL',
|
||||||
|
'Syndication\\MSN',
|
||||||
|
'Syndication\\NDN',
|
||||||
|
'Syndication\\Yahoo',
|
||||||
|
'Content\\News',
|
||||||
|
'Content\\News\\Local News',
|
||||||
|
],
|
||||||
|
'tags': ['CBS 2 News Weekends', 'Cindy Hsu', 'Blue Man Group'],
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
mcp_id = self._match_id(url)
|
||||||
|
return self.url_result(
|
||||||
|
'anvato:anvato_cbslocal_app_web_prod_547f3e49241ef0e5d30c79b2efbca5d92c698f67:' + mcp_id, 'Anvato', mcp_id)
|
||||||
|
|
||||||
|
|
||||||
|
class CBSLocalArticleIE(AnvatoIE):
|
||||||
|
_VALID_URL = CBSLocalIE._VALID_URL_BASE + r'\d+/\d+/\d+/(?P<id>[0-9a-z-]+)'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# Anvato backend
|
# Anvato backend
|
||||||
@ -52,31 +92,6 @@ class CBSLocalIE(AnvatoIE):
|
|||||||
# m3u8 download
|
# m3u8 download
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
'url': 'http://newyork.cbslocal.com/video/3580809-a-very-blue-anniversary/',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '3580809',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'A Very Blue Anniversary',
|
|
||||||
'description': 'CBS2’s Cindy Hsu has more.',
|
|
||||||
'thumbnail': 're:^https?://.*',
|
|
||||||
'timestamp': int,
|
|
||||||
'upload_date': r're:^\d{8}$',
|
|
||||||
'uploader': 'CBS',
|
|
||||||
'subtitles': {
|
|
||||||
'en': 'mincount:5',
|
|
||||||
},
|
|
||||||
'categories': [
|
|
||||||
'Stations\\Spoken Word\\WCBSTV',
|
|
||||||
'Syndication\\AOL',
|
|
||||||
'Syndication\\MSN',
|
|
||||||
'Syndication\\NDN',
|
|
||||||
'Syndication\\Yahoo',
|
|
||||||
'Content\\News',
|
|
||||||
'Content\\News\\Local News',
|
|
||||||
],
|
|
||||||
'tags': ['CBS 2 News Weekends', 'Cindy Hsu', 'Blue Man Group'],
|
|
||||||
},
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -1,15 +1,18 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import calendar
|
||||||
|
import datetime
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
clean_html,
|
clean_html,
|
||||||
|
extract_timezone,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
parse_duration,
|
parse_duration,
|
||||||
parse_iso8601,
|
|
||||||
parse_resolution,
|
parse_resolution,
|
||||||
|
try_get,
|
||||||
url_or_none,
|
url_or_none,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -24,8 +27,9 @@ class CCMAIE(InfoExtractor):
|
|||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'L\'espot de La Marató de TV3',
|
'title': 'L\'espot de La Marató de TV3',
|
||||||
'description': 'md5:f12987f320e2f6e988e9908e4fe97765',
|
'description': 'md5:f12987f320e2f6e988e9908e4fe97765',
|
||||||
'timestamp': 1470918540,
|
'timestamp': 1478608140,
|
||||||
'upload_date': '20160811',
|
'upload_date': '20161108',
|
||||||
|
'age_limit': 0,
|
||||||
}
|
}
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.ccma.cat/catradio/alacarta/programa/el-consell-de-savis-analitza-el-derbi/audio/943685/',
|
'url': 'http://www.ccma.cat/catradio/alacarta/programa/el-consell-de-savis-analitza-el-derbi/audio/943685/',
|
||||||
@ -35,8 +39,24 @@ class CCMAIE(InfoExtractor):
|
|||||||
'ext': 'mp3',
|
'ext': 'mp3',
|
||||||
'title': 'El Consell de Savis analitza el derbi',
|
'title': 'El Consell de Savis analitza el derbi',
|
||||||
'description': 'md5:e2a3648145f3241cb9c6b4b624033e53',
|
'description': 'md5:e2a3648145f3241cb9c6b4b624033e53',
|
||||||
'upload_date': '20171205',
|
'upload_date': '20170512',
|
||||||
'timestamp': 1512507300,
|
'timestamp': 1494622500,
|
||||||
|
'vcodec': 'none',
|
||||||
|
'categories': ['Esports'],
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.ccma.cat/tv3/alacarta/crims/crims-josep-tallada-lespereu-me-capitol-1/video/6031387/',
|
||||||
|
'md5': 'b43c3d3486f430f3032b5b160d80cbc3',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '6031387',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Crims - Josep Talleda, l\'"Espereu-me" (capítol 1)',
|
||||||
|
'description': 'md5:7cbdafb640da9d0d2c0f62bad1e74e60',
|
||||||
|
'timestamp': 1582577700,
|
||||||
|
'upload_date': '20200224',
|
||||||
|
'subtitles': 'mincount:4',
|
||||||
|
'age_limit': 16,
|
||||||
|
'series': 'Crims',
|
||||||
}
|
}
|
||||||
}]
|
}]
|
||||||
|
|
||||||
@ -72,17 +92,28 @@ class CCMAIE(InfoExtractor):
|
|||||||
|
|
||||||
informacio = media['informacio']
|
informacio = media['informacio']
|
||||||
title = informacio['titol']
|
title = informacio['titol']
|
||||||
durada = informacio.get('durada', {})
|
durada = informacio.get('durada') or {}
|
||||||
duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text'))
|
duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text'))
|
||||||
timestamp = parse_iso8601(informacio.get('data_emissio', {}).get('utc'))
|
tematica = try_get(informacio, lambda x: x['tematica']['text'])
|
||||||
|
|
||||||
|
timestamp = None
|
||||||
|
data_utc = try_get(informacio, lambda x: x['data_emissio']['utc'])
|
||||||
|
try:
|
||||||
|
timezone, data_utc = extract_timezone(data_utc)
|
||||||
|
timestamp = calendar.timegm((datetime.datetime.strptime(
|
||||||
|
data_utc, '%Y-%d-%mT%H:%M:%S') - timezone).timetuple())
|
||||||
|
except TypeError:
|
||||||
|
pass
|
||||||
|
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
subtitols = media.get('subtitols', {})
|
subtitols = media.get('subtitols') or []
|
||||||
if subtitols:
|
if isinstance(subtitols, dict):
|
||||||
sub_url = subtitols.get('url')
|
subtitols = [subtitols]
|
||||||
|
for st in subtitols:
|
||||||
|
sub_url = st.get('url')
|
||||||
if sub_url:
|
if sub_url:
|
||||||
subtitles.setdefault(
|
subtitles.setdefault(
|
||||||
subtitols.get('iso') or subtitols.get('text') or 'ca', []).append({
|
st.get('iso') or st.get('text') or 'ca', []).append({
|
||||||
'url': sub_url,
|
'url': sub_url,
|
||||||
})
|
})
|
||||||
|
|
||||||
@ -97,6 +128,16 @@ class CCMAIE(InfoExtractor):
|
|||||||
'height': int_or_none(imatges.get('alcada')),
|
'height': int_or_none(imatges.get('alcada')),
|
||||||
}]
|
}]
|
||||||
|
|
||||||
|
age_limit = None
|
||||||
|
codi_etic = try_get(informacio, lambda x: x['codi_etic']['id'])
|
||||||
|
if codi_etic:
|
||||||
|
codi_etic_s = codi_etic.split('_')
|
||||||
|
if len(codi_etic_s) == 2:
|
||||||
|
if codi_etic_s[1] == 'TP':
|
||||||
|
age_limit = 0
|
||||||
|
else:
|
||||||
|
age_limit = int_or_none(codi_etic_s[1])
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': media_id,
|
'id': media_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
@ -106,4 +147,9 @@ class CCMAIE(InfoExtractor):
|
|||||||
'thumbnails': thumbnails,
|
'thumbnails': thumbnails,
|
||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
|
'age_limit': age_limit,
|
||||||
|
'alt_title': informacio.get('titol_complet'),
|
||||||
|
'episode_number': int_or_none(informacio.get('capitol')),
|
||||||
|
'categories': [tematica] if tematica else None,
|
||||||
|
'series': informacio.get('programa'),
|
||||||
}
|
}
|
||||||
|
@ -5,10 +5,16 @@ import codecs
|
|||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..compat import (
|
||||||
|
compat_chr,
|
||||||
|
compat_ord,
|
||||||
|
compat_urllib_parse_unquote,
|
||||||
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
float_or_none,
|
float_or_none,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
|
merge_dicts,
|
||||||
multipart_encode,
|
multipart_encode,
|
||||||
parse_duration,
|
parse_duration,
|
||||||
random_birthday,
|
random_birthday,
|
||||||
@ -89,8 +95,11 @@ class CDAIE(InfoExtractor):
|
|||||||
if 'Ten film jest dostępny dla użytkowników premium' in webpage:
|
if 'Ten film jest dostępny dla użytkowników premium' in webpage:
|
||||||
raise ExtractorError('This video is only available for premium users.', expected=True)
|
raise ExtractorError('This video is only available for premium users.', expected=True)
|
||||||
|
|
||||||
|
if re.search(r'niedostępn[ey] w(?: |\s+)Twoim kraju\s*<', webpage):
|
||||||
|
self.raise_geo_restricted()
|
||||||
|
|
||||||
need_confirm_age = False
|
need_confirm_age = False
|
||||||
if self._html_search_regex(r'(<form[^>]+action="/a/validatebirth")',
|
if self._html_search_regex(r'(<form[^>]+action="[^"]*/a/validatebirth[^"]*")',
|
||||||
webpage, 'birthday validate form', default=None):
|
webpage, 'birthday validate form', default=None):
|
||||||
webpage = self._download_age_confirm_page(
|
webpage = self._download_age_confirm_page(
|
||||||
url, video_id, note='Confirming age')
|
url, video_id, note='Confirming age')
|
||||||
@ -107,8 +116,9 @@ class CDAIE(InfoExtractor):
|
|||||||
r'Odsłony:(?:\s| )*([0-9]+)', webpage,
|
r'Odsłony:(?:\s| )*([0-9]+)', webpage,
|
||||||
'view_count', default=None)
|
'view_count', default=None)
|
||||||
average_rating = self._search_regex(
|
average_rating = self._search_regex(
|
||||||
r'<(?:span|meta)[^>]+itemprop=(["\'])ratingValue\1[^>]*>(?P<rating_value>[0-9.]+)',
|
(r'<(?:span|meta)[^>]+itemprop=(["\'])ratingValue\1[^>]*>(?P<rating_value>[0-9.]+)',
|
||||||
webpage, 'rating', fatal=False, group='rating_value')
|
r'<span[^>]+\bclass=["\']rating["\'][^>]*>(?P<rating_value>[0-9.]+)'), webpage, 'rating', fatal=False,
|
||||||
|
group='rating_value')
|
||||||
|
|
||||||
info_dict = {
|
info_dict = {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
@ -123,6 +133,24 @@ class CDAIE(InfoExtractor):
|
|||||||
'age_limit': 18 if need_confirm_age else 0,
|
'age_limit': 18 if need_confirm_age else 0,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# Source: https://www.cda.pl/js/player.js?t=1606154898
|
||||||
|
def decrypt_file(a):
|
||||||
|
for p in ('_XDDD', '_CDA', '_ADC', '_CXD', '_QWE', '_Q5', '_IKSDE'):
|
||||||
|
a = a.replace(p, '')
|
||||||
|
a = compat_urllib_parse_unquote(a)
|
||||||
|
b = []
|
||||||
|
for c in a:
|
||||||
|
f = compat_ord(c)
|
||||||
|
b.append(compat_chr(33 + (f + 14) % 94) if 33 <= f and 126 >= f else compat_chr(f))
|
||||||
|
a = ''.join(b)
|
||||||
|
a = a.replace('.cda.mp4', '')
|
||||||
|
for p in ('.2cda.pl', '.3cda.pl'):
|
||||||
|
a = a.replace(p, '.cda.pl')
|
||||||
|
if '/upstream' in a:
|
||||||
|
a = a.replace('/upstream', '.mp4/upstream')
|
||||||
|
return 'https://' + a
|
||||||
|
return 'https://' + a + '.mp4'
|
||||||
|
|
||||||
def extract_format(page, version):
|
def extract_format(page, version):
|
||||||
json_str = self._html_search_regex(
|
json_str = self._html_search_regex(
|
||||||
r'player_data=(\\?["\'])(?P<player_data>.+?)\1', page,
|
r'player_data=(\\?["\'])(?P<player_data>.+?)\1', page,
|
||||||
@ -141,6 +169,8 @@ class CDAIE(InfoExtractor):
|
|||||||
video['file'] = codecs.decode(video['file'], 'rot_13')
|
video['file'] = codecs.decode(video['file'], 'rot_13')
|
||||||
if video['file'].endswith('adc.mp4'):
|
if video['file'].endswith('adc.mp4'):
|
||||||
video['file'] = video['file'].replace('adc.mp4', '.mp4')
|
video['file'] = video['file'].replace('adc.mp4', '.mp4')
|
||||||
|
elif not video['file'].startswith('http'):
|
||||||
|
video['file'] = decrypt_file(video['file'])
|
||||||
f = {
|
f = {
|
||||||
'url': video['file'],
|
'url': video['file'],
|
||||||
}
|
}
|
||||||
@ -179,4 +209,6 @@ class CDAIE(InfoExtractor):
|
|||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
return info_dict
|
info = self._search_json_ld(webpage, video_id, default={})
|
||||||
|
|
||||||
|
return merge_dicts(info_dict, info)
|
||||||
|
@ -96,7 +96,10 @@ class CNNIE(TurnerBaseIE):
|
|||||||
config['data_src'] % path, page_title, {
|
config['data_src'] % path, page_title, {
|
||||||
'default': {
|
'default': {
|
||||||
'media_src': config['media_src'],
|
'media_src': config['media_src'],
|
||||||
}
|
},
|
||||||
|
'f4m': {
|
||||||
|
'host': 'cnn-vh.akamaihd.net',
|
||||||
|
},
|
||||||
})
|
})
|
||||||
|
|
||||||
|
|
||||||
|
@ -1,142 +1,51 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
from .mtv import MTVServicesInfoExtractor
|
from .mtv import MTVServicesInfoExtractor
|
||||||
from .common import InfoExtractor
|
|
||||||
|
|
||||||
|
|
||||||
class ComedyCentralIE(MTVServicesInfoExtractor):
|
class ComedyCentralIE(MTVServicesInfoExtractor):
|
||||||
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
|
_VALID_URL = r'https?://(?:www\.)?cc\.com/(?:episodes|video(?:-clips)?)/(?P<id>[0-9a-z]{6})'
|
||||||
(video-clips|episodes|cc-studios|video-collections|shows(?=/[^/]+/(?!full-episodes)))
|
|
||||||
/(?P<title>.*)'''
|
|
||||||
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
|
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.cc.com/video-clips/kllhuv/stand-up-greg-fitzsimmons--uncensored---too-good-of-a-mother',
|
'url': 'http://www.cc.com/video-clips/5ke9v2/the-daily-show-with-trevor-noah-doc-rivers-and-steve-ballmer---the-nba-player-strike',
|
||||||
'md5': 'c4f48e9eda1b16dd10add0744344b6d8',
|
'md5': 'b8acb347177c680ff18a292aa2166f80',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'cef0cbb3-e776-4bc9-b62e-8016deccb354',
|
'id': '89ccc86e-1b02-4f83-b0c9-1d9592ecd025',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'CC:Stand-Up|August 18, 2013|1|0101|Uncensored - Too Good of a Mother',
|
'title': 'The Daily Show with Trevor Noah|August 28, 2020|25|25149|Doc Rivers and Steve Ballmer - The NBA Player Strike',
|
||||||
'description': 'After a certain point, breastfeeding becomes c**kblocking.',
|
'description': 'md5:5334307c433892b85f4f5e5ac9ef7498',
|
||||||
'timestamp': 1376798400,
|
'timestamp': 1598670000,
|
||||||
'upload_date': '20130818',
|
'upload_date': '20200829',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/interviews/6yx39d/exclusive-rand-paul-extended-interview',
|
'url': 'http://www.cc.com/episodes/pnzzci/drawn-together--american-idol--parody-clip-show-season-3-ep-314',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
|
||||||
|
|
||||||
|
|
||||||
class ComedyCentralFullEpisodesIE(MTVServicesInfoExtractor):
|
|
||||||
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
|
|
||||||
(?:full-episodes|shows(?=/[^/]+/full-episodes))
|
|
||||||
/(?P<id>[^?]+)'''
|
|
||||||
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
|
|
||||||
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'http://www.cc.com/full-episodes/pv391a/the-daily-show-with-trevor-noah-november-28--2016---ryan-speedo-green-season-22-ep-22028',
|
|
||||||
'info_dict': {
|
|
||||||
'description': 'Donald Trump is accused of exploiting his president-elect status for personal gain, Cuban leader Fidel Castro dies, and Ryan Speedo Green discusses "Sing for Your Life."',
|
|
||||||
'title': 'November 28, 2016 - Ryan Speedo Green',
|
|
||||||
},
|
|
||||||
'playlist_count': 4,
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes',
|
'url': 'https://www.cc.com/video/k3sdvm/the-daily-show-with-jon-stewart-exclusive-the-fourth-estate',
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
playlist_id = self._match_id(url)
|
|
||||||
webpage = self._download_webpage(url, playlist_id)
|
|
||||||
mgid = self._extract_triforce_mgid(webpage, data_zone='t2_lc_promo1')
|
|
||||||
videos_info = self._get_videos_info(mgid)
|
|
||||||
return videos_info
|
|
||||||
|
|
||||||
|
|
||||||
class ToshIE(MTVServicesInfoExtractor):
|
|
||||||
IE_DESC = 'Tosh.0'
|
|
||||||
_VALID_URL = r'^https?://tosh\.cc\.com/video-(?:clips|collections)/[^/]+/(?P<videotitle>[^/?#]+)'
|
|
||||||
_FEED_URL = 'http://tosh.cc.com/feeds/mrss'
|
|
||||||
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'http://tosh.cc.com/video-clips/68g93d/twitter-users-share-summer-plans',
|
|
||||||
'info_dict': {
|
|
||||||
'description': 'Tosh asked fans to share their summer plans.',
|
|
||||||
'title': 'Twitter Users Share Summer Plans',
|
|
||||||
},
|
|
||||||
'playlist': [{
|
|
||||||
'md5': 'f269e88114c1805bb6d7653fecea9e06',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '90498ec2-ed00-11e0-aca6-0026b9414f30',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Tosh.0|June 9, 2077|2|211|Twitter Users Share Summer Plans',
|
|
||||||
'description': 'Tosh asked fans to share their summer plans.',
|
|
||||||
'thumbnail': r're:^https?://.*\.jpg',
|
|
||||||
# It's really reported to be published on year 2077
|
|
||||||
'upload_date': '20770610',
|
|
||||||
'timestamp': 3390510600,
|
|
||||||
'subtitles': {
|
|
||||||
'en': 'mincount:3',
|
|
||||||
},
|
|
||||||
},
|
|
||||||
}]
|
|
||||||
}, {
|
|
||||||
'url': 'http://tosh.cc.com/video-collections/x2iz7k/just-plain-foul/m5q4fp',
|
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
|
|
||||||
class ComedyCentralTVIE(MTVServicesInfoExtractor):
|
class ComedyCentralTVIE(MTVServicesInfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?comedycentral\.tv/(?:staffeln|shows)/(?P<id>[^/?#&]+)'
|
_VALID_URL = r'https?://(?:www\.)?comedycentral\.tv/folgen/(?P<id>[0-9a-z]{6})'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.comedycentral.tv/staffeln/7436-the-mindy-project-staffel-4',
|
'url': 'https://www.comedycentral.tv/folgen/pxdpec/josh-investigates-klimawandel-staffel-1-ep-1',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'local_playlist-f99b626bdfe13568579a',
|
'id': '15907dc3-ec3c-11e8-a442-0e40cf2fc285',
|
||||||
'ext': 'flv',
|
'ext': 'mp4',
|
||||||
'title': 'Episode_the-mindy-project_shows_season-4_episode-3_full-episode_part1',
|
'title': 'Josh Investigates',
|
||||||
|
'description': 'Steht uns das Ende der Welt bevor?',
|
||||||
},
|
},
|
||||||
'params': {
|
|
||||||
# rtmp download
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'http://www.comedycentral.tv/shows/1074-workaholics',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'http://www.comedycentral.tv/shows/1727-the-mindy-project/bonus',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
_FEED_URL = 'http://feeds.mtvnservices.com/od/feed/intl-mrss-player-feed'
|
||||||
|
_GEO_COUNTRIES = ['DE']
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _get_feed_query(self, uri):
|
||||||
video_id = self._match_id(url)
|
return {
|
||||||
|
'accountOverride': 'intl.mtvi.com',
|
||||||
webpage = self._download_webpage(url, video_id)
|
'arcEp': 'web.cc.tv',
|
||||||
|
'ep': 'b9032c3a',
|
||||||
mrss_url = self._search_regex(
|
'imageEp': 'web.cc.tv',
|
||||||
r'data-mrss=(["\'])(?P<url>(?:(?!\1).)+)\1',
|
'mgid': uri,
|
||||||
webpage, 'mrss url', group='url')
|
|
||||||
|
|
||||||
return self._get_videos_info_from_url(mrss_url, video_id)
|
|
||||||
|
|
||||||
|
|
||||||
class ComedyCentralShortnameIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'^:(?P<id>tds|thedailyshow|theopposition)$'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': ':tds',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': ':thedailyshow',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': ':theopposition',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
video_id = self._match_id(url)
|
|
||||||
shortcut_map = {
|
|
||||||
'tds': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes',
|
|
||||||
'thedailyshow': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes',
|
|
||||||
'theopposition': 'http://www.cc.com/shows/the-opposition-with-jordan-klepper/full-episodes',
|
|
||||||
}
|
}
|
||||||
return self.url_result(shortcut_map[video_id])
|
|
||||||
|
@ -336,8 +336,8 @@ class InfoExtractor(object):
|
|||||||
object, each element of which is a valid dictionary by this specification.
|
object, each element of which is a valid dictionary by this specification.
|
||||||
|
|
||||||
Additionally, playlists can have "id", "title", "description", "uploader",
|
Additionally, playlists can have "id", "title", "description", "uploader",
|
||||||
"uploader_id", "uploader_url" attributes with the same semantics as videos
|
"uploader_id", "uploader_url", "duration" attributes with the same semantics
|
||||||
(see above).
|
as videos (see above).
|
||||||
|
|
||||||
|
|
||||||
_type "multi_video" indicates that there are multiple videos that
|
_type "multi_video" indicates that there are multiple videos that
|
||||||
@ -1237,8 +1237,16 @@ class InfoExtractor(object):
|
|||||||
'ViewAction': 'view',
|
'ViewAction': 'view',
|
||||||
}
|
}
|
||||||
|
|
||||||
|
def extract_interaction_type(e):
|
||||||
|
interaction_type = e.get('interactionType')
|
||||||
|
if isinstance(interaction_type, dict):
|
||||||
|
interaction_type = interaction_type.get('@type')
|
||||||
|
return str_or_none(interaction_type)
|
||||||
|
|
||||||
def extract_interaction_statistic(e):
|
def extract_interaction_statistic(e):
|
||||||
interaction_statistic = e.get('interactionStatistic')
|
interaction_statistic = e.get('interactionStatistic')
|
||||||
|
if isinstance(interaction_statistic, dict):
|
||||||
|
interaction_statistic = [interaction_statistic]
|
||||||
if not isinstance(interaction_statistic, list):
|
if not isinstance(interaction_statistic, list):
|
||||||
return
|
return
|
||||||
for is_e in interaction_statistic:
|
for is_e in interaction_statistic:
|
||||||
@ -1246,8 +1254,8 @@ class InfoExtractor(object):
|
|||||||
continue
|
continue
|
||||||
if is_e.get('@type') != 'InteractionCounter':
|
if is_e.get('@type') != 'InteractionCounter':
|
||||||
continue
|
continue
|
||||||
interaction_type = is_e.get('interactionType')
|
interaction_type = extract_interaction_type(is_e)
|
||||||
if not isinstance(interaction_type, compat_str):
|
if not interaction_type:
|
||||||
continue
|
continue
|
||||||
# For interaction count some sites provide string instead of
|
# For interaction count some sites provide string instead of
|
||||||
# an integer (as per spec) with non digit characters (e.g. ",")
|
# an integer (as per spec) with non digit characters (e.g. ",")
|
||||||
@ -2056,7 +2064,7 @@ class InfoExtractor(object):
|
|||||||
})
|
})
|
||||||
return entries
|
return entries
|
||||||
|
|
||||||
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}, data=None, headers={}, query={}):
|
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, data=None, headers={}, query={}):
|
||||||
res = self._download_xml_handle(
|
res = self._download_xml_handle(
|
||||||
mpd_url, video_id,
|
mpd_url, video_id,
|
||||||
note=note or 'Downloading MPD manifest',
|
note=note or 'Downloading MPD manifest',
|
||||||
@ -2070,10 +2078,9 @@ class InfoExtractor(object):
|
|||||||
mpd_base_url = base_url(urlh.geturl())
|
mpd_base_url = base_url(urlh.geturl())
|
||||||
|
|
||||||
return self._parse_mpd_formats(
|
return self._parse_mpd_formats(
|
||||||
mpd_doc, mpd_id=mpd_id, mpd_base_url=mpd_base_url,
|
mpd_doc, mpd_id, mpd_base_url, mpd_url)
|
||||||
formats_dict=formats_dict, mpd_url=mpd_url)
|
|
||||||
|
|
||||||
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', formats_dict={}, mpd_url=None):
|
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', mpd_url=None):
|
||||||
"""
|
"""
|
||||||
Parse formats from MPD manifest.
|
Parse formats from MPD manifest.
|
||||||
References:
|
References:
|
||||||
@ -2351,15 +2358,7 @@ class InfoExtractor(object):
|
|||||||
else:
|
else:
|
||||||
# Assuming direct URL to unfragmented media.
|
# Assuming direct URL to unfragmented media.
|
||||||
f['url'] = base_url
|
f['url'] = base_url
|
||||||
|
formats.append(f)
|
||||||
# According to [1, 5.3.5.2, Table 7, page 35] @id of Representation
|
|
||||||
# is not necessarily unique within a Period thus formats with
|
|
||||||
# the same `format_id` are quite possible. There are numerous examples
|
|
||||||
# of such manifests (see https://github.com/ytdl-org/youtube-dl/issues/15111,
|
|
||||||
# https://github.com/ytdl-org/youtube-dl/issues/13919)
|
|
||||||
full_info = formats_dict.get(representation_id, {}).copy()
|
|
||||||
full_info.update(f)
|
|
||||||
formats.append(full_info)
|
|
||||||
else:
|
else:
|
||||||
self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
|
self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
|
||||||
return formats
|
return formats
|
||||||
@ -2513,16 +2512,18 @@ class InfoExtractor(object):
|
|||||||
# amp-video and amp-audio are very similar to their HTML5 counterparts
|
# amp-video and amp-audio are very similar to their HTML5 counterparts
|
||||||
# so we wll include them right here (see
|
# so we wll include them right here (see
|
||||||
# https://www.ampproject.org/docs/reference/components/amp-video)
|
# https://www.ampproject.org/docs/reference/components/amp-video)
|
||||||
media_tags = [(media_tag, media_type, '')
|
# For dl8-* tags see https://delight-vr.com/documentation/dl8-video/
|
||||||
for media_tag, media_type
|
_MEDIA_TAG_NAME_RE = r'(?:(?:amp|dl8(?:-live)?)-)?(video|audio)'
|
||||||
in re.findall(r'(?s)(<(?:amp-)?(video|audio)[^>]*/>)', webpage)]
|
media_tags = [(media_tag, media_tag_name, media_type, '')
|
||||||
|
for media_tag, media_tag_name, media_type
|
||||||
|
in re.findall(r'(?s)(<(%s)[^>]*/>)' % _MEDIA_TAG_NAME_RE, webpage)]
|
||||||
media_tags.extend(re.findall(
|
media_tags.extend(re.findall(
|
||||||
# We only allow video|audio followed by a whitespace or '>'.
|
# We only allow video|audio followed by a whitespace or '>'.
|
||||||
# Allowing more characters may end up in significant slow down (see
|
# Allowing more characters may end up in significant slow down (see
|
||||||
# https://github.com/ytdl-org/youtube-dl/issues/11979, example URL:
|
# https://github.com/ytdl-org/youtube-dl/issues/11979, example URL:
|
||||||
# http://www.porntrex.com/maps/videositemap.xml).
|
# http://www.porntrex.com/maps/videositemap.xml).
|
||||||
r'(?s)(<(?P<tag>(?:amp-)?(?:video|audio))(?:\s+[^>]*)?>)(.*?)</(?P=tag)>', webpage))
|
r'(?s)(<(?P<tag>%s)(?:\s+[^>]*)?>)(.*?)</(?P=tag)>' % _MEDIA_TAG_NAME_RE, webpage))
|
||||||
for media_tag, media_type, media_content in media_tags:
|
for media_tag, _, media_type, media_content in media_tags:
|
||||||
media_info = {
|
media_info = {
|
||||||
'formats': [],
|
'formats': [],
|
||||||
'subtitles': {},
|
'subtitles': {},
|
||||||
@ -2595,6 +2596,13 @@ class InfoExtractor(object):
|
|||||||
return entries
|
return entries
|
||||||
|
|
||||||
def _extract_akamai_formats(self, manifest_url, video_id, hosts={}):
|
def _extract_akamai_formats(self, manifest_url, video_id, hosts={}):
|
||||||
|
signed = 'hdnea=' in manifest_url
|
||||||
|
if not signed:
|
||||||
|
# https://learn.akamai.com/en-us/webhelp/media-services-on-demand/stream-packaging-user-guide/GUID-BE6C0F73-1E06-483B-B0EA-57984B91B7F9.html
|
||||||
|
manifest_url = re.sub(
|
||||||
|
r'(?:b=[\d,-]+|(?:__a__|attributes)=off|__b__=\d+)&?',
|
||||||
|
'', manifest_url).strip('?')
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
|
|
||||||
hdcore_sign = 'hdcore=3.7.0'
|
hdcore_sign = 'hdcore=3.7.0'
|
||||||
@ -2614,33 +2622,32 @@ class InfoExtractor(object):
|
|||||||
hls_host = hosts.get('hls')
|
hls_host = hosts.get('hls')
|
||||||
if hls_host:
|
if hls_host:
|
||||||
m3u8_url = re.sub(r'(https?://)[^/]+', r'\1' + hls_host, m3u8_url)
|
m3u8_url = re.sub(r'(https?://)[^/]+', r'\1' + hls_host, m3u8_url)
|
||||||
formats.extend(self._extract_m3u8_formats(
|
m3u8_formats = self._extract_m3u8_formats(
|
||||||
m3u8_url, video_id, 'mp4', 'm3u8_native',
|
m3u8_url, video_id, 'mp4', 'm3u8_native',
|
||||||
m3u8_id='hls', fatal=False))
|
m3u8_id='hls', fatal=False)
|
||||||
|
formats.extend(m3u8_formats)
|
||||||
|
|
||||||
http_host = hosts.get('http')
|
http_host = hosts.get('http')
|
||||||
if http_host and 'hdnea=' not in manifest_url:
|
if http_host and m3u8_formats and not signed:
|
||||||
REPL_REGEX = r'https://[^/]+/i/([^,]+),([^/]+),([^/]+).csmil/.+'
|
REPL_REGEX = r'https?://[^/]+/i/([^,]+),([^/]+),([^/]+)\.csmil/.+'
|
||||||
qualities = re.match(REPL_REGEX, m3u8_url).group(2).split(',')
|
qualities = re.match(REPL_REGEX, m3u8_url).group(2).split(',')
|
||||||
qualities_length = len(qualities)
|
qualities_length = len(qualities)
|
||||||
if len(formats) in (qualities_length + 1, qualities_length * 2 + 1):
|
if len(m3u8_formats) in (qualities_length, qualities_length + 1):
|
||||||
i = 0
|
i = 0
|
||||||
http_formats = []
|
for f in m3u8_formats:
|
||||||
for f in formats:
|
if f['vcodec'] != 'none':
|
||||||
if f['protocol'] == 'm3u8_native' and f['vcodec'] != 'none':
|
|
||||||
for protocol in ('http', 'https'):
|
for protocol in ('http', 'https'):
|
||||||
http_f = f.copy()
|
http_f = f.copy()
|
||||||
del http_f['manifest_url']
|
del http_f['manifest_url']
|
||||||
http_url = re.sub(
|
http_url = re.sub(
|
||||||
REPL_REGEX, protocol + r'://%s/\1%s\3' % (http_host, qualities[i]), f['url'])
|
REPL_REGEX, protocol + r'://%s/\g<1>%s\3' % (http_host, qualities[i]), f['url'])
|
||||||
http_f.update({
|
http_f.update({
|
||||||
'format_id': http_f['format_id'].replace('hls-', protocol + '-'),
|
'format_id': http_f['format_id'].replace('hls-', protocol + '-'),
|
||||||
'url': http_url,
|
'url': http_url,
|
||||||
'protocol': protocol,
|
'protocol': protocol,
|
||||||
})
|
})
|
||||||
http_formats.append(http_f)
|
formats.append(http_f)
|
||||||
i += 1
|
i += 1
|
||||||
formats.extend(http_formats)
|
|
||||||
|
|
||||||
return formats
|
return formats
|
||||||
|
|
||||||
|
@ -8,9 +8,14 @@ from ..utils import (
|
|||||||
ExtractorError,
|
ExtractorError,
|
||||||
extract_attributes,
|
extract_attributes,
|
||||||
find_xpath_attr,
|
find_xpath_attr,
|
||||||
|
get_element_by_attribute,
|
||||||
get_element_by_class,
|
get_element_by_class,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
|
js_to_json,
|
||||||
|
merge_dicts,
|
||||||
|
parse_iso8601,
|
||||||
smuggle_url,
|
smuggle_url,
|
||||||
|
str_to_int,
|
||||||
unescapeHTML,
|
unescapeHTML,
|
||||||
)
|
)
|
||||||
from .senateisvp import SenateISVPIE
|
from .senateisvp import SenateISVPIE
|
||||||
@ -98,6 +103,48 @@ class CSpanIE(InfoExtractor):
|
|||||||
bc_attr['data-bcid'])
|
bc_attr['data-bcid'])
|
||||||
return self.url_result(smuggle_url(bc_url, {'source_url': url}))
|
return self.url_result(smuggle_url(bc_url, {'source_url': url}))
|
||||||
|
|
||||||
|
def add_referer(formats):
|
||||||
|
for f in formats:
|
||||||
|
f.setdefault('http_headers', {})['Referer'] = url
|
||||||
|
|
||||||
|
# As of 01.12.2020 this path looks to cover all cases making the rest
|
||||||
|
# of the code unnecessary
|
||||||
|
jwsetup = self._parse_json(
|
||||||
|
self._search_regex(
|
||||||
|
r'(?s)jwsetup\s*=\s*({.+?})\s*;', webpage, 'jwsetup',
|
||||||
|
default='{}'),
|
||||||
|
video_id, transform_source=js_to_json, fatal=False)
|
||||||
|
if jwsetup:
|
||||||
|
info = self._parse_jwplayer_data(
|
||||||
|
jwsetup, video_id, require_title=False, m3u8_id='hls',
|
||||||
|
base_url=url)
|
||||||
|
add_referer(info['formats'])
|
||||||
|
for subtitles in info['subtitles'].values():
|
||||||
|
for subtitle in subtitles:
|
||||||
|
ext = determine_ext(subtitle['url'])
|
||||||
|
if ext == 'php':
|
||||||
|
ext = 'vtt'
|
||||||
|
subtitle['ext'] = ext
|
||||||
|
ld_info = self._search_json_ld(webpage, video_id, default={})
|
||||||
|
title = get_element_by_class('video-page-title', webpage) or \
|
||||||
|
self._og_search_title(webpage)
|
||||||
|
description = get_element_by_attribute('itemprop', 'description', webpage) or \
|
||||||
|
self._html_search_meta(['og:description', 'description'], webpage)
|
||||||
|
return merge_dicts(info, ld_info, {
|
||||||
|
'title': title,
|
||||||
|
'thumbnail': get_element_by_attribute('itemprop', 'thumbnailUrl', webpage),
|
||||||
|
'description': description,
|
||||||
|
'timestamp': parse_iso8601(get_element_by_attribute('itemprop', 'uploadDate', webpage)),
|
||||||
|
'location': get_element_by_attribute('itemprop', 'contentLocation', webpage),
|
||||||
|
'duration': int_or_none(self._search_regex(
|
||||||
|
r'jwsetup\.seclength\s*=\s*(\d+);',
|
||||||
|
webpage, 'duration', fatal=False)),
|
||||||
|
'view_count': str_to_int(self._search_regex(
|
||||||
|
r"<span[^>]+class='views'[^>]*>([\d,]+)\s+Views</span>",
|
||||||
|
webpage, 'views', fatal=False)),
|
||||||
|
})
|
||||||
|
|
||||||
|
# Obsolete
|
||||||
# We first look for clipid, because clipprog always appears before
|
# We first look for clipid, because clipprog always appears before
|
||||||
patterns = [r'id=\'clip(%s)\'\s*value=\'([0-9]+)\'' % t for t in ('id', 'prog')]
|
patterns = [r'id=\'clip(%s)\'\s*value=\'([0-9]+)\'' % t for t in ('id', 'prog')]
|
||||||
results = list(filter(None, (re.search(p, webpage) for p in patterns)))
|
results = list(filter(None, (re.search(p, webpage) for p in patterns)))
|
||||||
@ -165,6 +212,7 @@ class CSpanIE(InfoExtractor):
|
|||||||
formats = self._extract_m3u8_formats(
|
formats = self._extract_m3u8_formats(
|
||||||
path, video_id, 'mp4', entry_protocol='m3u8_native',
|
path, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||||
m3u8_id='hls') if determine_ext(path) == 'm3u8' else [{'url': path, }]
|
m3u8_id='hls') if determine_ext(path) == 'm3u8' else [{'url': path, }]
|
||||||
|
add_referer(formats)
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
entries.append({
|
entries.append({
|
||||||
'id': '%s_%d' % (video_id, partnum + 1),
|
'id': '%s_%d' % (video_id, partnum + 1),
|
||||||
|
52
youtube_dl/extractor/ctv.py
Normal file
52
youtube_dl/extractor/ctv.py
Normal file
@ -0,0 +1,52 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
|
class CTVIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?ctv\.ca/(?P<id>(?:show|movie)s/[^/]+/[^/?#&]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.ctv.ca/shows/your-morning/wednesday-december-23-2020-s5e88',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '2102249',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': 'Wednesday, December 23, 2020',
|
||||||
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
|
'description': 'Your Morning delivers original perspectives and unique insights into the headlines of the day.',
|
||||||
|
'timestamp': 1608732000,
|
||||||
|
'upload_date': '20201223',
|
||||||
|
'series': 'Your Morning',
|
||||||
|
'season': '2020-2021',
|
||||||
|
'season_number': 5,
|
||||||
|
'episode_number': 88,
|
||||||
|
'tags': ['Your Morning'],
|
||||||
|
'categories': ['Talk Show'],
|
||||||
|
'duration': 7467.126,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.ctv.ca/movies/adam-sandlers-eight-crazy-nights/adam-sandlers-eight-crazy-nights',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
content = self._download_json(
|
||||||
|
'https://www.ctv.ca/space-graphql/graphql', display_id, query={
|
||||||
|
'query': '''{
|
||||||
|
resolvedPath(path: "/%s") {
|
||||||
|
lastSegment {
|
||||||
|
content {
|
||||||
|
... on AxisContent {
|
||||||
|
axisId
|
||||||
|
videoPlayerDestCode
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}''' % display_id,
|
||||||
|
})['data']['resolvedPath']['lastSegment']['content']
|
||||||
|
video_id = content['axisId']
|
||||||
|
return self.url_result(
|
||||||
|
'9c9media:%s:%s' % (content['videoPlayerDestCode'], video_id),
|
||||||
|
'NineCNineMedia', video_id)
|
@ -17,7 +17,12 @@ from ..utils import (
|
|||||||
class DPlayIE(InfoExtractor):
|
class DPlayIE(InfoExtractor):
|
||||||
_VALID_URL = r'''(?x)https?://
|
_VALID_URL = r'''(?x)https?://
|
||||||
(?P<domain>
|
(?P<domain>
|
||||||
(?:www\.)?(?P<host>dplay\.(?P<country>dk|fi|jp|se|no))|
|
(?:www\.)?(?P<host>d
|
||||||
|
(?:
|
||||||
|
play\.(?P<country>dk|fi|jp|se|no)|
|
||||||
|
iscoveryplus\.(?P<plus_country>dk|es|fi|it|se|no)
|
||||||
|
)
|
||||||
|
)|
|
||||||
(?P<subdomain_country>es|it)\.dplay\.com
|
(?P<subdomain_country>es|it)\.dplay\.com
|
||||||
)/[^/]+/(?P<id>[^/]+/[^/?#]+)'''
|
)/[^/]+/(?P<id>[^/]+/[^/?#]+)'''
|
||||||
|
|
||||||
@ -126,6 +131,24 @@ class DPlayIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://www.dplay.jp/video/gold-rush/24086',
|
'url': 'https://www.dplay.jp/video/gold-rush/24086',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.discoveryplus.se/videos/nugammalt-77-handelser-som-format-sverige/nugammalt-77-handelser-som-format-sverige-101',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.discoveryplus.dk/videoer/ted-bundy-mind-of-a-monster/ted-bundy-mind-of-a-monster',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.discoveryplus.no/videoer/i-kongens-klr/sesong-1-episode-7',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.discoveryplus.it/videos/biografie-imbarazzanti/luigi-di-maio-la-psicosi-di-stanislawskij',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.discoveryplus.es/videos/la-fiebre-del-oro/temporada-8-episodio-1',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.discoveryplus.fi/videot/shifting-gears-with-aaron-kaufman/episode-16',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _get_disco_api_info(self, url, display_id, disco_host, realm, country):
|
def _get_disco_api_info(self, url, display_id, disco_host, realm, country):
|
||||||
@ -241,7 +264,7 @@ class DPlayIE(InfoExtractor):
|
|||||||
mobj = re.match(self._VALID_URL, url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
display_id = mobj.group('id')
|
display_id = mobj.group('id')
|
||||||
domain = mobj.group('domain').lstrip('www.')
|
domain = mobj.group('domain').lstrip('www.')
|
||||||
country = mobj.group('country') or mobj.group('subdomain_country')
|
country = mobj.group('country') or mobj.group('subdomain_country') or mobj.group('plus_country')
|
||||||
host = 'disco-api.' + domain if domain.startswith('dplay.') else 'eu2-prod.disco-api.com'
|
host = 'disco-api.' + domain if domain[0] == 'd' else 'eu2-prod.disco-api.com'
|
||||||
return self._get_disco_api_info(
|
return self._get_disco_api_info(
|
||||||
url, display_id, host, 'dplay' + country, country)
|
url, display_id, host, 'dplay' + country, country)
|
||||||
|
@ -29,7 +29,7 @@ class DRTVIE(InfoExtractor):
|
|||||||
https?://
|
https?://
|
||||||
(?:
|
(?:
|
||||||
(?:www\.)?dr\.dk/(?:tv/se|nyheder|radio(?:/ondemand)?)/(?:[^/]+/)*|
|
(?:www\.)?dr\.dk/(?:tv/se|nyheder|radio(?:/ondemand)?)/(?:[^/]+/)*|
|
||||||
(?:www\.)?(?:dr\.dk|dr-massive\.com)/drtv/(?:se|episode)/
|
(?:www\.)?(?:dr\.dk|dr-massive\.com)/drtv/(?:se|episode|program)/
|
||||||
)
|
)
|
||||||
(?P<id>[\da-z_-]+)
|
(?P<id>[\da-z_-]+)
|
||||||
'''
|
'''
|
||||||
@ -111,6 +111,9 @@ class DRTVIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://dr-massive.com/drtv/se/bonderoeven_71769',
|
'url': 'https://dr-massive.com/drtv/se/bonderoeven_71769',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.dr.dk/drtv/program/jagten_220924',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -12,7 +12,14 @@ from ..utils import (
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class EggheadCourseIE(InfoExtractor):
|
class EggheadBaseIE(InfoExtractor):
|
||||||
|
def _call_api(self, path, video_id, resource, fatal=True):
|
||||||
|
return self._download_json(
|
||||||
|
'https://app.egghead.io/api/v1/' + path,
|
||||||
|
video_id, 'Downloading %s JSON' % resource, fatal=fatal)
|
||||||
|
|
||||||
|
|
||||||
|
class EggheadCourseIE(EggheadBaseIE):
|
||||||
IE_DESC = 'egghead.io course'
|
IE_DESC = 'egghead.io course'
|
||||||
IE_NAME = 'egghead:course'
|
IE_NAME = 'egghead:course'
|
||||||
_VALID_URL = r'https://egghead\.io/courses/(?P<id>[^/?#&]+)'
|
_VALID_URL = r'https://egghead\.io/courses/(?P<id>[^/?#&]+)'
|
||||||
@ -28,10 +35,9 @@ class EggheadCourseIE(InfoExtractor):
|
|||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
playlist_id = self._match_id(url)
|
playlist_id = self._match_id(url)
|
||||||
|
series_path = 'series/' + playlist_id
|
||||||
lessons = self._download_json(
|
lessons = self._call_api(
|
||||||
'https://egghead.io/api/v1/series/%s/lessons' % playlist_id,
|
series_path + '/lessons', playlist_id, 'course lessons')
|
||||||
playlist_id, 'Downloading course lessons JSON')
|
|
||||||
|
|
||||||
entries = []
|
entries = []
|
||||||
for lesson in lessons:
|
for lesson in lessons:
|
||||||
@ -44,9 +50,8 @@ class EggheadCourseIE(InfoExtractor):
|
|||||||
entries.append(self.url_result(
|
entries.append(self.url_result(
|
||||||
lesson_url, ie=EggheadLessonIE.ie_key(), video_id=lesson_id))
|
lesson_url, ie=EggheadLessonIE.ie_key(), video_id=lesson_id))
|
||||||
|
|
||||||
course = self._download_json(
|
course = self._call_api(
|
||||||
'https://egghead.io/api/v1/series/%s' % playlist_id,
|
series_path, playlist_id, 'course', False) or {}
|
||||||
playlist_id, 'Downloading course JSON', fatal=False) or {}
|
|
||||||
|
|
||||||
playlist_id = course.get('id')
|
playlist_id = course.get('id')
|
||||||
if playlist_id:
|
if playlist_id:
|
||||||
@ -57,7 +62,7 @@ class EggheadCourseIE(InfoExtractor):
|
|||||||
course.get('description'))
|
course.get('description'))
|
||||||
|
|
||||||
|
|
||||||
class EggheadLessonIE(InfoExtractor):
|
class EggheadLessonIE(EggheadBaseIE):
|
||||||
IE_DESC = 'egghead.io lesson'
|
IE_DESC = 'egghead.io lesson'
|
||||||
IE_NAME = 'egghead:lesson'
|
IE_NAME = 'egghead:lesson'
|
||||||
_VALID_URL = r'https://egghead\.io/(?:api/v1/)?lessons/(?P<id>[^/?#&]+)'
|
_VALID_URL = r'https://egghead\.io/(?:api/v1/)?lessons/(?P<id>[^/?#&]+)'
|
||||||
@ -74,7 +79,7 @@ class EggheadLessonIE(InfoExtractor):
|
|||||||
'upload_date': '20161209',
|
'upload_date': '20161209',
|
||||||
'duration': 304,
|
'duration': 304,
|
||||||
'view_count': 0,
|
'view_count': 0,
|
||||||
'tags': ['javascript', 'free'],
|
'tags': 'count:2',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -88,8 +93,8 @@ class EggheadLessonIE(InfoExtractor):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
display_id = self._match_id(url)
|
||||||
|
|
||||||
lesson = self._download_json(
|
lesson = self._call_api(
|
||||||
'https://egghead.io/api/v1/lessons/%s' % display_id, display_id)
|
'lessons/' + display_id, display_id, 'lesson')
|
||||||
|
|
||||||
lesson_id = compat_str(lesson['id'])
|
lesson_id = compat_str(lesson['id'])
|
||||||
title = lesson['title']
|
title = lesson['title']
|
||||||
|
@ -16,7 +16,7 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class EpornerIE(InfoExtractor):
|
class EpornerIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?eporner\.com/(?:hd-porn|embed)/(?P<id>\w+)(?:/(?P<display_id>[\w-]+))?'
|
_VALID_URL = r'https?://(?:www\.)?eporner\.com/(?:(?:hd-porn|embed)/|video-)(?P<id>\w+)(?:/(?P<display_id>[\w-]+))?'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.eporner.com/hd-porn/95008/Infamous-Tiffany-Teen-Strip-Tease-Video/',
|
'url': 'http://www.eporner.com/hd-porn/95008/Infamous-Tiffany-Teen-Strip-Tease-Video/',
|
||||||
'md5': '39d486f046212d8e1b911c52ab4691f8',
|
'md5': '39d486f046212d8e1b911c52ab4691f8',
|
||||||
@ -43,7 +43,10 @@ class EpornerIE(InfoExtractor):
|
|||||||
'url': 'http://www.eporner.com/hd-porn/3YRUtzMcWn0',
|
'url': 'http://www.eporner.com/hd-porn/3YRUtzMcWn0',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.eporner.com/hd-porn/3YRUtzMcWn0',
|
'url': 'http://www.eporner.com/embed/3YRUtzMcWn0',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.eporner.com/video-FJsA19J3Y3H/one-of-the-greats/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
@ -57,7 +60,7 @@ class EpornerIE(InfoExtractor):
|
|||||||
video_id = self._match_id(urlh.geturl())
|
video_id = self._match_id(urlh.geturl())
|
||||||
|
|
||||||
hash = self._search_regex(
|
hash = self._search_regex(
|
||||||
r'hash\s*:\s*["\']([\da-f]{32})', webpage, 'hash')
|
r'hash\s*[:=]\s*["\']([\da-f]{32})', webpage, 'hash')
|
||||||
|
|
||||||
title = self._og_search_title(webpage, default=None) or self._html_search_regex(
|
title = self._og_search_title(webpage, default=None) or self._html_search_regex(
|
||||||
r'<title>(.+?) - EPORNER', webpage, 'title')
|
r'<title>(.+?) - EPORNER', webpage, 'title')
|
||||||
@ -115,8 +118,8 @@ class EpornerIE(InfoExtractor):
|
|||||||
duration = parse_duration(self._html_search_meta(
|
duration = parse_duration(self._html_search_meta(
|
||||||
'duration', webpage, default=None))
|
'duration', webpage, default=None))
|
||||||
view_count = str_to_int(self._search_regex(
|
view_count = str_to_int(self._search_regex(
|
||||||
r'id="cinemaviews">\s*([0-9,]+)\s*<small>views',
|
r'id=["\']cinemaviews1["\'][^>]*>\s*([0-9,]+)',
|
||||||
webpage, 'view count', fatal=False))
|
webpage, 'view count', default=None))
|
||||||
|
|
||||||
return merge_dicts(json_ld, {
|
return merge_dicts(json_ld, {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
|
@ -1,77 +0,0 @@
|
|||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
from ..utils import (
|
|
||||||
ExtractorError,
|
|
||||||
sanitized_Request,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class EveryonesMixtapeIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?everyonesmixtape\.com/#/mix/(?P<id>[0-9a-zA-Z]+)(?:/(?P<songnr>[0-9]))?$'
|
|
||||||
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'http://everyonesmixtape.com/#/mix/m7m0jJAbMQi/5',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '5bfseWNmlds',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': "Passion Pit - \"Sleepyhead\" (Official Music Video)",
|
|
||||||
'uploader': 'FKR.TV',
|
|
||||||
'uploader_id': 'frenchkissrecords',
|
|
||||||
'description': "Music video for \"Sleepyhead\" from Passion Pit's debut EP Chunk Of Change.\nBuy on iTunes: https://itunes.apple.com/us/album/chunk-of-change-ep/id300087641\n\nDirected by The Wilderness.\n\nhttp://www.passionpitmusic.com\nhttp://www.frenchkissrecords.com",
|
|
||||||
'upload_date': '20081015'
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True, # This is simply YouTube
|
|
||||||
}
|
|
||||||
}, {
|
|
||||||
'url': 'http://everyonesmixtape.com/#/mix/m7m0jJAbMQi',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'm7m0jJAbMQi',
|
|
||||||
'title': 'Driving',
|
|
||||||
},
|
|
||||||
'playlist_count': 24
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
mobj = re.match(self._VALID_URL, url)
|
|
||||||
playlist_id = mobj.group('id')
|
|
||||||
|
|
||||||
pllist_url = 'http://everyonesmixtape.com/mixtape.php?a=getMixes&u=-1&linked=%s&explore=' % playlist_id
|
|
||||||
pllist_req = sanitized_Request(pllist_url)
|
|
||||||
pllist_req.add_header('X-Requested-With', 'XMLHttpRequest')
|
|
||||||
|
|
||||||
playlist_list = self._download_json(
|
|
||||||
pllist_req, playlist_id, note='Downloading playlist metadata')
|
|
||||||
try:
|
|
||||||
playlist_no = next(playlist['id']
|
|
||||||
for playlist in playlist_list
|
|
||||||
if playlist['code'] == playlist_id)
|
|
||||||
except StopIteration:
|
|
||||||
raise ExtractorError('Playlist id not found')
|
|
||||||
|
|
||||||
pl_url = 'http://everyonesmixtape.com/mixtape.php?a=getMix&id=%s&userId=null&code=' % playlist_no
|
|
||||||
pl_req = sanitized_Request(pl_url)
|
|
||||||
pl_req.add_header('X-Requested-With', 'XMLHttpRequest')
|
|
||||||
playlist = self._download_json(
|
|
||||||
pl_req, playlist_id, note='Downloading playlist info')
|
|
||||||
|
|
||||||
entries = [{
|
|
||||||
'_type': 'url',
|
|
||||||
'url': t['url'],
|
|
||||||
'title': t['title'],
|
|
||||||
} for t in playlist['tracks']]
|
|
||||||
|
|
||||||
if mobj.group('songnr'):
|
|
||||||
songnr = int(mobj.group('songnr')) - 1
|
|
||||||
return entries[songnr]
|
|
||||||
|
|
||||||
playlist_title = playlist['mixData']['name']
|
|
||||||
return {
|
|
||||||
'_type': 'playlist',
|
|
||||||
'id': playlist_id,
|
|
||||||
'title': playlist_title,
|
|
||||||
'entries': entries,
|
|
||||||
}
|
|
@ -30,7 +30,11 @@ from .adobetv import (
|
|||||||
from .adultswim import AdultSwimIE
|
from .adultswim import AdultSwimIE
|
||||||
from .aenetworks import (
|
from .aenetworks import (
|
||||||
AENetworksIE,
|
AENetworksIE,
|
||||||
|
AENetworksCollectionIE,
|
||||||
|
AENetworksShowIE,
|
||||||
HistoryTopicIE,
|
HistoryTopicIE,
|
||||||
|
HistoryPlayerIE,
|
||||||
|
BiographyIE,
|
||||||
)
|
)
|
||||||
from .afreecatv import AfreecaTVIE
|
from .afreecatv import AfreecaTVIE
|
||||||
from .airmozilla import AirMozillaIE
|
from .airmozilla import AirMozillaIE
|
||||||
@ -38,7 +42,10 @@ from .aljazeera import AlJazeeraIE
|
|||||||
from .alphaporno import AlphaPornoIE
|
from .alphaporno import AlphaPornoIE
|
||||||
from .amara import AmaraIE
|
from .amara import AmaraIE
|
||||||
from .amcnetworks import AMCNetworksIE
|
from .amcnetworks import AMCNetworksIE
|
||||||
from .americastestkitchen import AmericasTestKitchenIE
|
from .americastestkitchen import (
|
||||||
|
AmericasTestKitchenIE,
|
||||||
|
AmericasTestKitchenSeasonIE,
|
||||||
|
)
|
||||||
from .animeondemand import AnimeOnDemandIE
|
from .animeondemand import AnimeOnDemandIE
|
||||||
from .anvato import AnvatoIE
|
from .anvato import AnvatoIE
|
||||||
from .aol import AolIE
|
from .aol import AolIE
|
||||||
@ -51,7 +58,9 @@ from .appletrailers import (
|
|||||||
AppleTrailersIE,
|
AppleTrailersIE,
|
||||||
AppleTrailersSectionIE,
|
AppleTrailersSectionIE,
|
||||||
)
|
)
|
||||||
|
from .applepodcasts import ApplePodcastsIE
|
||||||
from .archiveorg import ArchiveOrgIE
|
from .archiveorg import ArchiveOrgIE
|
||||||
|
from .arcpublishing import ArcPublishingIE
|
||||||
from .arkena import ArkenaIE
|
from .arkena import ArkenaIE
|
||||||
from .ard import (
|
from .ard import (
|
||||||
ARDBetaMediathekIE,
|
ARDBetaMediathekIE,
|
||||||
@ -89,16 +98,18 @@ from .bbc import (
|
|||||||
BBCCoUkPlaylistIE,
|
BBCCoUkPlaylistIE,
|
||||||
BBCIE,
|
BBCIE,
|
||||||
)
|
)
|
||||||
from .beampro import (
|
|
||||||
BeamProLiveIE,
|
|
||||||
BeamProVodIE,
|
|
||||||
)
|
|
||||||
from .beeg import BeegIE
|
from .beeg import BeegIE
|
||||||
from .behindkink import BehindKinkIE
|
from .behindkink import BehindKinkIE
|
||||||
from .bellmedia import BellMediaIE
|
from .bellmedia import BellMediaIE
|
||||||
from .beatport import BeatportIE
|
from .beatport import BeatportIE
|
||||||
from .bet import BetIE
|
from .bet import BetIE
|
||||||
from .bfi import BFIPlayerIE
|
from .bfi import BFIPlayerIE
|
||||||
|
from .bfmtv import (
|
||||||
|
BFMTVIE,
|
||||||
|
BFMTVLiveIE,
|
||||||
|
BFMTVArticleIE,
|
||||||
|
)
|
||||||
|
from .bibeltv import BibelTVIE
|
||||||
from .bigflix import BigflixIE
|
from .bigflix import BigflixIE
|
||||||
from .bild import BildIE
|
from .bild import BildIE
|
||||||
from .bilibili import (
|
from .bilibili import (
|
||||||
@ -121,6 +132,7 @@ from .bleacherreport import (
|
|||||||
from .blinkx import BlinkxIE
|
from .blinkx import BlinkxIE
|
||||||
from .bloomberg import BloombergIE
|
from .bloomberg import BloombergIE
|
||||||
from .bokecc import BokeCCIE
|
from .bokecc import BokeCCIE
|
||||||
|
from .bongacams import BongaCamsIE
|
||||||
from .bostonglobe import BostonGlobeIE
|
from .bostonglobe import BostonGlobeIE
|
||||||
from .box import BoxIE
|
from .box import BoxIE
|
||||||
from .bpb import BpbIE
|
from .bpb import BpbIE
|
||||||
@ -151,6 +163,7 @@ from .canvas import (
|
|||||||
CanvasIE,
|
CanvasIE,
|
||||||
CanvasEenIE,
|
CanvasEenIE,
|
||||||
VrtNUIE,
|
VrtNUIE,
|
||||||
|
DagelijkseKostIE,
|
||||||
)
|
)
|
||||||
from .carambatv import (
|
from .carambatv import (
|
||||||
CarambaTVIE,
|
CarambaTVIE,
|
||||||
@ -165,7 +178,10 @@ from .cbc import (
|
|||||||
CBCOlympicsIE,
|
CBCOlympicsIE,
|
||||||
)
|
)
|
||||||
from .cbs import CBSIE
|
from .cbs import CBSIE
|
||||||
from .cbslocal import CBSLocalIE
|
from .cbslocal import (
|
||||||
|
CBSLocalIE,
|
||||||
|
CBSLocalArticleIE,
|
||||||
|
)
|
||||||
from .cbsinteractive import CBSInteractiveIE
|
from .cbsinteractive import CBSInteractiveIE
|
||||||
from .cbsnews import (
|
from .cbsnews import (
|
||||||
CBSNewsEmbedIE,
|
CBSNewsEmbedIE,
|
||||||
@ -220,11 +236,8 @@ from .cnn import (
|
|||||||
)
|
)
|
||||||
from .coub import CoubIE
|
from .coub import CoubIE
|
||||||
from .comedycentral import (
|
from .comedycentral import (
|
||||||
ComedyCentralFullEpisodesIE,
|
|
||||||
ComedyCentralIE,
|
ComedyCentralIE,
|
||||||
ComedyCentralShortnameIE,
|
|
||||||
ComedyCentralTVIE,
|
ComedyCentralTVIE,
|
||||||
ToshIE,
|
|
||||||
)
|
)
|
||||||
from .commonmistakes import CommonMistakesIE, UnicodeBOMIE
|
from .commonmistakes import CommonMistakesIE, UnicodeBOMIE
|
||||||
from .commonprotocols import (
|
from .commonprotocols import (
|
||||||
@ -243,6 +256,7 @@ from .crunchyroll import (
|
|||||||
)
|
)
|
||||||
from .cspan import CSpanIE
|
from .cspan import CSpanIE
|
||||||
from .ctsnews import CtsNewsIE
|
from .ctsnews import CtsNewsIE
|
||||||
|
from .ctv import CTVIE
|
||||||
from .ctvnews import CTVNewsIE
|
from .ctvnews import CTVNewsIE
|
||||||
from .cultureunplugged import CultureUnpluggedIE
|
from .cultureunplugged import CultureUnpluggedIE
|
||||||
from .curiositystream import (
|
from .curiositystream import (
|
||||||
@ -329,7 +343,6 @@ from .espn import (
|
|||||||
)
|
)
|
||||||
from .esri import EsriVideoIE
|
from .esri import EsriVideoIE
|
||||||
from .europa import EuropaIE
|
from .europa import EuropaIE
|
||||||
from .everyonesmixtape import EveryonesMixtapeIE
|
|
||||||
from .expotv import ExpoTVIE
|
from .expotv import ExpoTVIE
|
||||||
from .expressen import ExpressenIE
|
from .expressen import ExpressenIE
|
||||||
from .extremetube import ExtremeTubeIE
|
from .extremetube import ExtremeTubeIE
|
||||||
@ -393,10 +406,10 @@ from .frontendmasters import (
|
|||||||
FrontendMastersLessonIE,
|
FrontendMastersLessonIE,
|
||||||
FrontendMastersCourseIE
|
FrontendMastersCourseIE
|
||||||
)
|
)
|
||||||
|
from .fujitv import FujiTVFODPlus7IE
|
||||||
from .funimation import FunimationIE
|
from .funimation import FunimationIE
|
||||||
from .funk import FunkIE
|
from .funk import FunkIE
|
||||||
from .fusion import FusionIE
|
from .fusion import FusionIE
|
||||||
from .fxnetworks import FXNetworksIE
|
|
||||||
from .gaia import GaiaIE
|
from .gaia import GaiaIE
|
||||||
from .gameinformer import GameInformerIE
|
from .gameinformer import GameInformerIE
|
||||||
from .gamespot import GameSpotIE
|
from .gamespot import GameSpotIE
|
||||||
@ -417,7 +430,10 @@ from .go import GoIE
|
|||||||
from .godtube import GodTubeIE
|
from .godtube import GodTubeIE
|
||||||
from .golem import GolemIE
|
from .golem import GolemIE
|
||||||
from .googledrive import GoogleDriveIE
|
from .googledrive import GoogleDriveIE
|
||||||
from .googleplus import GooglePlusIE
|
from .googlepodcasts import (
|
||||||
|
GooglePodcastsIE,
|
||||||
|
GooglePodcastsFeedIE,
|
||||||
|
)
|
||||||
from .googlesearch import GoogleSearchIE
|
from .googlesearch import GoogleSearchIE
|
||||||
from .goshgay import GoshgayIE
|
from .goshgay import GoshgayIE
|
||||||
from .gputechconf import GPUTechConfIE
|
from .gputechconf import GPUTechConfIE
|
||||||
@ -455,8 +471,12 @@ from .hungama import (
|
|||||||
from .hypem import HypemIE
|
from .hypem import HypemIE
|
||||||
from .ign import (
|
from .ign import (
|
||||||
IGNIE,
|
IGNIE,
|
||||||
OneUPIE,
|
IGNVideoIE,
|
||||||
PCMagIE,
|
IGNArticleIE,
|
||||||
|
)
|
||||||
|
from .iheart import (
|
||||||
|
IHeartRadioIE,
|
||||||
|
IHeartRadioPodcastIE,
|
||||||
)
|
)
|
||||||
from .imdb import (
|
from .imdb import (
|
||||||
ImdbIE,
|
ImdbIE,
|
||||||
@ -502,13 +522,15 @@ from .joj import JojIE
|
|||||||
from .jwplatform import JWPlatformIE
|
from .jwplatform import JWPlatformIE
|
||||||
from .kakao import KakaoIE
|
from .kakao import KakaoIE
|
||||||
from .kaltura import KalturaIE
|
from .kaltura import KalturaIE
|
||||||
from .kanalplay import KanalPlayIE
|
|
||||||
from .kankan import KankanIE
|
from .kankan import KankanIE
|
||||||
from .karaoketv import KaraoketvIE
|
from .karaoketv import KaraoketvIE
|
||||||
from .karrierevideos import KarriereVideosIE
|
from .karrierevideos import KarriereVideosIE
|
||||||
from .keezmovies import KeezMoviesIE
|
from .keezmovies import KeezMoviesIE
|
||||||
from .ketnet import KetnetIE
|
from .ketnet import KetnetIE
|
||||||
from .khanacademy import KhanAcademyIE
|
from .khanacademy import (
|
||||||
|
KhanAcademyIE,
|
||||||
|
KhanAcademyUnitIE,
|
||||||
|
)
|
||||||
from .kickstarter import KickStarterIE
|
from .kickstarter import KickStarterIE
|
||||||
from .kinja import KinjaEmbedIE
|
from .kinja import KinjaEmbedIE
|
||||||
from .kinopoisk import KinoPoiskIE
|
from .kinopoisk import KinoPoiskIE
|
||||||
@ -531,7 +553,10 @@ from .laola1tv import (
|
|||||||
EHFTVIE,
|
EHFTVIE,
|
||||||
ITTFIE,
|
ITTFIE,
|
||||||
)
|
)
|
||||||
from .lbry import LBRYIE
|
from .lbry import (
|
||||||
|
LBRYIE,
|
||||||
|
LBRYChannelIE,
|
||||||
|
)
|
||||||
from .lci import LCIIE
|
from .lci import LCIIE
|
||||||
from .lcp import (
|
from .lcp import (
|
||||||
LcpPlayIE,
|
LcpPlayIE,
|
||||||
@ -606,6 +631,7 @@ from .markiza import (
|
|||||||
from .massengeschmacktv import MassengeschmackTVIE
|
from .massengeschmacktv import MassengeschmackTVIE
|
||||||
from .matchtv import MatchTVIE
|
from .matchtv import MatchTVIE
|
||||||
from .mdr import MDRIE
|
from .mdr import MDRIE
|
||||||
|
from .medaltv import MedalTVIE
|
||||||
from .mediaset import MediasetIE
|
from .mediaset import MediasetIE
|
||||||
from .mediasite import (
|
from .mediasite import (
|
||||||
MediasiteIE,
|
MediasiteIE,
|
||||||
@ -626,6 +652,11 @@ from .microsoftvirtualacademy import (
|
|||||||
MicrosoftVirtualAcademyIE,
|
MicrosoftVirtualAcademyIE,
|
||||||
MicrosoftVirtualAcademyCourseIE,
|
MicrosoftVirtualAcademyCourseIE,
|
||||||
)
|
)
|
||||||
|
from .minds import (
|
||||||
|
MindsIE,
|
||||||
|
MindsChannelIE,
|
||||||
|
MindsGroupIE,
|
||||||
|
)
|
||||||
from .ministrygrid import MinistryGridIE
|
from .ministrygrid import MinistryGridIE
|
||||||
from .minoto import MinotoIE
|
from .minoto import MinotoIE
|
||||||
from .miomio import MioMioIE
|
from .miomio import MioMioIE
|
||||||
@ -676,9 +707,15 @@ from .nationalgeographic import (
|
|||||||
NationalGeographicTVIE,
|
NationalGeographicTVIE,
|
||||||
)
|
)
|
||||||
from .naver import NaverIE
|
from .naver import NaverIE
|
||||||
from .nba import NBAIE
|
from .nba import (
|
||||||
|
NBAWatchEmbedIE,
|
||||||
|
NBAWatchIE,
|
||||||
|
NBAWatchCollectionIE,
|
||||||
|
NBAEmbedIE,
|
||||||
|
NBAIE,
|
||||||
|
NBAChannelIE,
|
||||||
|
)
|
||||||
from .nbc import (
|
from .nbc import (
|
||||||
CSNNEIE,
|
|
||||||
NBCIE,
|
NBCIE,
|
||||||
NBCNewsIE,
|
NBCNewsIE,
|
||||||
NBCOlympicsIE,
|
NBCOlympicsIE,
|
||||||
@ -721,8 +758,14 @@ from .nexx import (
|
|||||||
NexxIE,
|
NexxIE,
|
||||||
NexxEmbedIE,
|
NexxEmbedIE,
|
||||||
)
|
)
|
||||||
from .nfl import NFLIE
|
from .nfl import (
|
||||||
from .nhk import NhkVodIE
|
NFLIE,
|
||||||
|
NFLArticleIE,
|
||||||
|
)
|
||||||
|
from .nhk import (
|
||||||
|
NhkVodIE,
|
||||||
|
NhkVodProgramIE,
|
||||||
|
)
|
||||||
from .nhl import NHLIE
|
from .nhl import NHLIE
|
||||||
from .nick import (
|
from .nick import (
|
||||||
NickIE,
|
NickIE,
|
||||||
@ -738,7 +781,6 @@ from .ninenow import NineNowIE
|
|||||||
from .nintendo import NintendoIE
|
from .nintendo import NintendoIE
|
||||||
from .njpwworld import NJPWWorldIE
|
from .njpwworld import NJPWWorldIE
|
||||||
from .nobelprize import NobelPrizeIE
|
from .nobelprize import NobelPrizeIE
|
||||||
from .noco import NocoIE
|
|
||||||
from .nonktube import NonkTubeIE
|
from .nonktube import NonkTubeIE
|
||||||
from .noovo import NoovoIE
|
from .noovo import NoovoIE
|
||||||
from .normalboots import NormalbootsIE
|
from .normalboots import NormalbootsIE
|
||||||
@ -771,6 +813,7 @@ from .nrk import (
|
|||||||
NRKSkoleIE,
|
NRKSkoleIE,
|
||||||
NRKTVIE,
|
NRKTVIE,
|
||||||
NRKTVDirekteIE,
|
NRKTVDirekteIE,
|
||||||
|
NRKRadioPodkastIE,
|
||||||
NRKTVEpisodeIE,
|
NRKTVEpisodeIE,
|
||||||
NRKTVEpisodesIE,
|
NRKTVEpisodesIE,
|
||||||
NRKTVSeasonIE,
|
NRKTVSeasonIE,
|
||||||
@ -1034,16 +1077,11 @@ from .skynewsarabia import (
|
|||||||
from .sky import (
|
from .sky import (
|
||||||
SkyNewsIE,
|
SkyNewsIE,
|
||||||
SkySportsIE,
|
SkySportsIE,
|
||||||
|
SkySportsNewsIE,
|
||||||
)
|
)
|
||||||
from .slideshare import SlideshareIE
|
from .slideshare import SlideshareIE
|
||||||
from .slideslive import SlidesLiveIE
|
from .slideslive import SlidesLiveIE
|
||||||
from .slutload import SlutloadIE
|
from .slutload import SlutloadIE
|
||||||
from .smotri import (
|
|
||||||
SmotriIE,
|
|
||||||
SmotriCommunityIE,
|
|
||||||
SmotriUserIE,
|
|
||||||
SmotriBroadcastIE,
|
|
||||||
)
|
|
||||||
from .snotr import SnotrIE
|
from .snotr import SnotrIE
|
||||||
from .sohu import SohuIE
|
from .sohu import SohuIE
|
||||||
from .sonyliv import SonyLIVIE
|
from .sonyliv import SonyLIVIE
|
||||||
@ -1077,10 +1115,23 @@ from .spike import (
|
|||||||
BellatorIE,
|
BellatorIE,
|
||||||
ParamountNetworkIE,
|
ParamountNetworkIE,
|
||||||
)
|
)
|
||||||
from .stitcher import StitcherIE
|
from .stitcher import (
|
||||||
|
StitcherIE,
|
||||||
|
StitcherShowIE,
|
||||||
|
)
|
||||||
from .sport5 import Sport5IE
|
from .sport5 import Sport5IE
|
||||||
from .sportbox import SportBoxIE
|
from .sportbox import SportBoxIE
|
||||||
from .sportdeutschland import SportDeutschlandIE
|
from .sportdeutschland import SportDeutschlandIE
|
||||||
|
from .spotify import (
|
||||||
|
SpotifyIE,
|
||||||
|
SpotifyShowIE,
|
||||||
|
)
|
||||||
|
from .spreaker import (
|
||||||
|
SpreakerIE,
|
||||||
|
SpreakerPageIE,
|
||||||
|
SpreakerShowIE,
|
||||||
|
SpreakerShowPageIE,
|
||||||
|
)
|
||||||
from .springboardplatform import SpringboardPlatformIE
|
from .springboardplatform import SpringboardPlatformIE
|
||||||
from .sprout import SproutIE
|
from .sprout import SproutIE
|
||||||
from .srgssr import (
|
from .srgssr import (
|
||||||
@ -1115,7 +1166,6 @@ from .tagesschau import (
|
|||||||
TagesschauIE,
|
TagesschauIE,
|
||||||
)
|
)
|
||||||
from .tass import TassIE
|
from .tass import TassIE
|
||||||
from .tastytrade import TastyTradeIE
|
|
||||||
from .tbs import TBSIE
|
from .tbs import TBSIE
|
||||||
from .tdslifeway import TDSLifewayIE
|
from .tdslifeway import TDSLifewayIE
|
||||||
from .teachable import (
|
from .teachable import (
|
||||||
@ -1142,6 +1192,7 @@ from .telequebec import (
|
|||||||
TeleQuebecSquatIE,
|
TeleQuebecSquatIE,
|
||||||
TeleQuebecEmissionIE,
|
TeleQuebecEmissionIE,
|
||||||
TeleQuebecLiveIE,
|
TeleQuebecLiveIE,
|
||||||
|
TeleQuebecVideoIE,
|
||||||
)
|
)
|
||||||
from .teletask import TeleTaskIE
|
from .teletask import TeleTaskIE
|
||||||
from .telewebion import TelewebionIE
|
from .telewebion import TelewebionIE
|
||||||
@ -1178,13 +1229,20 @@ from .tnaflix import (
|
|||||||
EMPFlixIE,
|
EMPFlixIE,
|
||||||
MovieFapIE,
|
MovieFapIE,
|
||||||
)
|
)
|
||||||
from .toggle import ToggleIE
|
from .toggle import (
|
||||||
|
ToggleIE,
|
||||||
|
MeWatchIE,
|
||||||
|
)
|
||||||
from .tonline import TOnlineIE
|
from .tonline import TOnlineIE
|
||||||
from .toongoggles import ToonGogglesIE
|
from .toongoggles import ToonGogglesIE
|
||||||
from .toutv import TouTvIE
|
from .toutv import TouTvIE
|
||||||
from .toypics import ToypicsUserIE, ToypicsIE
|
from .toypics import ToypicsUserIE, ToypicsIE
|
||||||
from .traileraddict import TrailerAddictIE
|
from .traileraddict import TrailerAddictIE
|
||||||
from .trilulilu import TriluliluIE
|
from .trilulilu import TriluliluIE
|
||||||
|
from .trovo import (
|
||||||
|
TrovoIE,
|
||||||
|
TrovoVodIE,
|
||||||
|
)
|
||||||
from .trunews import TruNewsIE
|
from .trunews import TruNewsIE
|
||||||
from .trutv import TruTVIE
|
from .trutv import TruTVIE
|
||||||
from .tube8 import Tube8IE
|
from .tube8 import Tube8IE
|
||||||
@ -1203,6 +1261,7 @@ from .tv2 import (
|
|||||||
TV2IE,
|
TV2IE,
|
||||||
TV2ArticleIE,
|
TV2ArticleIE,
|
||||||
KatsomoIE,
|
KatsomoIE,
|
||||||
|
MTVUutisetArticleIE,
|
||||||
)
|
)
|
||||||
from .tv2dk import (
|
from .tv2dk import (
|
||||||
TV2DKIE,
|
TV2DKIE,
|
||||||
@ -1211,7 +1270,14 @@ from .tv2dk import (
|
|||||||
from .tv2hu import TV2HuIE
|
from .tv2hu import TV2HuIE
|
||||||
from .tv4 import TV4IE
|
from .tv4 import TV4IE
|
||||||
from .tv5mondeplus import TV5MondePlusIE
|
from .tv5mondeplus import TV5MondePlusIE
|
||||||
from .tva import TVAIE
|
from .tv5unis import (
|
||||||
|
TV5UnisVideoIE,
|
||||||
|
TV5UnisIE,
|
||||||
|
)
|
||||||
|
from .tva import (
|
||||||
|
TVAIE,
|
||||||
|
QubIE,
|
||||||
|
)
|
||||||
from .tvanouvelles import (
|
from .tvanouvelles import (
|
||||||
TVANouvellesIE,
|
TVANouvellesIE,
|
||||||
TVANouvellesArticleIE,
|
TVANouvellesArticleIE,
|
||||||
@ -1220,6 +1286,7 @@ from .tvc import (
|
|||||||
TVCIE,
|
TVCIE,
|
||||||
TVCArticleIE,
|
TVCArticleIE,
|
||||||
)
|
)
|
||||||
|
from .tver import TVerIE
|
||||||
from .tvigle import TvigleIE
|
from .tvigle import TvigleIE
|
||||||
from .tvland import TVLandIE
|
from .tvland import TVLandIE
|
||||||
from .tvn24 import TVN24IE
|
from .tvn24 import TVN24IE
|
||||||
@ -1333,7 +1400,6 @@ from .vidme import (
|
|||||||
VidmeUserIE,
|
VidmeUserIE,
|
||||||
VidmeUserLikesIE,
|
VidmeUserLikesIE,
|
||||||
)
|
)
|
||||||
from .vidzi import VidziIE
|
|
||||||
from .vier import VierIE, VierVideosIE
|
from .vier import VierIE, VierVideosIE
|
||||||
from .viewlift import (
|
from .viewlift import (
|
||||||
ViewLiftIE,
|
ViewLiftIE,
|
||||||
@ -1374,6 +1440,7 @@ from .vk import (
|
|||||||
)
|
)
|
||||||
from .vlive import (
|
from .vlive import (
|
||||||
VLiveIE,
|
VLiveIE,
|
||||||
|
VLivePostIE,
|
||||||
VLiveChannelIE,
|
VLiveChannelIE,
|
||||||
)
|
)
|
||||||
from .vodlocker import VodlockerIE
|
from .vodlocker import VodlockerIE
|
||||||
@ -1392,10 +1459,14 @@ from .vrv import (
|
|||||||
VRVSeriesIE,
|
VRVSeriesIE,
|
||||||
)
|
)
|
||||||
from .vshare import VShareIE
|
from .vshare import VShareIE
|
||||||
|
from .vtm import VTMIE
|
||||||
from .medialaan import MedialaanIE
|
from .medialaan import MedialaanIE
|
||||||
from .vube import VubeIE
|
from .vube import VubeIE
|
||||||
from .vuclip import VuClipIE
|
from .vuclip import VuClipIE
|
||||||
from .vvvvid import VVVVIDIE
|
from .vvvvid import (
|
||||||
|
VVVVIDIE,
|
||||||
|
VVVVIDShowIE,
|
||||||
|
)
|
||||||
from .vyborymos import VyboryMosIE
|
from .vyborymos import VyboryMosIE
|
||||||
from .vzaar import VzaarIE
|
from .vzaar import VzaarIE
|
||||||
from .wakanim import WakanimIE
|
from .wakanim import WakanimIE
|
||||||
@ -1426,7 +1497,10 @@ from .weibo import (
|
|||||||
WeiboMobileIE
|
WeiboMobileIE
|
||||||
)
|
)
|
||||||
from .weiqitv import WeiqiTVIE
|
from .weiqitv import WeiqiTVIE
|
||||||
from .wistia import WistiaIE
|
from .wistia import (
|
||||||
|
WistiaIE,
|
||||||
|
WistiaPlaylistIE,
|
||||||
|
)
|
||||||
from .worldstarhiphop import WorldStarHipHopIE
|
from .worldstarhiphop import WorldStarHipHopIE
|
||||||
from .wsj import (
|
from .wsj import (
|
||||||
WSJIE,
|
WSJIE,
|
||||||
@ -1470,6 +1544,8 @@ from .yandexmusic import (
|
|||||||
YandexMusicTrackIE,
|
YandexMusicTrackIE,
|
||||||
YandexMusicAlbumIE,
|
YandexMusicAlbumIE,
|
||||||
YandexMusicPlaylistIE,
|
YandexMusicPlaylistIE,
|
||||||
|
YandexMusicArtistTracksIE,
|
||||||
|
YandexMusicArtistAlbumsIE,
|
||||||
)
|
)
|
||||||
from .yandexvideo import YandexVideoIE
|
from .yandexvideo import YandexVideoIE
|
||||||
from .yapfiles import YapFilesIE
|
from .yapfiles import YapFilesIE
|
||||||
@ -1491,6 +1567,7 @@ from .yourporn import YourPornIE
|
|||||||
from .yourupload import YourUploadIE
|
from .yourupload import YourUploadIE
|
||||||
from .youtube import (
|
from .youtube import (
|
||||||
YoutubeIE,
|
YoutubeIE,
|
||||||
|
YoutubeFavouritesIE,
|
||||||
YoutubeHistoryIE,
|
YoutubeHistoryIE,
|
||||||
YoutubeTabIE,
|
YoutubeTabIE,
|
||||||
YoutubePlaylistIE,
|
YoutubePlaylistIE,
|
||||||
@ -1501,11 +1578,11 @@ from .youtube import (
|
|||||||
YoutubeSubscriptionsIE,
|
YoutubeSubscriptionsIE,
|
||||||
YoutubeTruncatedIDIE,
|
YoutubeTruncatedIDIE,
|
||||||
YoutubeTruncatedURLIE,
|
YoutubeTruncatedURLIE,
|
||||||
|
YoutubeYtBeIE,
|
||||||
YoutubeYtUserIE,
|
YoutubeYtUserIE,
|
||||||
YoutubeWatchLaterIE,
|
YoutubeWatchLaterIE,
|
||||||
)
|
)
|
||||||
from .zapiks import ZapiksIE
|
from .zapiks import ZapiksIE
|
||||||
from .zaq1 import Zaq1IE
|
|
||||||
from .zattoo import (
|
from .zattoo import (
|
||||||
BBVTVIE,
|
BBVTVIE,
|
||||||
EinsUndEinsTVIE,
|
EinsUndEinsTVIE,
|
||||||
|
@ -1,6 +1,7 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import json
|
||||||
import re
|
import re
|
||||||
import socket
|
import socket
|
||||||
|
|
||||||
@ -8,6 +9,7 @@ from .common import InfoExtractor
|
|||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_etree_fromstring,
|
compat_etree_fromstring,
|
||||||
compat_http_client,
|
compat_http_client,
|
||||||
|
compat_str,
|
||||||
compat_urllib_error,
|
compat_urllib_error,
|
||||||
compat_urllib_parse_unquote,
|
compat_urllib_parse_unquote,
|
||||||
compat_urllib_parse_unquote_plus,
|
compat_urllib_parse_unquote_plus,
|
||||||
@ -16,14 +18,17 @@ from ..utils import (
|
|||||||
clean_html,
|
clean_html,
|
||||||
error_to_compat_str,
|
error_to_compat_str,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
|
float_or_none,
|
||||||
get_element_by_id,
|
get_element_by_id,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
js_to_json,
|
js_to_json,
|
||||||
limit_length,
|
limit_length,
|
||||||
parse_count,
|
parse_count,
|
||||||
|
qualities,
|
||||||
sanitized_Request,
|
sanitized_Request,
|
||||||
try_get,
|
try_get,
|
||||||
urlencode_postdata,
|
urlencode_postdata,
|
||||||
|
urljoin,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -39,11 +44,13 @@ class FacebookIE(InfoExtractor):
|
|||||||
photo\.php|
|
photo\.php|
|
||||||
video\.php|
|
video\.php|
|
||||||
video/embed|
|
video/embed|
|
||||||
story\.php
|
story\.php|
|
||||||
|
watch(?:/live)?/?
|
||||||
)\?(?:.*?)(?:v|video_id|story_fbid)=|
|
)\?(?:.*?)(?:v|video_id|story_fbid)=|
|
||||||
[^/]+/videos/(?:[^/]+/)?|
|
[^/]+/videos/(?:[^/]+/)?|
|
||||||
[^/]+/posts/|
|
[^/]+/posts/|
|
||||||
groups/[^/]+/permalink/
|
groups/[^/]+/permalink/|
|
||||||
|
watchparty/
|
||||||
)|
|
)|
|
||||||
facebook:
|
facebook:
|
||||||
)
|
)
|
||||||
@ -54,8 +61,6 @@ class FacebookIE(InfoExtractor):
|
|||||||
_NETRC_MACHINE = 'facebook'
|
_NETRC_MACHINE = 'facebook'
|
||||||
IE_NAME = 'facebook'
|
IE_NAME = 'facebook'
|
||||||
|
|
||||||
_CHROME_USER_AGENT = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36'
|
|
||||||
|
|
||||||
_VIDEO_PAGE_TEMPLATE = 'https://www.facebook.com/video/video.php?v=%s'
|
_VIDEO_PAGE_TEMPLATE = 'https://www.facebook.com/video/video.php?v=%s'
|
||||||
_VIDEO_PAGE_TAHOE_TEMPLATE = 'https://www.facebook.com/video/tahoe/async/%s/?chain=true&isvideo=true&payloadtype=primary'
|
_VIDEO_PAGE_TAHOE_TEMPLATE = 'https://www.facebook.com/video/tahoe/async/%s/?chain=true&isvideo=true&payloadtype=primary'
|
||||||
|
|
||||||
@ -72,6 +77,7 @@ class FacebookIE(InfoExtractor):
|
|||||||
},
|
},
|
||||||
'skip': 'Requires logging in',
|
'skip': 'Requires logging in',
|
||||||
}, {
|
}, {
|
||||||
|
# data.video
|
||||||
'url': 'https://www.facebook.com/video.php?v=274175099429670',
|
'url': 'https://www.facebook.com/video.php?v=274175099429670',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '274175099429670',
|
'id': '274175099429670',
|
||||||
@ -133,6 +139,7 @@ class FacebookIE(InfoExtractor):
|
|||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
# have 1080P, but only up to 720p in swf params
|
# have 1080P, but only up to 720p in swf params
|
||||||
|
# data.video.story.attachments[].media
|
||||||
'url': 'https://www.facebook.com/cnn/videos/10155529876156509/',
|
'url': 'https://www.facebook.com/cnn/videos/10155529876156509/',
|
||||||
'md5': '9571fae53d4165bbbadb17a94651dcdc',
|
'md5': '9571fae53d4165bbbadb17a94651dcdc',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -147,6 +154,7 @@ class FacebookIE(InfoExtractor):
|
|||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
# bigPipe.onPageletArrive ... onPageletArrive pagelet_group_mall
|
# bigPipe.onPageletArrive ... onPageletArrive pagelet_group_mall
|
||||||
|
# data.node.comet_sections.content.story.attachments[].style_type_renderer.attachment.media
|
||||||
'url': 'https://www.facebook.com/yaroslav.korpan/videos/1417995061575415/',
|
'url': 'https://www.facebook.com/yaroslav.korpan/videos/1417995061575415/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '1417995061575415',
|
'id': '1417995061575415',
|
||||||
@ -174,6 +182,7 @@ class FacebookIE(InfoExtractor):
|
|||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
|
# data.node.comet_sections.content.story.attachments[].style_type_renderer.attachment.media
|
||||||
'url': 'https://www.facebook.com/groups/1024490957622648/permalink/1396382447100162/',
|
'url': 'https://www.facebook.com/groups/1024490957622648/permalink/1396382447100162/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '1396382447100162',
|
'id': '1396382447100162',
|
||||||
@ -193,18 +202,23 @@ class FacebookIE(InfoExtractor):
|
|||||||
'url': 'https://www.facebook.com/amogood/videos/1618742068337349/?fref=nf',
|
'url': 'https://www.facebook.com/amogood/videos/1618742068337349/?fref=nf',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
|
# data.mediaset.currMedia.edges
|
||||||
'url': 'https://www.facebook.com/ChristyClarkForBC/videos/vb.22819070941/10153870694020942/?type=2&theater',
|
'url': 'https://www.facebook.com/ChristyClarkForBC/videos/vb.22819070941/10153870694020942/?type=2&theater',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
|
# data.video.story.attachments[].media
|
||||||
'url': 'facebook:544765982287235',
|
'url': 'facebook:544765982287235',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
|
# data.node.comet_sections.content.story.attachments[].style_type_renderer.attachment.media
|
||||||
'url': 'https://www.facebook.com/groups/164828000315060/permalink/764967300301124/',
|
'url': 'https://www.facebook.com/groups/164828000315060/permalink/764967300301124/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
|
# data.video.creation_story.attachments[].media
|
||||||
'url': 'https://zh-hk.facebook.com/peoplespower/videos/1135894589806027/',
|
'url': 'https://zh-hk.facebook.com/peoplespower/videos/1135894589806027/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
|
# data.video
|
||||||
'url': 'https://www.facebookcorewwwi.onion/video.php?v=274175099429670',
|
'url': 'https://www.facebookcorewwwi.onion/video.php?v=274175099429670',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
@ -212,6 +226,7 @@ class FacebookIE(InfoExtractor):
|
|||||||
'url': 'https://www.facebook.com/onlycleverentertainment/videos/1947995502095005/',
|
'url': 'https://www.facebook.com/onlycleverentertainment/videos/1947995502095005/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
|
# data.video
|
||||||
'url': 'https://www.facebook.com/WatchESLOne/videos/359649331226507/',
|
'url': 'https://www.facebook.com/WatchESLOne/videos/359649331226507/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '359649331226507',
|
'id': '359649331226507',
|
||||||
@ -222,7 +237,64 @@ class FacebookIE(InfoExtractor):
|
|||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
# data.node.comet_sections.content.story.attachments[].style_type_renderer.attachment.all_subattachments.nodes[].media
|
||||||
|
'url': 'https://www.facebook.com/100033620354545/videos/106560053808006/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '106560053808006',
|
||||||
|
},
|
||||||
|
'playlist_count': 2,
|
||||||
|
}, {
|
||||||
|
# data.video.story.attachments[].media
|
||||||
|
'url': 'https://www.facebook.com/watch/?v=647537299265662',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# data.node.comet_sections.content.story.attachments[].style_type_renderer.attachment.all_subattachments.nodes[].media
|
||||||
|
'url': 'https://www.facebook.com/PankajShahLondon/posts/10157667649866271',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '10157667649866271',
|
||||||
|
},
|
||||||
|
'playlist_count': 3,
|
||||||
|
}, {
|
||||||
|
# data.nodes[].comet_sections.content.story.attachments[].style_type_renderer.attachment.media
|
||||||
|
'url': 'https://m.facebook.com/Alliance.Police.Department/posts/4048563708499330',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '117576630041613',
|
||||||
|
'ext': 'mp4',
|
||||||
|
# TODO: title can be extracted from video page
|
||||||
|
'title': 'Facebook video #117576630041613',
|
||||||
|
'uploader_id': '189393014416438',
|
||||||
|
'upload_date': '20201123',
|
||||||
|
'timestamp': 1606162592,
|
||||||
|
},
|
||||||
|
'skip': 'Requires logging in',
|
||||||
|
}, {
|
||||||
|
# node.comet_sections.content.story.attached_story.attachments.style_type_renderer.attachment.media
|
||||||
|
'url': 'https://www.facebook.com/groups/ateistiskselskab/permalink/10154930137678856/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '211567722618337',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Facebook video #211567722618337',
|
||||||
|
'uploader_id': '127875227654254',
|
||||||
|
'upload_date': '20161122',
|
||||||
|
'timestamp': 1479793574,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
# data.video.creation_story.attachments[].media
|
||||||
|
'url': 'https://www.facebook.com/watch/live/?v=1823658634322275',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.facebook.com/watchparty/211641140192478',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '211641140192478',
|
||||||
|
},
|
||||||
|
'playlist_count': 1,
|
||||||
|
'skip': 'Requires logging in',
|
||||||
}]
|
}]
|
||||||
|
_SUPPORTED_PAGLETS_REGEX = r'(?:pagelet_group_mall|permalink_video_pagelet|hyperfeed_story_id_[0-9a-f]+)'
|
||||||
|
_api_config = {
|
||||||
|
'graphURI': '/api/graphql/'
|
||||||
|
}
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def _extract_urls(webpage):
|
def _extract_urls(webpage):
|
||||||
@ -305,23 +377,24 @@ class FacebookIE(InfoExtractor):
|
|||||||
def _real_initialize(self):
|
def _real_initialize(self):
|
||||||
self._login()
|
self._login()
|
||||||
|
|
||||||
def _extract_from_url(self, url, video_id, fatal_if_no_video=True):
|
def _extract_from_url(self, url, video_id):
|
||||||
req = sanitized_Request(url)
|
webpage = self._download_webpage(
|
||||||
req.add_header('User-Agent', self._CHROME_USER_AGENT)
|
url.replace('://m.facebook.com/', '://www.facebook.com/'), video_id)
|
||||||
webpage = self._download_webpage(req, video_id)
|
|
||||||
|
|
||||||
video_data = None
|
video_data = None
|
||||||
|
|
||||||
def extract_video_data(instances):
|
def extract_video_data(instances):
|
||||||
|
video_data = []
|
||||||
for item in instances:
|
for item in instances:
|
||||||
if item[1][0] == 'VideoConfig':
|
if try_get(item, lambda x: x[1][0]) == 'VideoConfig':
|
||||||
video_item = item[2][0]
|
video_item = item[2][0]
|
||||||
if video_item.get('video_id'):
|
if video_item.get('video_id'):
|
||||||
return video_item['videoData']
|
video_data.append(video_item['videoData'])
|
||||||
|
return video_data
|
||||||
|
|
||||||
server_js_data = self._parse_json(self._search_regex(
|
server_js_data = self._parse_json(self._search_regex(
|
||||||
r'handleServerJS\(({.+})(?:\);|,")', webpage,
|
[r'handleServerJS\(({.+})(?:\);|,")', r'\bs\.handle\(({.+?})\);'],
|
||||||
'server js data', default='{}'), video_id, fatal=False)
|
webpage, 'server js data', default='{}'), video_id, fatal=False)
|
||||||
|
|
||||||
if server_js_data:
|
if server_js_data:
|
||||||
video_data = extract_video_data(server_js_data.get('instances', []))
|
video_data = extract_video_data(server_js_data.get('instances', []))
|
||||||
@ -331,17 +404,118 @@ class FacebookIE(InfoExtractor):
|
|||||||
return extract_video_data(try_get(
|
return extract_video_data(try_get(
|
||||||
js_data, lambda x: x['jsmods']['instances'], list) or [])
|
js_data, lambda x: x['jsmods']['instances'], list) or [])
|
||||||
|
|
||||||
|
def extract_dash_manifest(video, formats):
|
||||||
|
dash_manifest = video.get('dash_manifest')
|
||||||
|
if dash_manifest:
|
||||||
|
formats.extend(self._parse_mpd_formats(
|
||||||
|
compat_etree_fromstring(compat_urllib_parse_unquote_plus(dash_manifest))))
|
||||||
|
|
||||||
|
def process_formats(formats):
|
||||||
|
# Downloads with browser's User-Agent are rate limited. Working around
|
||||||
|
# with non-browser User-Agent.
|
||||||
|
for f in formats:
|
||||||
|
f.setdefault('http_headers', {})['User-Agent'] = 'facebookexternalhit/1.1'
|
||||||
|
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
def extract_relay_data(_filter):
|
||||||
|
return self._parse_json(self._search_regex(
|
||||||
|
r'handleWithCustomApplyEach\([^,]+,\s*({.*?%s.*?})\);' % _filter,
|
||||||
|
webpage, 'replay data', default='{}'), video_id, fatal=False) or {}
|
||||||
|
|
||||||
|
def extract_relay_prefetched_data(_filter):
|
||||||
|
replay_data = extract_relay_data(_filter)
|
||||||
|
for require in (replay_data.get('require') or []):
|
||||||
|
if require[0] == 'RelayPrefetchedStreamCache':
|
||||||
|
return try_get(require, lambda x: x[3][1]['__bbox']['result']['data'], dict) or {}
|
||||||
|
|
||||||
if not video_data:
|
if not video_data:
|
||||||
server_js_data = self._parse_json(
|
server_js_data = self._parse_json(self._search_regex([
|
||||||
self._search_regex(
|
r'bigPipe\.onPageletArrive\(({.+?})\)\s*;\s*}\s*\)\s*,\s*["\']onPageletArrive\s+' + self._SUPPORTED_PAGLETS_REGEX,
|
||||||
r'bigPipe\.onPageletArrive\(({.+?})\)\s*;\s*}\s*\)\s*,\s*["\']onPageletArrive\s+(?:pagelet_group_mall|permalink_video_pagelet|hyperfeed_story_id_\d+)',
|
r'bigPipe\.onPageletArrive\(({.*?id\s*:\s*"%s".*?})\);' % self._SUPPORTED_PAGLETS_REGEX
|
||||||
webpage, 'js data', default='{}'),
|
], webpage, 'js data', default='{}'), video_id, js_to_json, False)
|
||||||
video_id, transform_source=js_to_json, fatal=False)
|
|
||||||
video_data = extract_from_jsmods_instances(server_js_data)
|
video_data = extract_from_jsmods_instances(server_js_data)
|
||||||
|
|
||||||
if not video_data:
|
if not video_data:
|
||||||
if not fatal_if_no_video:
|
data = extract_relay_prefetched_data(
|
||||||
return webpage, False
|
r'"(?:dash_manifest|playable_url(?:_quality_hd)?)"\s*:\s*"[^"]+"')
|
||||||
|
if data:
|
||||||
|
entries = []
|
||||||
|
|
||||||
|
def parse_graphql_video(video):
|
||||||
|
formats = []
|
||||||
|
q = qualities(['sd', 'hd'])
|
||||||
|
for (suffix, format_id) in [('', 'sd'), ('_quality_hd', 'hd')]:
|
||||||
|
playable_url = video.get('playable_url' + suffix)
|
||||||
|
if not playable_url:
|
||||||
|
continue
|
||||||
|
formats.append({
|
||||||
|
'format_id': format_id,
|
||||||
|
'quality': q(format_id),
|
||||||
|
'url': playable_url,
|
||||||
|
})
|
||||||
|
extract_dash_manifest(video, formats)
|
||||||
|
process_formats(formats)
|
||||||
|
v_id = video.get('videoId') or video.get('id') or video_id
|
||||||
|
info = {
|
||||||
|
'id': v_id,
|
||||||
|
'formats': formats,
|
||||||
|
'thumbnail': try_get(video, lambda x: x['thumbnailImage']['uri']),
|
||||||
|
'uploader_id': try_get(video, lambda x: x['owner']['id']),
|
||||||
|
'timestamp': int_or_none(video.get('publish_time')),
|
||||||
|
'duration': float_or_none(video.get('playable_duration_in_ms'), 1000),
|
||||||
|
}
|
||||||
|
description = try_get(video, lambda x: x['savable_description']['text'])
|
||||||
|
title = video.get('name')
|
||||||
|
if title:
|
||||||
|
info.update({
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
})
|
||||||
|
else:
|
||||||
|
info['title'] = description or 'Facebook video #%s' % v_id
|
||||||
|
entries.append(info)
|
||||||
|
|
||||||
|
def parse_attachment(attachment, key='media'):
|
||||||
|
media = attachment.get(key) or {}
|
||||||
|
if media.get('__typename') == 'Video':
|
||||||
|
return parse_graphql_video(media)
|
||||||
|
|
||||||
|
nodes = data.get('nodes') or []
|
||||||
|
node = data.get('node') or {}
|
||||||
|
if not nodes and node:
|
||||||
|
nodes.append(node)
|
||||||
|
for node in nodes:
|
||||||
|
story = try_get(node, lambda x: x['comet_sections']['content']['story'], dict) or {}
|
||||||
|
attachments = try_get(story, [
|
||||||
|
lambda x: x['attached_story']['attachments'],
|
||||||
|
lambda x: x['attachments']
|
||||||
|
], list) or []
|
||||||
|
for attachment in attachments:
|
||||||
|
attachment = try_get(attachment, lambda x: x['style_type_renderer']['attachment'], dict)
|
||||||
|
ns = try_get(attachment, lambda x: x['all_subattachments']['nodes'], list) or []
|
||||||
|
for n in ns:
|
||||||
|
parse_attachment(n)
|
||||||
|
parse_attachment(attachment)
|
||||||
|
|
||||||
|
edges = try_get(data, lambda x: x['mediaset']['currMedia']['edges'], list) or []
|
||||||
|
for edge in edges:
|
||||||
|
parse_attachment(edge, key='node')
|
||||||
|
|
||||||
|
video = data.get('video') or {}
|
||||||
|
if video:
|
||||||
|
attachments = try_get(video, [
|
||||||
|
lambda x: x['story']['attachments'],
|
||||||
|
lambda x: x['creation_story']['attachments']
|
||||||
|
], list) or []
|
||||||
|
for attachment in attachments:
|
||||||
|
parse_attachment(attachment)
|
||||||
|
if not entries:
|
||||||
|
parse_graphql_video(video)
|
||||||
|
|
||||||
|
return self.playlist_result(entries, video_id)
|
||||||
|
|
||||||
|
if not video_data:
|
||||||
m_msg = re.search(r'class="[^"]*uiInterstitialContent[^"]*"><div>(.*?)</div>', webpage)
|
m_msg = re.search(r'class="[^"]*uiInterstitialContent[^"]*"><div>(.*?)</div>', webpage)
|
||||||
if m_msg is not None:
|
if m_msg is not None:
|
||||||
raise ExtractorError(
|
raise ExtractorError(
|
||||||
@ -350,6 +524,43 @@ class FacebookIE(InfoExtractor):
|
|||||||
elif '>You must log in to continue' in webpage:
|
elif '>You must log in to continue' in webpage:
|
||||||
self.raise_login_required()
|
self.raise_login_required()
|
||||||
|
|
||||||
|
if not video_data and '/watchparty/' in url:
|
||||||
|
post_data = {
|
||||||
|
'doc_id': 3731964053542869,
|
||||||
|
'variables': json.dumps({
|
||||||
|
'livingRoomID': video_id,
|
||||||
|
}),
|
||||||
|
}
|
||||||
|
|
||||||
|
prefetched_data = extract_relay_prefetched_data(r'"login_data"\s*:\s*{')
|
||||||
|
if prefetched_data:
|
||||||
|
lsd = try_get(prefetched_data, lambda x: x['login_data']['lsd'], dict)
|
||||||
|
if lsd:
|
||||||
|
post_data[lsd['name']] = lsd['value']
|
||||||
|
|
||||||
|
relay_data = extract_relay_data(r'\[\s*"RelayAPIConfigDefaults"\s*,')
|
||||||
|
for define in (relay_data.get('define') or []):
|
||||||
|
if define[0] == 'RelayAPIConfigDefaults':
|
||||||
|
self._api_config = define[2]
|
||||||
|
|
||||||
|
living_room = self._download_json(
|
||||||
|
urljoin(url, self._api_config['graphURI']), video_id,
|
||||||
|
data=urlencode_postdata(post_data))['data']['living_room']
|
||||||
|
|
||||||
|
entries = []
|
||||||
|
for edge in (try_get(living_room, lambda x: x['recap']['watched_content']['edges']) or []):
|
||||||
|
video = try_get(edge, lambda x: x['node']['video']) or {}
|
||||||
|
v_id = video.get('id')
|
||||||
|
if not v_id:
|
||||||
|
continue
|
||||||
|
v_id = compat_str(v_id)
|
||||||
|
entries.append(self.url_result(
|
||||||
|
self._VIDEO_PAGE_TEMPLATE % v_id,
|
||||||
|
self.ie_key(), v_id, video.get('name')))
|
||||||
|
|
||||||
|
return self.playlist_result(entries, video_id)
|
||||||
|
|
||||||
|
if not video_data:
|
||||||
# Video info not in first request, do a secondary request using
|
# Video info not in first request, do a secondary request using
|
||||||
# tahoe player specific URL
|
# tahoe player specific URL
|
||||||
tahoe_data = self._download_webpage(
|
tahoe_data = self._download_webpage(
|
||||||
@ -379,8 +590,19 @@ class FacebookIE(InfoExtractor):
|
|||||||
if not video_data:
|
if not video_data:
|
||||||
raise ExtractorError('Cannot parse data')
|
raise ExtractorError('Cannot parse data')
|
||||||
|
|
||||||
subtitles = {}
|
if len(video_data) > 1:
|
||||||
|
entries = []
|
||||||
|
for v in video_data:
|
||||||
|
video_url = v[0].get('video_url')
|
||||||
|
if not video_url:
|
||||||
|
continue
|
||||||
|
entries.append(self.url_result(urljoin(
|
||||||
|
url, video_url), self.ie_key(), v[0].get('video_id')))
|
||||||
|
return self.playlist_result(entries, video_id)
|
||||||
|
video_data = video_data[0]
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
|
subtitles = {}
|
||||||
for f in video_data:
|
for f in video_data:
|
||||||
format_id = f['stream_type']
|
format_id = f['stream_type']
|
||||||
if f and isinstance(f, dict):
|
if f and isinstance(f, dict):
|
||||||
@ -399,22 +621,14 @@ class FacebookIE(InfoExtractor):
|
|||||||
'url': src,
|
'url': src,
|
||||||
'preference': preference,
|
'preference': preference,
|
||||||
})
|
})
|
||||||
dash_manifest = f[0].get('dash_manifest')
|
extract_dash_manifest(f[0], formats)
|
||||||
if dash_manifest:
|
|
||||||
formats.extend(self._parse_mpd_formats(
|
|
||||||
compat_etree_fromstring(compat_urllib_parse_unquote_plus(dash_manifest))))
|
|
||||||
subtitles_src = f[0].get('subtitles_src')
|
subtitles_src = f[0].get('subtitles_src')
|
||||||
if subtitles_src:
|
if subtitles_src:
|
||||||
subtitles.setdefault('en', []).append({'url': subtitles_src})
|
subtitles.setdefault('en', []).append({'url': subtitles_src})
|
||||||
if not formats:
|
if not formats:
|
||||||
raise ExtractorError('Cannot find video formats')
|
raise ExtractorError('Cannot find video formats')
|
||||||
|
|
||||||
# Downloads with browser's User-Agent are rate limited. Working around
|
process_formats(formats)
|
||||||
# with non-browser User-Agent.
|
|
||||||
for f in formats:
|
|
||||||
f.setdefault('http_headers', {})['User-Agent'] = 'facebookexternalhit/1.1'
|
|
||||||
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
video_title = self._html_search_regex(
|
video_title = self._html_search_regex(
|
||||||
r'<h2\s+[^>]*class="uiHeaderTitle"[^>]*>([^<]*)</h2>', webpage,
|
r'<h2\s+[^>]*class="uiHeaderTitle"[^>]*>([^<]*)</h2>', webpage,
|
||||||
@ -454,35 +668,13 @@ class FacebookIE(InfoExtractor):
|
|||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
}
|
}
|
||||||
|
|
||||||
return webpage, info_dict
|
return info_dict
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
real_url = self._VIDEO_PAGE_TEMPLATE % video_id if url.startswith('facebook:') else url
|
real_url = self._VIDEO_PAGE_TEMPLATE % video_id if url.startswith('facebook:') else url
|
||||||
webpage, info_dict = self._extract_from_url(real_url, video_id, fatal_if_no_video=False)
|
return self._extract_from_url(real_url, video_id)
|
||||||
|
|
||||||
if info_dict:
|
|
||||||
return info_dict
|
|
||||||
|
|
||||||
if '/posts/' in url:
|
|
||||||
video_id_json = self._search_regex(
|
|
||||||
r'(["\'])video_ids\1\s*:\s*(?P<ids>\[.+?\])', webpage, 'video ids', group='ids',
|
|
||||||
default='')
|
|
||||||
if video_id_json:
|
|
||||||
entries = [
|
|
||||||
self.url_result('facebook:%s' % vid, FacebookIE.ie_key())
|
|
||||||
for vid in self._parse_json(video_id_json, video_id)]
|
|
||||||
return self.playlist_result(entries, video_id)
|
|
||||||
|
|
||||||
# Single Video?
|
|
||||||
video_id = self._search_regex(r'video_id:\s*"([0-9]+)"', webpage, 'single video id')
|
|
||||||
return self.url_result('facebook:%s' % video_id, FacebookIE.ie_key())
|
|
||||||
else:
|
|
||||||
_, info_dict = self._extract_from_url(
|
|
||||||
self._VIDEO_PAGE_TEMPLATE % video_id,
|
|
||||||
video_id, fatal_if_no_video=True)
|
|
||||||
return info_dict
|
|
||||||
|
|
||||||
|
|
||||||
class FacebookPluginsVideoIE(InfoExtractor):
|
class FacebookPluginsVideoIE(InfoExtractor):
|
||||||
|
@ -11,7 +11,7 @@ from ..utils import (
|
|||||||
|
|
||||||
class FranceCultureIE(InfoExtractor):
|
class FranceCultureIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?franceculture\.fr/emissions/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
_VALID_URL = r'https?://(?:www\.)?franceculture\.fr/emissions/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||||
_TEST = {
|
_TESTS = [{
|
||||||
'url': 'http://www.franceculture.fr/emissions/carnet-nomade/rendez-vous-au-pays-des-geeks',
|
'url': 'http://www.franceculture.fr/emissions/carnet-nomade/rendez-vous-au-pays-des-geeks',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'rendez-vous-au-pays-des-geeks',
|
'id': 'rendez-vous-au-pays-des-geeks',
|
||||||
@ -20,10 +20,14 @@ class FranceCultureIE(InfoExtractor):
|
|||||||
'title': 'Rendez-vous au pays des geeks',
|
'title': 'Rendez-vous au pays des geeks',
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
'upload_date': '20140301',
|
'upload_date': '20140301',
|
||||||
'timestamp': 1393642916,
|
'timestamp': 1393700400,
|
||||||
'vcodec': 'none',
|
'vcodec': 'none',
|
||||||
}
|
}
|
||||||
}
|
}, {
|
||||||
|
# no thumbnail
|
||||||
|
'url': 'https://www.franceculture.fr/emissions/la-recherche-montre-en-main/la-recherche-montre-en-main-du-mercredi-10-octobre-2018',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url)
|
display_id = self._match_id(url)
|
||||||
@ -36,19 +40,19 @@ class FranceCultureIE(InfoExtractor):
|
|||||||
</h1>|
|
</h1>|
|
||||||
<div[^>]+class="[^"]*?(?:title-zone-diffusion|heading-zone-(?:wrapper|player-button))[^"]*?"[^>]*>
|
<div[^>]+class="[^"]*?(?:title-zone-diffusion|heading-zone-(?:wrapper|player-button))[^"]*?"[^>]*>
|
||||||
).*?
|
).*?
|
||||||
(<button[^>]+data-asset-source="[^"]+"[^>]+>)
|
(<button[^>]+data-(?:url|asset-source)="[^"]+"[^>]+>)
|
||||||
''',
|
''',
|
||||||
webpage, 'video data'))
|
webpage, 'video data'))
|
||||||
|
|
||||||
video_url = video_data['data-asset-source']
|
video_url = video_data.get('data-url') or video_data['data-asset-source']
|
||||||
title = video_data.get('data-asset-title') or self._og_search_title(webpage)
|
title = video_data.get('data-asset-title') or video_data.get('data-diffusion-title') or self._og_search_title(webpage)
|
||||||
|
|
||||||
description = self._html_search_regex(
|
description = self._html_search_regex(
|
||||||
r'(?s)<div[^>]+class="intro"[^>]*>.*?<h2>(.+?)</h2>',
|
r'(?s)<div[^>]+class="intro"[^>]*>.*?<h2>(.+?)</h2>',
|
||||||
webpage, 'description', default=None)
|
webpage, 'description', default=None)
|
||||||
thumbnail = self._search_regex(
|
thumbnail = self._search_regex(
|
||||||
r'(?s)<figure[^>]+itemtype="https://schema.org/ImageObject"[^>]*>.*?<img[^>]+(?:data-dejavu-)?src="([^"]+)"',
|
r'(?s)<figure[^>]+itemtype="https://schema.org/ImageObject"[^>]*>.*?<img[^>]+(?:data-dejavu-)?src="([^"]+)"',
|
||||||
webpage, 'thumbnail', fatal=False)
|
webpage, 'thumbnail', default=None)
|
||||||
uploader = self._html_search_regex(
|
uploader = self._html_search_regex(
|
||||||
r'(?s)<span class="author">(.*?)</span>',
|
r'(?s)<span class="author">(.*?)</span>',
|
||||||
webpage, 'uploader', default=None)
|
webpage, 'uploader', default=None)
|
||||||
@ -64,6 +68,6 @@ class FranceCultureIE(InfoExtractor):
|
|||||||
'ext': ext,
|
'ext': ext,
|
||||||
'vcodec': 'none' if ext == 'mp3' else None,
|
'vcodec': 'none' if ext == 'mp3' else None,
|
||||||
'uploader': uploader,
|
'uploader': uploader,
|
||||||
'timestamp': int_or_none(video_data.get('data-asset-created-date')),
|
'timestamp': int_or_none(video_data.get('data-start-time')) or int_or_none(video_data.get('data-asset-created-date')),
|
||||||
'duration': int_or_none(video_data.get('data-duration')),
|
'duration': int_or_none(video_data.get('data-duration')),
|
||||||
}
|
}
|
||||||
|
35
youtube_dl/extractor/fujitv.py
Normal file
35
youtube_dl/extractor/fujitv.py
Normal file
@ -0,0 +1,35 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
|
class FujiTVFODPlus7IE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://i\.fod\.fujitv\.co\.jp/plus7/web/[0-9a-z]{4}/(?P<id>[0-9a-z]+)'
|
||||||
|
_BASE_URL = 'http://i.fod.fujitv.co.jp/'
|
||||||
|
_BITRATE_MAP = {
|
||||||
|
300: (320, 180),
|
||||||
|
800: (640, 360),
|
||||||
|
1200: (1280, 720),
|
||||||
|
2000: (1280, 720),
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
formats = self._extract_m3u8_formats(
|
||||||
|
self._BASE_URL + 'abr/pc_html5/%s.m3u8' % video_id, video_id)
|
||||||
|
for f in formats:
|
||||||
|
wh = self._BITRATE_MAP.get(f.get('tbr'))
|
||||||
|
if wh:
|
||||||
|
f.update({
|
||||||
|
'width': wh[0],
|
||||||
|
'height': wh[1],
|
||||||
|
})
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': video_id,
|
||||||
|
'formats': formats,
|
||||||
|
'thumbnail': self._BASE_URL + 'pc/image/wbtn/wbtn_%s.jpg' % video_id,
|
||||||
|
}
|
@ -1,77 +0,0 @@
|
|||||||
# coding: utf-8
|
|
||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
from .adobepass import AdobePassIE
|
|
||||||
from ..utils import (
|
|
||||||
extract_attributes,
|
|
||||||
int_or_none,
|
|
||||||
parse_age_limit,
|
|
||||||
smuggle_url,
|
|
||||||
update_url_query,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class FXNetworksIE(AdobePassIE):
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?(?:fxnetworks|simpsonsworld)\.com/video/(?P<id>\d+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'http://www.fxnetworks.com/video/1032565827847',
|
|
||||||
'md5': '8d99b97b4aa7a202f55b6ed47ea7e703',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'dRzwHC_MMqIv',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'First Look: Better Things - Season 2',
|
|
||||||
'description': 'Because real life is like a fart. Watch this FIRST LOOK to see what inspired the new season of Better Things.',
|
|
||||||
'age_limit': 14,
|
|
||||||
'uploader': 'NEWA-FNG-FX',
|
|
||||||
'upload_date': '20170825',
|
|
||||||
'timestamp': 1503686274,
|
|
||||||
'episode_number': 0,
|
|
||||||
'season_number': 2,
|
|
||||||
'series': 'Better Things',
|
|
||||||
},
|
|
||||||
'add_ie': ['ThePlatform'],
|
|
||||||
}, {
|
|
||||||
'url': 'http://www.simpsonsworld.com/video/716094019682',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
video_id = self._match_id(url)
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
|
||||||
if 'The content you are trying to access is not available in your region.' in webpage:
|
|
||||||
self.raise_geo_restricted()
|
|
||||||
video_data = extract_attributes(self._search_regex(
|
|
||||||
r'(<a.+?rel="https?://link\.theplatform\.com/s/.+?</a>)', webpage, 'video data'))
|
|
||||||
player_type = self._search_regex(r'playerType\s*=\s*[\'"]([^\'"]+)', webpage, 'player type', default=None)
|
|
||||||
release_url = video_data['rel']
|
|
||||||
title = video_data['data-title']
|
|
||||||
rating = video_data.get('data-rating')
|
|
||||||
query = {
|
|
||||||
'mbr': 'true',
|
|
||||||
}
|
|
||||||
if player_type == 'movies':
|
|
||||||
query.update({
|
|
||||||
'manifest': 'm3u',
|
|
||||||
})
|
|
||||||
else:
|
|
||||||
query.update({
|
|
||||||
'switch': 'http',
|
|
||||||
})
|
|
||||||
if video_data.get('data-req-auth') == '1':
|
|
||||||
resource = self._get_mvpd_resource(
|
|
||||||
video_data['data-channel'], title,
|
|
||||||
video_data.get('data-guid'), rating)
|
|
||||||
query['auth'] = self._extract_mvpd_auth(url, video_id, 'fx', resource)
|
|
||||||
|
|
||||||
return {
|
|
||||||
'_type': 'url_transparent',
|
|
||||||
'id': video_id,
|
|
||||||
'title': title,
|
|
||||||
'url': smuggle_url(update_url_query(release_url, query), {'force_smil_url': True}),
|
|
||||||
'series': video_data.get('data-show-title'),
|
|
||||||
'episode_number': int_or_none(video_data.get('data-episode')),
|
|
||||||
'season_number': int_or_none(video_data.get('data-season')),
|
|
||||||
'thumbnail': video_data.get('data-large-thumb'),
|
|
||||||
'age_limit': parse_age_limit(rating),
|
|
||||||
'ie_key': 'ThePlatform',
|
|
||||||
}
|
|
@ -1,16 +1,7 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .once import OnceIE
|
from .once import OnceIE
|
||||||
from ..compat import (
|
from ..compat import compat_urllib_parse_unquote
|
||||||
compat_urllib_parse_unquote,
|
|
||||||
)
|
|
||||||
from ..utils import (
|
|
||||||
unescapeHTML,
|
|
||||||
url_basename,
|
|
||||||
dict_get,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class GameSpotIE(OnceIE):
|
class GameSpotIE(OnceIE):
|
||||||
@ -24,17 +15,16 @@ class GameSpotIE(OnceIE):
|
|||||||
'title': 'Arma 3 - Community Guide: SITREP I',
|
'title': 'Arma 3 - Community Guide: SITREP I',
|
||||||
'description': 'Check out this video where some of the basics of Arma 3 is explained.',
|
'description': 'Check out this video where some of the basics of Arma 3 is explained.',
|
||||||
},
|
},
|
||||||
|
'skip': 'manifest URL give HTTP Error 404: Not Found',
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.gamespot.com/videos/the-witcher-3-wild-hunt-xbox-one-now-playing/2300-6424837/',
|
'url': 'http://www.gamespot.com/videos/the-witcher-3-wild-hunt-xbox-one-now-playing/2300-6424837/',
|
||||||
|
'md5': '173ea87ad762cf5d3bf6163dceb255a6',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'gs-2300-6424837',
|
'id': 'gs-2300-6424837',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Now Playing - The Witcher 3: Wild Hunt',
|
'title': 'Now Playing - The Witcher 3: Wild Hunt',
|
||||||
'description': 'Join us as we take a look at the early hours of The Witcher 3: Wild Hunt and more.',
|
'description': 'Join us as we take a look at the early hours of The Witcher 3: Wild Hunt and more.',
|
||||||
},
|
},
|
||||||
'params': {
|
|
||||||
'skip_download': True, # m3u8 downloads
|
|
||||||
},
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.gamespot.com/videos/embed/6439218/',
|
'url': 'https://www.gamespot.com/videos/embed/6439218/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -49,90 +39,40 @@ class GameSpotIE(OnceIE):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
page_id = self._match_id(url)
|
page_id = self._match_id(url)
|
||||||
webpage = self._download_webpage(url, page_id)
|
webpage = self._download_webpage(url, page_id)
|
||||||
data_video_json = self._search_regex(
|
data_video = self._parse_json(self._html_search_regex(
|
||||||
r'data-video=["\'](.*?)["\']', webpage, 'data video')
|
r'data-video=(["\'])({.*?})\1', webpage,
|
||||||
data_video = self._parse_json(unescapeHTML(data_video_json), page_id)
|
'video data', group=2), page_id)
|
||||||
|
title = compat_urllib_parse_unquote(data_video['title'])
|
||||||
streams = data_video['videoStreams']
|
streams = data_video['videoStreams']
|
||||||
|
|
||||||
manifest_url = None
|
|
||||||
formats = []
|
formats = []
|
||||||
f4m_url = streams.get('f4m_stream')
|
|
||||||
if f4m_url:
|
m3u8_url = streams.get('adaptive_stream')
|
||||||
manifest_url = f4m_url
|
|
||||||
formats.extend(self._extract_f4m_formats(
|
|
||||||
f4m_url + '?hdcore=3.7.0', page_id, f4m_id='hds', fatal=False))
|
|
||||||
m3u8_url = dict_get(streams, ('m3u8_stream', 'adaptive_stream'))
|
|
||||||
if m3u8_url:
|
if m3u8_url:
|
||||||
manifest_url = m3u8_url
|
|
||||||
m3u8_formats = self._extract_m3u8_formats(
|
m3u8_formats = self._extract_m3u8_formats(
|
||||||
m3u8_url, page_id, 'mp4', 'm3u8_native',
|
m3u8_url, page_id, 'mp4', 'm3u8_native',
|
||||||
m3u8_id='hls', fatal=False)
|
m3u8_id='hls', fatal=False)
|
||||||
formats.extend(m3u8_formats)
|
for f in m3u8_formats:
|
||||||
progressive_url = dict_get(
|
formats.append(f)
|
||||||
streams, ('progressive_hd', 'progressive_high', 'progressive_low', 'other_lr'))
|
http_f = f.copy()
|
||||||
if progressive_url and manifest_url:
|
del http_f['manifest_url']
|
||||||
qualities_basename = self._search_regex(
|
http_f.update({
|
||||||
r'/([^/]+)\.csmil/',
|
'format_id': f['format_id'].replace('hls-', 'http-'),
|
||||||
manifest_url, 'qualities basename', default=None)
|
'protocol': 'http',
|
||||||
if qualities_basename:
|
'url': f['url'].replace('.m3u8', '.mp4'),
|
||||||
QUALITIES_RE = r'((,\d+)+,?)'
|
})
|
||||||
qualities = self._search_regex(
|
formats.append(http_f)
|
||||||
QUALITIES_RE, qualities_basename,
|
|
||||||
'qualities', default=None)
|
|
||||||
if qualities:
|
|
||||||
qualities = list(map(lambda q: int(q), qualities.strip(',').split(',')))
|
|
||||||
qualities.sort()
|
|
||||||
http_template = re.sub(QUALITIES_RE, r'%d', qualities_basename)
|
|
||||||
http_url_basename = url_basename(progressive_url)
|
|
||||||
if m3u8_formats:
|
|
||||||
self._sort_formats(m3u8_formats)
|
|
||||||
m3u8_formats = list(filter(
|
|
||||||
lambda f: f.get('vcodec') != 'none', m3u8_formats))
|
|
||||||
if len(qualities) == len(m3u8_formats):
|
|
||||||
for q, m3u8_format in zip(qualities, m3u8_formats):
|
|
||||||
f = m3u8_format.copy()
|
|
||||||
f.update({
|
|
||||||
'url': progressive_url.replace(
|
|
||||||
http_url_basename, http_template % q),
|
|
||||||
'format_id': f['format_id'].replace('hls', 'http'),
|
|
||||||
'protocol': 'http',
|
|
||||||
})
|
|
||||||
formats.append(f)
|
|
||||||
else:
|
|
||||||
for q in qualities:
|
|
||||||
formats.append({
|
|
||||||
'url': progressive_url.replace(
|
|
||||||
http_url_basename, http_template % q),
|
|
||||||
'ext': 'mp4',
|
|
||||||
'format_id': 'http-%d' % q,
|
|
||||||
'tbr': q,
|
|
||||||
})
|
|
||||||
|
|
||||||
onceux_json = self._search_regex(
|
mpd_url = streams.get('adaptive_dash')
|
||||||
r'data-onceux-options=["\'](.*?)["\']', webpage, 'data video', default=None)
|
if mpd_url:
|
||||||
if onceux_json:
|
formats.extend(self._extract_mpd_formats(
|
||||||
onceux_url = self._parse_json(unescapeHTML(onceux_json), page_id).get('metadataUri')
|
mpd_url, page_id, mpd_id='dash', fatal=False))
|
||||||
if onceux_url:
|
|
||||||
formats.extend(self._extract_once_formats(re.sub(
|
|
||||||
r'https?://[^/]+', 'http://once.unicornmedia.com', onceux_url),
|
|
||||||
http_formats_preference=-1))
|
|
||||||
|
|
||||||
if not formats:
|
|
||||||
for quality in ['sd', 'hd']:
|
|
||||||
# It's actually a link to a flv file
|
|
||||||
flv_url = streams.get('f4m_{0}'.format(quality))
|
|
||||||
if flv_url is not None:
|
|
||||||
formats.append({
|
|
||||||
'url': flv_url,
|
|
||||||
'ext': 'flv',
|
|
||||||
'format_id': quality,
|
|
||||||
})
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': data_video['guid'],
|
'id': data_video.get('guid') or page_id,
|
||||||
'display_id': page_id,
|
'display_id': page_id,
|
||||||
'title': compat_urllib_parse_unquote(data_video['title']),
|
'title': title,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'description': self._html_search_meta('description', webpage),
|
'description': self._html_search_meta('description', webpage),
|
||||||
'thumbnail': self._og_search_thumbnail(webpage),
|
'thumbnail': self._og_search_thumbnail(webpage),
|
||||||
|
@ -20,19 +20,24 @@ from ..utils import (
|
|||||||
ExtractorError,
|
ExtractorError,
|
||||||
float_or_none,
|
float_or_none,
|
||||||
HEADRequest,
|
HEADRequest,
|
||||||
|
int_or_none,
|
||||||
is_html,
|
is_html,
|
||||||
js_to_json,
|
js_to_json,
|
||||||
KNOWN_EXTENSIONS,
|
KNOWN_EXTENSIONS,
|
||||||
merge_dicts,
|
merge_dicts,
|
||||||
mimetype2ext,
|
mimetype2ext,
|
||||||
orderedSet,
|
orderedSet,
|
||||||
|
parse_duration,
|
||||||
sanitized_Request,
|
sanitized_Request,
|
||||||
smuggle_url,
|
smuggle_url,
|
||||||
unescapeHTML,
|
unescapeHTML,
|
||||||
unified_strdate,
|
unified_timestamp,
|
||||||
unsmuggle_url,
|
unsmuggle_url,
|
||||||
UnsupportedError,
|
UnsupportedError,
|
||||||
|
url_or_none,
|
||||||
|
xpath_attr,
|
||||||
xpath_text,
|
xpath_text,
|
||||||
|
xpath_with_ns,
|
||||||
)
|
)
|
||||||
from .commonprotocols import RtmpIE
|
from .commonprotocols import RtmpIE
|
||||||
from .brightcove import (
|
from .brightcove import (
|
||||||
@ -48,7 +53,6 @@ from .ooyala import OoyalaIE
|
|||||||
from .rutv import RUTVIE
|
from .rutv import RUTVIE
|
||||||
from .tvc import TVCIE
|
from .tvc import TVCIE
|
||||||
from .sportbox import SportBoxIE
|
from .sportbox import SportBoxIE
|
||||||
from .smotri import SmotriIE
|
|
||||||
from .myvi import MyviIE
|
from .myvi import MyviIE
|
||||||
from .condenast import CondeNastIE
|
from .condenast import CondeNastIE
|
||||||
from .udn import UDNEmbedIE
|
from .udn import UDNEmbedIE
|
||||||
@ -63,7 +67,10 @@ from .tube8 import Tube8IE
|
|||||||
from .mofosex import MofosexEmbedIE
|
from .mofosex import MofosexEmbedIE
|
||||||
from .spankwire import SpankwireIE
|
from .spankwire import SpankwireIE
|
||||||
from .youporn import YouPornIE
|
from .youporn import YouPornIE
|
||||||
from .vimeo import VimeoIE
|
from .vimeo import (
|
||||||
|
VimeoIE,
|
||||||
|
VHXEmbedIE,
|
||||||
|
)
|
||||||
from .dailymotion import DailymotionIE
|
from .dailymotion import DailymotionIE
|
||||||
from .dailymail import DailyMailIE
|
from .dailymail import DailyMailIE
|
||||||
from .onionstudios import OnionStudiosIE
|
from .onionstudios import OnionStudiosIE
|
||||||
@ -120,6 +127,8 @@ from .expressen import ExpressenIE
|
|||||||
from .zype import ZypeIE
|
from .zype import ZypeIE
|
||||||
from .odnoklassniki import OdnoklassnikiIE
|
from .odnoklassniki import OdnoklassnikiIE
|
||||||
from .kinja import KinjaEmbedIE
|
from .kinja import KinjaEmbedIE
|
||||||
|
from .arcpublishing import ArcPublishingIE
|
||||||
|
from .medialaan import MedialaanIE
|
||||||
|
|
||||||
|
|
||||||
class GenericIE(InfoExtractor):
|
class GenericIE(InfoExtractor):
|
||||||
@ -198,11 +207,48 @@ class GenericIE(InfoExtractor):
|
|||||||
{
|
{
|
||||||
'url': 'http://podcastfeeds.nbcnews.com/audio/podcast/MSNBC-MADDOW-NETCAST-M4V.xml',
|
'url': 'http://podcastfeeds.nbcnews.com/audio/podcast/MSNBC-MADDOW-NETCAST-M4V.xml',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'pdv_maddow_netcast_m4v-02-27-2015-201624',
|
'id': 'http://podcastfeeds.nbcnews.com/nbcnews/video/podcast/MSNBC-MADDOW-NETCAST-M4V.xml',
|
||||||
'ext': 'm4v',
|
'title': 'MSNBC Rachel Maddow (video)',
|
||||||
'upload_date': '20150228',
|
'description': 're:.*her unique approach to storytelling.*',
|
||||||
'title': 'pdv_maddow_netcast_m4v-02-27-2015-201624',
|
},
|
||||||
}
|
'playlist': [{
|
||||||
|
'info_dict': {
|
||||||
|
'ext': 'mov',
|
||||||
|
'id': 'pdv_maddow_netcast_mov-12-04-2020-224335',
|
||||||
|
'title': 're:MSNBC Rachel Maddow',
|
||||||
|
'description': 're:.*her unique approach to storytelling.*',
|
||||||
|
'timestamp': int,
|
||||||
|
'upload_date': compat_str,
|
||||||
|
'duration': float,
|
||||||
|
},
|
||||||
|
}],
|
||||||
|
},
|
||||||
|
# RSS feed with item with description and thumbnails
|
||||||
|
{
|
||||||
|
'url': 'https://anchor.fm/s/dd00e14/podcast/rss',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'https://anchor.fm/s/dd00e14/podcast/rss',
|
||||||
|
'title': 're:.*100% Hydrogen.*',
|
||||||
|
'description': 're:.*In this episode.*',
|
||||||
|
},
|
||||||
|
'playlist': [{
|
||||||
|
'info_dict': {
|
||||||
|
'ext': 'm4a',
|
||||||
|
'id': 'c1c879525ce2cb640b344507e682c36d',
|
||||||
|
'title': 're:Hydrogen!',
|
||||||
|
'description': 're:.*In this episode we are going.*',
|
||||||
|
'timestamp': 1567977776,
|
||||||
|
'upload_date': '20190908',
|
||||||
|
'duration': 459,
|
||||||
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
|
'episode_number': 1,
|
||||||
|
'season_number': 1,
|
||||||
|
'age_limit': 0,
|
||||||
|
},
|
||||||
|
}],
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
},
|
},
|
||||||
# RSS feed with enclosures and unsupported link URLs
|
# RSS feed with enclosures and unsupported link URLs
|
||||||
{
|
{
|
||||||
@ -1983,22 +2029,6 @@ class GenericIE(InfoExtractor):
|
|||||||
},
|
},
|
||||||
'add_ie': [SpringboardPlatformIE.ie_key()],
|
'add_ie': [SpringboardPlatformIE.ie_key()],
|
||||||
},
|
},
|
||||||
{
|
|
||||||
'url': 'https://www.youtube.com/shared?ci=1nEzmT-M4fU',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'uPDB5I9wfp8',
|
|
||||||
'ext': 'webm',
|
|
||||||
'title': 'Pocoyo: 90 minutos de episódios completos Português para crianças - PARTE 3',
|
|
||||||
'description': 'md5:d9e4d9346a2dfff4c7dc4c8cec0f546d',
|
|
||||||
'upload_date': '20160219',
|
|
||||||
'uploader': 'Pocoyo - Português (BR)',
|
|
||||||
'uploader_id': 'PocoyoBrazil',
|
|
||||||
},
|
|
||||||
'add_ie': [YoutubeIE.ie_key()],
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
'url': 'https://www.yapfiles.ru/show/1872528/690b05d3054d2dbe1e69523aa21bb3b1.mp4.html',
|
'url': 'https://www.yapfiles.ru/show/1872528/690b05d3054d2dbe1e69523aa21bb3b1.mp4.html',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -2103,23 +2133,23 @@ class GenericIE(InfoExtractor):
|
|||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
{
|
# {
|
||||||
# Zype embed
|
# # Zype embed
|
||||||
'url': 'https://www.cookscountry.com/episode/554-smoky-barbecue-favorites',
|
# 'url': 'https://www.cookscountry.com/episode/554-smoky-barbecue-favorites',
|
||||||
'info_dict': {
|
# 'info_dict': {
|
||||||
'id': '5b400b834b32992a310622b9',
|
# 'id': '5b400b834b32992a310622b9',
|
||||||
'ext': 'mp4',
|
# 'ext': 'mp4',
|
||||||
'title': 'Smoky Barbecue Favorites',
|
# 'title': 'Smoky Barbecue Favorites',
|
||||||
'thumbnail': r're:^https?://.*\.jpe?g',
|
# 'thumbnail': r're:^https?://.*\.jpe?g',
|
||||||
'description': 'md5:5ff01e76316bd8d46508af26dc86023b',
|
# 'description': 'md5:5ff01e76316bd8d46508af26dc86023b',
|
||||||
'upload_date': '20170909',
|
# 'upload_date': '20170909',
|
||||||
'timestamp': 1504915200,
|
# 'timestamp': 1504915200,
|
||||||
},
|
# },
|
||||||
'add_ie': [ZypeIE.ie_key()],
|
# 'add_ie': [ZypeIE.ie_key()],
|
||||||
'params': {
|
# 'params': {
|
||||||
'skip_download': True,
|
# 'skip_download': True,
|
||||||
},
|
# },
|
||||||
},
|
# },
|
||||||
{
|
{
|
||||||
# videojs embed
|
# videojs embed
|
||||||
'url': 'https://video.sibnet.ru/shell.php?videoid=3422904',
|
'url': 'https://video.sibnet.ru/shell.php?videoid=3422904',
|
||||||
@ -2168,7 +2198,46 @@ class GenericIE(InfoExtractor):
|
|||||||
# 'params': {
|
# 'params': {
|
||||||
# 'force_generic_extractor': True,
|
# 'force_generic_extractor': True,
|
||||||
# },
|
# },
|
||||||
# }
|
# },
|
||||||
|
{
|
||||||
|
# VHX Embed
|
||||||
|
'url': 'https://demo.vhx.tv/category-c/videos/file-example-mp4-480-1-5mg-copy',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '858208',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Untitled',
|
||||||
|
'uploader_id': 'user80538407',
|
||||||
|
'uploader': 'OTT Videos',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
# ArcPublishing PoWa video player
|
||||||
|
'url': 'https://www.adn.com/politics/2020/11/02/video-senate-candidates-campaign-in-anchorage-on-eve-of-election-day/',
|
||||||
|
'md5': 'b03b2fac8680e1e5a7cc81a5c27e71b3',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '8c99cb6e-b29c-4bc9-9173-7bf9979225ab',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Senate candidates wave to voters on Anchorage streets',
|
||||||
|
'description': 'md5:91f51a6511f090617353dc720318b20e',
|
||||||
|
'timestamp': 1604378735,
|
||||||
|
'upload_date': '20201103',
|
||||||
|
'duration': 1581,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
# MyChannels SDK embed
|
||||||
|
# https://www.24kitchen.nl/populair/deskundige-dit-waarom-sommigen-gevoelig-zijn-voor-voedselallergieen
|
||||||
|
'url': 'https://www.demorgen.be/nieuws/burgemeester-rotterdam-richt-zich-in-videoboodschap-tot-relschoppers-voelt-het-goed~b0bcfd741/',
|
||||||
|
'md5': '90c0699c37006ef18e198c032d81739c',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '194165',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Burgemeester Aboutaleb spreekt relschoppers toe',
|
||||||
|
'timestamp': 1611740340,
|
||||||
|
'upload_date': '20210127',
|
||||||
|
'duration': 159,
|
||||||
|
},
|
||||||
|
},
|
||||||
]
|
]
|
||||||
|
|
||||||
def report_following_redirect(self, new_url):
|
def report_following_redirect(self, new_url):
|
||||||
@ -2180,6 +2249,10 @@ class GenericIE(InfoExtractor):
|
|||||||
playlist_desc_el = doc.find('./channel/description')
|
playlist_desc_el = doc.find('./channel/description')
|
||||||
playlist_desc = None if playlist_desc_el is None else playlist_desc_el.text
|
playlist_desc = None if playlist_desc_el is None else playlist_desc_el.text
|
||||||
|
|
||||||
|
NS_MAP = {
|
||||||
|
'itunes': 'http://www.itunes.com/dtds/podcast-1.0.dtd',
|
||||||
|
}
|
||||||
|
|
||||||
entries = []
|
entries = []
|
||||||
for it in doc.findall('./channel/item'):
|
for it in doc.findall('./channel/item'):
|
||||||
next_url = None
|
next_url = None
|
||||||
@ -2195,10 +2268,33 @@ class GenericIE(InfoExtractor):
|
|||||||
if not next_url:
|
if not next_url:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
|
def itunes(key):
|
||||||
|
return xpath_text(
|
||||||
|
it, xpath_with_ns('./itunes:%s' % key, NS_MAP),
|
||||||
|
default=None)
|
||||||
|
|
||||||
|
duration = itunes('duration')
|
||||||
|
explicit = (itunes('explicit') or '').lower()
|
||||||
|
if explicit in ('true', 'yes'):
|
||||||
|
age_limit = 18
|
||||||
|
elif explicit in ('false', 'no'):
|
||||||
|
age_limit = 0
|
||||||
|
else:
|
||||||
|
age_limit = None
|
||||||
|
|
||||||
entries.append({
|
entries.append({
|
||||||
'_type': 'url_transparent',
|
'_type': 'url_transparent',
|
||||||
'url': next_url,
|
'url': next_url,
|
||||||
'title': it.find('title').text,
|
'title': it.find('title').text,
|
||||||
|
'description': xpath_text(it, 'description', default=None),
|
||||||
|
'timestamp': unified_timestamp(
|
||||||
|
xpath_text(it, 'pubDate', default=None)),
|
||||||
|
'duration': int_or_none(duration) or parse_duration(duration),
|
||||||
|
'thumbnail': url_or_none(xpath_attr(it, xpath_with_ns('./itunes:image', NS_MAP), 'href')),
|
||||||
|
'episode': itunes('title'),
|
||||||
|
'episode_number': int_or_none(itunes('episode')),
|
||||||
|
'season_number': int_or_none(itunes('season')),
|
||||||
|
'age_limit': age_limit,
|
||||||
})
|
})
|
||||||
|
|
||||||
return {
|
return {
|
||||||
@ -2318,7 +2414,7 @@ class GenericIE(InfoExtractor):
|
|||||||
info_dict = {
|
info_dict = {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': self._generic_title(url),
|
'title': self._generic_title(url),
|
||||||
'upload_date': unified_strdate(head_response.headers.get('Last-Modified'))
|
'timestamp': unified_timestamp(head_response.headers.get('Last-Modified'))
|
||||||
}
|
}
|
||||||
|
|
||||||
# Check for direct link to a video
|
# Check for direct link to a video
|
||||||
@ -2381,6 +2477,9 @@ class GenericIE(InfoExtractor):
|
|||||||
webpage = self._webpage_read_content(
|
webpage = self._webpage_read_content(
|
||||||
full_response, url, video_id, prefix=first_bytes)
|
full_response, url, video_id, prefix=first_bytes)
|
||||||
|
|
||||||
|
if '<title>DPG Media Privacy Gate</title>' in webpage:
|
||||||
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
self.report_extraction(video_id)
|
self.report_extraction(video_id)
|
||||||
|
|
||||||
# Is it an RSS feed, a SMIL file, an XSPF playlist or a MPD manifest?
|
# Is it an RSS feed, a SMIL file, an XSPF playlist or a MPD manifest?
|
||||||
@ -2424,7 +2523,9 @@ class GenericIE(InfoExtractor):
|
|||||||
# Sometimes embedded video player is hidden behind percent encoding
|
# Sometimes embedded video player is hidden behind percent encoding
|
||||||
# (e.g. https://github.com/ytdl-org/youtube-dl/issues/2448)
|
# (e.g. https://github.com/ytdl-org/youtube-dl/issues/2448)
|
||||||
# Unescaping the whole page allows to handle those cases in a generic way
|
# Unescaping the whole page allows to handle those cases in a generic way
|
||||||
webpage = compat_urllib_parse_unquote(webpage)
|
# FIXME: unescaping the whole page may break URLs, commenting out for now.
|
||||||
|
# There probably should be a second run of generic extractor on unescaped webpage.
|
||||||
|
# webpage = compat_urllib_parse_unquote(webpage)
|
||||||
|
|
||||||
# Unescape squarespace embeds to be detected by generic extractor,
|
# Unescape squarespace embeds to be detected by generic extractor,
|
||||||
# see https://github.com/ytdl-org/youtube-dl/issues/21294
|
# see https://github.com/ytdl-org/youtube-dl/issues/21294
|
||||||
@ -2506,6 +2607,15 @@ class GenericIE(InfoExtractor):
|
|||||||
if tp_urls:
|
if tp_urls:
|
||||||
return self.playlist_from_matches(tp_urls, video_id, video_title, ie='ThePlatform')
|
return self.playlist_from_matches(tp_urls, video_id, video_title, ie='ThePlatform')
|
||||||
|
|
||||||
|
arc_urls = ArcPublishingIE._extract_urls(webpage)
|
||||||
|
if arc_urls:
|
||||||
|
return self.playlist_from_matches(arc_urls, video_id, video_title, ie=ArcPublishingIE.ie_key())
|
||||||
|
|
||||||
|
mychannels_urls = MedialaanIE._extract_urls(webpage)
|
||||||
|
if mychannels_urls:
|
||||||
|
return self.playlist_from_matches(
|
||||||
|
mychannels_urls, video_id, video_title, ie=MedialaanIE.ie_key())
|
||||||
|
|
||||||
# Look for embedded rtl.nl player
|
# Look for embedded rtl.nl player
|
||||||
matches = re.findall(
|
matches = re.findall(
|
||||||
r'<iframe[^>]+?src="((?:https?:)?//(?:(?:www|static)\.)?rtl\.nl/(?:system/videoplayer/[^"]+(?:video_)?)?embed[^"]+)"',
|
r'<iframe[^>]+?src="((?:https?:)?//(?:(?:www|static)\.)?rtl\.nl/(?:system/videoplayer/[^"]+(?:video_)?)?embed[^"]+)"',
|
||||||
@ -2517,6 +2627,10 @@ class GenericIE(InfoExtractor):
|
|||||||
if vimeo_urls:
|
if vimeo_urls:
|
||||||
return self.playlist_from_matches(vimeo_urls, video_id, video_title, ie=VimeoIE.ie_key())
|
return self.playlist_from_matches(vimeo_urls, video_id, video_title, ie=VimeoIE.ie_key())
|
||||||
|
|
||||||
|
vhx_url = VHXEmbedIE._extract_url(webpage)
|
||||||
|
if vhx_url:
|
||||||
|
return self.url_result(vhx_url, VHXEmbedIE.ie_key())
|
||||||
|
|
||||||
vid_me_embed_url = self._search_regex(
|
vid_me_embed_url = self._search_regex(
|
||||||
r'src=[\'"](https?://vid\.me/[^\'"]+)[\'"]',
|
r'src=[\'"](https?://vid\.me/[^\'"]+)[\'"]',
|
||||||
webpage, 'vid.me embed', default=None)
|
webpage, 'vid.me embed', default=None)
|
||||||
@ -2772,11 +2886,6 @@ class GenericIE(InfoExtractor):
|
|||||||
if mobj is not None:
|
if mobj is not None:
|
||||||
return self.url_result(mobj.group('url'))
|
return self.url_result(mobj.group('url'))
|
||||||
|
|
||||||
# Look for embedded smotri.com player
|
|
||||||
smotri_url = SmotriIE._extract_url(webpage)
|
|
||||||
if smotri_url:
|
|
||||||
return self.url_result(smotri_url, 'Smotri')
|
|
||||||
|
|
||||||
# Look for embedded Myvi.ru player
|
# Look for embedded Myvi.ru player
|
||||||
myvi_url = MyviIE._extract_url(webpage)
|
myvi_url = MyviIE._extract_url(webpage)
|
||||||
if myvi_url:
|
if myvi_url:
|
||||||
|
@ -38,13 +38,17 @@ class GoIE(AdobePassIE):
|
|||||||
'disneynow': {
|
'disneynow': {
|
||||||
'brand': '011',
|
'brand': '011',
|
||||||
'resource_id': 'Disney',
|
'resource_id': 'Disney',
|
||||||
}
|
},
|
||||||
|
'fxnow.fxnetworks': {
|
||||||
|
'brand': '025',
|
||||||
|
'requestor_id': 'dtci',
|
||||||
|
},
|
||||||
}
|
}
|
||||||
_VALID_URL = r'''(?x)
|
_VALID_URL = r'''(?x)
|
||||||
https?://
|
https?://
|
||||||
(?:
|
(?:
|
||||||
(?:(?P<sub_domain>%s)\.)?go|
|
(?:(?P<sub_domain>%s)\.)?go|
|
||||||
(?P<sub_domain_2>abc|freeform|disneynow)
|
(?P<sub_domain_2>abc|freeform|disneynow|fxnow\.fxnetworks)
|
||||||
)\.com/
|
)\.com/
|
||||||
(?:
|
(?:
|
||||||
(?:[^/]+/)*(?P<id>[Vv][Dd][Kk][Aa]\w+)|
|
(?:[^/]+/)*(?P<id>[Vv][Dd][Kk][Aa]\w+)|
|
||||||
@ -99,6 +103,19 @@ class GoIE(AdobePassIE):
|
|||||||
# m3u8 download
|
# m3u8 download
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://fxnow.fxnetworks.com/shows/better-things/video/vdka12782841',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'VDKA12782841',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'First Look: Better Things - Season 2',
|
||||||
|
'description': 'md5:fa73584a95761c605d9d54904e35b407',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'geo_bypass_ip_block': '3.244.239.0/24',
|
||||||
|
# m3u8 download
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://abc.go.com/shows/the-catch/episode-guide/season-01/10-the-wedding',
|
'url': 'http://abc.go.com/shows/the-catch/episode-guide/season-01/10-the-wedding',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
@ -7,6 +7,7 @@ from ..compat import compat_parse_qs
|
|||||||
from ..utils import (
|
from ..utils import (
|
||||||
determine_ext,
|
determine_ext,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
|
get_element_by_class,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
lowercase_escape,
|
lowercase_escape,
|
||||||
try_get,
|
try_get,
|
||||||
@ -237,7 +238,7 @@ class GoogleDriveIE(InfoExtractor):
|
|||||||
if confirmation_webpage:
|
if confirmation_webpage:
|
||||||
confirm = self._search_regex(
|
confirm = self._search_regex(
|
||||||
r'confirm=([^&"\']+)', confirmation_webpage,
|
r'confirm=([^&"\']+)', confirmation_webpage,
|
||||||
'confirmation code', fatal=False)
|
'confirmation code', default=None)
|
||||||
if confirm:
|
if confirm:
|
||||||
confirmed_source_url = update_url_query(source_url, {
|
confirmed_source_url = update_url_query(source_url, {
|
||||||
'confirm': confirm,
|
'confirm': confirm,
|
||||||
@ -245,6 +246,11 @@ class GoogleDriveIE(InfoExtractor):
|
|||||||
urlh = request_source_file(confirmed_source_url, 'confirmed source')
|
urlh = request_source_file(confirmed_source_url, 'confirmed source')
|
||||||
if urlh and urlh.headers.get('Content-Disposition'):
|
if urlh and urlh.headers.get('Content-Disposition'):
|
||||||
add_source_format(urlh)
|
add_source_format(urlh)
|
||||||
|
else:
|
||||||
|
self.report_warning(
|
||||||
|
get_element_by_class('uc-error-subcaption', confirmation_webpage)
|
||||||
|
or get_element_by_class('uc-error-caption', confirmation_webpage)
|
||||||
|
or 'unable to extract confirmation code')
|
||||||
|
|
||||||
if not formats and reason:
|
if not formats and reason:
|
||||||
raise ExtractorError(reason, expected=True)
|
raise ExtractorError(reason, expected=True)
|
||||||
|
@ -1,73 +0,0 @@
|
|||||||
# coding: utf-8
|
|
||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
import re
|
|
||||||
import codecs
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
from ..utils import unified_strdate
|
|
||||||
|
|
||||||
|
|
||||||
class GooglePlusIE(InfoExtractor):
|
|
||||||
IE_DESC = 'Google Plus'
|
|
||||||
_VALID_URL = r'https?://plus\.google\.com/(?:[^/]+/)*?posts/(?P<id>\w+)'
|
|
||||||
IE_NAME = 'plus.google'
|
|
||||||
_TEST = {
|
|
||||||
'url': 'https://plus.google.com/u/0/108897254135232129896/posts/ZButuJc6CtH',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'ZButuJc6CtH',
|
|
||||||
'ext': 'flv',
|
|
||||||
'title': '嘆きの天使 降臨',
|
|
||||||
'upload_date': '20120613',
|
|
||||||
'uploader': '井上ヨシマサ',
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
video_id = self._match_id(url)
|
|
||||||
|
|
||||||
# Step 1, Retrieve post webpage to extract further information
|
|
||||||
webpage = self._download_webpage(url, video_id, 'Downloading entry webpage')
|
|
||||||
|
|
||||||
title = self._og_search_description(webpage).splitlines()[0]
|
|
||||||
upload_date = unified_strdate(self._html_search_regex(
|
|
||||||
r'''(?x)<a.+?class="o-U-s\s[^"]+"\s+style="display:\s*none"\s*>
|
|
||||||
([0-9]{4}-[0-9]{2}-[0-9]{2})</a>''',
|
|
||||||
webpage, 'upload date', fatal=False, flags=re.VERBOSE))
|
|
||||||
uploader = self._html_search_regex(
|
|
||||||
r'rel="author".*?>(.*?)</a>', webpage, 'uploader', fatal=False)
|
|
||||||
|
|
||||||
# Step 2, Simulate clicking the image box to launch video
|
|
||||||
DOMAIN = 'https://plus.google.com/'
|
|
||||||
video_page = self._search_regex(
|
|
||||||
r'<a href="((?:%s)?photos/.*?)"' % re.escape(DOMAIN),
|
|
||||||
webpage, 'video page URL')
|
|
||||||
if not video_page.startswith(DOMAIN):
|
|
||||||
video_page = DOMAIN + video_page
|
|
||||||
|
|
||||||
webpage = self._download_webpage(video_page, video_id, 'Downloading video page')
|
|
||||||
|
|
||||||
def unicode_escape(s):
|
|
||||||
decoder = codecs.getdecoder('unicode_escape')
|
|
||||||
return re.sub(
|
|
||||||
r'\\u[0-9a-fA-F]{4,}',
|
|
||||||
lambda m: decoder(m.group(0))[0],
|
|
||||||
s)
|
|
||||||
|
|
||||||
# Extract video links all sizes
|
|
||||||
formats = [{
|
|
||||||
'url': unicode_escape(video_url),
|
|
||||||
'ext': 'flv',
|
|
||||||
'width': int(width),
|
|
||||||
'height': int(height),
|
|
||||||
} for width, height, video_url in re.findall(
|
|
||||||
r'\d+,(\d+),(\d+),"(https?://[^.]+\.googleusercontent\.com.*?)"', webpage)]
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': video_id,
|
|
||||||
'title': title,
|
|
||||||
'uploader': uploader,
|
|
||||||
'upload_date': upload_date,
|
|
||||||
'formats': formats,
|
|
||||||
}
|
|
88
youtube_dl/extractor/googlepodcasts.py
Normal file
88
youtube_dl/extractor/googlepodcasts.py
Normal file
@ -0,0 +1,88 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
clean_podcast_url,
|
||||||
|
int_or_none,
|
||||||
|
try_get,
|
||||||
|
urlencode_postdata,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class GooglePodcastsBaseIE(InfoExtractor):
|
||||||
|
_VALID_URL_BASE = r'https?://podcasts\.google\.com/feed/'
|
||||||
|
|
||||||
|
def _batch_execute(self, func_id, video_id, params):
|
||||||
|
return json.loads(self._download_json(
|
||||||
|
'https://podcasts.google.com/_/PodcastsUi/data/batchexecute',
|
||||||
|
video_id, data=urlencode_postdata({
|
||||||
|
'f.req': json.dumps([[[func_id, json.dumps(params), None, '1']]]),
|
||||||
|
}), transform_source=lambda x: self._search_regex(r'(?s)(\[.+\])', x, 'data'))[0][2])
|
||||||
|
|
||||||
|
def _extract_episode(self, episode):
|
||||||
|
return {
|
||||||
|
'id': episode[4][3],
|
||||||
|
'title': episode[8],
|
||||||
|
'url': clean_podcast_url(episode[13]),
|
||||||
|
'thumbnail': episode[2],
|
||||||
|
'description': episode[9],
|
||||||
|
'creator': try_get(episode, lambda x: x[14]),
|
||||||
|
'timestamp': int_or_none(episode[11]),
|
||||||
|
'duration': int_or_none(episode[12]),
|
||||||
|
'series': episode[1],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class GooglePodcastsIE(GooglePodcastsBaseIE):
|
||||||
|
IE_NAME = 'google:podcasts'
|
||||||
|
_VALID_URL = GooglePodcastsBaseIE._VALID_URL_BASE + r'(?P<feed_url>[^/]+)/episode/(?P<id>[^/?&#]+)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5ucHIub3JnLzM0NDA5ODUzOS9wb2RjYXN0LnhtbA/episode/MzBlNWRlN2UtOWE4Yy00ODcwLTk2M2MtM2JlMmUyNmViOTRh',
|
||||||
|
'md5': 'fa56b2ee8bd0703e27e42d4b104c4766',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '30e5de7e-9a8c-4870-963c-3be2e26eb94a',
|
||||||
|
'ext': 'mp3',
|
||||||
|
'title': 'WWDTM New Year 2021',
|
||||||
|
'description': 'We say goodbye to 2020 with Christine Baranksi, Doug Jones, Jonna Mendez, and Kellee Edwards.',
|
||||||
|
'upload_date': '20210102',
|
||||||
|
'timestamp': 1609606800,
|
||||||
|
'duration': 2901,
|
||||||
|
'series': "Wait Wait... Don't Tell Me!",
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
b64_feed_url, b64_guid = re.match(self._VALID_URL, url).groups()
|
||||||
|
episode = self._batch_execute(
|
||||||
|
'oNjqVe', b64_guid, [b64_feed_url, b64_guid])[1]
|
||||||
|
return self._extract_episode(episode)
|
||||||
|
|
||||||
|
|
||||||
|
class GooglePodcastsFeedIE(GooglePodcastsBaseIE):
|
||||||
|
IE_NAME = 'google:podcasts:feed'
|
||||||
|
_VALID_URL = GooglePodcastsBaseIE._VALID_URL_BASE + r'(?P<id>[^/?&#]+)/?(?:[?#&]|$)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5ucHIub3JnLzM0NDA5ODUzOS9wb2RjYXN0LnhtbA',
|
||||||
|
'info_dict': {
|
||||||
|
'title': "Wait Wait... Don't Tell Me!",
|
||||||
|
'description': "NPR's weekly current events quiz. Have a laugh and test your news knowledge while figuring out what's real and what we've made up.",
|
||||||
|
},
|
||||||
|
'playlist_mincount': 20,
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
b64_feed_url = self._match_id(url)
|
||||||
|
data = self._batch_execute('ncqJEe', b64_feed_url, [b64_feed_url])
|
||||||
|
|
||||||
|
entries = []
|
||||||
|
for episode in (try_get(data, lambda x: x[1][0]) or []):
|
||||||
|
entries.append(self._extract_episode(episode))
|
||||||
|
|
||||||
|
feed = try_get(data, lambda x: x[3]) or []
|
||||||
|
return self.playlist_result(
|
||||||
|
entries, playlist_title=try_get(feed, lambda x: x[0]),
|
||||||
|
playlist_description=try_get(feed, lambda x: x[2]))
|
@ -3,6 +3,7 @@ from __future__ import unicode_literals
|
|||||||
|
|
||||||
import hashlib
|
import hashlib
|
||||||
import hmac
|
import hmac
|
||||||
|
import json
|
||||||
import re
|
import re
|
||||||
import time
|
import time
|
||||||
import uuid
|
import uuid
|
||||||
@ -25,43 +26,50 @@ from ..utils import (
|
|||||||
class HotStarBaseIE(InfoExtractor):
|
class HotStarBaseIE(InfoExtractor):
|
||||||
_AKAMAI_ENCRYPTION_KEY = b'\x05\xfc\x1a\x01\xca\xc9\x4b\xc4\x12\xfc\x53\x12\x07\x75\xf9\xee'
|
_AKAMAI_ENCRYPTION_KEY = b'\x05\xfc\x1a\x01\xca\xc9\x4b\xc4\x12\xfc\x53\x12\x07\x75\xf9\xee'
|
||||||
|
|
||||||
def _call_api_impl(self, path, video_id, query):
|
def _call_api_impl(self, path, video_id, headers, query, data=None):
|
||||||
st = int(time.time())
|
st = int(time.time())
|
||||||
exp = st + 6000
|
exp = st + 6000
|
||||||
auth = 'st=%d~exp=%d~acl=/*' % (st, exp)
|
auth = 'st=%d~exp=%d~acl=/*' % (st, exp)
|
||||||
auth += '~hmac=' + hmac.new(self._AKAMAI_ENCRYPTION_KEY, auth.encode(), hashlib.sha256).hexdigest()
|
auth += '~hmac=' + hmac.new(self._AKAMAI_ENCRYPTION_KEY, auth.encode(), hashlib.sha256).hexdigest()
|
||||||
response = self._download_json(
|
h = {'hotstarauth': auth}
|
||||||
'https://api.hotstar.com/' + path, video_id, headers={
|
h.update(headers)
|
||||||
'hotstarauth': auth,
|
return self._download_json(
|
||||||
'x-country-code': 'IN',
|
'https://api.hotstar.com/' + path,
|
||||||
'x-platform-code': 'JIO',
|
video_id, headers=h, query=query, data=data)
|
||||||
}, query=query)
|
|
||||||
|
def _call_api(self, path, video_id, query_name='contentId'):
|
||||||
|
response = self._call_api_impl(path, video_id, {
|
||||||
|
'x-country-code': 'IN',
|
||||||
|
'x-platform-code': 'JIO',
|
||||||
|
}, {
|
||||||
|
query_name: video_id,
|
||||||
|
'tas': 10000,
|
||||||
|
})
|
||||||
if response['statusCode'] != 'OK':
|
if response['statusCode'] != 'OK':
|
||||||
raise ExtractorError(
|
raise ExtractorError(
|
||||||
response['body']['message'], expected=True)
|
response['body']['message'], expected=True)
|
||||||
return response['body']['results']
|
return response['body']['results']
|
||||||
|
|
||||||
def _call_api(self, path, video_id, query_name='contentId'):
|
def _call_api_v2(self, path, video_id, headers, query=None, data=None):
|
||||||
return self._call_api_impl(path, video_id, {
|
h = {'X-Request-Id': compat_str(uuid.uuid4())}
|
||||||
query_name: video_id,
|
h.update(headers)
|
||||||
'tas': 10000,
|
try:
|
||||||
})
|
return self._call_api_impl(
|
||||||
|
path, video_id, h, query, data)
|
||||||
def _call_api_v2(self, path, video_id):
|
except ExtractorError as e:
|
||||||
return self._call_api_impl(
|
if isinstance(e.cause, compat_HTTPError):
|
||||||
'%s/in/contents/%s' % (path, video_id), video_id, {
|
if e.cause.code == 402:
|
||||||
'desiredConfig': 'encryption:plain;ladder:phone,tv;package:hls,dash',
|
self.raise_login_required()
|
||||||
'client': 'mweb',
|
message = self._parse_json(e.cause.read().decode(), video_id)['message']
|
||||||
'clientVersion': '6.18.0',
|
if message in ('Content not available in region', 'Country is not supported'):
|
||||||
'deviceId': compat_str(uuid.uuid4()),
|
raise self.raise_geo_restricted(message)
|
||||||
'osName': 'Windows',
|
raise ExtractorError(message)
|
||||||
'osVersion': '10',
|
raise e
|
||||||
})
|
|
||||||
|
|
||||||
|
|
||||||
class HotStarIE(HotStarBaseIE):
|
class HotStarIE(HotStarBaseIE):
|
||||||
IE_NAME = 'hotstar'
|
IE_NAME = 'hotstar'
|
||||||
_VALID_URL = r'https?://(?:www\.)?hotstar\.com/(?:.+?[/-])?(?P<id>\d{10})'
|
_VALID_URL = r'https?://(?:www\.)?hotstar\.com/(?:.+[/-])?(?P<id>\d{10})'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# contentData
|
# contentData
|
||||||
'url': 'https://www.hotstar.com/can-you-not-spread-rumours/1000076273',
|
'url': 'https://www.hotstar.com/can-you-not-spread-rumours/1000076273',
|
||||||
@ -92,8 +100,13 @@ class HotStarIE(HotStarBaseIE):
|
|||||||
# only available via api v2
|
# only available via api v2
|
||||||
'url': 'https://www.hotstar.com/tv/ek-bhram-sarvagun-sampanna/s-2116/janhvi-targets-suman/1000234847',
|
'url': 'https://www.hotstar.com/tv/ek-bhram-sarvagun-sampanna/s-2116/janhvi-targets-suman/1000234847',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.hotstar.com/in/tv/start-music/1260005217/cooks-vs-comalis/1100039717',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
_GEO_BYPASS = False
|
_GEO_BYPASS = False
|
||||||
|
_DEVICE_ID = None
|
||||||
|
_USER_TOKEN = None
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
@ -121,7 +134,30 @@ class HotStarIE(HotStarBaseIE):
|
|||||||
headers = {'Referer': url}
|
headers = {'Referer': url}
|
||||||
formats = []
|
formats = []
|
||||||
geo_restricted = False
|
geo_restricted = False
|
||||||
playback_sets = self._call_api_v2('h/v2/play', video_id)['playBackSets']
|
|
||||||
|
if not self._USER_TOKEN:
|
||||||
|
self._DEVICE_ID = compat_str(uuid.uuid4())
|
||||||
|
self._USER_TOKEN = self._call_api_v2('um/v3/users', video_id, {
|
||||||
|
'X-HS-Platform': 'PCTV',
|
||||||
|
'Content-Type': 'application/json',
|
||||||
|
}, data=json.dumps({
|
||||||
|
'device_ids': [{
|
||||||
|
'id': self._DEVICE_ID,
|
||||||
|
'type': 'device_id',
|
||||||
|
}],
|
||||||
|
}).encode())['user_identity']
|
||||||
|
|
||||||
|
playback_sets = self._call_api_v2(
|
||||||
|
'play/v2/playback/content/' + video_id, video_id, {
|
||||||
|
'X-HS-Platform': 'web',
|
||||||
|
'X-HS-AppVersion': '6.99.1',
|
||||||
|
'X-HS-UserToken': self._USER_TOKEN,
|
||||||
|
}, query={
|
||||||
|
'device-id': self._DEVICE_ID,
|
||||||
|
'desired-config': 'encryption:plain',
|
||||||
|
'os-name': 'Windows',
|
||||||
|
'os-version': '10',
|
||||||
|
})['data']['playBackSets']
|
||||||
for playback_set in playback_sets:
|
for playback_set in playback_sets:
|
||||||
if not isinstance(playback_set, dict):
|
if not isinstance(playback_set, dict):
|
||||||
continue
|
continue
|
||||||
@ -163,19 +199,22 @@ class HotStarIE(HotStarBaseIE):
|
|||||||
for f in formats:
|
for f in formats:
|
||||||
f.setdefault('http_headers', {}).update(headers)
|
f.setdefault('http_headers', {}).update(headers)
|
||||||
|
|
||||||
|
image = try_get(video_data, lambda x: x['image']['h'], compat_str)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
|
'thumbnail': 'https://img1.hotstarext.com/image/upload/' + image if image else None,
|
||||||
'description': video_data.get('description'),
|
'description': video_data.get('description'),
|
||||||
'duration': int_or_none(video_data.get('duration')),
|
'duration': int_or_none(video_data.get('duration')),
|
||||||
'timestamp': int_or_none(video_data.get('broadcastDate') or video_data.get('startDate')),
|
'timestamp': int_or_none(video_data.get('broadcastDate') or video_data.get('startDate')),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'channel': video_data.get('channelName'),
|
'channel': video_data.get('channelName'),
|
||||||
'channel_id': video_data.get('channelId'),
|
'channel_id': str_or_none(video_data.get('channelId')),
|
||||||
'series': video_data.get('showName'),
|
'series': video_data.get('showName'),
|
||||||
'season': video_data.get('seasonName'),
|
'season': video_data.get('seasonName'),
|
||||||
'season_number': int_or_none(video_data.get('seasonNo')),
|
'season_number': int_or_none(video_data.get('seasonNo')),
|
||||||
'season_id': video_data.get('seasonId'),
|
'season_id': str_or_none(video_data.get('seasonId')),
|
||||||
'episode': title,
|
'episode': title,
|
||||||
'episode_number': int_or_none(video_data.get('episodeNo')),
|
'episode_number': int_or_none(video_data.get('episodeNo')),
|
||||||
}
|
}
|
||||||
@ -183,7 +222,7 @@ class HotStarIE(HotStarBaseIE):
|
|||||||
|
|
||||||
class HotStarPlaylistIE(HotStarBaseIE):
|
class HotStarPlaylistIE(HotStarBaseIE):
|
||||||
IE_NAME = 'hotstar:playlist'
|
IE_NAME = 'hotstar:playlist'
|
||||||
_VALID_URL = r'https?://(?:www\.)?hotstar\.com/tv/[^/]+/s-\w+/list/[^/]+/t-(?P<id>\w+)'
|
_VALID_URL = r'https?://(?:www\.)?hotstar\.com/(?:[a-z]{2}/)?tv/[^/]+/s-\w+/list/[^/]+/t-(?P<id>\w+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.hotstar.com/tv/savdhaan-india/s-26/list/popular-clips/t-3_2_26',
|
'url': 'https://www.hotstar.com/tv/savdhaan-india/s-26/list/popular-clips/t-3_2_26',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -193,6 +232,9 @@ class HotStarPlaylistIE(HotStarBaseIE):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://www.hotstar.com/tv/savdhaan-india/s-26/list/extras/t-2480',
|
'url': 'https://www.hotstar.com/tv/savdhaan-india/s-26/list/extras/t-2480',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.hotstar.com/us/tv/masterchef-india/s-830/list/episodes/t-1_2_830',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -3,230 +3,255 @@ from __future__ import unicode_literals
|
|||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..compat import (
|
||||||
|
compat_parse_qs,
|
||||||
|
compat_urllib_parse_urlparse,
|
||||||
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
|
HEADRequest,
|
||||||
|
determine_ext,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
|
strip_or_none,
|
||||||
|
try_get,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class IGNIE(InfoExtractor):
|
class IGNBaseIE(InfoExtractor):
|
||||||
|
def _call_api(self, slug):
|
||||||
|
return self._download_json(
|
||||||
|
'http://apis.ign.com/{0}/v3/{0}s/slug/{1}'.format(self._PAGE_TYPE, slug), slug)
|
||||||
|
|
||||||
|
|
||||||
|
class IGNIE(IGNBaseIE):
|
||||||
"""
|
"""
|
||||||
Extractor for some of the IGN sites, like www.ign.com, es.ign.com de.ign.com.
|
Extractor for some of the IGN sites, like www.ign.com, es.ign.com de.ign.com.
|
||||||
Some videos of it.ign.com are also supported
|
Some videos of it.ign.com are also supported
|
||||||
"""
|
"""
|
||||||
|
|
||||||
_VALID_URL = r'https?://.+?\.ign\.com/(?:[^/]+/)?(?P<type>videos|show_videos|articles|feature|(?:[^/]+/\d+/video))(/.+)?/(?P<name_or_id>.+)'
|
_VALID_URL = r'https?://(?:.+?\.ign|www\.pcmag)\.com/videos/(?:\d{4}/\d{2}/\d{2}/)?(?P<id>[^/?&#]+)'
|
||||||
IE_NAME = 'ign.com'
|
IE_NAME = 'ign.com'
|
||||||
|
_PAGE_TYPE = 'video'
|
||||||
|
|
||||||
_API_URL_TEMPLATE = 'http://apis.ign.com/video/v3/videos/%s'
|
_TESTS = [{
|
||||||
_EMBED_RE = r'<iframe[^>]+?["\']((?:https?:)?//.+?\.ign\.com.+?/embed.+?)["\']'
|
'url': 'http://www.ign.com/videos/2013/06/05/the-last-of-us-review',
|
||||||
|
'md5': 'd2e1586d9987d40fad7867bf96a018ea',
|
||||||
_TESTS = [
|
'info_dict': {
|
||||||
{
|
'id': '8f862beef863986b2785559b9e1aa599',
|
||||||
'url': 'http://www.ign.com/videos/2013/06/05/the-last-of-us-review',
|
'ext': 'mp4',
|
||||||
'md5': 'febda82c4bafecd2d44b6e1a18a595f8',
|
'title': 'The Last of Us Review',
|
||||||
'info_dict': {
|
'description': 'md5:c8946d4260a4d43a00d5ae8ed998870c',
|
||||||
'id': '8f862beef863986b2785559b9e1aa599',
|
'timestamp': 1370440800,
|
||||||
'ext': 'mp4',
|
'upload_date': '20130605',
|
||||||
'title': 'The Last of Us Review',
|
'tags': 'count:9',
|
||||||
'description': 'md5:c8946d4260a4d43a00d5ae8ed998870c',
|
}
|
||||||
'timestamp': 1370440800,
|
}, {
|
||||||
'upload_date': '20130605',
|
'url': 'http://www.pcmag.com/videos/2015/01/06/010615-whats-new-now-is-gogo-snooping-on-your-data',
|
||||||
'uploader_id': 'cberidon@ign.com',
|
'md5': 'f1581a6fe8c5121be5b807684aeac3f6',
|
||||||
}
|
'info_dict': {
|
||||||
},
|
'id': 'ee10d774b508c9b8ec07e763b9125b91',
|
||||||
{
|
'ext': 'mp4',
|
||||||
'url': 'http://me.ign.com/en/feature/15775/100-little-things-in-gta-5-that-will-blow-your-mind',
|
'title': 'What\'s New Now: Is GoGo Snooping on Your Data?',
|
||||||
'info_dict': {
|
'description': 'md5:817a20299de610bd56f13175386da6fa',
|
||||||
'id': '100-little-things-in-gta-5-that-will-blow-your-mind',
|
'timestamp': 1420571160,
|
||||||
},
|
'upload_date': '20150106',
|
||||||
'playlist': [
|
'tags': 'count:4',
|
||||||
{
|
}
|
||||||
'info_dict': {
|
}, {
|
||||||
'id': '5ebbd138523268b93c9141af17bec937',
|
'url': 'https://www.ign.com/videos/is-a-resident-evil-4-remake-on-the-way-ign-daily-fix',
|
||||||
'ext': 'mp4',
|
'only_matching': True,
|
||||||
'title': 'GTA 5 Video Review',
|
}]
|
||||||
'description': 'Rockstar drops the mic on this generation of games. Watch our review of the masterly Grand Theft Auto V.',
|
|
||||||
'timestamp': 1379339880,
|
|
||||||
'upload_date': '20130916',
|
|
||||||
'uploader_id': 'danieljkrupa@gmail.com',
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
'info_dict': {
|
|
||||||
'id': '638672ee848ae4ff108df2a296418ee2',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': '26 Twisted Moments from GTA 5 in Slow Motion',
|
|
||||||
'description': 'The twisted beauty of GTA 5 in stunning slow motion.',
|
|
||||||
'timestamp': 1386878820,
|
|
||||||
'upload_date': '20131212',
|
|
||||||
'uploader_id': 'togilvie@ign.com',
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
'url': 'http://www.ign.com/articles/2014/08/15/rewind-theater-wild-trailer-gamescom-2014?watch',
|
|
||||||
'md5': '618fedb9c901fd086f6f093564ef8558',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '078fdd005f6d3c02f63d795faa1b984f',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Rewind Theater - Wild Trailer Gamescom 2014',
|
|
||||||
'description': 'Brian and Jared explore Michel Ancel\'s captivating new preview.',
|
|
||||||
'timestamp': 1408047180,
|
|
||||||
'upload_date': '20140814',
|
|
||||||
'uploader_id': 'jamesduggan1990@gmail.com',
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
'url': 'http://me.ign.com/en/videos/112203/video/how-hitman-aims-to-be-different-than-every-other-s',
|
|
||||||
'only_matching': True,
|
|
||||||
},
|
|
||||||
{
|
|
||||||
'url': 'http://me.ign.com/ar/angry-birds-2/106533/video/lrd-ldyy-lwl-lfylm-angry-birds',
|
|
||||||
'only_matching': True,
|
|
||||||
},
|
|
||||||
{
|
|
||||||
# videoId pattern
|
|
||||||
'url': 'http://www.ign.com/articles/2017/06/08/new-ducktales-short-donalds-birthday-doesnt-go-as-planned',
|
|
||||||
'only_matching': True,
|
|
||||||
},
|
|
||||||
]
|
|
||||||
|
|
||||||
def _find_video_id(self, webpage):
|
|
||||||
res_id = [
|
|
||||||
r'"video_id"\s*:\s*"(.*?)"',
|
|
||||||
r'class="hero-poster[^"]*?"[^>]*id="(.+?)"',
|
|
||||||
r'data-video-id="(.+?)"',
|
|
||||||
r'<object id="vid_(.+?)"',
|
|
||||||
r'<meta name="og:image" content=".*/(.+?)-(.+?)/.+.jpg"',
|
|
||||||
r'videoId"\s*:\s*"(.+?)"',
|
|
||||||
r'videoId["\']\s*:\s*["\']([^"\']+?)["\']',
|
|
||||||
]
|
|
||||||
return self._search_regex(res_id, webpage, 'video id', default=None)
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
display_id = self._match_id(url)
|
||||||
name_or_id = mobj.group('name_or_id')
|
video = self._call_api(display_id)
|
||||||
page_type = mobj.group('type')
|
video_id = video['videoId']
|
||||||
webpage = self._download_webpage(url, name_or_id)
|
metadata = video['metadata']
|
||||||
if page_type != 'video':
|
title = metadata.get('longTitle') or metadata.get('title') or metadata['name']
|
||||||
multiple_urls = re.findall(
|
|
||||||
r'<param name="flashvars"[^>]*value="[^"]*?url=(https?://www\.ign\.com/videos/.*?)["&]',
|
|
||||||
webpage)
|
|
||||||
if multiple_urls:
|
|
||||||
entries = [self.url_result(u, ie='IGN') for u in multiple_urls]
|
|
||||||
return {
|
|
||||||
'_type': 'playlist',
|
|
||||||
'id': name_or_id,
|
|
||||||
'entries': entries,
|
|
||||||
}
|
|
||||||
|
|
||||||
video_id = self._find_video_id(webpage)
|
|
||||||
if not video_id:
|
|
||||||
return self.url_result(self._search_regex(
|
|
||||||
self._EMBED_RE, webpage, 'embed url'))
|
|
||||||
return self._get_video_info(video_id)
|
|
||||||
|
|
||||||
def _get_video_info(self, video_id):
|
|
||||||
api_data = self._download_json(
|
|
||||||
self._API_URL_TEMPLATE % video_id, video_id)
|
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
m3u8_url = api_data['refs'].get('m3uUrl')
|
refs = video.get('refs') or {}
|
||||||
|
|
||||||
|
m3u8_url = refs.get('m3uUrl')
|
||||||
if m3u8_url:
|
if m3u8_url:
|
||||||
formats.extend(self._extract_m3u8_formats(
|
formats.extend(self._extract_m3u8_formats(
|
||||||
m3u8_url, video_id, 'mp4', 'm3u8_native',
|
m3u8_url, video_id, 'mp4', 'm3u8_native',
|
||||||
m3u8_id='hls', fatal=False))
|
m3u8_id='hls', fatal=False))
|
||||||
f4m_url = api_data['refs'].get('f4mUrl')
|
|
||||||
|
f4m_url = refs.get('f4mUrl')
|
||||||
if f4m_url:
|
if f4m_url:
|
||||||
formats.extend(self._extract_f4m_formats(
|
formats.extend(self._extract_f4m_formats(
|
||||||
f4m_url, video_id, f4m_id='hds', fatal=False))
|
f4m_url, video_id, f4m_id='hds', fatal=False))
|
||||||
for asset in api_data['assets']:
|
|
||||||
|
for asset in (video.get('assets') or []):
|
||||||
|
asset_url = asset.get('url')
|
||||||
|
if not asset_url:
|
||||||
|
continue
|
||||||
formats.append({
|
formats.append({
|
||||||
'url': asset['url'],
|
'url': asset_url,
|
||||||
'tbr': asset.get('actual_bitrate_kbps'),
|
'tbr': int_or_none(asset.get('bitrate'), 1000),
|
||||||
'fps': asset.get('frame_rate'),
|
'fps': int_or_none(asset.get('frame_rate')),
|
||||||
'height': int_or_none(asset.get('height')),
|
'height': int_or_none(asset.get('height')),
|
||||||
'width': int_or_none(asset.get('width')),
|
'width': int_or_none(asset.get('width')),
|
||||||
})
|
})
|
||||||
|
|
||||||
|
mezzanine_url = try_get(video, lambda x: x['system']['mezzanineUrl'])
|
||||||
|
if mezzanine_url:
|
||||||
|
formats.append({
|
||||||
|
'ext': determine_ext(mezzanine_url, 'mp4'),
|
||||||
|
'format_id': 'mezzanine',
|
||||||
|
'preference': 1,
|
||||||
|
'url': mezzanine_url,
|
||||||
|
})
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
thumbnails = [{
|
thumbnails = []
|
||||||
'url': thumbnail['url']
|
for thumbnail in (video.get('thumbnails') or []):
|
||||||
} for thumbnail in api_data.get('thumbnails', [])]
|
thumbnail_url = thumbnail.get('url')
|
||||||
|
if not thumbnail_url:
|
||||||
|
continue
|
||||||
|
thumbnails.append({
|
||||||
|
'url': thumbnail_url,
|
||||||
|
})
|
||||||
|
|
||||||
metadata = api_data['metadata']
|
tags = []
|
||||||
|
for tag in (video.get('tags') or []):
|
||||||
|
display_name = tag.get('displayName')
|
||||||
|
if not display_name:
|
||||||
|
continue
|
||||||
|
tags.append(display_name)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': api_data.get('videoId') or video_id,
|
'id': video_id,
|
||||||
'title': metadata.get('longTitle') or metadata.get('name') or metadata.get['title'],
|
'title': title,
|
||||||
'description': metadata.get('description'),
|
'description': strip_or_none(metadata.get('description')),
|
||||||
'timestamp': parse_iso8601(metadata.get('publishDate')),
|
'timestamp': parse_iso8601(metadata.get('publishDate')),
|
||||||
'duration': int_or_none(metadata.get('duration')),
|
'duration': int_or_none(metadata.get('duration')),
|
||||||
'display_id': metadata.get('slug') or video_id,
|
'display_id': display_id,
|
||||||
'uploader_id': metadata.get('creator'),
|
|
||||||
'thumbnails': thumbnails,
|
'thumbnails': thumbnails,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
|
'tags': tags,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class OneUPIE(IGNIE):
|
class IGNVideoIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://gamevideos\.1up\.com/(?P<type>video)/id/(?P<name_or_id>.+)\.html'
|
_VALID_URL = r'https?://.+?\.ign\.com/(?:[a-z]{2}/)?[^/]+/(?P<id>\d+)/(?:video|trailer)/'
|
||||||
IE_NAME = '1up.com'
|
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://gamevideos.1up.com/video/id/34976.html',
|
'url': 'http://me.ign.com/en/videos/112203/video/how-hitman-aims-to-be-different-than-every-other-s',
|
||||||
'md5': 'c9cc69e07acb675c31a16719f909e347',
|
'md5': 'dd9aca7ed2657c4e118d8b261e5e9de1',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '34976',
|
'id': 'e9be7ea899a9bbfc0674accc22a36cc8',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Sniper Elite V2 - Trailer',
|
'title': 'How Hitman Aims to Be Different Than Every Other Stealth Game - NYCC 2015',
|
||||||
'description': 'md5:bf0516c5ee32a3217aa703e9b1bc7826',
|
'description': 'Taking out assassination targets in Hitman has never been more stylish.',
|
||||||
'timestamp': 1313099220,
|
'timestamp': 1444665600,
|
||||||
'upload_date': '20110811',
|
'upload_date': '20151012',
|
||||||
'uploader_id': 'IGN',
|
|
||||||
}
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'http://me.ign.com/ar/angry-birds-2/106533/video/lrd-ldyy-lwl-lfylm-angry-birds',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# Youtube embed
|
||||||
|
'url': 'https://me.ign.com/ar/ratchet-clank-rift-apart/144327/trailer/embed',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# Twitter embed
|
||||||
|
'url': 'http://adria.ign.com/sherlock-season-4/9687/trailer/embed',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# Vimeo embed
|
||||||
|
'url': 'https://kr.ign.com/bic-2018/3307/trailer/embed',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
video_id = self._match_id(url)
|
||||||
result = super(OneUPIE, self)._real_extract(url)
|
req = HEADRequest(url.rsplit('/', 1)[0] + '/embed')
|
||||||
result['id'] = mobj.group('name_or_id')
|
url = self._request_webpage(req, video_id).geturl()
|
||||||
return result
|
ign_url = compat_parse_qs(
|
||||||
|
compat_urllib_parse_urlparse(url).query).get('url', [None])[0]
|
||||||
|
if ign_url:
|
||||||
|
return self.url_result(ign_url, IGNIE.ie_key())
|
||||||
|
return self.url_result(url)
|
||||||
|
|
||||||
|
|
||||||
class PCMagIE(IGNIE):
|
class IGNArticleIE(IGNBaseIE):
|
||||||
_VALID_URL = r'https?://(?:www\.)?pcmag\.com/(?P<type>videos|article2)(/.+)?/(?P<name_or_id>.+)'
|
_VALID_URL = r'https?://.+?\.ign\.com/(?:articles(?:/\d{4}/\d{2}/\d{2})?|(?:[a-z]{2}/)?feature/\d+)/(?P<id>[^/?&#]+)'
|
||||||
IE_NAME = 'pcmag'
|
_PAGE_TYPE = 'article'
|
||||||
|
|
||||||
_EMBED_RE = r'iframe\.setAttribute\("src",\s*__util.objToUrlString\("http://widgets\.ign\.com/video/embed/content\.html?[^"]*url=([^"]+)["&]'
|
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.pcmag.com/videos/2015/01/06/010615-whats-new-now-is-gogo-snooping-on-your-data',
|
'url': 'http://me.ign.com/en/feature/15775/100-little-things-in-gta-5-that-will-blow-your-mind',
|
||||||
'md5': '212d6154fd0361a2781075f1febbe9ad',
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'ee10d774b508c9b8ec07e763b9125b91',
|
'id': '524497489e4e8ff5848ece34',
|
||||||
'ext': 'mp4',
|
'title': '100 Little Things in GTA 5 That Will Blow Your Mind',
|
||||||
'title': '010615_What\'s New Now: Is GoGo Snooping on Your Data?',
|
},
|
||||||
'description': 'md5:a7071ae64d2f68cc821c729d4ded6bb3',
|
'playlist': [
|
||||||
'timestamp': 1420571160,
|
{
|
||||||
'upload_date': '20150106',
|
'info_dict': {
|
||||||
'uploader_id': 'cozzipix@gmail.com',
|
'id': '5ebbd138523268b93c9141af17bec937',
|
||||||
}
|
'ext': 'mp4',
|
||||||
|
'title': 'GTA 5 Video Review',
|
||||||
|
'description': 'Rockstar drops the mic on this generation of games. Watch our review of the masterly Grand Theft Auto V.',
|
||||||
|
'timestamp': 1379339880,
|
||||||
|
'upload_date': '20130916',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'info_dict': {
|
||||||
|
'id': '638672ee848ae4ff108df2a296418ee2',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': '26 Twisted Moments from GTA 5 in Slow Motion',
|
||||||
|
'description': 'The twisted beauty of GTA 5 in stunning slow motion.',
|
||||||
|
'timestamp': 1386878820,
|
||||||
|
'upload_date': '20131212',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
],
|
||||||
|
'params': {
|
||||||
|
'playlist_items': '2-3',
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.pcmag.com/article2/0,2817,2470156,00.asp',
|
'url': 'http://www.ign.com/articles/2014/08/15/rewind-theater-wild-trailer-gamescom-2014?watch',
|
||||||
'md5': '94130c1ca07ba0adb6088350681f16c1',
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '042e560ba94823d43afcb12ddf7142ca',
|
'id': '53ee806780a81ec46e0790f8',
|
||||||
'ext': 'mp4',
|
'title': 'Rewind Theater - Wild Trailer Gamescom 2014',
|
||||||
'title': 'HTC\'s Weird New Re Camera - What\'s New Now',
|
},
|
||||||
'description': 'md5:53433c45df96d2ea5d0fda18be2ca908',
|
'playlist_count': 2,
|
||||||
'timestamp': 1412953920,
|
}, {
|
||||||
'upload_date': '20141010',
|
# videoId pattern
|
||||||
'uploader_id': 'chris_snyder@pcmag.com',
|
'url': 'http://www.ign.com/articles/2017/06/08/new-ducktales-short-donalds-birthday-doesnt-go-as-planned',
|
||||||
}
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# Youtube embed
|
||||||
|
'url': 'https://www.ign.com/articles/2021-mvp-named-in-puppy-bowl-xvii',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# IMDB embed
|
||||||
|
'url': 'https://www.ign.com/articles/2014/08/07/sons-of-anarchy-final-season-trailer',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# Facebook embed
|
||||||
|
'url': 'https://www.ign.com/articles/2017/09/20/marvels-the-punisher-watch-the-new-trailer-for-the-netflix-series',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# Brightcove embed
|
||||||
|
'url': 'https://www.ign.com/articles/2016/01/16/supergirl-goes-flying-with-martian-manhunter-in-new-clip',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
article = self._call_api(display_id)
|
||||||
|
|
||||||
|
def entries():
|
||||||
|
media_url = try_get(article, lambda x: x['mediaRelations'][0]['media']['metadata']['url'])
|
||||||
|
if media_url:
|
||||||
|
yield self.url_result(media_url, IGNIE.ie_key())
|
||||||
|
for content in (article.get('content') or []):
|
||||||
|
for video_url in re.findall(r'(?:\[(?:ignvideo\s+url|youtube\s+clip_id)|<iframe[^>]+src)="([^"]+)"', content):
|
||||||
|
yield self.url_result(video_url)
|
||||||
|
|
||||||
|
return self.playlist_result(
|
||||||
|
entries(), article.get('articleId'),
|
||||||
|
strip_or_none(try_get(article, lambda x: x['metadata']['headline'])))
|
||||||
|
97
youtube_dl/extractor/iheart.py
Normal file
97
youtube_dl/extractor/iheart.py
Normal file
@ -0,0 +1,97 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
clean_html,
|
||||||
|
clean_podcast_url,
|
||||||
|
int_or_none,
|
||||||
|
str_or_none,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class IHeartRadioBaseIE(InfoExtractor):
|
||||||
|
def _call_api(self, path, video_id, fatal=True, query=None):
|
||||||
|
return self._download_json(
|
||||||
|
'https://api.iheart.com/api/v3/podcast/' + path,
|
||||||
|
video_id, fatal=fatal, query=query)
|
||||||
|
|
||||||
|
def _extract_episode(self, episode):
|
||||||
|
return {
|
||||||
|
'thumbnail': episode.get('imageUrl'),
|
||||||
|
'description': clean_html(episode.get('description')),
|
||||||
|
'timestamp': int_or_none(episode.get('startDate'), 1000),
|
||||||
|
'duration': int_or_none(episode.get('duration')),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class IHeartRadioIE(IHeartRadioBaseIE):
|
||||||
|
IENAME = 'iheartradio'
|
||||||
|
_VALID_URL = r'(?:https?://(?:www\.)?iheart\.com/podcast/[^/]+/episode/(?P<display_id>[^/?&#]+)-|iheartradio:)(?P<id>\d+)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'https://www.iheart.com/podcast/105-behind-the-bastards-29236323/episode/part-one-alexander-lukashenko-the-dictator-70346499/?embed=true',
|
||||||
|
'md5': 'c8609c92c8688dcb69d8541042b8abca',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '70346499',
|
||||||
|
'ext': 'mp3',
|
||||||
|
'title': 'Part One: Alexander Lukashenko: The Dictator of Belarus',
|
||||||
|
'description': 'md5:96cc7297b3a5a9ebae28643801c96fae',
|
||||||
|
'timestamp': 1597741200,
|
||||||
|
'upload_date': '20200818',
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
episode_id = self._match_id(url)
|
||||||
|
episode = self._call_api(
|
||||||
|
'episodes/' + episode_id, episode_id)['episode']
|
||||||
|
info = self._extract_episode(episode)
|
||||||
|
info.update({
|
||||||
|
'id': episode_id,
|
||||||
|
'title': episode['title'],
|
||||||
|
'url': clean_podcast_url(episode['mediaUrl']),
|
||||||
|
})
|
||||||
|
return info
|
||||||
|
|
||||||
|
|
||||||
|
class IHeartRadioPodcastIE(IHeartRadioBaseIE):
|
||||||
|
IE_NAME = 'iheartradio:podcast'
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?iheart(?:podcastnetwork)?\.com/podcast/[^/?&#]+-(?P<id>\d+)/?(?:[?#&]|$)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.iheart.com/podcast/1119-it-could-happen-here-30717896/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '30717896',
|
||||||
|
'title': 'It Could Happen Here',
|
||||||
|
'description': 'md5:5842117412a967eb0b01f8088eb663e2',
|
||||||
|
},
|
||||||
|
'playlist_mincount': 11,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.iheartpodcastnetwork.com/podcast/105-stuff-you-should-know-26940277',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
podcast_id = self._match_id(url)
|
||||||
|
path = 'podcasts/' + podcast_id
|
||||||
|
episodes = self._call_api(
|
||||||
|
path + '/episodes', podcast_id, query={'limit': 1000000000})['data']
|
||||||
|
|
||||||
|
entries = []
|
||||||
|
for episode in episodes:
|
||||||
|
episode_id = str_or_none(episode.get('id'))
|
||||||
|
if not episode_id:
|
||||||
|
continue
|
||||||
|
info = self._extract_episode(episode)
|
||||||
|
info.update({
|
||||||
|
'_type': 'url',
|
||||||
|
'id': episode_id,
|
||||||
|
'title': episode.get('title'),
|
||||||
|
'url': 'iheartradio:' + episode_id,
|
||||||
|
'ie_key': IHeartRadioIE.ie_key(),
|
||||||
|
})
|
||||||
|
entries.append(info)
|
||||||
|
|
||||||
|
podcast = self._call_api(path, podcast_id, False) or {}
|
||||||
|
|
||||||
|
return self.playlist_result(
|
||||||
|
entries, podcast_id, podcast.get('title'), podcast.get('description'))
|
@ -12,7 +12,7 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class InaIE(InfoExtractor):
|
class InaIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?ina\.fr/(?:video|audio)/(?P<id>[A-Z0-9_]+)'
|
_VALID_URL = r'https?://(?:(?:www|m)\.)?ina\.fr/(?:video|audio)/(?P<id>[A-Z0-9_]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.ina.fr/video/I12055569/francois-hollande-je-crois-que-c-est-clair-video.html',
|
'url': 'http://www.ina.fr/video/I12055569/francois-hollande-je-crois-que-c-est-clair-video.html',
|
||||||
'md5': 'a667021bf2b41f8dc6049479d9bb38a3',
|
'md5': 'a667021bf2b41f8dc6049479d9bb38a3',
|
||||||
@ -31,6 +31,9 @@ class InaIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://www.ina.fr/video/P16173408-video.html',
|
'url': 'https://www.ina.fr/video/P16173408-video.html',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'http://m.ina.fr/video/I12055569',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -22,7 +22,7 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class InstagramIE(InfoExtractor):
|
class InstagramIE(InfoExtractor):
|
||||||
_VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com/(?:p|tv)/(?P<id>[^/?#&]+))'
|
_VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com/(?:p|tv|reel)/(?P<id>[^/?#&]+))'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://instagram.com/p/aye83DjauH/?foo=bar#abc',
|
'url': 'https://instagram.com/p/aye83DjauH/?foo=bar#abc',
|
||||||
'md5': '0d2da106a9d2631273e192b372806516',
|
'md5': '0d2da106a9d2631273e192b372806516',
|
||||||
@ -35,7 +35,7 @@ class InstagramIE(InfoExtractor):
|
|||||||
'timestamp': 1371748545,
|
'timestamp': 1371748545,
|
||||||
'upload_date': '20130620',
|
'upload_date': '20130620',
|
||||||
'uploader_id': 'naomipq',
|
'uploader_id': 'naomipq',
|
||||||
'uploader': 'Naomi Leonor Phan-Quang',
|
'uploader': 'B E A U T Y F O R A S H E S',
|
||||||
'like_count': int,
|
'like_count': int,
|
||||||
'comment_count': int,
|
'comment_count': int,
|
||||||
'comments': list,
|
'comments': list,
|
||||||
@ -95,6 +95,9 @@ class InstagramIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://www.instagram.com/tv/aye83DjauH/',
|
'url': 'https://www.instagram.com/tv/aye83DjauH/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.instagram.com/reel/CDUMkliABpa/',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
@ -122,9 +125,9 @@ class InstagramIE(InfoExtractor):
|
|||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
(video_url, description, thumbnail, timestamp, uploader,
|
(media, video_url, description, thumbnail, timestamp, uploader,
|
||||||
uploader_id, like_count, comment_count, comments, height,
|
uploader_id, like_count, comment_count, comments, height,
|
||||||
width) = [None] * 11
|
width) = [None] * 12
|
||||||
|
|
||||||
shared_data = self._parse_json(
|
shared_data = self._parse_json(
|
||||||
self._search_regex(
|
self._search_regex(
|
||||||
@ -137,59 +140,77 @@ class InstagramIE(InfoExtractor):
|
|||||||
(lambda x: x['entry_data']['PostPage'][0]['graphql']['shortcode_media'],
|
(lambda x: x['entry_data']['PostPage'][0]['graphql']['shortcode_media'],
|
||||||
lambda x: x['entry_data']['PostPage'][0]['media']),
|
lambda x: x['entry_data']['PostPage'][0]['media']),
|
||||||
dict)
|
dict)
|
||||||
if media:
|
# _sharedData.entry_data.PostPage is empty when authenticated (see
|
||||||
video_url = media.get('video_url')
|
# https://github.com/ytdl-org/youtube-dl/pull/22880)
|
||||||
height = int_or_none(media.get('dimensions', {}).get('height'))
|
if not media:
|
||||||
width = int_or_none(media.get('dimensions', {}).get('width'))
|
additional_data = self._parse_json(
|
||||||
description = try_get(
|
self._search_regex(
|
||||||
media, lambda x: x['edge_media_to_caption']['edges'][0]['node']['text'],
|
r'window\.__additionalDataLoaded\s*\(\s*[^,]+,\s*({.+?})\s*\)\s*;',
|
||||||
compat_str) or media.get('caption')
|
webpage, 'additional data', default='{}'),
|
||||||
thumbnail = media.get('display_src')
|
video_id, fatal=False)
|
||||||
timestamp = int_or_none(media.get('taken_at_timestamp') or media.get('date'))
|
if additional_data:
|
||||||
uploader = media.get('owner', {}).get('full_name')
|
media = try_get(
|
||||||
uploader_id = media.get('owner', {}).get('username')
|
additional_data, lambda x: x['graphql']['shortcode_media'],
|
||||||
|
dict)
|
||||||
|
if media:
|
||||||
|
video_url = media.get('video_url')
|
||||||
|
height = int_or_none(media.get('dimensions', {}).get('height'))
|
||||||
|
width = int_or_none(media.get('dimensions', {}).get('width'))
|
||||||
|
description = try_get(
|
||||||
|
media, lambda x: x['edge_media_to_caption']['edges'][0]['node']['text'],
|
||||||
|
compat_str) or media.get('caption')
|
||||||
|
thumbnail = media.get('display_src') or media.get('display_url')
|
||||||
|
timestamp = int_or_none(media.get('taken_at_timestamp') or media.get('date'))
|
||||||
|
uploader = media.get('owner', {}).get('full_name')
|
||||||
|
uploader_id = media.get('owner', {}).get('username')
|
||||||
|
|
||||||
def get_count(key, kind):
|
def get_count(keys, kind):
|
||||||
return int_or_none(try_get(
|
if not isinstance(keys, (list, tuple)):
|
||||||
|
keys = [keys]
|
||||||
|
for key in keys:
|
||||||
|
count = int_or_none(try_get(
|
||||||
media, (lambda x: x['edge_media_%s' % key]['count'],
|
media, (lambda x: x['edge_media_%s' % key]['count'],
|
||||||
lambda x: x['%ss' % kind]['count'])))
|
lambda x: x['%ss' % kind]['count'])))
|
||||||
like_count = get_count('preview_like', 'like')
|
if count is not None:
|
||||||
comment_count = get_count('to_comment', 'comment')
|
return count
|
||||||
|
like_count = get_count('preview_like', 'like')
|
||||||
|
comment_count = get_count(
|
||||||
|
('preview_comment', 'to_comment', 'to_parent_comment'), 'comment')
|
||||||
|
|
||||||
comments = [{
|
comments = [{
|
||||||
'author': comment.get('user', {}).get('username'),
|
'author': comment.get('user', {}).get('username'),
|
||||||
'author_id': comment.get('user', {}).get('id'),
|
'author_id': comment.get('user', {}).get('id'),
|
||||||
'id': comment.get('id'),
|
'id': comment.get('id'),
|
||||||
'text': comment.get('text'),
|
'text': comment.get('text'),
|
||||||
'timestamp': int_or_none(comment.get('created_at')),
|
'timestamp': int_or_none(comment.get('created_at')),
|
||||||
} for comment in media.get(
|
} for comment in media.get(
|
||||||
'comments', {}).get('nodes', []) if comment.get('text')]
|
'comments', {}).get('nodes', []) if comment.get('text')]
|
||||||
if not video_url:
|
if not video_url:
|
||||||
edges = try_get(
|
edges = try_get(
|
||||||
media, lambda x: x['edge_sidecar_to_children']['edges'],
|
media, lambda x: x['edge_sidecar_to_children']['edges'],
|
||||||
list) or []
|
list) or []
|
||||||
if edges:
|
if edges:
|
||||||
entries = []
|
entries = []
|
||||||
for edge_num, edge in enumerate(edges, start=1):
|
for edge_num, edge in enumerate(edges, start=1):
|
||||||
node = try_get(edge, lambda x: x['node'], dict)
|
node = try_get(edge, lambda x: x['node'], dict)
|
||||||
if not node:
|
if not node:
|
||||||
continue
|
continue
|
||||||
node_video_url = url_or_none(node.get('video_url'))
|
node_video_url = url_or_none(node.get('video_url'))
|
||||||
if not node_video_url:
|
if not node_video_url:
|
||||||
continue
|
continue
|
||||||
entries.append({
|
entries.append({
|
||||||
'id': node.get('shortcode') or node['id'],
|
'id': node.get('shortcode') or node['id'],
|
||||||
'title': 'Video %d' % edge_num,
|
'title': 'Video %d' % edge_num,
|
||||||
'url': node_video_url,
|
'url': node_video_url,
|
||||||
'thumbnail': node.get('display_url'),
|
'thumbnail': node.get('display_url'),
|
||||||
'width': int_or_none(try_get(node, lambda x: x['dimensions']['width'])),
|
'width': int_or_none(try_get(node, lambda x: x['dimensions']['width'])),
|
||||||
'height': int_or_none(try_get(node, lambda x: x['dimensions']['height'])),
|
'height': int_or_none(try_get(node, lambda x: x['dimensions']['height'])),
|
||||||
'view_count': int_or_none(node.get('video_view_count')),
|
'view_count': int_or_none(node.get('video_view_count')),
|
||||||
})
|
})
|
||||||
return self.playlist_result(
|
return self.playlist_result(
|
||||||
entries, video_id,
|
entries, video_id,
|
||||||
'Post by %s' % uploader_id if uploader_id else None,
|
'Post by %s' % uploader_id if uploader_id else None,
|
||||||
description)
|
description)
|
||||||
|
|
||||||
if not video_url:
|
if not video_url:
|
||||||
video_url = self._og_search_video_url(webpage, secure=False)
|
video_url = self._og_search_video_url(webpage, secure=False)
|
||||||
|
@ -1,29 +1,21 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import uuid
|
|
||||||
import xml.etree.ElementTree as etree
|
|
||||||
import json
|
import json
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .brightcove import BrightcoveNewIE
|
from .brightcove import BrightcoveNewIE
|
||||||
from ..compat import (
|
|
||||||
compat_str,
|
|
||||||
compat_etree_register_namespace,
|
|
||||||
)
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
|
clean_html,
|
||||||
determine_ext,
|
determine_ext,
|
||||||
ExtractorError,
|
|
||||||
extract_attributes,
|
extract_attributes,
|
||||||
int_or_none,
|
get_element_by_class,
|
||||||
|
JSON_LD_RE,
|
||||||
merge_dicts,
|
merge_dicts,
|
||||||
parse_duration,
|
parse_duration,
|
||||||
smuggle_url,
|
smuggle_url,
|
||||||
url_or_none,
|
url_or_none,
|
||||||
xpath_with_ns,
|
|
||||||
xpath_element,
|
|
||||||
xpath_text,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -31,14 +23,18 @@ class ITVIE(InfoExtractor):
|
|||||||
_VALID_URL = r'https?://(?:www\.)?itv\.com/hub/[^/]+/(?P<id>[0-9a-zA-Z]+)'
|
_VALID_URL = r'https?://(?:www\.)?itv\.com/hub/[^/]+/(?P<id>[0-9a-zA-Z]+)'
|
||||||
_GEO_COUNTRIES = ['GB']
|
_GEO_COUNTRIES = ['GB']
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.itv.com/hub/mr-bean-animated-series/2a2936a0053',
|
'url': 'https://www.itv.com/hub/liar/2a4547a0012',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '2a2936a0053',
|
'id': '2a4547a0012',
|
||||||
'ext': 'flv',
|
'ext': 'mp4',
|
||||||
'title': 'Home Movie',
|
'title': 'Liar - Series 2 - Episode 6',
|
||||||
|
'description': 'md5:d0f91536569dec79ea184f0a44cca089',
|
||||||
|
'series': 'Liar',
|
||||||
|
'season_number': 2,
|
||||||
|
'episode_number': 6,
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
# rtmp download
|
# m3u8 download
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
@ -61,220 +57,97 @@ class ITVIE(InfoExtractor):
|
|||||||
params = extract_attributes(self._search_regex(
|
params = extract_attributes(self._search_regex(
|
||||||
r'(?s)(<[^>]+id="video"[^>]*>)', webpage, 'params'))
|
r'(?s)(<[^>]+id="video"[^>]*>)', webpage, 'params'))
|
||||||
|
|
||||||
ns_map = {
|
ios_playlist_url = params.get('data-video-playlist') or params['data-video-id']
|
||||||
'soapenv': 'http://schemas.xmlsoap.org/soap/envelope/',
|
hmac = params['data-video-hmac']
|
||||||
'tem': 'http://tempuri.org/',
|
|
||||||
'itv': 'http://schemas.datacontract.org/2004/07/Itv.BB.Mercury.Common.Types',
|
|
||||||
'com': 'http://schemas.itv.com/2009/05/Common',
|
|
||||||
}
|
|
||||||
for ns, full_ns in ns_map.items():
|
|
||||||
compat_etree_register_namespace(ns, full_ns)
|
|
||||||
|
|
||||||
def _add_ns(name):
|
|
||||||
return xpath_with_ns(name, ns_map)
|
|
||||||
|
|
||||||
def _add_sub_element(element, name):
|
|
||||||
return etree.SubElement(element, _add_ns(name))
|
|
||||||
|
|
||||||
production_id = (
|
|
||||||
params.get('data-video-autoplay-id')
|
|
||||||
or '%s#001' % (
|
|
||||||
params.get('data-video-episode-id')
|
|
||||||
or video_id.replace('a', '/')))
|
|
||||||
|
|
||||||
req_env = etree.Element(_add_ns('soapenv:Envelope'))
|
|
||||||
_add_sub_element(req_env, 'soapenv:Header')
|
|
||||||
body = _add_sub_element(req_env, 'soapenv:Body')
|
|
||||||
get_playlist = _add_sub_element(body, ('tem:GetPlaylist'))
|
|
||||||
request = _add_sub_element(get_playlist, 'tem:request')
|
|
||||||
_add_sub_element(request, 'itv:ProductionId').text = production_id
|
|
||||||
_add_sub_element(request, 'itv:RequestGuid').text = compat_str(uuid.uuid4()).upper()
|
|
||||||
vodcrid = _add_sub_element(request, 'itv:Vodcrid')
|
|
||||||
_add_sub_element(vodcrid, 'com:Id')
|
|
||||||
_add_sub_element(request, 'itv:Partition')
|
|
||||||
user_info = _add_sub_element(get_playlist, 'tem:userInfo')
|
|
||||||
_add_sub_element(user_info, 'itv:Broadcaster').text = 'Itv'
|
|
||||||
_add_sub_element(user_info, 'itv:DM')
|
|
||||||
_add_sub_element(user_info, 'itv:RevenueScienceValue')
|
|
||||||
_add_sub_element(user_info, 'itv:SessionId')
|
|
||||||
_add_sub_element(user_info, 'itv:SsoToken')
|
|
||||||
_add_sub_element(user_info, 'itv:UserToken')
|
|
||||||
site_info = _add_sub_element(get_playlist, 'tem:siteInfo')
|
|
||||||
_add_sub_element(site_info, 'itv:AdvertisingRestriction').text = 'None'
|
|
||||||
_add_sub_element(site_info, 'itv:AdvertisingSite').text = 'ITV'
|
|
||||||
_add_sub_element(site_info, 'itv:AdvertisingType').text = 'Any'
|
|
||||||
_add_sub_element(site_info, 'itv:Area').text = 'ITVPLAYER.VIDEO'
|
|
||||||
_add_sub_element(site_info, 'itv:Category')
|
|
||||||
_add_sub_element(site_info, 'itv:Platform').text = 'DotCom'
|
|
||||||
_add_sub_element(site_info, 'itv:Site').text = 'ItvCom'
|
|
||||||
device_info = _add_sub_element(get_playlist, 'tem:deviceInfo')
|
|
||||||
_add_sub_element(device_info, 'itv:ScreenSize').text = 'Big'
|
|
||||||
player_info = _add_sub_element(get_playlist, 'tem:playerInfo')
|
|
||||||
_add_sub_element(player_info, 'itv:Version').text = '2'
|
|
||||||
|
|
||||||
headers = self.geo_verification_headers()
|
headers = self.geo_verification_headers()
|
||||||
headers.update({
|
headers.update({
|
||||||
'Content-Type': 'text/xml; charset=utf-8',
|
'Accept': 'application/vnd.itv.vod.playlist.v2+json',
|
||||||
'SOAPAction': 'http://tempuri.org/PlaylistService/GetPlaylist',
|
'Content-Type': 'application/json',
|
||||||
|
'hmac': hmac.upper(),
|
||||||
})
|
})
|
||||||
|
ios_playlist = self._download_json(
|
||||||
|
ios_playlist_url, video_id, data=json.dumps({
|
||||||
|
'user': {
|
||||||
|
'itvUserId': '',
|
||||||
|
'entitlements': [],
|
||||||
|
'token': ''
|
||||||
|
},
|
||||||
|
'device': {
|
||||||
|
'manufacturer': 'Safari',
|
||||||
|
'model': '5',
|
||||||
|
'os': {
|
||||||
|
'name': 'Windows NT',
|
||||||
|
'version': '6.1',
|
||||||
|
'type': 'desktop'
|
||||||
|
}
|
||||||
|
},
|
||||||
|
'client': {
|
||||||
|
'version': '4.1',
|
||||||
|
'id': 'browser'
|
||||||
|
},
|
||||||
|
'variantAvailability': {
|
||||||
|
'featureset': {
|
||||||
|
'min': ['hls', 'aes', 'outband-webvtt'],
|
||||||
|
'max': ['hls', 'aes', 'outband-webvtt']
|
||||||
|
},
|
||||||
|
'platformTag': 'dotcom'
|
||||||
|
}
|
||||||
|
}).encode(), headers=headers)
|
||||||
|
video_data = ios_playlist['Playlist']['Video']
|
||||||
|
ios_base_url = video_data.get('Base')
|
||||||
|
|
||||||
info = self._search_json_ld(webpage, video_id, default={})
|
|
||||||
formats = []
|
formats = []
|
||||||
subtitles = {}
|
for media_file in (video_data.get('MediaFiles') or []):
|
||||||
|
href = media_file.get('Href')
|
||||||
def extract_subtitle(sub_url):
|
if not href:
|
||||||
ext = determine_ext(sub_url, 'ttml')
|
continue
|
||||||
subtitles.setdefault('en', []).append({
|
if ios_base_url:
|
||||||
'url': sub_url,
|
href = ios_base_url + href
|
||||||
'ext': 'ttml' if ext == 'xml' else ext,
|
ext = determine_ext(href)
|
||||||
})
|
if ext == 'm3u8':
|
||||||
|
formats.extend(self._extract_m3u8_formats(
|
||||||
resp_env = self._download_xml(
|
href, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||||
params['data-playlist-url'], video_id,
|
m3u8_id='hls', fatal=False))
|
||||||
headers=headers, data=etree.tostring(req_env), fatal=False)
|
|
||||||
if resp_env:
|
|
||||||
playlist = xpath_element(resp_env, './/Playlist')
|
|
||||||
if playlist is None:
|
|
||||||
fault_code = xpath_text(resp_env, './/faultcode')
|
|
||||||
fault_string = xpath_text(resp_env, './/faultstring')
|
|
||||||
if fault_code == 'InvalidGeoRegion':
|
|
||||||
self.raise_geo_restricted(
|
|
||||||
msg=fault_string, countries=self._GEO_COUNTRIES)
|
|
||||||
elif fault_code not in (
|
|
||||||
'InvalidEntity', 'InvalidVodcrid', 'ContentUnavailable'):
|
|
||||||
raise ExtractorError(
|
|
||||||
'%s said: %s' % (self.IE_NAME, fault_string), expected=True)
|
|
||||||
info.update({
|
|
||||||
'title': self._og_search_title(webpage),
|
|
||||||
'episode_title': params.get('data-video-episode'),
|
|
||||||
'series': params.get('data-video-title'),
|
|
||||||
})
|
|
||||||
else:
|
else:
|
||||||
title = xpath_text(playlist, 'EpisodeTitle', default=None)
|
formats.append({
|
||||||
info.update({
|
'url': href,
|
||||||
'title': title,
|
|
||||||
'episode_title': title,
|
|
||||||
'episode_number': int_or_none(xpath_text(playlist, 'EpisodeNumber')),
|
|
||||||
'series': xpath_text(playlist, 'ProgrammeTitle'),
|
|
||||||
'duration': parse_duration(xpath_text(playlist, 'Duration')),
|
|
||||||
})
|
})
|
||||||
video_element = xpath_element(playlist, 'VideoEntries/Video', fatal=True)
|
|
||||||
media_files = xpath_element(video_element, 'MediaFiles', fatal=True)
|
|
||||||
rtmp_url = media_files.attrib['base']
|
|
||||||
|
|
||||||
for media_file in media_files.findall('MediaFile'):
|
|
||||||
play_path = xpath_text(media_file, 'URL')
|
|
||||||
if not play_path:
|
|
||||||
continue
|
|
||||||
tbr = int_or_none(media_file.get('bitrate'), 1000)
|
|
||||||
f = {
|
|
||||||
'format_id': 'rtmp' + ('-%d' % tbr if tbr else ''),
|
|
||||||
'play_path': play_path,
|
|
||||||
# Providing this swfVfy allows to avoid truncated downloads
|
|
||||||
'player_url': 'http://www.itv.com/mercury/Mercury_VideoPlayer.swf',
|
|
||||||
'page_url': url,
|
|
||||||
'tbr': tbr,
|
|
||||||
'ext': 'flv',
|
|
||||||
}
|
|
||||||
app = self._search_regex(
|
|
||||||
'rtmpe?://[^/]+/(.+)$', rtmp_url, 'app', default=None)
|
|
||||||
if app:
|
|
||||||
f.update({
|
|
||||||
'url': rtmp_url.split('?', 1)[0],
|
|
||||||
'app': app,
|
|
||||||
})
|
|
||||||
else:
|
|
||||||
f['url'] = rtmp_url
|
|
||||||
formats.append(f)
|
|
||||||
|
|
||||||
for caption_url in video_element.findall('ClosedCaptioningURIs/URL'):
|
|
||||||
if caption_url.text:
|
|
||||||
extract_subtitle(caption_url.text)
|
|
||||||
|
|
||||||
ios_playlist_url = params.get('data-video-playlist') or params.get('data-video-id')
|
|
||||||
hmac = params.get('data-video-hmac')
|
|
||||||
if ios_playlist_url and hmac and re.match(r'https?://', ios_playlist_url):
|
|
||||||
headers = self.geo_verification_headers()
|
|
||||||
headers.update({
|
|
||||||
'Accept': 'application/vnd.itv.vod.playlist.v2+json',
|
|
||||||
'Content-Type': 'application/json',
|
|
||||||
'hmac': hmac.upper(),
|
|
||||||
})
|
|
||||||
ios_playlist = self._download_json(
|
|
||||||
ios_playlist_url, video_id, data=json.dumps({
|
|
||||||
'user': {
|
|
||||||
'itvUserId': '',
|
|
||||||
'entitlements': [],
|
|
||||||
'token': ''
|
|
||||||
},
|
|
||||||
'device': {
|
|
||||||
'manufacturer': 'Safari',
|
|
||||||
'model': '5',
|
|
||||||
'os': {
|
|
||||||
'name': 'Windows NT',
|
|
||||||
'version': '6.1',
|
|
||||||
'type': 'desktop'
|
|
||||||
}
|
|
||||||
},
|
|
||||||
'client': {
|
|
||||||
'version': '4.1',
|
|
||||||
'id': 'browser'
|
|
||||||
},
|
|
||||||
'variantAvailability': {
|
|
||||||
'featureset': {
|
|
||||||
'min': ['hls', 'aes', 'outband-webvtt'],
|
|
||||||
'max': ['hls', 'aes', 'outband-webvtt']
|
|
||||||
},
|
|
||||||
'platformTag': 'dotcom'
|
|
||||||
}
|
|
||||||
}).encode(), headers=headers, fatal=False)
|
|
||||||
if ios_playlist:
|
|
||||||
video_data = ios_playlist.get('Playlist', {}).get('Video', {})
|
|
||||||
ios_base_url = video_data.get('Base')
|
|
||||||
for media_file in video_data.get('MediaFiles', []):
|
|
||||||
href = media_file.get('Href')
|
|
||||||
if not href:
|
|
||||||
continue
|
|
||||||
if ios_base_url:
|
|
||||||
href = ios_base_url + href
|
|
||||||
ext = determine_ext(href)
|
|
||||||
if ext == 'm3u8':
|
|
||||||
formats.extend(self._extract_m3u8_formats(
|
|
||||||
href, video_id, 'mp4', entry_protocol='m3u8_native',
|
|
||||||
m3u8_id='hls', fatal=False))
|
|
||||||
else:
|
|
||||||
formats.append({
|
|
||||||
'url': href,
|
|
||||||
})
|
|
||||||
subs = video_data.get('Subtitles')
|
|
||||||
if isinstance(subs, list):
|
|
||||||
for sub in subs:
|
|
||||||
if not isinstance(sub, dict):
|
|
||||||
continue
|
|
||||||
href = url_or_none(sub.get('Href'))
|
|
||||||
if href:
|
|
||||||
extract_subtitle(href)
|
|
||||||
if not info.get('duration'):
|
|
||||||
info['duration'] = parse_duration(video_data.get('Duration'))
|
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
info.update({
|
subtitles = {}
|
||||||
|
subs = video_data.get('Subtitles') or []
|
||||||
|
for sub in subs:
|
||||||
|
if not isinstance(sub, dict):
|
||||||
|
continue
|
||||||
|
href = url_or_none(sub.get('Href'))
|
||||||
|
if not href:
|
||||||
|
continue
|
||||||
|
subtitles.setdefault('en', []).append({
|
||||||
|
'url': href,
|
||||||
|
'ext': determine_ext(href, 'vtt'),
|
||||||
|
})
|
||||||
|
|
||||||
|
info = self._search_json_ld(webpage, video_id, default={})
|
||||||
|
if not info:
|
||||||
|
json_ld = self._parse_json(self._search_regex(
|
||||||
|
JSON_LD_RE, webpage, 'JSON-LD', '{}',
|
||||||
|
group='json_ld'), video_id, fatal=False)
|
||||||
|
if json_ld and json_ld.get('@type') == 'BreadcrumbList':
|
||||||
|
for ile in (json_ld.get('itemListElement:') or []):
|
||||||
|
item = ile.get('item:') or {}
|
||||||
|
if item.get('@type') == 'TVEpisode':
|
||||||
|
item['@context'] = 'http://schema.org'
|
||||||
|
info = self._json_ld(item, video_id, fatal=False) or {}
|
||||||
|
break
|
||||||
|
|
||||||
|
return merge_dicts({
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
|
'title': self._html_search_meta(['og:title', 'twitter:title'], webpage),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
})
|
'duration': parse_duration(video_data.get('Duration')),
|
||||||
|
'description': clean_html(get_element_by_class('episode-info__synopsis', webpage)),
|
||||||
webpage_info = self._search_json_ld(webpage, video_id, default={})
|
}, info)
|
||||||
if not webpage_info.get('title'):
|
|
||||||
webpage_info['title'] = self._html_search_regex(
|
|
||||||
r'(?s)<h\d+[^>]+\bclass=["\'][^>]*episode-title["\'][^>]*>([^<]+)<',
|
|
||||||
webpage, 'title', default=None) or self._og_search_title(
|
|
||||||
webpage, default=None) or self._html_search_meta(
|
|
||||||
'twitter:title', webpage, 'title',
|
|
||||||
default=None) or webpage_info['episode']
|
|
||||||
|
|
||||||
return merge_dicts(info, webpage_info)
|
|
||||||
|
|
||||||
|
|
||||||
class ITVBTCCIE(InfoExtractor):
|
class ITVBTCCIE(InfoExtractor):
|
||||||
|
@ -3,10 +3,13 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import compat_str
|
from ..compat import compat_HTTPError
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
|
ExtractorError,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
|
str_or_none,
|
||||||
strip_or_none,
|
strip_or_none,
|
||||||
|
try_get,
|
||||||
unified_timestamp,
|
unified_timestamp,
|
||||||
update_url_query,
|
update_url_query,
|
||||||
)
|
)
|
||||||
@ -23,7 +26,7 @@ class KakaoIE(InfoExtractor):
|
|||||||
'id': '301965083',
|
'id': '301965083',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': '乃木坂46 バナナマン 「3期生紹介コーナーが始動!顔高低差GPも!」 『乃木坂工事中』',
|
'title': '乃木坂46 バナナマン 「3期生紹介コーナーが始動!顔高低差GPも!」 『乃木坂工事中』',
|
||||||
'uploader_id': 2671005,
|
'uploader_id': '2671005',
|
||||||
'uploader': '그랑그랑이',
|
'uploader': '그랑그랑이',
|
||||||
'timestamp': 1488160199,
|
'timestamp': 1488160199,
|
||||||
'upload_date': '20170227',
|
'upload_date': '20170227',
|
||||||
@ -36,11 +39,15 @@ class KakaoIE(InfoExtractor):
|
|||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'description': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)\r\n\r\n[쇼! 음악중심] 20160611, 507회',
|
'description': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)\r\n\r\n[쇼! 음악중심] 20160611, 507회',
|
||||||
'title': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)',
|
'title': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)',
|
||||||
'uploader_id': 2653210,
|
'uploader_id': '2653210',
|
||||||
'uploader': '쇼! 음악중심',
|
'uploader': '쇼! 음악중심',
|
||||||
'timestamp': 1485684628,
|
'timestamp': 1485684628,
|
||||||
'upload_date': '20170129',
|
'upload_date': '20170129',
|
||||||
}
|
}
|
||||||
|
}, {
|
||||||
|
# geo restricted
|
||||||
|
'url': 'https://tv.kakao.com/channel/3643855/cliplink/412069491',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -68,8 +75,7 @@ class KakaoIE(InfoExtractor):
|
|||||||
'fields': ','.join([
|
'fields': ','.join([
|
||||||
'-*', 'tid', 'clipLink', 'displayTitle', 'clip', 'title',
|
'-*', 'tid', 'clipLink', 'displayTitle', 'clip', 'title',
|
||||||
'description', 'channelId', 'createTime', 'duration', 'playCount',
|
'description', 'channelId', 'createTime', 'duration', 'playCount',
|
||||||
'likeCount', 'commentCount', 'tagList', 'channel', 'name',
|
'likeCount', 'commentCount', 'tagList', 'channel', 'name', 'thumbnailUrl',
|
||||||
'clipChapterThumbnailList', 'thumbnailUrl', 'timeInSec', 'isDefault',
|
|
||||||
'videoOutputList', 'width', 'height', 'kbps', 'profile', 'label'])
|
'videoOutputList', 'width', 'height', 'kbps', 'profile', 'label'])
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -82,24 +88,28 @@ class KakaoIE(InfoExtractor):
|
|||||||
|
|
||||||
title = clip.get('title') or clip_link.get('displayTitle')
|
title = clip.get('title') or clip_link.get('displayTitle')
|
||||||
|
|
||||||
query['tid'] = impress.get('tid', '')
|
query.update({
|
||||||
|
'fields': '-*,code,message,url',
|
||||||
|
'tid': impress.get('tid') or '',
|
||||||
|
})
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for fmt in clip.get('videoOutputList', []):
|
for fmt in (clip.get('videoOutputList') or []):
|
||||||
try:
|
try:
|
||||||
profile_name = fmt['profile']
|
profile_name = fmt['profile']
|
||||||
if profile_name == 'AUDIO':
|
if profile_name == 'AUDIO':
|
||||||
continue
|
continue
|
||||||
query.update({
|
query['profile'] = profile_name
|
||||||
'profile': profile_name,
|
try:
|
||||||
'fields': '-*,url',
|
fmt_url_json = self._download_json(
|
||||||
})
|
api_base + 'raw/videolocation', display_id,
|
||||||
fmt_url_json = self._download_json(
|
'Downloading video URL for profile %s' % profile_name,
|
||||||
api_base + 'raw/videolocation', display_id,
|
query=query, headers=player_header)
|
||||||
'Downloading video URL for profile %s' % profile_name,
|
except ExtractorError as e:
|
||||||
query=query, headers=player_header, fatal=False)
|
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
|
||||||
|
resp = self._parse_json(e.cause.read().decode(), video_id)
|
||||||
if fmt_url_json is None:
|
if resp.get('code') == 'GeoBlocked':
|
||||||
|
self.raise_geo_restricted()
|
||||||
continue
|
continue
|
||||||
|
|
||||||
fmt_url = fmt_url_json['url']
|
fmt_url = fmt_url_json['url']
|
||||||
@ -116,27 +126,13 @@ class KakaoIE(InfoExtractor):
|
|||||||
pass
|
pass
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
thumbs = []
|
|
||||||
for thumb in clip.get('clipChapterThumbnailList', []):
|
|
||||||
thumbs.append({
|
|
||||||
'url': thumb.get('thumbnailUrl'),
|
|
||||||
'id': compat_str(thumb.get('timeInSec')),
|
|
||||||
'preference': -1 if thumb.get('isDefault') else 0
|
|
||||||
})
|
|
||||||
top_thumbnail = clip.get('thumbnailUrl')
|
|
||||||
if top_thumbnail:
|
|
||||||
thumbs.append({
|
|
||||||
'url': top_thumbnail,
|
|
||||||
'preference': 10,
|
|
||||||
})
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': display_id,
|
'id': display_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
'description': strip_or_none(clip.get('description')),
|
'description': strip_or_none(clip.get('description')),
|
||||||
'uploader': clip_link.get('channel', {}).get('name'),
|
'uploader': try_get(clip_link, lambda x: x['channel']['name']),
|
||||||
'uploader_id': clip_link.get('channelId'),
|
'uploader_id': str_or_none(clip_link.get('channelId')),
|
||||||
'thumbnails': thumbs,
|
'thumbnail': clip.get('thumbnailUrl'),
|
||||||
'timestamp': unified_timestamp(clip_link.get('createTime')),
|
'timestamp': unified_timestamp(clip_link.get('createTime')),
|
||||||
'duration': int_or_none(clip.get('duration')),
|
'duration': int_or_none(clip.get('duration')),
|
||||||
'view_count': int_or_none(clip.get('playCount')),
|
'view_count': int_or_none(clip.get('playCount')),
|
||||||
|
@ -1,97 +0,0 @@
|
|||||||
# coding: utf-8
|
|
||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
from ..utils import (
|
|
||||||
ExtractorError,
|
|
||||||
float_or_none,
|
|
||||||
srt_subtitles_timecode,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class KanalPlayIE(InfoExtractor):
|
|
||||||
IE_DESC = 'Kanal 5/9/11 Play'
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?kanal(?P<channel_id>5|9|11)play\.se/(?:#!/)?(?:play/)?program/\d+/video/(?P<id>\d+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'http://www.kanal5play.se/#!/play/program/3060212363/video/3270012277',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '3270012277',
|
|
||||||
'ext': 'flv',
|
|
||||||
'title': 'Saknar både dusch och avlopp',
|
|
||||||
'description': 'md5:6023a95832a06059832ae93bc3c7efb7',
|
|
||||||
'duration': 2636.36,
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
# rtmp download
|
|
||||||
'skip_download': True,
|
|
||||||
}
|
|
||||||
}, {
|
|
||||||
'url': 'http://www.kanal9play.se/#!/play/program/335032/video/246042',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'http://www.kanal11play.se/#!/play/program/232835958/video/367135199',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _fix_subtitles(self, subs):
|
|
||||||
return '\r\n\r\n'.join(
|
|
||||||
'%s\r\n%s --> %s\r\n%s'
|
|
||||||
% (
|
|
||||||
num,
|
|
||||||
srt_subtitles_timecode(item['startMillis'] / 1000.0),
|
|
||||||
srt_subtitles_timecode(item['endMillis'] / 1000.0),
|
|
||||||
item['text'],
|
|
||||||
) for num, item in enumerate(subs, 1))
|
|
||||||
|
|
||||||
def _get_subtitles(self, channel_id, video_id):
|
|
||||||
subs = self._download_json(
|
|
||||||
'http://www.kanal%splay.se/api/subtitles/%s' % (channel_id, video_id),
|
|
||||||
video_id, 'Downloading subtitles JSON', fatal=False)
|
|
||||||
return {'sv': [{'ext': 'srt', 'data': self._fix_subtitles(subs)}]} if subs else {}
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
mobj = re.match(self._VALID_URL, url)
|
|
||||||
video_id = mobj.group('id')
|
|
||||||
channel_id = mobj.group('channel_id')
|
|
||||||
|
|
||||||
video = self._download_json(
|
|
||||||
'http://www.kanal%splay.se/api/getVideo?format=FLASH&videoId=%s' % (channel_id, video_id),
|
|
||||||
video_id)
|
|
||||||
|
|
||||||
reasons_for_no_streams = video.get('reasonsForNoStreams')
|
|
||||||
if reasons_for_no_streams:
|
|
||||||
raise ExtractorError(
|
|
||||||
'%s returned error: %s' % (self.IE_NAME, '\n'.join(reasons_for_no_streams)),
|
|
||||||
expected=True)
|
|
||||||
|
|
||||||
title = video['title']
|
|
||||||
description = video.get('description')
|
|
||||||
duration = float_or_none(video.get('length'), 1000)
|
|
||||||
thumbnail = video.get('posterUrl')
|
|
||||||
|
|
||||||
stream_base_url = video['streamBaseUrl']
|
|
||||||
|
|
||||||
formats = [{
|
|
||||||
'url': stream_base_url,
|
|
||||||
'play_path': stream['source'],
|
|
||||||
'ext': 'flv',
|
|
||||||
'tbr': float_or_none(stream.get('bitrate'), 1000),
|
|
||||||
'rtmp_real_time': True,
|
|
||||||
} for stream in video['streams']]
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
subtitles = {}
|
|
||||||
if video.get('hasSubtitle'):
|
|
||||||
subtitles = self.extract_subtitles(channel_id, video_id)
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': video_id,
|
|
||||||
'title': title,
|
|
||||||
'description': description,
|
|
||||||
'thumbnail': thumbnail,
|
|
||||||
'duration': duration,
|
|
||||||
'formats': formats,
|
|
||||||
'subtitles': subtitles,
|
|
||||||
}
|
|
@ -2,92 +2,71 @@ from __future__ import unicode_literals
|
|||||||
|
|
||||||
from .canvas import CanvasIE
|
from .canvas import CanvasIE
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_urllib_parse_unquote
|
||||||
|
from ..utils import (
|
||||||
|
int_or_none,
|
||||||
|
parse_iso8601,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class KetnetIE(InfoExtractor):
|
class KetnetIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?ketnet\.be/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
_VALID_URL = r'https?://(?:www\.)?ketnet\.be/(?P<id>(?:[^/]+/)*[^/?#&]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.ketnet.be/kijken/zomerse-filmpjes',
|
'url': 'https://www.ketnet.be/kijken/n/nachtwacht/3/nachtwacht-s3a1-de-greystook',
|
||||||
'md5': '6bdeb65998930251bbd1c510750edba9',
|
'md5': '37b2b7bb9b3dcaa05b67058dc3a714a9',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'zomerse-filmpjes',
|
'id': 'pbs-pub-aef8b526-115e-4006-aa24-e59ff6c6ef6f$vid-ddb815bf-c8e7-467b-8879-6bad7a32cebd',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Gluur mee op de filmset en op Pennenzakkenrock',
|
'title': 'Nachtwacht - Reeks 3: Aflevering 1',
|
||||||
'description': 'Gluur mee met Ghost Rockers op de filmset',
|
'description': 'De Nachtwacht krijgt te maken met een parasiet',
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
}
|
'duration': 1468.02,
|
||||||
}, {
|
'timestamp': 1609225200,
|
||||||
# mzid in playerConfig instead of sources
|
'upload_date': '20201229',
|
||||||
'url': 'https://www.ketnet.be/kijken/nachtwacht/de-greystook',
|
'series': 'Nachtwacht',
|
||||||
'md5': '90139b746a0a9bd7bb631283f6e2a64e',
|
'season': 'Reeks 3',
|
||||||
'info_dict': {
|
'episode': 'De Greystook',
|
||||||
'id': 'md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
|
'episode_number': 1,
|
||||||
'display_id': 'md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
|
|
||||||
'ext': 'flv',
|
|
||||||
'title': 'Nachtwacht: De Greystook',
|
|
||||||
'description': 'md5:1db3f5dc4c7109c821261e7512975be7',
|
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
|
||||||
'duration': 1468.03,
|
|
||||||
},
|
},
|
||||||
'expected_warnings': ['is not a supported codec', 'Unknown MIME type'],
|
'expected_warnings': ['is not a supported codec', 'Unknown MIME type'],
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.ketnet.be/kijken/karrewiet/uitzending-8-september-2016',
|
'url': 'https://www.ketnet.be/themas/karrewiet/jaaroverzicht-20200/karrewiet-het-jaar-van-black-mamba',
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'https://www.ketnet.be/achter-de-schermen/sien-repeteert-voor-stars-for-life',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
# mzsource, geo restricted to Belgium
|
|
||||||
'url': 'https://www.ketnet.be/kijken/nachtwacht/de-bermadoe',
|
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
display_id = self._match_id(url)
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
video = self._download_json(
|
||||||
|
'https://senior-bff.ketnet.be/graphql', display_id, query={
|
||||||
|
'query': '''{
|
||||||
|
video(id: "content/ketnet/nl/%s.model.json") {
|
||||||
|
description
|
||||||
|
episodeNr
|
||||||
|
imageUrl
|
||||||
|
mediaReference
|
||||||
|
programTitle
|
||||||
|
publicationDate
|
||||||
|
seasonTitle
|
||||||
|
subtitleVideodetail
|
||||||
|
titleVideodetail
|
||||||
|
}
|
||||||
|
}''' % display_id,
|
||||||
|
})['data']['video']
|
||||||
|
|
||||||
config = self._parse_json(
|
mz_id = compat_urllib_parse_unquote(video['mediaReference'])
|
||||||
self._search_regex(
|
|
||||||
r'(?s)playerConfig\s*=\s*({.+?})\s*;', webpage,
|
|
||||||
'player config'),
|
|
||||||
video_id)
|
|
||||||
|
|
||||||
mzid = config.get('mzid')
|
|
||||||
if mzid:
|
|
||||||
return self.url_result(
|
|
||||||
'https://mediazone.vrt.be/api/v1/ketnet/assets/%s' % mzid,
|
|
||||||
CanvasIE.ie_key(), video_id=mzid)
|
|
||||||
|
|
||||||
title = config['title']
|
|
||||||
|
|
||||||
formats = []
|
|
||||||
for source_key in ('', 'mz'):
|
|
||||||
source = config.get('%ssource' % source_key)
|
|
||||||
if not isinstance(source, dict):
|
|
||||||
continue
|
|
||||||
for format_id, format_url in source.items():
|
|
||||||
if format_id == 'hls':
|
|
||||||
formats.extend(self._extract_m3u8_formats(
|
|
||||||
format_url, video_id, 'mp4',
|
|
||||||
entry_protocol='m3u8_native', m3u8_id=format_id,
|
|
||||||
fatal=False))
|
|
||||||
elif format_id == 'hds':
|
|
||||||
formats.extend(self._extract_f4m_formats(
|
|
||||||
format_url, video_id, f4m_id=format_id, fatal=False))
|
|
||||||
else:
|
|
||||||
formats.append({
|
|
||||||
'url': format_url,
|
|
||||||
'format_id': format_id,
|
|
||||||
})
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'_type': 'url_transparent',
|
||||||
'title': title,
|
'id': mz_id,
|
||||||
'description': config.get('description'),
|
'title': video['titleVideodetail'],
|
||||||
'thumbnail': config.get('image'),
|
'url': 'https://mediazone.vrt.be/api/v1/ketnet/assets/' + mz_id,
|
||||||
'series': config.get('program'),
|
'thumbnail': video.get('imageUrl'),
|
||||||
'episode': config.get('episode'),
|
'description': video.get('description'),
|
||||||
'formats': formats,
|
'timestamp': parse_iso8601(video.get('publicationDate')),
|
||||||
|
'series': video.get('programTitle'),
|
||||||
|
'season': video.get('seasonTitle'),
|
||||||
|
'episode': video.get('subtitleVideodetail'),
|
||||||
|
'episode_number': int_or_none(video.get('episodeNr')),
|
||||||
|
'ie_key': CanvasIE.ie_key(),
|
||||||
}
|
}
|
||||||
|
@ -1,82 +1,107 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
import json
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
unified_strdate,
|
int_or_none,
|
||||||
|
parse_iso8601,
|
||||||
|
try_get,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class KhanAcademyIE(InfoExtractor):
|
class KhanAcademyBaseIE(InfoExtractor):
|
||||||
_VALID_URL = r'^https?://(?:(?:www|api)\.)?khanacademy\.org/(?P<key>[^/]+)/(?:[^/]+/){,2}(?P<id>[^?#/]+)(?:$|[?#])'
|
_VALID_URL_TEMPL = r'https?://(?:www\.)?khanacademy\.org/(?P<id>(?:[^/]+/){%s}%s[^?#/&]+)'
|
||||||
IE_NAME = 'KhanAcademy'
|
|
||||||
|
|
||||||
_TESTS = [{
|
def _parse_video(self, video):
|
||||||
'url': 'http://www.khanacademy.org/video/one-time-pad',
|
return {
|
||||||
'md5': '7b391cce85e758fb94f763ddc1bbb979',
|
'_type': 'url_transparent',
|
||||||
|
'url': video['youtubeId'],
|
||||||
|
'id': video.get('slug'),
|
||||||
|
'title': video.get('title'),
|
||||||
|
'thumbnail': video.get('imageUrl') or video.get('thumbnailUrl'),
|
||||||
|
'duration': int_or_none(video.get('duration')),
|
||||||
|
'description': video.get('description'),
|
||||||
|
'ie_key': 'Youtube',
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
component_props = self._parse_json(self._download_json(
|
||||||
|
'https://www.khanacademy.org/api/internal/graphql',
|
||||||
|
display_id, query={
|
||||||
|
'hash': 1604303425,
|
||||||
|
'variables': json.dumps({
|
||||||
|
'path': display_id,
|
||||||
|
'queryParams': '',
|
||||||
|
}),
|
||||||
|
})['data']['contentJson'], display_id)['componentProps']
|
||||||
|
return self._parse_component_props(component_props)
|
||||||
|
|
||||||
|
|
||||||
|
class KhanAcademyIE(KhanAcademyBaseIE):
|
||||||
|
IE_NAME = 'khanacademy'
|
||||||
|
_VALID_URL = KhanAcademyBaseIE._VALID_URL_TEMPL % ('4', 'v/')
|
||||||
|
_TEST = {
|
||||||
|
'url': 'https://www.khanacademy.org/computing/computer-science/cryptography/crypt/v/one-time-pad',
|
||||||
|
'md5': '9c84b7b06f9ebb80d22a5c8dedefb9a0',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'one-time-pad',
|
'id': 'FlIG3TvQCBQ',
|
||||||
'ext': 'webm',
|
'ext': 'mp4',
|
||||||
'title': 'The one-time pad',
|
'title': 'The one-time pad',
|
||||||
'description': 'The perfect cipher',
|
'description': 'The perfect cipher',
|
||||||
'duration': 176,
|
'duration': 176,
|
||||||
'uploader': 'Brit Cruise',
|
'uploader': 'Brit Cruise',
|
||||||
'uploader_id': 'khanacademy',
|
'uploader_id': 'khanacademy',
|
||||||
'upload_date': '20120411',
|
'upload_date': '20120411',
|
||||||
|
'timestamp': 1334170113,
|
||||||
|
'license': 'cc-by-nc-sa',
|
||||||
},
|
},
|
||||||
'add_ie': ['Youtube'],
|
'add_ie': ['Youtube'],
|
||||||
}, {
|
}
|
||||||
'url': 'https://www.khanacademy.org/math/applied-math/cryptography',
|
|
||||||
|
def _parse_component_props(self, component_props):
|
||||||
|
video = component_props['tutorialPageData']['contentModel']
|
||||||
|
info = self._parse_video(video)
|
||||||
|
author_names = video.get('authorNames')
|
||||||
|
info.update({
|
||||||
|
'uploader': ', '.join(author_names) if author_names else None,
|
||||||
|
'timestamp': parse_iso8601(video.get('dateAdded')),
|
||||||
|
'license': video.get('kaUserLicense'),
|
||||||
|
})
|
||||||
|
return info
|
||||||
|
|
||||||
|
|
||||||
|
class KhanAcademyUnitIE(KhanAcademyBaseIE):
|
||||||
|
IE_NAME = 'khanacademy:unit'
|
||||||
|
_VALID_URL = (KhanAcademyBaseIE._VALID_URL_TEMPL % ('2', '')) + '/?(?:[?#&]|$)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'https://www.khanacademy.org/computing/computer-science/cryptography',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'cryptography',
|
'id': 'cryptography',
|
||||||
'title': 'Journey into cryptography',
|
'title': 'Cryptography',
|
||||||
'description': 'How have humans protected their secret messages through history? What has changed today?',
|
'description': 'How have humans protected their secret messages through history? What has changed today?',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 3,
|
'playlist_mincount': 31,
|
||||||
}]
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _parse_component_props(self, component_props):
|
||||||
m = re.match(self._VALID_URL, url)
|
curation = component_props['curation']
|
||||||
video_id = m.group('id')
|
|
||||||
|
|
||||||
if m.group('key') == 'video':
|
entries = []
|
||||||
data = self._download_json(
|
tutorials = try_get(curation, lambda x: x['tabs'][0]['modules'][0]['tutorials'], list) or []
|
||||||
'http://api.khanacademy.org/api/v1/videos/' + video_id,
|
for tutorial_number, tutorial in enumerate(tutorials, 1):
|
||||||
video_id, 'Downloading video info')
|
chapter_info = {
|
||||||
|
'chapter': tutorial.get('title'),
|
||||||
upload_date = unified_strdate(data['date_added'])
|
'chapter_number': tutorial_number,
|
||||||
uploader = ', '.join(data['author_names'])
|
'chapter_id': tutorial.get('id'),
|
||||||
return {
|
|
||||||
'_type': 'url_transparent',
|
|
||||||
'url': data['url'],
|
|
||||||
'id': video_id,
|
|
||||||
'title': data['title'],
|
|
||||||
'thumbnail': data['image_url'],
|
|
||||||
'duration': data['duration'],
|
|
||||||
'description': data['description'],
|
|
||||||
'uploader': uploader,
|
|
||||||
'upload_date': upload_date,
|
|
||||||
}
|
}
|
||||||
else:
|
for content_item in (tutorial.get('contentItems') or []):
|
||||||
# topic
|
if content_item.get('kind') == 'Video':
|
||||||
data = self._download_json(
|
info = self._parse_video(content_item)
|
||||||
'http://api.khanacademy.org/api/v1/topic/' + video_id,
|
info.update(chapter_info)
|
||||||
video_id, 'Downloading topic info')
|
entries.append(info)
|
||||||
|
|
||||||
entries = [
|
return self.playlist_result(
|
||||||
{
|
entries, curation.get('unit'), curation.get('title'),
|
||||||
'_type': 'url',
|
curation.get('description'))
|
||||||
'url': c['url'],
|
|
||||||
'id': c['id'],
|
|
||||||
'title': c['title'],
|
|
||||||
}
|
|
||||||
for c in data['children'] if c['kind'] in ('Video', 'Topic')]
|
|
||||||
|
|
||||||
return {
|
|
||||||
'_type': 'playlist',
|
|
||||||
'id': video_id,
|
|
||||||
'title': data['title'],
|
|
||||||
'description': data['description'],
|
|
||||||
'entries': entries,
|
|
||||||
}
|
|
||||||
|
@ -1,22 +1,86 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import functools
|
||||||
import json
|
import json
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import compat_str
|
from ..compat import (
|
||||||
|
compat_str,
|
||||||
|
compat_urllib_parse_unquote,
|
||||||
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
determine_ext,
|
determine_ext,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
mimetype2ext,
|
mimetype2ext,
|
||||||
|
OnDemandPagedList,
|
||||||
try_get,
|
try_get,
|
||||||
|
urljoin,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class LBRYIE(InfoExtractor):
|
class LBRYBaseIE(InfoExtractor):
|
||||||
IE_NAME = 'lbry.tv'
|
_BASE_URL_REGEX = r'https?://(?:www\.)?(?:lbry\.tv|odysee\.com)/'
|
||||||
_VALID_URL = r'https?://(?:www\.)?(?:lbry\.tv|odysee\.com)/(?P<id>@[^:]+:[0-9a-z]+/[^:]+:[0-9a-z])'
|
_CLAIM_ID_REGEX = r'[0-9a-f]{1,40}'
|
||||||
|
_OPT_CLAIM_ID = '[^:/?#&]+(?::%s)?' % _CLAIM_ID_REGEX
|
||||||
|
_SUPPORTED_STREAM_TYPES = ['video', 'audio']
|
||||||
|
|
||||||
|
def _call_api_proxy(self, method, display_id, params, resource):
|
||||||
|
return self._download_json(
|
||||||
|
'https://api.lbry.tv/api/v1/proxy',
|
||||||
|
display_id, 'Downloading %s JSON metadata' % resource,
|
||||||
|
headers={'Content-Type': 'application/json-rpc'},
|
||||||
|
data=json.dumps({
|
||||||
|
'method': method,
|
||||||
|
'params': params,
|
||||||
|
}).encode())['result']
|
||||||
|
|
||||||
|
def _resolve_url(self, url, display_id, resource):
|
||||||
|
return self._call_api_proxy(
|
||||||
|
'resolve', display_id, {'urls': url}, resource)[url]
|
||||||
|
|
||||||
|
def _permanent_url(self, url, claim_name, claim_id):
|
||||||
|
return urljoin(url, '/%s:%s' % (claim_name, claim_id))
|
||||||
|
|
||||||
|
def _parse_stream(self, stream, url):
|
||||||
|
stream_value = stream.get('value') or {}
|
||||||
|
stream_type = stream_value.get('stream_type')
|
||||||
|
source = stream_value.get('source') or {}
|
||||||
|
media = stream_value.get(stream_type) or {}
|
||||||
|
signing_channel = stream.get('signing_channel') or {}
|
||||||
|
channel_name = signing_channel.get('name')
|
||||||
|
channel_claim_id = signing_channel.get('claim_id')
|
||||||
|
channel_url = None
|
||||||
|
if channel_name and channel_claim_id:
|
||||||
|
channel_url = self._permanent_url(url, channel_name, channel_claim_id)
|
||||||
|
|
||||||
|
info = {
|
||||||
|
'thumbnail': try_get(stream_value, lambda x: x['thumbnail']['url'], compat_str),
|
||||||
|
'description': stream_value.get('description'),
|
||||||
|
'license': stream_value.get('license'),
|
||||||
|
'timestamp': int_or_none(stream.get('timestamp')),
|
||||||
|
'tags': stream_value.get('tags'),
|
||||||
|
'duration': int_or_none(media.get('duration')),
|
||||||
|
'channel': try_get(signing_channel, lambda x: x['value']['title']),
|
||||||
|
'channel_id': channel_claim_id,
|
||||||
|
'channel_url': channel_url,
|
||||||
|
'ext': determine_ext(source.get('name')) or mimetype2ext(source.get('media_type')),
|
||||||
|
'filesize': int_or_none(source.get('size')),
|
||||||
|
}
|
||||||
|
if stream_type == 'audio':
|
||||||
|
info['vcodec'] = 'none'
|
||||||
|
else:
|
||||||
|
info.update({
|
||||||
|
'width': int_or_none(media.get('width')),
|
||||||
|
'height': int_or_none(media.get('height')),
|
||||||
|
})
|
||||||
|
return info
|
||||||
|
|
||||||
|
|
||||||
|
class LBRYIE(LBRYBaseIE):
|
||||||
|
IE_NAME = 'lbry'
|
||||||
|
_VALID_URL = LBRYBaseIE._BASE_URL_REGEX + r'(?P<id>\$/[^/]+/[^/]+/{1}|@{0}/{0}|(?!@){0})'.format(LBRYBaseIE._OPT_CLAIM_ID, LBRYBaseIE._CLAIM_ID_REGEX)
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# Video
|
# Video
|
||||||
'url': 'https://lbry.tv/@Mantega:1/First-day-LBRY:1',
|
'url': 'https://lbry.tv/@Mantega:1/First-day-LBRY:1',
|
||||||
@ -28,6 +92,8 @@ class LBRYIE(InfoExtractor):
|
|||||||
'description': 'md5:f6cb5c704b332d37f5119313c2c98f51',
|
'description': 'md5:f6cb5c704b332d37f5119313c2c98f51',
|
||||||
'timestamp': 1595694354,
|
'timestamp': 1595694354,
|
||||||
'upload_date': '20200725',
|
'upload_date': '20200725',
|
||||||
|
'width': 1280,
|
||||||
|
'height': 720,
|
||||||
}
|
}
|
||||||
}, {
|
}, {
|
||||||
# Audio
|
# Audio
|
||||||
@ -40,6 +106,12 @@ class LBRYIE(InfoExtractor):
|
|||||||
'description': 'md5:661ac4f1db09f31728931d7b88807a61',
|
'description': 'md5:661ac4f1db09f31728931d7b88807a61',
|
||||||
'timestamp': 1591312601,
|
'timestamp': 1591312601,
|
||||||
'upload_date': '20200604',
|
'upload_date': '20200604',
|
||||||
|
'tags': list,
|
||||||
|
'duration': 2570,
|
||||||
|
'channel': 'The LBRY Foundation',
|
||||||
|
'channel_id': '0ed629d2b9c601300cacf7eabe9da0be79010212',
|
||||||
|
'channel_url': 'https://lbry.tv/@LBRYFoundation:0ed629d2b9c601300cacf7eabe9da0be79010212',
|
||||||
|
'vcodec': 'none',
|
||||||
}
|
}
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://odysee.com/@BrodieRobertson:5/apple-is-tracking-everything-you-do-on:e',
|
'url': 'https://odysee.com/@BrodieRobertson:5/apple-is-tracking-everything-you-do-on:e',
|
||||||
@ -47,45 +119,103 @@ class LBRYIE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': "https://odysee.com/@ScammerRevolts:b0/I-SYSKEY'D-THE-SAME-SCAMMERS-3-TIMES!:b",
|
'url': "https://odysee.com/@ScammerRevolts:b0/I-SYSKEY'D-THE-SAME-SCAMMERS-3-TIMES!:b",
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://lbry.tv/Episode-1:e7d93d772bd87e2b62d5ab993c1c3ced86ebb396',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://lbry.tv/$/embed/Episode-1/e7d93d772bd87e2b62d5ab993c1c3ced86ebb396',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://lbry.tv/Episode-1:e7',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://lbry.tv/@LBRYFoundation/Episode-1',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://lbry.tv/$/download/Episode-1/e7d93d772bd87e2b62d5ab993c1c3ced86ebb396',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://lbry.tv/@lacajadepandora:a/TRUMP-EST%C3%81-BIEN-PUESTO-con-Pilar-Baselga,-Carlos-Senra,-Luis-Palacios-(720p_30fps_H264-192kbit_AAC):1',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _call_api_proxy(self, method, display_id, params):
|
def _real_extract(self, url):
|
||||||
return self._download_json(
|
display_id = self._match_id(url)
|
||||||
'https://api.lbry.tv/api/v1/proxy', display_id,
|
if display_id.startswith('$/'):
|
||||||
headers={'Content-Type': 'application/json-rpc'},
|
display_id = display_id.split('/', 2)[-1].replace('/', ':')
|
||||||
data=json.dumps({
|
else:
|
||||||
'method': method,
|
display_id = display_id.replace(':', '#')
|
||||||
'params': params,
|
display_id = compat_urllib_parse_unquote(display_id)
|
||||||
}).encode())['result']
|
uri = 'lbry://' + display_id
|
||||||
|
result = self._resolve_url(uri, display_id, 'stream')
|
||||||
|
result_value = result['value']
|
||||||
|
if result_value.get('stream_type') not in self._SUPPORTED_STREAM_TYPES:
|
||||||
|
raise ExtractorError('Unsupported URL', expected=True)
|
||||||
|
claim_id = result['claim_id']
|
||||||
|
title = result_value['title']
|
||||||
|
streaming_url = self._call_api_proxy(
|
||||||
|
'get', claim_id, {'uri': uri}, 'streaming url')['streaming_url']
|
||||||
|
info = self._parse_stream(result, url)
|
||||||
|
info.update({
|
||||||
|
'id': claim_id,
|
||||||
|
'title': title,
|
||||||
|
'url': streaming_url,
|
||||||
|
})
|
||||||
|
return info
|
||||||
|
|
||||||
|
|
||||||
|
class LBRYChannelIE(LBRYBaseIE):
|
||||||
|
IE_NAME = 'lbry:channel'
|
||||||
|
_VALID_URL = LBRYBaseIE._BASE_URL_REGEX + r'(?P<id>@%s)/?(?:[?#&]|$)' % LBRYBaseIE._OPT_CLAIM_ID
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://lbry.tv/@LBRYFoundation:0',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '0ed629d2b9c601300cacf7eabe9da0be79010212',
|
||||||
|
'title': 'The LBRY Foundation',
|
||||||
|
'description': 'Channel for the LBRY Foundation. Follow for updates and news.',
|
||||||
|
},
|
||||||
|
'playlist_count': 29,
|
||||||
|
}, {
|
||||||
|
'url': 'https://lbry.tv/@LBRYFoundation',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
_PAGE_SIZE = 50
|
||||||
|
|
||||||
|
def _fetch_page(self, claim_id, url, page):
|
||||||
|
page += 1
|
||||||
|
result = self._call_api_proxy(
|
||||||
|
'claim_search', claim_id, {
|
||||||
|
'channel_ids': [claim_id],
|
||||||
|
'claim_type': 'stream',
|
||||||
|
'no_totals': True,
|
||||||
|
'page': page,
|
||||||
|
'page_size': self._PAGE_SIZE,
|
||||||
|
'stream_types': self._SUPPORTED_STREAM_TYPES,
|
||||||
|
}, 'page %d' % page)
|
||||||
|
for item in (result.get('items') or []):
|
||||||
|
stream_claim_name = item.get('name')
|
||||||
|
stream_claim_id = item.get('claim_id')
|
||||||
|
if not (stream_claim_name and stream_claim_id):
|
||||||
|
continue
|
||||||
|
|
||||||
|
info = self._parse_stream(item, url)
|
||||||
|
info.update({
|
||||||
|
'_type': 'url',
|
||||||
|
'id': stream_claim_id,
|
||||||
|
'title': try_get(item, lambda x: x['value']['title']),
|
||||||
|
'url': self._permanent_url(url, stream_claim_name, stream_claim_id),
|
||||||
|
})
|
||||||
|
yield info
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id = self._match_id(url).replace(':', '#')
|
display_id = self._match_id(url).replace(':', '#')
|
||||||
uri = 'lbry://' + display_id
|
result = self._resolve_url(
|
||||||
result = self._call_api_proxy(
|
'lbry://' + display_id, display_id, 'channel')
|
||||||
'resolve', display_id, {'urls': [uri]})[uri]
|
claim_id = result['claim_id']
|
||||||
result_value = result['value']
|
entries = OnDemandPagedList(
|
||||||
if result_value.get('stream_type') not in ('video', 'audio'):
|
functools.partial(self._fetch_page, claim_id, url),
|
||||||
raise ExtractorError('Unsupported URL', expected=True)
|
self._PAGE_SIZE)
|
||||||
streaming_url = self._call_api_proxy(
|
result_value = result.get('value') or {}
|
||||||
'get', display_id, {'uri': uri})['streaming_url']
|
return self.playlist_result(
|
||||||
source = result_value.get('source') or {}
|
entries, claim_id, result_value.get('title'),
|
||||||
media = result_value.get('video') or result_value.get('audio') or {}
|
result_value.get('description'))
|
||||||
signing_channel = result_value.get('signing_channel') or {}
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': result['claim_id'],
|
|
||||||
'title': result_value['title'],
|
|
||||||
'thumbnail': try_get(result_value, lambda x: x['thumbnail']['url'], compat_str),
|
|
||||||
'description': result_value.get('description'),
|
|
||||||
'license': result_value.get('license'),
|
|
||||||
'timestamp': int_or_none(result.get('timestamp')),
|
|
||||||
'tags': result_value.get('tags'),
|
|
||||||
'width': int_or_none(media.get('width')),
|
|
||||||
'height': int_or_none(media.get('height')),
|
|
||||||
'duration': int_or_none(media.get('duration')),
|
|
||||||
'channel': signing_channel.get('name'),
|
|
||||||
'channel_id': signing_channel.get('claim_id'),
|
|
||||||
'ext': determine_ext(source.get('name')) or mimetype2ext(source.get('media_type')),
|
|
||||||
'filesize': int_or_none(source.get('size')),
|
|
||||||
'url': streaming_url,
|
|
||||||
}
|
|
||||||
|
@ -8,11 +8,15 @@ from .common import InfoExtractor
|
|||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_b64decode,
|
compat_b64decode,
|
||||||
compat_HTTPError,
|
compat_HTTPError,
|
||||||
|
compat_str,
|
||||||
)
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
|
clean_html,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
orderedSet,
|
js_to_json,
|
||||||
unescapeHTML,
|
parse_duration,
|
||||||
|
try_get,
|
||||||
|
unified_timestamp,
|
||||||
urlencode_postdata,
|
urlencode_postdata,
|
||||||
urljoin,
|
urljoin,
|
||||||
)
|
)
|
||||||
@ -28,11 +32,15 @@ class LinuxAcademyIE(InfoExtractor):
|
|||||||
)
|
)
|
||||||
'''
|
'''
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://linuxacademy.com/cp/courses/lesson/course/1498/lesson/2/module/154',
|
'url': 'https://linuxacademy.com/cp/courses/lesson/course/7971/lesson/2/module/675',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '1498-2',
|
'id': '7971-2',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': "Introduction to the Practitioner's Brief",
|
'title': 'What Is Data Science',
|
||||||
|
'description': 'md5:c574a3c20607144fb36cb65bdde76c99',
|
||||||
|
'timestamp': 1607387907,
|
||||||
|
'upload_date': '20201208',
|
||||||
|
'duration': 304,
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -46,7 +54,8 @@ class LinuxAcademyIE(InfoExtractor):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '154',
|
'id': '154',
|
||||||
'title': 'AWS Certified Cloud Practitioner',
|
'title': 'AWS Certified Cloud Practitioner',
|
||||||
'description': 'md5:039db7e60e4aac9cf43630e0a75fa834',
|
'description': 'md5:a68a299ca9bb98d41cca5abc4d4ce22c',
|
||||||
|
'duration': 28835,
|
||||||
},
|
},
|
||||||
'playlist_count': 41,
|
'playlist_count': 41,
|
||||||
'skip': 'Requires Linux Academy account credentials',
|
'skip': 'Requires Linux Academy account credentials',
|
||||||
@ -74,6 +83,7 @@ class LinuxAcademyIE(InfoExtractor):
|
|||||||
self._AUTHORIZE_URL, None, 'Downloading authorize page', query={
|
self._AUTHORIZE_URL, None, 'Downloading authorize page', query={
|
||||||
'client_id': self._CLIENT_ID,
|
'client_id': self._CLIENT_ID,
|
||||||
'response_type': 'token id_token',
|
'response_type': 'token id_token',
|
||||||
|
'response_mode': 'web_message',
|
||||||
'redirect_uri': self._ORIGIN_URL,
|
'redirect_uri': self._ORIGIN_URL,
|
||||||
'scope': 'openid email user_impersonation profile',
|
'scope': 'openid email user_impersonation profile',
|
||||||
'audience': self._ORIGIN_URL,
|
'audience': self._ORIGIN_URL,
|
||||||
@ -129,7 +139,13 @@ class LinuxAcademyIE(InfoExtractor):
|
|||||||
|
|
||||||
access_token = self._search_regex(
|
access_token = self._search_regex(
|
||||||
r'access_token=([^=&]+)', urlh.geturl(),
|
r'access_token=([^=&]+)', urlh.geturl(),
|
||||||
'access token')
|
'access token', default=None)
|
||||||
|
if not access_token:
|
||||||
|
access_token = self._parse_json(
|
||||||
|
self._search_regex(
|
||||||
|
r'authorizationResponse\s*=\s*({.+?})\s*;', callback_page,
|
||||||
|
'authorization response'), None,
|
||||||
|
transform_source=js_to_json)['response']['access_token']
|
||||||
|
|
||||||
self._download_webpage(
|
self._download_webpage(
|
||||||
'https://linuxacademy.com/cp/login/tokenValidateLogin/token/%s'
|
'https://linuxacademy.com/cp/login/tokenValidateLogin/token/%s'
|
||||||
@ -144,30 +160,84 @@ class LinuxAcademyIE(InfoExtractor):
|
|||||||
|
|
||||||
# course path
|
# course path
|
||||||
if course_id:
|
if course_id:
|
||||||
entries = [
|
module = self._parse_json(
|
||||||
self.url_result(
|
self._search_regex(
|
||||||
urljoin(url, lesson_url), ie=LinuxAcademyIE.ie_key())
|
r'window\.module\s*=\s*({.+?})\s*;', webpage, 'module'),
|
||||||
for lesson_url in orderedSet(re.findall(
|
item_id)
|
||||||
r'<a[^>]+\bhref=["\'](/cp/courses/lesson/course/\d+/lesson/\d+/module/\d+)',
|
entries = []
|
||||||
webpage))]
|
chapter_number = None
|
||||||
title = unescapeHTML(self._html_search_regex(
|
chapter = None
|
||||||
(r'class=["\']course-title["\'][^>]*>(?P<value>[^<]+)',
|
chapter_id = None
|
||||||
r'var\s+title\s*=\s*(["\'])(?P<value>(?:(?!\1).)+)\1'),
|
for item in module['items']:
|
||||||
webpage, 'title', default=None, group='value'))
|
if not isinstance(item, dict):
|
||||||
description = unescapeHTML(self._html_search_regex(
|
continue
|
||||||
r'var\s+description\s*=\s*(["\'])(?P<value>(?:(?!\1).)+)\1',
|
|
||||||
webpage, 'description', default=None, group='value'))
|
def type_field(key):
|
||||||
return self.playlist_result(entries, course_id, title, description)
|
return (try_get(item, lambda x: x['type'][key], compat_str) or '').lower()
|
||||||
|
type_fields = (type_field('name'), type_field('slug'))
|
||||||
|
# Move to next module section
|
||||||
|
if 'section' in type_fields:
|
||||||
|
chapter = item.get('course_name')
|
||||||
|
chapter_id = item.get('course_module')
|
||||||
|
chapter_number = 1 if not chapter_number else chapter_number + 1
|
||||||
|
continue
|
||||||
|
# Skip non-lessons
|
||||||
|
if 'lesson' not in type_fields:
|
||||||
|
continue
|
||||||
|
lesson_url = urljoin(url, item.get('url'))
|
||||||
|
if not lesson_url:
|
||||||
|
continue
|
||||||
|
title = item.get('title') or item.get('lesson_name')
|
||||||
|
description = item.get('md_desc') or clean_html(item.get('description')) or clean_html(item.get('text'))
|
||||||
|
entries.append({
|
||||||
|
'_type': 'url_transparent',
|
||||||
|
'url': lesson_url,
|
||||||
|
'ie_key': LinuxAcademyIE.ie_key(),
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
'timestamp': unified_timestamp(item.get('date')) or unified_timestamp(item.get('created_on')),
|
||||||
|
'duration': parse_duration(item.get('duration')),
|
||||||
|
'chapter': chapter,
|
||||||
|
'chapter_id': chapter_id,
|
||||||
|
'chapter_number': chapter_number,
|
||||||
|
})
|
||||||
|
return {
|
||||||
|
'_type': 'playlist',
|
||||||
|
'entries': entries,
|
||||||
|
'id': course_id,
|
||||||
|
'title': module.get('title'),
|
||||||
|
'description': module.get('md_desc') or clean_html(module.get('desc')),
|
||||||
|
'duration': parse_duration(module.get('duration')),
|
||||||
|
}
|
||||||
|
|
||||||
# single video path
|
# single video path
|
||||||
info = self._extract_jwplayer_data(
|
m3u8_url = self._parse_json(
|
||||||
webpage, item_id, require_title=False, m3u8_id='hls',)
|
self._search_regex(
|
||||||
title = self._search_regex(
|
r'player\.playlist\s*=\s*(\[.+?\])\s*;', webpage, 'playlist'),
|
||||||
(r'>Lecture\s*:\s*(?P<value>[^<]+)',
|
item_id)[0]['file']
|
||||||
r'lessonName\s*=\s*(["\'])(?P<value>(?:(?!\1).)+)\1'), webpage,
|
formats = self._extract_m3u8_formats(
|
||||||
'title', group='value')
|
m3u8_url, item_id, 'mp4', entry_protocol='m3u8_native',
|
||||||
info.update({
|
m3u8_id='hls')
|
||||||
|
self._sort_formats(formats)
|
||||||
|
info = {
|
||||||
'id': item_id,
|
'id': item_id,
|
||||||
'title': title,
|
'formats': formats,
|
||||||
})
|
}
|
||||||
|
lesson = self._parse_json(
|
||||||
|
self._search_regex(
|
||||||
|
(r'window\.lesson\s*=\s*({.+?})\s*;',
|
||||||
|
r'player\.lesson\s*=\s*({.+?})\s*;'),
|
||||||
|
webpage, 'lesson', default='{}'), item_id, fatal=False)
|
||||||
|
if lesson:
|
||||||
|
info.update({
|
||||||
|
'title': lesson.get('lesson_name'),
|
||||||
|
'description': lesson.get('md_desc') or clean_html(lesson.get('desc')),
|
||||||
|
'timestamp': unified_timestamp(lesson.get('date')) or unified_timestamp(lesson.get('created_on')),
|
||||||
|
'duration': parse_duration(lesson.get('duration')),
|
||||||
|
})
|
||||||
|
if not info.get('title'):
|
||||||
|
info['title'] = self._search_regex(
|
||||||
|
(r'>Lecture\s*:\s*(?P<value>[^<]+)',
|
||||||
|
r'lessonName\s*=\s*(["\'])(?P<value>(?:(?!\1).)+)\1'), webpage,
|
||||||
|
'title', group='value')
|
||||||
return info
|
return info
|
||||||
|
@ -2,12 +2,16 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import compat_urlparse
|
from ..compat import (
|
||||||
|
compat_str,
|
||||||
|
compat_urlparse,
|
||||||
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
determine_ext,
|
determine_ext,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
parse_duration,
|
parse_duration,
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
|
url_or_none,
|
||||||
xpath_text,
|
xpath_text,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -16,6 +20,8 @@ class MDRIE(InfoExtractor):
|
|||||||
IE_DESC = 'MDR.DE and KiKA'
|
IE_DESC = 'MDR.DE and KiKA'
|
||||||
_VALID_URL = r'https?://(?:www\.)?(?:mdr|kika)\.de/(?:.*)/[a-z-]+-?(?P<id>\d+)(?:_.+?)?\.html'
|
_VALID_URL = r'https?://(?:www\.)?(?:mdr|kika)\.de/(?:.*)/[a-z-]+-?(?P<id>\d+)(?:_.+?)?\.html'
|
||||||
|
|
||||||
|
_GEO_COUNTRIES = ['DE']
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# MDR regularly deletes its videos
|
# MDR regularly deletes its videos
|
||||||
'url': 'http://www.mdr.de/fakt/video189002.html',
|
'url': 'http://www.mdr.de/fakt/video189002.html',
|
||||||
@ -66,6 +72,22 @@ class MDRIE(InfoExtractor):
|
|||||||
'duration': 3239,
|
'duration': 3239,
|
||||||
'uploader': 'MITTELDEUTSCHER RUNDFUNK',
|
'uploader': 'MITTELDEUTSCHER RUNDFUNK',
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
# empty bitrateVideo and bitrateAudio
|
||||||
|
'url': 'https://www.kika.de/filme/sendung128372_zc-572e3f45_zs-1d9fb70e.html',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '128372',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Der kleine Wichtel kehrt zurück',
|
||||||
|
'description': 'md5:f77fafdff90f7aa1e9dca14f662c052a',
|
||||||
|
'duration': 4876,
|
||||||
|
'timestamp': 1607823300,
|
||||||
|
'upload_date': '20201213',
|
||||||
|
'uploader': 'ZDF',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.kika.de/baumhaus/sendungen/video19636_zc-fea7f8a0_zs-4bf89c60.html',
|
'url': 'http://www.kika.de/baumhaus/sendungen/video19636_zc-fea7f8a0_zs-4bf89c60.html',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -91,10 +113,13 @@ class MDRIE(InfoExtractor):
|
|||||||
|
|
||||||
title = xpath_text(doc, ['./title', './broadcast/broadcastName'], 'title', fatal=True)
|
title = xpath_text(doc, ['./title', './broadcast/broadcastName'], 'title', fatal=True)
|
||||||
|
|
||||||
|
type_ = xpath_text(doc, './type', default=None)
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
processed_urls = []
|
processed_urls = []
|
||||||
for asset in doc.findall('./assets/asset'):
|
for asset in doc.findall('./assets/asset'):
|
||||||
for source in (
|
for source in (
|
||||||
|
'download',
|
||||||
'progressiveDownload',
|
'progressiveDownload',
|
||||||
'dynamicHttpStreamingRedirector',
|
'dynamicHttpStreamingRedirector',
|
||||||
'adaptiveHttpStreamingRedirector'):
|
'adaptiveHttpStreamingRedirector'):
|
||||||
@ -102,63 +127,49 @@ class MDRIE(InfoExtractor):
|
|||||||
if url_el is None:
|
if url_el is None:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
video_url = url_el.text
|
video_url = url_or_none(url_el.text)
|
||||||
if video_url in processed_urls:
|
if not video_url or video_url in processed_urls:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
processed_urls.append(video_url)
|
processed_urls.append(video_url)
|
||||||
|
|
||||||
vbr = int_or_none(xpath_text(asset, './bitrateVideo', 'vbr'), 1000)
|
ext = determine_ext(video_url)
|
||||||
abr = int_or_none(xpath_text(asset, './bitrateAudio', 'abr'), 1000)
|
|
||||||
|
|
||||||
ext = determine_ext(url_el.text)
|
|
||||||
if ext == 'm3u8':
|
if ext == 'm3u8':
|
||||||
url_formats = self._extract_m3u8_formats(
|
formats.extend(self._extract_m3u8_formats(
|
||||||
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||||
preference=0, m3u8_id='HLS', fatal=False)
|
preference=0, m3u8_id='HLS', fatal=False))
|
||||||
elif ext == 'f4m':
|
elif ext == 'f4m':
|
||||||
url_formats = self._extract_f4m_formats(
|
formats.extend(self._extract_f4m_formats(
|
||||||
video_url + '?hdcore=3.7.0&plugin=aasp-3.7.0.39.44', video_id,
|
video_url + '?hdcore=3.7.0&plugin=aasp-3.7.0.39.44', video_id,
|
||||||
preference=0, f4m_id='HDS', fatal=False)
|
preference=0, f4m_id='HDS', fatal=False))
|
||||||
else:
|
else:
|
||||||
media_type = xpath_text(asset, './mediaType', 'media type', default='MP4')
|
media_type = xpath_text(asset, './mediaType', 'media type', default='MP4')
|
||||||
vbr = int_or_none(xpath_text(asset, './bitrateVideo', 'vbr'), 1000)
|
vbr = int_or_none(xpath_text(asset, './bitrateVideo', 'vbr'), 1000)
|
||||||
abr = int_or_none(xpath_text(asset, './bitrateAudio', 'abr'), 1000)
|
abr = int_or_none(xpath_text(asset, './bitrateAudio', 'abr'), 1000)
|
||||||
filesize = int_or_none(xpath_text(asset, './fileSize', 'file size'))
|
filesize = int_or_none(xpath_text(asset, './fileSize', 'file size'))
|
||||||
|
|
||||||
|
format_id = [media_type]
|
||||||
|
if vbr or abr:
|
||||||
|
format_id.append(compat_str(vbr or abr))
|
||||||
|
|
||||||
f = {
|
f = {
|
||||||
'url': video_url,
|
'url': video_url,
|
||||||
'format_id': '%s-%d' % (media_type, vbr or abr),
|
'format_id': '-'.join(format_id),
|
||||||
'filesize': filesize,
|
'filesize': filesize,
|
||||||
'abr': abr,
|
'abr': abr,
|
||||||
'preference': 1,
|
'vbr': vbr,
|
||||||
}
|
}
|
||||||
|
|
||||||
if vbr:
|
if vbr:
|
||||||
width = int_or_none(xpath_text(asset, './frameWidth', 'width'))
|
|
||||||
height = int_or_none(xpath_text(asset, './frameHeight', 'height'))
|
|
||||||
f.update({
|
f.update({
|
||||||
'vbr': vbr,
|
'width': int_or_none(xpath_text(asset, './frameWidth', 'width')),
|
||||||
'width': width,
|
'height': int_or_none(xpath_text(asset, './frameHeight', 'height')),
|
||||||
'height': height,
|
|
||||||
})
|
})
|
||||||
|
|
||||||
url_formats = [f]
|
if type_ == 'audio':
|
||||||
|
f['vcodec'] = 'none'
|
||||||
|
|
||||||
if not url_formats:
|
formats.append(f)
|
||||||
continue
|
|
||||||
|
|
||||||
if not vbr:
|
|
||||||
for f in url_formats:
|
|
||||||
abr = f.get('tbr') or abr
|
|
||||||
if 'tbr' in f:
|
|
||||||
del f['tbr']
|
|
||||||
f.update({
|
|
||||||
'abr': abr,
|
|
||||||
'vcodec': 'none',
|
|
||||||
})
|
|
||||||
|
|
||||||
formats.extend(url_formats)
|
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
131
youtube_dl/extractor/medaltv.py
Normal file
131
youtube_dl/extractor/medaltv.py
Normal file
@ -0,0 +1,131 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_str
|
||||||
|
from ..utils import (
|
||||||
|
ExtractorError,
|
||||||
|
float_or_none,
|
||||||
|
int_or_none,
|
||||||
|
str_or_none,
|
||||||
|
try_get,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class MedalTVIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?medal\.tv/clips/(?P<id>[0-9]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://medal.tv/clips/34934644/3Is9zyGMoBMr',
|
||||||
|
'md5': '7b07b064331b1cf9e8e5c52a06ae68fa',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '34934644',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Quad Cold',
|
||||||
|
'description': 'Medal,https://medal.tv/desktop/',
|
||||||
|
'uploader': 'MowgliSB',
|
||||||
|
'timestamp': 1603165266,
|
||||||
|
'upload_date': '20201020',
|
||||||
|
'uploader_id': 10619174,
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'https://medal.tv/clips/36787208',
|
||||||
|
'md5': 'b6dc76b78195fff0b4f8bf4a33ec2148',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '36787208',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'u tk me i tk u bigger',
|
||||||
|
'description': 'Medal,https://medal.tv/desktop/',
|
||||||
|
'uploader': 'Mimicc',
|
||||||
|
'timestamp': 1605580939,
|
||||||
|
'upload_date': '20201117',
|
||||||
|
'uploader_id': 5156321,
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
|
hydration_data = self._parse_json(self._search_regex(
|
||||||
|
r'<script[^>]*>\s*(?:var\s*)?hydrationData\s*=\s*({.+?})\s*</script>',
|
||||||
|
webpage, 'hydration data', default='{}'), video_id)
|
||||||
|
|
||||||
|
clip = try_get(
|
||||||
|
hydration_data, lambda x: x['clips'][video_id], dict) or {}
|
||||||
|
if not clip:
|
||||||
|
raise ExtractorError(
|
||||||
|
'Could not find video information.', video_id=video_id)
|
||||||
|
|
||||||
|
title = clip['contentTitle']
|
||||||
|
|
||||||
|
source_width = int_or_none(clip.get('sourceWidth'))
|
||||||
|
source_height = int_or_none(clip.get('sourceHeight'))
|
||||||
|
|
||||||
|
aspect_ratio = source_width / source_height if source_width and source_height else 16 / 9
|
||||||
|
|
||||||
|
def add_item(container, item_url, height, id_key='format_id', item_id=None):
|
||||||
|
item_id = item_id or '%dp' % height
|
||||||
|
if item_id not in item_url:
|
||||||
|
return
|
||||||
|
width = int(round(aspect_ratio * height))
|
||||||
|
container.append({
|
||||||
|
'url': item_url,
|
||||||
|
id_key: item_id,
|
||||||
|
'width': width,
|
||||||
|
'height': height
|
||||||
|
})
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
thumbnails = []
|
||||||
|
for k, v in clip.items():
|
||||||
|
if not (v and isinstance(v, compat_str)):
|
||||||
|
continue
|
||||||
|
mobj = re.match(r'(contentUrl|thumbnail)(?:(\d+)p)?$', k)
|
||||||
|
if not mobj:
|
||||||
|
continue
|
||||||
|
prefix = mobj.group(1)
|
||||||
|
height = int_or_none(mobj.group(2))
|
||||||
|
if prefix == 'contentUrl':
|
||||||
|
add_item(
|
||||||
|
formats, v, height or source_height,
|
||||||
|
item_id=None if height else 'source')
|
||||||
|
elif prefix == 'thumbnail':
|
||||||
|
add_item(thumbnails, v, height, 'id')
|
||||||
|
|
||||||
|
error = clip.get('error')
|
||||||
|
if not formats and error:
|
||||||
|
if error == 404:
|
||||||
|
raise ExtractorError(
|
||||||
|
'That clip does not exist.',
|
||||||
|
expected=True, video_id=video_id)
|
||||||
|
else:
|
||||||
|
raise ExtractorError(
|
||||||
|
'An unknown error occurred ({0}).'.format(error),
|
||||||
|
video_id=video_id)
|
||||||
|
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
# Necessary because the id of the author is not known in advance.
|
||||||
|
# Won't raise an issue if no profile can be found as this is optional.
|
||||||
|
author = try_get(
|
||||||
|
hydration_data, lambda x: list(x['profiles'].values())[0], dict) or {}
|
||||||
|
author_id = str_or_none(author.get('id'))
|
||||||
|
author_url = 'https://medal.tv/users/{0}'.format(author_id) if author_id else None
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'formats': formats,
|
||||||
|
'thumbnails': thumbnails,
|
||||||
|
'description': clip.get('contentDescription'),
|
||||||
|
'uploader': author.get('displayName'),
|
||||||
|
'timestamp': float_or_none(clip.get('created'), 1000),
|
||||||
|
'uploader_id': author_id,
|
||||||
|
'uploader_url': author_url,
|
||||||
|
'duration': int_or_none(clip.get('videoLengthSeconds')),
|
||||||
|
'view_count': int_or_none(clip.get('views')),
|
||||||
|
'like_count': int_or_none(clip.get('likes')),
|
||||||
|
'comment_count': int_or_none(clip.get('comments')),
|
||||||
|
}
|
@ -2,268 +2,113 @@ from __future__ import unicode_literals
|
|||||||
|
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .gigya import GigyaBaseIE
|
from .common import InfoExtractor
|
||||||
|
|
||||||
from ..compat import compat_str
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
|
extract_attributes,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
parse_duration,
|
mimetype2ext,
|
||||||
try_get,
|
parse_iso8601,
|
||||||
unified_timestamp,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class MedialaanIE(GigyaBaseIE):
|
class MedialaanIE(InfoExtractor):
|
||||||
_VALID_URL = r'''(?x)
|
_VALID_URL = r'''(?x)
|
||||||
https?://
|
https?://
|
||||||
(?:www\.|nieuws\.)?
|
|
||||||
(?:
|
(?:
|
||||||
(?P<site_id>vtm|q2|vtmkzoom)\.be/
|
(?:embed\.)?mychannels.video/embed/|
|
||||||
(?:
|
embed\.mychannels\.video/(?:s(?:dk|cript)/)?production/|
|
||||||
video(?:/[^/]+/id/|/?\?.*?\baid=)|
|
(?:www\.)?(?:
|
||||||
(?:[^/]+/)*
|
(?:
|
||||||
)
|
7sur7|
|
||||||
|
demorgen|
|
||||||
|
hln|
|
||||||
|
joe|
|
||||||
|
qmusic
|
||||||
|
)\.be|
|
||||||
|
(?:
|
||||||
|
[abe]d|
|
||||||
|
bndestem|
|
||||||
|
destentor|
|
||||||
|
gelderlander|
|
||||||
|
pzc|
|
||||||
|
tubantia|
|
||||||
|
volkskrant
|
||||||
|
)\.nl
|
||||||
|
)/video/(?:[^/]+/)*[^/?&#]+~p
|
||||||
)
|
)
|
||||||
(?P<id>[^/?#&]+)
|
(?P<id>\d+)
|
||||||
'''
|
'''
|
||||||
_NETRC_MACHINE = 'medialaan'
|
|
||||||
_APIKEY = '3_HZ0FtkMW_gOyKlqQzW5_0FHRC7Nd5XpXJZcDdXY4pk5eES2ZWmejRW5egwVm4ug-'
|
|
||||||
_SITE_TO_APP_ID = {
|
|
||||||
'vtm': 'vtm_watch',
|
|
||||||
'q2': 'q2',
|
|
||||||
'vtmkzoom': 'vtmkzoom',
|
|
||||||
}
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# vod
|
'url': 'https://www.bndestem.nl/video/de-terugkeer-van-ally-de-aap-en-wie-vertrekt-er-nog-bij-nac~p193993',
|
||||||
'url': 'http://vtm.be/video/volledige-afleveringen/id/vtm_20170219_VM0678361_vtmwatch',
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'vtm_20170219_VM0678361_vtmwatch',
|
'id': '193993',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Allemaal Chris afl. 6',
|
'title': 'De terugkeer van Ally de Aap en wie vertrekt er nog bij NAC?',
|
||||||
'description': 'md5:4be86427521e7b07e0adb0c9c554ddb2',
|
'timestamp': 1611663540,
|
||||||
'timestamp': 1487533280,
|
'upload_date': '20210126',
|
||||||
'upload_date': '20170219',
|
'duration': 238,
|
||||||
'duration': 2562,
|
|
||||||
'series': 'Allemaal Chris',
|
|
||||||
'season': 'Allemaal Chris',
|
|
||||||
'season_number': 1,
|
|
||||||
'season_id': '256936078124527',
|
|
||||||
'episode': 'Allemaal Chris afl. 6',
|
|
||||||
'episode_number': 6,
|
|
||||||
'episode_id': '256936078591527',
|
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
'skip': 'Requires account credentials',
|
|
||||||
}, {
|
}, {
|
||||||
# clip
|
'url': 'https://www.gelderlander.nl/video/kanalen/degelderlander~c320/series/snel-nieuws~s984/noodbevel-in-doetinchem-politie-stuurt-mensen-centrum-uit~p194093',
|
||||||
'url': 'http://vtm.be/video?aid=168332',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '168332',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': '"Veronique liegt!"',
|
|
||||||
'description': 'md5:1385e2b743923afe54ba4adc38476155',
|
|
||||||
'timestamp': 1489002029,
|
|
||||||
'upload_date': '20170308',
|
|
||||||
'duration': 96,
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
# vod
|
|
||||||
'url': 'http://vtm.be/video/volledige-afleveringen/id/257107153551000',
|
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
# vod
|
'url': 'https://embed.mychannels.video/sdk/production/193993?options=TFTFF_default',
|
||||||
'url': 'http://vtm.be/video?aid=163157',
|
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
# vod
|
'url': 'https://embed.mychannels.video/script/production/193993',
|
||||||
'url': 'http://www.q2.be/video/volledige-afleveringen/id/2be_20170301_VM0684442_q2',
|
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
# clip
|
'url': 'https://embed.mychannels.video/production/193993',
|
||||||
'url': 'http://vtmkzoom.be/k3-dansstudio/een-nieuw-seizoen-van-k3-dansstudio',
|
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
# http/s redirect
|
'url': 'https://mychannels.video/embed/193993',
|
||||||
'url': 'https://vtmkzoom.be/video?aid=45724',
|
'only_matching': True,
|
||||||
'info_dict': {
|
|
||||||
'id': '257136373657000',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'K3 Dansstudio Ushuaia afl.6',
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
'skip': 'Requires account credentials',
|
|
||||||
}, {
|
}, {
|
||||||
# nieuws.vtm.be
|
'url': 'https://embed.mychannels.video/embed/193993',
|
||||||
'url': 'https://nieuws.vtm.be/stadion/stadion/genk-nog-moeilijk-programma',
|
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_initialize(self):
|
@staticmethod
|
||||||
self._logged_in = False
|
def _extract_urls(webpage):
|
||||||
|
entries = []
|
||||||
def _login(self):
|
for element in re.findall(r'(<div[^>]+data-mychannels-type="video"[^>]*>)', webpage):
|
||||||
username, password = self._get_login_info()
|
mychannels_id = extract_attributes(element).get('data-mychannels-id')
|
||||||
if username is None:
|
if mychannels_id:
|
||||||
self.raise_login_required()
|
entries.append('https://mychannels.video/embed/' + mychannels_id)
|
||||||
|
return entries
|
||||||
auth_data = {
|
|
||||||
'APIKey': self._APIKEY,
|
|
||||||
'sdk': 'js_6.1',
|
|
||||||
'format': 'json',
|
|
||||||
'loginID': username,
|
|
||||||
'password': password,
|
|
||||||
}
|
|
||||||
|
|
||||||
auth_info = self._gigya_login(auth_data)
|
|
||||||
|
|
||||||
self._uid = auth_info['UID']
|
|
||||||
self._uid_signature = auth_info['UIDSignature']
|
|
||||||
self._signature_timestamp = auth_info['signatureTimestamp']
|
|
||||||
|
|
||||||
self._logged_in = True
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
production_id = self._match_id(url)
|
||||||
video_id, site_id = mobj.group('id', 'site_id')
|
production = self._download_json(
|
||||||
|
'https://embed.mychannels.video/sdk/production/' + production_id,
|
||||||
|
production_id, query={'options': 'UUUU_default'})['productions'][0]
|
||||||
|
title = production['title']
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
formats = []
|
||||||
|
for source in (production.get('sources') or []):
|
||||||
config = self._parse_json(
|
src = source.get('src')
|
||||||
self._search_regex(
|
if not src:
|
||||||
r'videoJSConfig\s*=\s*JSON\.parse\(\'({.+?})\'\);',
|
continue
|
||||||
webpage, 'config', default='{}'), video_id,
|
ext = mimetype2ext(source.get('type'))
|
||||||
transform_source=lambda s: s.replace(
|
if ext == 'm3u8':
|
||||||
'\\\\', '\\').replace(r'\"', '"').replace(r"\'", "'"))
|
formats.extend(self._extract_m3u8_formats(
|
||||||
|
src, production_id, 'mp4', 'm3u8_native',
|
||||||
vod_id = config.get('vodId') or self._search_regex(
|
m3u8_id='hls', fatal=False))
|
||||||
(r'\\"vodId\\"\s*:\s*\\"(.+?)\\"',
|
|
||||||
r'"vodId"\s*:\s*"(.+?)"',
|
|
||||||
r'<[^>]+id=["\']vod-(\d+)'),
|
|
||||||
webpage, 'video_id', default=None)
|
|
||||||
|
|
||||||
# clip, no authentication required
|
|
||||||
if not vod_id:
|
|
||||||
player = self._parse_json(
|
|
||||||
self._search_regex(
|
|
||||||
r'vmmaplayer\(({.+?})\);', webpage, 'vmma player',
|
|
||||||
default=''),
|
|
||||||
video_id, transform_source=lambda s: '[%s]' % s, fatal=False)
|
|
||||||
if player:
|
|
||||||
video = player[-1]
|
|
||||||
if video['videoUrl'] in ('http', 'https'):
|
|
||||||
return self.url_result(video['url'], MedialaanIE.ie_key())
|
|
||||||
info = {
|
|
||||||
'id': video_id,
|
|
||||||
'url': video['videoUrl'],
|
|
||||||
'title': video['title'],
|
|
||||||
'thumbnail': video.get('imageUrl'),
|
|
||||||
'timestamp': int_or_none(video.get('createdDate')),
|
|
||||||
'duration': int_or_none(video.get('duration')),
|
|
||||||
}
|
|
||||||
else:
|
else:
|
||||||
info = self._parse_html5_media_entries(
|
formats.append({
|
||||||
url, webpage, video_id, m3u8_id='hls')[0]
|
'ext': ext,
|
||||||
info.update({
|
'url': src,
|
||||||
'id': video_id,
|
|
||||||
'title': self._html_search_meta('description', webpage),
|
|
||||||
'duration': parse_duration(self._html_search_meta('duration', webpage)),
|
|
||||||
})
|
})
|
||||||
# vod, authentication required
|
self._sort_formats(formats)
|
||||||
else:
|
|
||||||
if not self._logged_in:
|
|
||||||
self._login()
|
|
||||||
|
|
||||||
settings = self._parse_json(
|
return {
|
||||||
self._search_regex(
|
'id': production_id,
|
||||||
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
|
'title': title,
|
||||||
webpage, 'drupal settings', default='{}'),
|
'formats': formats,
|
||||||
video_id)
|
'thumbnail': production.get('posterUrl'),
|
||||||
|
'timestamp': parse_iso8601(production.get('publicationDate'), ' '),
|
||||||
def get(container, item):
|
'duration': int_or_none(production.get('duration')) or None,
|
||||||
return try_get(
|
}
|
||||||
settings, lambda x: x[container][item],
|
|
||||||
compat_str) or self._search_regex(
|
|
||||||
r'"%s"\s*:\s*"([^"]+)' % item, webpage, item,
|
|
||||||
default=None)
|
|
||||||
|
|
||||||
app_id = get('vod', 'app_id') or self._SITE_TO_APP_ID.get(site_id, 'vtm_watch')
|
|
||||||
sso = get('vod', 'gigyaDatabase') or 'vtm-sso'
|
|
||||||
|
|
||||||
data = self._download_json(
|
|
||||||
'http://vod.medialaan.io/api/1.0/item/%s/video' % vod_id,
|
|
||||||
video_id, query={
|
|
||||||
'app_id': app_id,
|
|
||||||
'user_network': sso,
|
|
||||||
'UID': self._uid,
|
|
||||||
'UIDSignature': self._uid_signature,
|
|
||||||
'signatureTimestamp': self._signature_timestamp,
|
|
||||||
})
|
|
||||||
|
|
||||||
formats = self._extract_m3u8_formats(
|
|
||||||
data['response']['uri'], video_id, entry_protocol='m3u8_native',
|
|
||||||
ext='mp4', m3u8_id='hls')
|
|
||||||
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
info = {
|
|
||||||
'id': vod_id,
|
|
||||||
'formats': formats,
|
|
||||||
}
|
|
||||||
|
|
||||||
api_key = get('vod', 'apiKey')
|
|
||||||
channel = get('medialaanGigya', 'channel')
|
|
||||||
|
|
||||||
if api_key:
|
|
||||||
videos = self._download_json(
|
|
||||||
'http://vod.medialaan.io/vod/v2/videos', video_id, fatal=False,
|
|
||||||
query={
|
|
||||||
'channels': channel,
|
|
||||||
'ids': vod_id,
|
|
||||||
'limit': 1,
|
|
||||||
'apikey': api_key,
|
|
||||||
})
|
|
||||||
if videos:
|
|
||||||
video = try_get(
|
|
||||||
videos, lambda x: x['response']['videos'][0], dict)
|
|
||||||
if video:
|
|
||||||
def get(container, item, expected_type=None):
|
|
||||||
return try_get(
|
|
||||||
video, lambda x: x[container][item], expected_type)
|
|
||||||
|
|
||||||
def get_string(container, item):
|
|
||||||
return get(container, item, compat_str)
|
|
||||||
|
|
||||||
info.update({
|
|
||||||
'series': get_string('program', 'title'),
|
|
||||||
'season': get_string('season', 'title'),
|
|
||||||
'season_number': int_or_none(get('season', 'number')),
|
|
||||||
'season_id': get_string('season', 'id'),
|
|
||||||
'episode': get_string('episode', 'title'),
|
|
||||||
'episode_number': int_or_none(get('episode', 'number')),
|
|
||||||
'episode_id': get_string('episode', 'id'),
|
|
||||||
'duration': int_or_none(
|
|
||||||
video.get('duration')) or int_or_none(
|
|
||||||
video.get('durationMillis'), scale=1000),
|
|
||||||
'title': get_string('episode', 'title'),
|
|
||||||
'description': get_string('episode', 'text'),
|
|
||||||
'timestamp': unified_timestamp(get_string(
|
|
||||||
'publication', 'begin')),
|
|
||||||
})
|
|
||||||
|
|
||||||
if not info.get('title'):
|
|
||||||
info['title'] = try_get(
|
|
||||||
config, lambda x: x['videoConfig']['title'],
|
|
||||||
compat_str) or self._html_search_regex(
|
|
||||||
r'\\"title\\"\s*:\s*\\"(.+?)\\"', webpage, 'title',
|
|
||||||
default=None) or self._og_search_title(webpage)
|
|
||||||
|
|
||||||
if not info.get('description'):
|
|
||||||
info['description'] = self._html_search_regex(
|
|
||||||
r'<div[^>]+class="field-item\s+even">\s*<p>(.+?)</p>',
|
|
||||||
webpage, 'description', default=None)
|
|
||||||
|
|
||||||
return info
|
|
||||||
|
@ -23,7 +23,7 @@ class MediasetIE(ThePlatformBaseIE):
|
|||||||
https?://
|
https?://
|
||||||
(?:(?:www|static3)\.)?mediasetplay\.mediaset\.it/
|
(?:(?:www|static3)\.)?mediasetplay\.mediaset\.it/
|
||||||
(?:
|
(?:
|
||||||
(?:video|on-demand)/(?:[^/]+/)+[^/]+_|
|
(?:video|on-demand|movie)/(?:[^/]+/)+[^/]+_|
|
||||||
player/index\.html\?.*?\bprogramGuid=
|
player/index\.html\?.*?\bprogramGuid=
|
||||||
)
|
)
|
||||||
)(?P<id>[0-9A-Z]{16,})
|
)(?P<id>[0-9A-Z]{16,})
|
||||||
@ -88,6 +88,9 @@ class MediasetIE(ThePlatformBaseIE):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://www.mediasetplay.mediaset.it/video/grandefratellovip/benedetta-una-doccia-gelata_F309344401044C135',
|
'url': 'https://www.mediasetplay.mediaset.it/video/grandefratellovip/benedetta-una-doccia-gelata_F309344401044C135',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.mediasetplay.mediaset.it/movie/herculeslaleggendahainizio/hercules-la-leggenda-ha-inizio_F305927501000102',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
|
196
youtube_dl/extractor/minds.py
Normal file
196
youtube_dl/extractor/minds.py
Normal file
@ -0,0 +1,196 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_str
|
||||||
|
from ..utils import (
|
||||||
|
clean_html,
|
||||||
|
int_or_none,
|
||||||
|
str_or_none,
|
||||||
|
strip_or_none,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class MindsBaseIE(InfoExtractor):
|
||||||
|
_VALID_URL_BASE = r'https?://(?:www\.)?minds\.com/'
|
||||||
|
|
||||||
|
def _call_api(self, path, video_id, resource, query=None):
|
||||||
|
api_url = 'https://www.minds.com/api/' + path
|
||||||
|
token = self._get_cookies(api_url).get('XSRF-TOKEN')
|
||||||
|
return self._download_json(
|
||||||
|
api_url, video_id, 'Downloading %s JSON metadata' % resource, headers={
|
||||||
|
'Referer': 'https://www.minds.com/',
|
||||||
|
'X-XSRF-TOKEN': token.value if token else '',
|
||||||
|
}, query=query)
|
||||||
|
|
||||||
|
|
||||||
|
class MindsIE(MindsBaseIE):
|
||||||
|
IE_NAME = 'minds'
|
||||||
|
_VALID_URL = MindsBaseIE._VALID_URL_BASE + r'(?:media|newsfeed|archive/view)/(?P<id>[0-9]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.minds.com/media/100000000000086822',
|
||||||
|
'md5': '215a658184a419764852239d4970b045',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '100000000000086822',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Minds intro sequence',
|
||||||
|
'thumbnail': r're:https?://.+\.png',
|
||||||
|
'uploader_id': 'ottman',
|
||||||
|
'upload_date': '20130524',
|
||||||
|
'timestamp': 1369404826,
|
||||||
|
'uploader': 'Bill Ottman',
|
||||||
|
'view_count': int,
|
||||||
|
'like_count': int,
|
||||||
|
'dislike_count': int,
|
||||||
|
'tags': ['animation'],
|
||||||
|
'comment_count': int,
|
||||||
|
'license': 'attribution-cc',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
# entity.type == 'activity' and empty title
|
||||||
|
'url': 'https://www.minds.com/newsfeed/798025111988506624',
|
||||||
|
'md5': 'b2733a74af78d7fd3f541c4cbbaa5950',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '798022190320226304',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': '798022190320226304',
|
||||||
|
'uploader': 'ColinFlaherty',
|
||||||
|
'upload_date': '20180111',
|
||||||
|
'timestamp': 1515639316,
|
||||||
|
'uploader_id': 'ColinFlaherty',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.minds.com/archive/view/715172106794442752',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# youtube perma_url
|
||||||
|
'url': 'https://www.minds.com/newsfeed/1197131838022602752',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
entity_id = self._match_id(url)
|
||||||
|
entity = self._call_api(
|
||||||
|
'v1/entities/entity/' + entity_id, entity_id, 'entity')['entity']
|
||||||
|
if entity.get('type') == 'activity':
|
||||||
|
if entity.get('custom_type') == 'video':
|
||||||
|
video_id = entity['entity_guid']
|
||||||
|
else:
|
||||||
|
return self.url_result(entity['perma_url'])
|
||||||
|
else:
|
||||||
|
assert(entity['subtype'] == 'video')
|
||||||
|
video_id = entity_id
|
||||||
|
# 1080p and webm formats available only on the sources array
|
||||||
|
video = self._call_api(
|
||||||
|
'v2/media/video/' + video_id, video_id, 'video')
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
for source in (video.get('sources') or []):
|
||||||
|
src = source.get('src')
|
||||||
|
if not src:
|
||||||
|
continue
|
||||||
|
formats.append({
|
||||||
|
'format_id': source.get('label'),
|
||||||
|
'height': int_or_none(source.get('size')),
|
||||||
|
'url': src,
|
||||||
|
})
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
entity = video.get('entity') or entity
|
||||||
|
owner = entity.get('ownerObj') or {}
|
||||||
|
uploader_id = owner.get('username')
|
||||||
|
|
||||||
|
tags = entity.get('tags')
|
||||||
|
if tags and isinstance(tags, compat_str):
|
||||||
|
tags = [tags]
|
||||||
|
|
||||||
|
thumbnail = None
|
||||||
|
poster = video.get('poster') or entity.get('thumbnail_src')
|
||||||
|
if poster:
|
||||||
|
urlh = self._request_webpage(poster, video_id, fatal=False)
|
||||||
|
if urlh:
|
||||||
|
thumbnail = urlh.geturl()
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': entity.get('title') or video_id,
|
||||||
|
'formats': formats,
|
||||||
|
'description': clean_html(entity.get('description')) or None,
|
||||||
|
'license': str_or_none(entity.get('license')),
|
||||||
|
'timestamp': int_or_none(entity.get('time_created')),
|
||||||
|
'uploader': strip_or_none(owner.get('name')),
|
||||||
|
'uploader_id': uploader_id,
|
||||||
|
'uploader_url': 'https://www.minds.com/' + uploader_id if uploader_id else None,
|
||||||
|
'view_count': int_or_none(entity.get('play:count')),
|
||||||
|
'like_count': int_or_none(entity.get('thumbs:up:count')),
|
||||||
|
'dislike_count': int_or_none(entity.get('thumbs:down:count')),
|
||||||
|
'tags': tags,
|
||||||
|
'comment_count': int_or_none(entity.get('comments:count')),
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class MindsFeedBaseIE(MindsBaseIE):
|
||||||
|
_PAGE_SIZE = 150
|
||||||
|
|
||||||
|
def _entries(self, feed_id):
|
||||||
|
query = {'limit': self._PAGE_SIZE, 'sync': 1}
|
||||||
|
i = 1
|
||||||
|
while True:
|
||||||
|
data = self._call_api(
|
||||||
|
'v2/feeds/container/%s/videos' % feed_id,
|
||||||
|
feed_id, 'page %s' % i, query)
|
||||||
|
entities = data.get('entities') or []
|
||||||
|
for entity in entities:
|
||||||
|
guid = entity.get('guid')
|
||||||
|
if not guid:
|
||||||
|
continue
|
||||||
|
yield self.url_result(
|
||||||
|
'https://www.minds.com/newsfeed/' + guid,
|
||||||
|
MindsIE.ie_key(), guid)
|
||||||
|
query['from_timestamp'] = data['load-next']
|
||||||
|
if not (query['from_timestamp'] and len(entities) == self._PAGE_SIZE):
|
||||||
|
break
|
||||||
|
i += 1
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
feed_id = self._match_id(url)
|
||||||
|
feed = self._call_api(
|
||||||
|
'v1/%s/%s' % (self._FEED_PATH, feed_id),
|
||||||
|
feed_id, self._FEED_TYPE)[self._FEED_TYPE]
|
||||||
|
|
||||||
|
return self.playlist_result(
|
||||||
|
self._entries(feed['guid']), feed_id,
|
||||||
|
strip_or_none(feed.get('name')),
|
||||||
|
feed.get('briefdescription'))
|
||||||
|
|
||||||
|
|
||||||
|
class MindsChannelIE(MindsFeedBaseIE):
|
||||||
|
_FEED_TYPE = 'channel'
|
||||||
|
IE_NAME = 'minds:' + _FEED_TYPE
|
||||||
|
_VALID_URL = MindsBaseIE._VALID_URL_BASE + r'(?!(?:newsfeed|media|api|archive|groups)/)(?P<id>[^/?&#]+)'
|
||||||
|
_FEED_PATH = 'channel'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'https://www.minds.com/ottman',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'ottman',
|
||||||
|
'title': 'Bill Ottman',
|
||||||
|
'description': 'Co-creator & CEO @minds',
|
||||||
|
},
|
||||||
|
'playlist_mincount': 54,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class MindsGroupIE(MindsFeedBaseIE):
|
||||||
|
_FEED_TYPE = 'group'
|
||||||
|
IE_NAME = 'minds:' + _FEED_TYPE
|
||||||
|
_VALID_URL = MindsBaseIE._VALID_URL_BASE + r'groups/profile/(?P<id>[0-9]+)'
|
||||||
|
_FEED_PATH = 'groups/group'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'https://www.minds.com/groups/profile/785582576369672204/feed/videos',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '785582576369672204',
|
||||||
|
'title': 'Cooking Videos',
|
||||||
|
},
|
||||||
|
'playlist_mincount': 1,
|
||||||
|
}
|
@ -1,15 +1,14 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .telecinco import TelecincoIE
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
int_or_none,
|
int_or_none,
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
smuggle_url,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class MiTeleIE(InfoExtractor):
|
class MiTeleIE(TelecincoIE):
|
||||||
IE_DESC = 'mitele.es'
|
IE_DESC = 'mitele.es'
|
||||||
_VALID_URL = r'https?://(?:www\.)?mitele\.es/(?:[^/]+/)+(?P<id>[^/]+)/player'
|
_VALID_URL = r'https?://(?:www\.)?mitele\.es/(?:[^/]+/)+(?P<id>[^/]+)/player'
|
||||||
|
|
||||||
@ -31,7 +30,6 @@ class MiTeleIE(InfoExtractor):
|
|||||||
'timestamp': 1471209401,
|
'timestamp': 1471209401,
|
||||||
'upload_date': '20160814',
|
'upload_date': '20160814',
|
||||||
},
|
},
|
||||||
'add_ie': ['Ooyala'],
|
|
||||||
}, {
|
}, {
|
||||||
# no explicit title
|
# no explicit title
|
||||||
'url': 'http://www.mitele.es/programas-tv/cuarto-milenio/57b0de3dc915da14058b4876/player',
|
'url': 'http://www.mitele.es/programas-tv/cuarto-milenio/57b0de3dc915da14058b4876/player',
|
||||||
@ -54,7 +52,6 @@ class MiTeleIE(InfoExtractor):
|
|||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
'add_ie': ['Ooyala'],
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.mitele.es/series-online/la-que-se-avecina/57aac5c1c915da951a8b45ed/player',
|
'url': 'http://www.mitele.es/series-online/la-que-se-avecina/57aac5c1c915da951a8b45ed/player',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -70,16 +67,11 @@ class MiTeleIE(InfoExtractor):
|
|||||||
r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=\s*({.+})',
|
r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=\s*({.+})',
|
||||||
webpage, 'Pre Player'), display_id)['prePlayer']
|
webpage, 'Pre Player'), display_id)['prePlayer']
|
||||||
title = pre_player['title']
|
title = pre_player['title']
|
||||||
video = pre_player['video']
|
video_info = self._parse_content(pre_player['video'], url)
|
||||||
video_id = video['dataMediaId']
|
|
||||||
content = pre_player.get('content') or {}
|
content = pre_player.get('content') or {}
|
||||||
info = content.get('info') or {}
|
info = content.get('info') or {}
|
||||||
|
|
||||||
return {
|
video_info.update({
|
||||||
'_type': 'url_transparent',
|
|
||||||
# for some reason only HLS is supported
|
|
||||||
'url': smuggle_url('ooyala:' + video_id, {'supportedformats': 'm3u8,dash'}),
|
|
||||||
'id': video_id,
|
|
||||||
'title': title,
|
'title': title,
|
||||||
'description': info.get('synopsis'),
|
'description': info.get('synopsis'),
|
||||||
'series': content.get('title'),
|
'series': content.get('title'),
|
||||||
@ -87,7 +79,7 @@ class MiTeleIE(InfoExtractor):
|
|||||||
'episode': content.get('subtitle'),
|
'episode': content.get('subtitle'),
|
||||||
'episode_number': int_or_none(info.get('episode_number')),
|
'episode_number': int_or_none(info.get('episode_number')),
|
||||||
'duration': int_or_none(info.get('duration')),
|
'duration': int_or_none(info.get('duration')),
|
||||||
'thumbnail': video.get('dataPoster'),
|
|
||||||
'age_limit': int_or_none(info.get('rating')),
|
'age_limit': int_or_none(info.get('rating')),
|
||||||
'timestamp': parse_iso8601(pre_player.get('publishedTime')),
|
'timestamp': parse_iso8601(pre_player.get('publishedTime')),
|
||||||
}
|
})
|
||||||
|
return video_info
|
||||||
|
@ -251,8 +251,11 @@ class MixcloudPlaylistBaseIE(MixcloudBaseIE):
|
|||||||
cloudcast_url = cloudcast.get('url')
|
cloudcast_url = cloudcast.get('url')
|
||||||
if not cloudcast_url:
|
if not cloudcast_url:
|
||||||
continue
|
continue
|
||||||
|
slug = try_get(cloudcast, lambda x: x['slug'], compat_str)
|
||||||
|
owner_username = try_get(cloudcast, lambda x: x['owner']['username'], compat_str)
|
||||||
|
video_id = '%s_%s' % (owner_username, slug) if slug and owner_username else None
|
||||||
entries.append(self.url_result(
|
entries.append(self.url_result(
|
||||||
cloudcast_url, MixcloudIE.ie_key(), cloudcast.get('slug')))
|
cloudcast_url, MixcloudIE.ie_key(), video_id))
|
||||||
|
|
||||||
page_info = items['pageInfo']
|
page_info = items['pageInfo']
|
||||||
has_next_page = page_info['hasNextPage']
|
has_next_page = page_info['hasNextPage']
|
||||||
@ -321,7 +324,8 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
|
|||||||
_DESCRIPTION_KEY = 'biog'
|
_DESCRIPTION_KEY = 'biog'
|
||||||
_ROOT_TYPE = 'user'
|
_ROOT_TYPE = 'user'
|
||||||
_NODE_TEMPLATE = '''slug
|
_NODE_TEMPLATE = '''slug
|
||||||
url'''
|
url
|
||||||
|
owner { username }'''
|
||||||
|
|
||||||
def _get_playlist_title(self, title, slug):
|
def _get_playlist_title(self, title, slug):
|
||||||
return '%s (%s)' % (title, slug)
|
return '%s (%s)' % (title, slug)
|
||||||
@ -345,6 +349,7 @@ class MixcloudPlaylistIE(MixcloudPlaylistBaseIE):
|
|||||||
_NODE_TEMPLATE = '''cloudcast {
|
_NODE_TEMPLATE = '''cloudcast {
|
||||||
slug
|
slug
|
||||||
url
|
url
|
||||||
|
owner { username }
|
||||||
}'''
|
}'''
|
||||||
|
|
||||||
def _get_cloudcast(self, node):
|
def _get_cloudcast(self, node):
|
||||||
|
@ -61,6 +61,23 @@ class MotherlessIE(InfoExtractor):
|
|||||||
# no keywords
|
# no keywords
|
||||||
'url': 'http://motherless.com/8B4BBC1',
|
'url': 'http://motherless.com/8B4BBC1',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# see https://motherless.com/videos/recent for recent videos with
|
||||||
|
# uploaded date in "ago" format
|
||||||
|
'url': 'https://motherless.com/3C3E2CF',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3C3E2CF',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'a/ Hot Teens',
|
||||||
|
'categories': list,
|
||||||
|
'upload_date': '20210104',
|
||||||
|
'uploader_id': 'yonbiw',
|
||||||
|
'thumbnail': r're:https?://.*\.jpg',
|
||||||
|
'age_limit': 18,
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -85,20 +102,28 @@ class MotherlessIE(InfoExtractor):
|
|||||||
or 'http://cdn4.videos.motherlessmedia.com/videos/%s.mp4?fs=opencloud' % video_id)
|
or 'http://cdn4.videos.motherlessmedia.com/videos/%s.mp4?fs=opencloud' % video_id)
|
||||||
age_limit = self._rta_search(webpage)
|
age_limit = self._rta_search(webpage)
|
||||||
view_count = str_to_int(self._html_search_regex(
|
view_count = str_to_int(self._html_search_regex(
|
||||||
(r'>(\d+)\s+Views<', r'<strong>Views</strong>\s+([^<]+)<'),
|
(r'>([\d,.]+)\s+Views<', r'<strong>Views</strong>\s+([^<]+)<'),
|
||||||
webpage, 'view count', fatal=False))
|
webpage, 'view count', fatal=False))
|
||||||
like_count = str_to_int(self._html_search_regex(
|
like_count = str_to_int(self._html_search_regex(
|
||||||
(r'>(\d+)\s+Favorites<', r'<strong>Favorited</strong>\s+([^<]+)<'),
|
(r'>([\d,.]+)\s+Favorites<',
|
||||||
|
r'<strong>Favorited</strong>\s+([^<]+)<'),
|
||||||
webpage, 'like count', fatal=False))
|
webpage, 'like count', fatal=False))
|
||||||
|
|
||||||
upload_date = self._html_search_regex(
|
upload_date = unified_strdate(self._search_regex(
|
||||||
(r'class=["\']count[^>]+>(\d+\s+[a-zA-Z]{3}\s+\d{4})<',
|
r'class=["\']count[^>]+>(\d+\s+[a-zA-Z]{3}\s+\d{4})<', webpage,
|
||||||
r'<strong>Uploaded</strong>\s+([^<]+)<'), webpage, 'upload date')
|
'upload date', default=None))
|
||||||
if 'Ago' in upload_date:
|
if not upload_date:
|
||||||
days = int(re.search(r'([0-9]+)', upload_date).group(1))
|
uploaded_ago = self._search_regex(
|
||||||
upload_date = (datetime.datetime.now() - datetime.timedelta(days=days)).strftime('%Y%m%d')
|
r'>\s*(\d+[hd])\s+[aA]go\b', webpage, 'uploaded ago',
|
||||||
else:
|
default=None)
|
||||||
upload_date = unified_strdate(upload_date)
|
if uploaded_ago:
|
||||||
|
delta = int(uploaded_ago[:-1])
|
||||||
|
_AGO_UNITS = {
|
||||||
|
'h': 'hours',
|
||||||
|
'd': 'days',
|
||||||
|
}
|
||||||
|
kwargs = {_AGO_UNITS.get(uploaded_ago[-1]): delta}
|
||||||
|
upload_date = (datetime.datetime.utcnow() - datetime.timedelta(**kwargs)).strftime('%Y%m%d')
|
||||||
|
|
||||||
comment_count = webpage.count('class="media-comment-contents"')
|
comment_count = webpage.count('class="media-comment-contents"')
|
||||||
uploader_id = self._html_search_regex(
|
uploader_id = self._html_search_regex(
|
||||||
|
@ -253,6 +253,10 @@ class MTVServicesInfoExtractor(InfoExtractor):
|
|||||||
|
|
||||||
return try_get(feed, lambda x: x['result']['data']['id'], compat_str)
|
return try_get(feed, lambda x: x['result']['data']['id'], compat_str)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _extract_child_with_type(parent, t):
|
||||||
|
return next(c for c in parent['children'] if c.get('type') == t)
|
||||||
|
|
||||||
def _extract_mgid(self, webpage):
|
def _extract_mgid(self, webpage):
|
||||||
try:
|
try:
|
||||||
# the url can be http://media.mtvnservices.com/fb/{mgid}.swf
|
# the url can be http://media.mtvnservices.com/fb/{mgid}.swf
|
||||||
@ -278,6 +282,13 @@ class MTVServicesInfoExtractor(InfoExtractor):
|
|||||||
if not mgid:
|
if not mgid:
|
||||||
mgid = self._extract_triforce_mgid(webpage)
|
mgid = self._extract_triforce_mgid(webpage)
|
||||||
|
|
||||||
|
if not mgid:
|
||||||
|
data = self._parse_json(self._search_regex(
|
||||||
|
r'__DATA__\s*=\s*({.+?});', webpage, 'data'), None)
|
||||||
|
main_container = self._extract_child_with_type(data, 'MainContainer')
|
||||||
|
video_player = self._extract_child_with_type(main_container, 'VideoPlayer')
|
||||||
|
mgid = video_player['props']['media']['video']['config']['uri']
|
||||||
|
|
||||||
return mgid
|
return mgid
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -349,18 +360,6 @@ class MTVIE(MTVServicesInfoExtractor):
|
|||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def extract_child_with_type(parent, t):
|
|
||||||
children = parent['children']
|
|
||||||
return next(c for c in children if c.get('type') == t)
|
|
||||||
|
|
||||||
def _extract_mgid(self, webpage):
|
|
||||||
data = self._parse_json(self._search_regex(
|
|
||||||
r'__DATA__\s*=\s*({.+?});', webpage, 'data'), None)
|
|
||||||
main_container = self.extract_child_with_type(data, 'MainContainer')
|
|
||||||
video_player = self.extract_child_with_type(main_container, 'VideoPlayer')
|
|
||||||
return video_player['props']['media']['video']['config']['uri']
|
|
||||||
|
|
||||||
|
|
||||||
class MTVJapanIE(MTVServicesInfoExtractor):
|
class MTVJapanIE(MTVServicesInfoExtractor):
|
||||||
IE_NAME = 'mtvjapan'
|
IE_NAME = 'mtvjapan'
|
||||||
|
@ -5,33 +5,137 @@ import re
|
|||||||
|
|
||||||
from .turner import TurnerBaseIE
|
from .turner import TurnerBaseIE
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_urllib_parse_urlencode,
|
compat_parse_qs,
|
||||||
compat_urlparse,
|
compat_str,
|
||||||
|
compat_urllib_parse_unquote,
|
||||||
|
compat_urllib_parse_urlparse,
|
||||||
)
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
|
int_or_none,
|
||||||
|
merge_dicts,
|
||||||
OnDemandPagedList,
|
OnDemandPagedList,
|
||||||
remove_start,
|
parse_duration,
|
||||||
|
parse_iso8601,
|
||||||
|
try_get,
|
||||||
|
update_url_query,
|
||||||
|
urljoin,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class NBAIE(TurnerBaseIE):
|
class NBACVPBaseIE(TurnerBaseIE):
|
||||||
_VALID_URL = r'https?://(?:watch\.|www\.)?nba\.com/(?P<path>(?:[^/]+/)+(?P<id>[^?]*?))/?(?:/index\.html)?(?:\?.*)?$'
|
def _extract_nba_cvp_info(self, path, video_id, fatal=False):
|
||||||
|
return self._extract_cvp_info(
|
||||||
|
'http://secure.nba.com/%s' % path, video_id, {
|
||||||
|
'default': {
|
||||||
|
'media_src': 'http://nba.cdn.turner.com/nba/big',
|
||||||
|
},
|
||||||
|
'm3u8': {
|
||||||
|
'media_src': 'http://nbavod-f.akamaihd.net',
|
||||||
|
},
|
||||||
|
}, fatal=fatal)
|
||||||
|
|
||||||
|
|
||||||
|
class NBAWatchBaseIE(NBACVPBaseIE):
|
||||||
|
_VALID_URL_BASE = r'https?://(?:(?:www\.)?nba\.com(?:/watch)?|watch\.nba\.com)/'
|
||||||
|
|
||||||
|
def _extract_video(self, filter_key, filter_value):
|
||||||
|
video = self._download_json(
|
||||||
|
'https://neulionscnbav2-a.akamaihd.net/solr/nbad_program/usersearch',
|
||||||
|
filter_value, query={
|
||||||
|
'fl': 'description,image,name,pid,releaseDate,runtime,tags,seoName',
|
||||||
|
'q': filter_key + ':' + filter_value,
|
||||||
|
'wt': 'json',
|
||||||
|
})['response']['docs'][0]
|
||||||
|
|
||||||
|
video_id = str(video['pid'])
|
||||||
|
title = video['name']
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
m3u8_url = (self._download_json(
|
||||||
|
'https://watch.nba.com/service/publishpoint', video_id, query={
|
||||||
|
'type': 'video',
|
||||||
|
'format': 'json',
|
||||||
|
'id': video_id,
|
||||||
|
}, headers={
|
||||||
|
'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 11_0_1 like Mac OS X) AppleWebKit/604.1.38 (KHTML, like Gecko) Version/11.0 Mobile/15A402 Safari/604.1',
|
||||||
|
}, fatal=False) or {}).get('path')
|
||||||
|
if m3u8_url:
|
||||||
|
m3u8_formats = self._extract_m3u8_formats(
|
||||||
|
re.sub(r'_(?:pc|iphone)\.', '.', m3u8_url), video_id, 'mp4',
|
||||||
|
'm3u8_native', m3u8_id='hls', fatal=False)
|
||||||
|
formats.extend(m3u8_formats)
|
||||||
|
for f in m3u8_formats:
|
||||||
|
http_f = f.copy()
|
||||||
|
http_f.update({
|
||||||
|
'format_id': http_f['format_id'].replace('hls-', 'http-'),
|
||||||
|
'protocol': 'http',
|
||||||
|
'url': http_f['url'].replace('.m3u8', ''),
|
||||||
|
})
|
||||||
|
formats.append(http_f)
|
||||||
|
|
||||||
|
info = {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'thumbnail': urljoin('https://nbadsdmt.akamaized.net/media/nba/nba/thumbs/', video.get('image')),
|
||||||
|
'description': video.get('description'),
|
||||||
|
'duration': int_or_none(video.get('runtime')),
|
||||||
|
'timestamp': parse_iso8601(video.get('releaseDate')),
|
||||||
|
'tags': video.get('tags'),
|
||||||
|
}
|
||||||
|
|
||||||
|
seo_name = video.get('seoName')
|
||||||
|
if seo_name and re.search(r'\d{4}/\d{2}/\d{2}/', seo_name):
|
||||||
|
base_path = ''
|
||||||
|
if seo_name.startswith('teams/'):
|
||||||
|
base_path += seo_name.split('/')[1] + '/'
|
||||||
|
base_path += 'video/'
|
||||||
|
cvp_info = self._extract_nba_cvp_info(
|
||||||
|
base_path + seo_name + '.xml', video_id, False)
|
||||||
|
if cvp_info:
|
||||||
|
formats.extend(cvp_info['formats'])
|
||||||
|
info = merge_dicts(info, cvp_info)
|
||||||
|
|
||||||
|
self._sort_formats(formats)
|
||||||
|
info['formats'] = formats
|
||||||
|
return info
|
||||||
|
|
||||||
|
|
||||||
|
class NBAWatchEmbedIE(NBAWatchBaseIE):
|
||||||
|
IENAME = 'nba:watch:embed'
|
||||||
|
_VALID_URL = NBAWatchBaseIE._VALID_URL_BASE + r'embed\?.*?\bid=(?P<id>\d+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'http://watch.nba.com/embed?id=659395',
|
||||||
|
'md5': 'b7e3f9946595f4ca0a13903ce5edd120',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '659395',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Mix clip: More than 7 points of Joe Ingles, Luc Mbah a Moute, Blake Griffin and 6 more in Utah Jazz vs. the Clippers, 4/15/2017',
|
||||||
|
'description': 'Mix clip: More than 7 points of Joe Ingles, Luc Mbah a Moute, Blake Griffin and 6 more in Utah Jazz vs. the Clippers, 4/15/2017',
|
||||||
|
'timestamp': 1492228800,
|
||||||
|
'upload_date': '20170415',
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
return self._extract_video('pid', video_id)
|
||||||
|
|
||||||
|
|
||||||
|
class NBAWatchIE(NBAWatchBaseIE):
|
||||||
|
IE_NAME = 'nba:watch'
|
||||||
|
_VALID_URL = NBAWatchBaseIE._VALID_URL_BASE + r'(?:nba/)?video/(?P<id>.+?(?=/index\.html)|(?:[^/]+/)*[^/?#&]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.nba.com/video/games/nets/2012/12/04/0021200253-okc-bkn-recap.nba/index.html',
|
'url': 'http://www.nba.com/video/games/nets/2012/12/04/0021200253-okc-bkn-recap.nba/index.html',
|
||||||
'md5': '9e7729d3010a9c71506fd1248f74e4f4',
|
'md5': '9d902940d2a127af3f7f9d2f3dc79c96',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '0021200253-okc-bkn-recap',
|
'id': '70946',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Thunder vs. Nets',
|
'title': 'Thunder vs. Nets',
|
||||||
'description': 'Kevin Durant scores 32 points and dishes out six assists as the Thunder beat the Nets in Brooklyn.',
|
'description': 'Kevin Durant scores 32 points and dishes out six assists as the Thunder beat the Nets in Brooklyn.',
|
||||||
'duration': 181,
|
'duration': 181,
|
||||||
'timestamp': 1354638466,
|
'timestamp': 1354597200,
|
||||||
'upload_date': '20121204',
|
'upload_date': '20121204',
|
||||||
},
|
},
|
||||||
'params': {
|
|
||||||
# m3u8 download
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.nba.com/video/games/hornets/2014/12/05/0021400276-nyk-cha-play5.nba/',
|
'url': 'http://www.nba.com/video/games/hornets/2014/12/05/0021400276-nyk-cha-play5.nba/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -39,116 +143,286 @@ class NBAIE(TurnerBaseIE):
|
|||||||
'url': 'http://watch.nba.com/video/channels/playoffs/2015/05/20/0041400301-cle-atl-recap.nba',
|
'url': 'http://watch.nba.com/video/channels/playoffs/2015/05/20/0041400301-cle-atl-recap.nba',
|
||||||
'md5': 'b2b39b81cf28615ae0c3360a3f9668c4',
|
'md5': 'b2b39b81cf28615ae0c3360a3f9668c4',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'channels/playoffs/2015/05/20/0041400301-cle-atl-recap.nba',
|
'id': '330865',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Hawks vs. Cavaliers Game 1',
|
'title': 'Hawks vs. Cavaliers Game 1',
|
||||||
'description': 'md5:8094c3498d35a9bd6b1a8c396a071b4d',
|
'description': 'md5:8094c3498d35a9bd6b1a8c396a071b4d',
|
||||||
'duration': 228,
|
'duration': 228,
|
||||||
'timestamp': 1432134543,
|
'timestamp': 1432094400,
|
||||||
'upload_date': '20150520',
|
'upload_date': '20150521',
|
||||||
},
|
|
||||||
'expected_warnings': ['Unable to download f4m manifest'],
|
|
||||||
}, {
|
|
||||||
'url': 'http://www.nba.com/clippers/news/doc-rivers-were-not-trading-blake',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'teams/clippers/2016/02/17/1455672027478-Doc_Feb16_720.mov-297324',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Practice: Doc Rivers - 2/16/16',
|
|
||||||
'description': 'Head Coach Doc Rivers addresses the media following practice.',
|
|
||||||
'upload_date': '20160216',
|
|
||||||
'timestamp': 1455672000,
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
# m3u8 download
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
'expected_warnings': ['Unable to download f4m manifest'],
|
|
||||||
}, {
|
|
||||||
'url': 'http://www.nba.com/timberwolves/wiggins-shootaround#',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'timberwolves',
|
|
||||||
'title': 'Shootaround Access - Dec. 12 | Andrew Wiggins',
|
|
||||||
},
|
|
||||||
'playlist_count': 30,
|
|
||||||
'params': {
|
|
||||||
# Download the whole playlist takes too long time
|
|
||||||
'playlist_items': '1-30',
|
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.nba.com/timberwolves/wiggins-shootaround#',
|
'url': 'http://watch.nba.com/nba/video/channels/nba_tv/2015/06/11/YT_go_big_go_home_Game4_061115',
|
||||||
'info_dict': {
|
'only_matching': True,
|
||||||
'id': 'teams/timberwolves/2014/12/12/Wigginsmp4-3462601',
|
}, {
|
||||||
'ext': 'mp4',
|
# only CVP mp4 format available
|
||||||
'title': 'Shootaround Access - Dec. 12 | Andrew Wiggins',
|
'url': 'https://watch.nba.com/video/teams/cavaliers/2012/10/15/sloan121015mov-2249106',
|
||||||
'description': 'Wolves rookie Andrew Wiggins addresses the media after Friday\'s shootaround.',
|
'only_matching': True,
|
||||||
'upload_date': '20141212',
|
}, {
|
||||||
'timestamp': 1418418600,
|
'url': 'https://watch.nba.com/video/top-100-dunks-from-the-2019-20-season?plsrc=nba&collection=2019-20-season-highlights',
|
||||||
},
|
'only_matching': True,
|
||||||
'params': {
|
|
||||||
'noplaylist': True,
|
|
||||||
# m3u8 download
|
|
||||||
'skip_download': True,
|
|
||||||
},
|
|
||||||
'expected_warnings': ['Unable to download f4m manifest'],
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
_PAGE_SIZE = 30
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
collection_id = compat_parse_qs(compat_urllib_parse_urlparse(url).query).get('collection', [None])[0]
|
||||||
|
if collection_id:
|
||||||
|
if self._downloader.params.get('noplaylist'):
|
||||||
|
self.to_screen('Downloading just video %s because of --no-playlist' % display_id)
|
||||||
|
else:
|
||||||
|
self.to_screen('Downloading playlist %s - add --no-playlist to just download video' % collection_id)
|
||||||
|
return self.url_result(
|
||||||
|
'https://www.nba.com/watch/list/collection/' + collection_id,
|
||||||
|
NBAWatchCollectionIE.ie_key(), collection_id)
|
||||||
|
return self._extract_video('seoName', display_id)
|
||||||
|
|
||||||
def _fetch_page(self, team, video_id, page):
|
|
||||||
search_url = 'http://searchapp2.nba.com/nba-search/query.jsp?' + compat_urllib_parse_urlencode({
|
|
||||||
'type': 'teamvideo',
|
|
||||||
'start': page * self._PAGE_SIZE + 1,
|
|
||||||
'npp': (page + 1) * self._PAGE_SIZE + 1,
|
|
||||||
'sort': 'recent',
|
|
||||||
'output': 'json',
|
|
||||||
'site': team,
|
|
||||||
})
|
|
||||||
results = self._download_json(
|
|
||||||
search_url, video_id, note='Download page %d of playlist data' % page)['results'][0]
|
|
||||||
for item in results:
|
|
||||||
yield self.url_result(compat_urlparse.urljoin('http://www.nba.com/', item['url']))
|
|
||||||
|
|
||||||
def _extract_playlist(self, orig_path, video_id, webpage):
|
class NBAWatchCollectionIE(NBAWatchBaseIE):
|
||||||
team = orig_path.split('/')[0]
|
IE_NAME = 'nba:watch:collection'
|
||||||
|
_VALID_URL = NBAWatchBaseIE._VALID_URL_BASE + r'list/collection/(?P<id>[^/?#&]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://watch.nba.com/list/collection/season-preview-2020',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'season-preview-2020',
|
||||||
|
},
|
||||||
|
'playlist_mincount': 43,
|
||||||
|
}]
|
||||||
|
_PAGE_SIZE = 100
|
||||||
|
|
||||||
if self._downloader.params.get('noplaylist'):
|
def _fetch_page(self, collection_id, page):
|
||||||
self.to_screen('Downloading just video because of --no-playlist')
|
page += 1
|
||||||
video_path = self._search_regex(
|
videos = self._download_json(
|
||||||
r'nbaVideoCore\.firstVideo\s*=\s*\'([^\']+)\';', webpage, 'video path')
|
'https://content-api-prod.nba.com/public/1/endeavor/video-list/collection/' + collection_id,
|
||||||
video_url = 'http://www.nba.com/%s/video/%s' % (team, video_path)
|
collection_id, 'Downloading page %d JSON metadata' % page, query={
|
||||||
return self.url_result(video_url)
|
'count': self._PAGE_SIZE,
|
||||||
|
'page': page,
|
||||||
self.to_screen('Downloading playlist - add --no-playlist to just download video')
|
})['results']['videos']
|
||||||
playlist_title = self._og_search_title(webpage, fatal=False)
|
for video in videos:
|
||||||
entries = OnDemandPagedList(
|
program = video.get('program') or {}
|
||||||
functools.partial(self._fetch_page, team, video_id),
|
seo_name = program.get('seoName') or program.get('slug')
|
||||||
self._PAGE_SIZE)
|
if not seo_name:
|
||||||
|
continue
|
||||||
return self.playlist_result(entries, team, playlist_title)
|
yield {
|
||||||
|
'_type': 'url',
|
||||||
|
'id': program.get('id'),
|
||||||
|
'title': program.get('title') or video.get('title'),
|
||||||
|
'url': 'https://www.nba.com/watch/video/' + seo_name,
|
||||||
|
'thumbnail': video.get('image'),
|
||||||
|
'description': program.get('description') or video.get('description'),
|
||||||
|
'duration': parse_duration(program.get('runtimeHours')),
|
||||||
|
'timestamp': parse_iso8601(video.get('releaseDate')),
|
||||||
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
path, video_id = re.match(self._VALID_URL, url).groups()
|
collection_id = self._match_id(url)
|
||||||
orig_path = path
|
entries = OnDemandPagedList(
|
||||||
if path.startswith('nba/'):
|
functools.partial(self._fetch_page, collection_id),
|
||||||
path = path[3:]
|
self._PAGE_SIZE)
|
||||||
|
return self.playlist_result(entries, collection_id)
|
||||||
|
|
||||||
if 'video/' not in path:
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
|
||||||
path = remove_start(self._search_regex(r'data-videoid="([^"]+)"', webpage, 'video id'), '/')
|
|
||||||
|
|
||||||
if path == '{{id}}':
|
class NBABaseIE(NBACVPBaseIE):
|
||||||
return self._extract_playlist(orig_path, video_id, webpage)
|
_VALID_URL_BASE = r'''(?x)
|
||||||
|
https?://(?:www\.)?nba\.com/
|
||||||
|
(?P<team>
|
||||||
|
blazers|
|
||||||
|
bucks|
|
||||||
|
bulls|
|
||||||
|
cavaliers|
|
||||||
|
celtics|
|
||||||
|
clippers|
|
||||||
|
grizzlies|
|
||||||
|
hawks|
|
||||||
|
heat|
|
||||||
|
hornets|
|
||||||
|
jazz|
|
||||||
|
kings|
|
||||||
|
knicks|
|
||||||
|
lakers|
|
||||||
|
magic|
|
||||||
|
mavericks|
|
||||||
|
nets|
|
||||||
|
nuggets|
|
||||||
|
pacers|
|
||||||
|
pelicans|
|
||||||
|
pistons|
|
||||||
|
raptors|
|
||||||
|
rockets|
|
||||||
|
sixers|
|
||||||
|
spurs|
|
||||||
|
suns|
|
||||||
|
thunder|
|
||||||
|
timberwolves|
|
||||||
|
warriors|
|
||||||
|
wizards
|
||||||
|
)
|
||||||
|
(?:/play\#)?/'''
|
||||||
|
_CHANNEL_PATH_REGEX = r'video/channel|series'
|
||||||
|
|
||||||
# See prepareContentId() of pkgCvp.js
|
def _embed_url_result(self, team, content_id):
|
||||||
if path.startswith('video/teams'):
|
return self.url_result(update_url_query(
|
||||||
path = 'video/channels/proxy/' + path[6:]
|
'https://secure.nba.com/assets/amp/include/video/iframe.html', {
|
||||||
|
'contentId': content_id,
|
||||||
|
'team': team,
|
||||||
|
}), NBAEmbedIE.ie_key())
|
||||||
|
|
||||||
return self._extract_cvp_info(
|
def _call_api(self, team, content_id, query, resource):
|
||||||
'http://www.nba.com/%s.xml' % path, video_id, {
|
return self._download_json(
|
||||||
'default': {
|
'https://api.nba.net/2/%s/video,imported_video,wsc/' % team,
|
||||||
'media_src': 'http://nba.cdn.turner.com/nba/big',
|
content_id, 'Download %s JSON metadata' % resource,
|
||||||
},
|
query=query, headers={
|
||||||
'm3u8': {
|
'accessToken': 'internal|bb88df6b4c2244e78822812cecf1ee1b',
|
||||||
'media_src': 'http://nbavod-f.akamaihd.net',
|
})['response']['result']
|
||||||
},
|
|
||||||
|
def _extract_video(self, video, team, extract_all=True):
|
||||||
|
video_id = compat_str(video['nid'])
|
||||||
|
team = video['brand']
|
||||||
|
|
||||||
|
info = {
|
||||||
|
'id': video_id,
|
||||||
|
'title': video.get('title') or video.get('headline') or video['shortHeadline'],
|
||||||
|
'description': video.get('description'),
|
||||||
|
'timestamp': parse_iso8601(video.get('published')),
|
||||||
|
}
|
||||||
|
|
||||||
|
subtitles = {}
|
||||||
|
captions = try_get(video, lambda x: x['videoCaptions']['sidecars'], dict) or {}
|
||||||
|
for caption_url in captions.values():
|
||||||
|
subtitles.setdefault('en', []).append({'url': caption_url})
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
mp4_url = video.get('mp4')
|
||||||
|
if mp4_url:
|
||||||
|
formats.append({
|
||||||
|
'url': mp4_url,
|
||||||
})
|
})
|
||||||
|
|
||||||
|
if extract_all:
|
||||||
|
source_url = video.get('videoSource')
|
||||||
|
if source_url and not source_url.startswith('s3://') and self._is_valid_url(source_url, video_id, 'source'):
|
||||||
|
formats.append({
|
||||||
|
'format_id': 'source',
|
||||||
|
'url': source_url,
|
||||||
|
'preference': 1,
|
||||||
|
})
|
||||||
|
|
||||||
|
m3u8_url = video.get('m3u8')
|
||||||
|
if m3u8_url:
|
||||||
|
if '.akamaihd.net/i/' in m3u8_url:
|
||||||
|
formats.extend(self._extract_akamai_formats(
|
||||||
|
m3u8_url, video_id, {'http': 'pmd.cdn.turner.com'}))
|
||||||
|
else:
|
||||||
|
formats.extend(self._extract_m3u8_formats(
|
||||||
|
m3u8_url, video_id, 'mp4',
|
||||||
|
'm3u8_native', m3u8_id='hls', fatal=False))
|
||||||
|
|
||||||
|
content_xml = video.get('contentXml')
|
||||||
|
if team and content_xml:
|
||||||
|
cvp_info = self._extract_nba_cvp_info(
|
||||||
|
team + content_xml, video_id, fatal=False)
|
||||||
|
if cvp_info:
|
||||||
|
formats.extend(cvp_info['formats'])
|
||||||
|
subtitles = self._merge_subtitles(subtitles, cvp_info['subtitles'])
|
||||||
|
info = merge_dicts(info, cvp_info)
|
||||||
|
|
||||||
|
self._sort_formats(formats)
|
||||||
|
else:
|
||||||
|
info.update(self._embed_url_result(team, video['videoId']))
|
||||||
|
|
||||||
|
info.update({
|
||||||
|
'formats': formats,
|
||||||
|
'subtitles': subtitles,
|
||||||
|
})
|
||||||
|
|
||||||
|
return info
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
team, display_id = re.match(self._VALID_URL, url).groups()
|
||||||
|
if '/play#/' in url:
|
||||||
|
display_id = compat_urllib_parse_unquote(display_id)
|
||||||
|
else:
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
display_id = self._search_regex(
|
||||||
|
self._CONTENT_ID_REGEX + r'\s*:\s*"([^"]+)"', webpage, 'video id')
|
||||||
|
return self._extract_url_results(team, display_id)
|
||||||
|
|
||||||
|
|
||||||
|
class NBAEmbedIE(NBABaseIE):
|
||||||
|
IENAME = 'nba:embed'
|
||||||
|
_VALID_URL = r'https?://secure\.nba\.com/assets/amp/include/video/(?:topI|i)frame\.html\?.*?\bcontentId=(?P<id>[^?#&]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://secure.nba.com/assets/amp/include/video/topIframe.html?contentId=teams/bulls/2020/12/04/3478774/1607105587854-20201204_SCHEDULE_RELEASE_FINAL_DRUPAL-3478774&team=bulls&adFree=false&profile=71&videoPlayerName=TAMPCVP&baseUrl=&videoAdsection=nba.com_mobile_web_teamsites_chicagobulls&Env=',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://secure.nba.com/assets/amp/include/video/iframe.html?contentId=2016/10/29/0021600027boschaplay7&adFree=false&profile=71&team=&videoPlayerName=LAMPCVP',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
qs = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
|
||||||
|
content_id = qs['contentId'][0]
|
||||||
|
team = qs.get('team', [None])[0]
|
||||||
|
if not team:
|
||||||
|
return self.url_result(
|
||||||
|
'https://watch.nba.com/video/' + content_id, NBAWatchIE.ie_key())
|
||||||
|
video = self._call_api(team, content_id, {'videoid': content_id}, 'video')[0]
|
||||||
|
return self._extract_video(video, team)
|
||||||
|
|
||||||
|
|
||||||
|
class NBAIE(NBABaseIE):
|
||||||
|
IENAME = 'nba'
|
||||||
|
_VALID_URL = NBABaseIE._VALID_URL_BASE + '(?!%s)video/(?P<id>(?:[^/]+/)*[^/?#&]+)' % NBABaseIE._CHANNEL_PATH_REGEX
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.nba.com/bulls/video/teams/bulls/2020/12/04/3478774/1607105587854-20201204schedulereleasefinaldrupal-3478774',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '45039',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'AND WE BACK.',
|
||||||
|
'description': 'Part 1 of our 2020-21 schedule is here! Watch our games on NBC Sports Chicago.',
|
||||||
|
'duration': 94,
|
||||||
|
'timestamp': 1607112000,
|
||||||
|
'upload_date': '20201218',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.nba.com/bucks/play#/video/teams%2Fbucks%2F2020%2F12%2F17%2F64860%2F1608252863446-Op_Dream_16x9-64860',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.nba.com/bucks/play#/video/wsc%2Fteams%2F2787C911AA1ACD154B5377F7577CCC7134B2A4B0',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
_CONTENT_ID_REGEX = r'videoID'
|
||||||
|
|
||||||
|
def _extract_url_results(self, team, content_id):
|
||||||
|
return self._embed_url_result(team, content_id)
|
||||||
|
|
||||||
|
|
||||||
|
class NBAChannelIE(NBABaseIE):
|
||||||
|
IENAME = 'nba:channel'
|
||||||
|
_VALID_URL = NBABaseIE._VALID_URL_BASE + '(?:%s)/(?P<id>[^/?#&]+)' % NBABaseIE._CHANNEL_PATH_REGEX
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.nba.com/blazers/video/channel/summer_league',
|
||||||
|
'info_dict': {
|
||||||
|
'title': 'Summer League',
|
||||||
|
},
|
||||||
|
'playlist_mincount': 138,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.nba.com/bucks/play#/series/On%20This%20Date',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
_CONTENT_ID_REGEX = r'videoSubCategory'
|
||||||
|
_PAGE_SIZE = 100
|
||||||
|
|
||||||
|
def _fetch_page(self, team, channel, page):
|
||||||
|
results = self._call_api(team, channel, {
|
||||||
|
'channels': channel,
|
||||||
|
'count': self._PAGE_SIZE,
|
||||||
|
'offset': page * self._PAGE_SIZE,
|
||||||
|
}, 'page %d' % (page + 1))
|
||||||
|
for video in results:
|
||||||
|
yield self._extract_video(video, team, False)
|
||||||
|
|
||||||
|
def _extract_url_results(self, team, content_id):
|
||||||
|
entries = OnDemandPagedList(
|
||||||
|
functools.partial(self._fetch_page, team, content_id),
|
||||||
|
self._PAGE_SIZE)
|
||||||
|
return self.playlist_result(entries, playlist_title=content_id)
|
||||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user