Bart Broere
ad64f3751e
Improve regex
...
Co-authored-by: Roy <git@rvsit.nl>
2024-03-14 13:34:33 +01:00
Bart Broere
d4250c8703
Merge branch 'ytdl-org:master' into fix-npo-support
2024-03-12 20:46:16 +01:00
Zizheng Guo
a96a45b2cd
[Vimeo] Improve config
extraction ( #32742 )
...
* update for more robust json parsing
2024-03-12 11:44:13 +00:00
Bart Broere
58d7a00e3f
Resolve some of the pull request feedback
2024-03-11 14:14:38 +01:00
Bart Broere
4398f6832f
Fix zapp extractor
2024-03-11 13:40:23 +01:00
Bart Broere
1ca4e686a3
Add an MD5
2024-03-10 17:04:00 +01:00
Bart Broere
28624cfe09
Work work
2024-03-10 16:57:31 +01:00
Bart Broere
c08f29f45b
Update unit tests
2024-03-10 16:27:40 +01:00
hatsomatt
820fae3b3a
[Videa] Fix extraction
...
* update API URL
* from https://github.com/yt-dlp/yt-dlp/pull/8003
* thanks to the authors!
Closes yt-dlp/7427
Authored by: hatsomatt, aky-01
2024-03-08 13:14:52 +00:00
dirkf
aef24d97e9
[Videa] Align with yt-dlp
2024-03-08 13:14:52 +00:00
dirkf
f7b30e3f73
[XFileShare] Update extractor for 2024
...
* simplify aa_decode()
* review and update supported sites and tests
* in above, include FileMoon.sx, and remove separate module
* incorporate changes from yt-dlp
* allow for decoding multiple scripts (eg, FileMoon)
* use new JWPlayer extraction
2024-03-08 13:03:42 +00:00
dirkf
f66372403f
[InfoExtractor] Rework and improve JWPlayer extraction
...
* use traverse_obj() and _search_json()
* support playlist `.load({**video1},{**video2}, ...)`
* support transform_source=... for _extract_jwplayer_data()
2024-03-08 13:03:42 +00:00
dirkf
7216fa2ac4
[InfoExtractor] Add _search_json()
...
* uses the error diagnostic to truncate the JSON string
* may be confused by non-C-Pythons
2024-03-08 13:03:42 +00:00
dirkf
acc383b9e3
[utils] Let int_or_none() accept a base, like int()
2024-03-08 13:03:42 +00:00
Bart Broere
0ab79c37ae
Reusable code for two NTR sites
2024-03-07 16:23:09 +01:00
Bart Broere
0cbcd1aec6
Make diff better
2024-03-06 12:55:51 +01:00
Bart Broere
159f825edd
Add scaffolding for last few extractors and change order so the PR diff looks nice
2024-03-06 12:53:37 +01:00
Bart Broere
681b39032a
Fix flake8 and better error reporting
2024-03-06 12:32:34 +01:00
Bart Broere
4b24e5f00d
Re-add SchoolTV
2024-03-06 12:22:27 +01:00
Bart Broere
3b3d73cbe6
Use program-detail endpoint and remove a test
2024-03-06 11:52:08 +01:00
Bart Broere
d426a92a60
Encoding suggestion from PR
2024-03-05 14:11:49 +01:00
Bart Broere
d36d50fe5c
Re-add Zapp
2024-03-05 14:04:03 +01:00
Bart Broere
eb6e396bfb
First version of a VPRO regex
2024-03-05 13:55:59 +01:00
Bart Broere
28ba01f1cc
Add Ongehoord Nederland and test URL for BNNVARA
2024-03-05 13:43:56 +01:00
Bart Broere
4fc423845e
Fix lint
2024-03-05 12:49:22 +01:00
Hubert Hirtz
f0812d7848
[utils] Handle user:pass in URLs ( #28801 )
...
* Handle user:pass in URLs
Fixes "nonnumeric port" errors when youtube-dl is given URLs with
usernames and passwords such as:
http://username:password@example.com/myvideo.mp4
Refs:
- https://en.wikipedia.org/wiki/Basic_access_authentication
- https://tools.ietf.org/html/rfc1738#section-3.1
- https://docs.python.org/3.8/library/urllib.parse.html#urllib.parse.urlsplit
Fixes #18276 (point 4)
Fixes #20258
Fixes #26211 (see comment)
* Align code with yt-dlp
---------
Co-authored-by: dirkf <fieldhouse@gmx.net>
2024-03-04 01:27:55 +00:00
Bart Broere
34b5b20107
Refactor into reusable method
2024-03-03 17:47:15 +01:00
Bart Broere
8b1a7d9a7c
Use provided util
2024-03-01 16:23:19 +01:00
Bart Broere
f9e59b0c49
Add the possibility to add 'hls' later
2024-03-01 15:28:14 +01:00
Bart Broere
fb7b7179ff
Speculate about other ways of getting productId
2024-03-01 15:08:10 +01:00
Bart Broere
0dc7d954cb
Comply with coding conventions a bit more
2024-03-01 15:05:30 +01:00
Bart Broere
21eb4513e0
Convert the description into code
2024-03-01 14:12:51 +01:00
Bart Broere
29724e7b07
Delete all broken extractors
...
Re-implementing these is quicker for the cases where that's even still possible
2024-03-01 13:24:48 +01:00
Bart Broere
577368116b
Fix token URL
2024-03-01 13:15:52 +01:00
Bart Broere
da3d1f4321
Add notes on new npo.nl site
2024-03-01 10:36:03 +01:00
Bart Broere
f76d58c71f
Skip a test
2024-02-26 13:18:36 +01:00
Bart Broere
c409a8c54b
Merge branch 'ytdl-org:master' into fix-npo-support
2024-02-25 09:42:26 +01:00
Aaron Tan
40bd5c1815
[caffeine.tv] Add new extractor ( #32514 )
...
* Add CaffeineTVIE info extractor to support site caffeine.tv
---------
Co-authored-by: dirkf <fieldhouse@gmx.net>
2024-02-22 12:54:07 +00:00
dirkf
70f230f9cf
[GBNews]Add new extractor for GB News TV channel ( #29432 )
...
* Add extractor for GB News TV channel
* Support more GBNews URL formats
Allow alphanumeric and _ in place of `shows`, which redirect to site's preferred URL
* Update for 2024
2024-02-22 12:44:00 +00:00
dirkf
48ddab1f3a
[downloader/external] Fix WgetFD proxy (rev 2)
...
From PR (defunct source), closes #29343 .
Matches https://github.com/yt-dlp/yt-dlp/pull/3152
Thx former user kikuyan.
2024-02-21 16:29:08 +00:00
dirkf
7687389f08
[Vbox7] Improve extraction, adding features from yt-dlp PR #9100
...
* changes from https://github.com/yt-dlp/yt-dlp/pull/9100 (thx
seproDev):
- attempt HLS extraction
- re-enable XFF
- test `view_count`, `duration` extraction
* improve commenting, error checks
2024-02-19 00:53:22 +00:00
dirkf
4416f82c80
[Vbox7IE] Sanitise ld+json containing unexpected characters
...
* based on PR #29680
* added hack to force invoking `transform_source`
* fixes #26218
2024-02-02 12:36:05 +00:00
dirkf
bdda6b81df
[Vbox7IE] Improve extraction
...
* DASH extraction no longer fails with new range support
* but always find combined formats if available
* suppress ineffective XFF geo-bypass (causes time-outs)
* adapted from https://github.com/ytdl-org/youtube-dl/pull/29680
* thx former GH user kikuyan
2024-02-02 12:36:05 +00:00
dirkf
1fd8f802b8
[InfoExtractor] Correctly resolve BaseURL in DASH manifest
...
Specs:
* ISO/IEC 23009-1:2012 section 5.6
* RFC 3986 section 5.
2024-02-02 12:36:05 +00:00
dirkf
4eaeb9b2c6
[InfoExtractor] Support byte range for DASH
...
* adapted from https://github.com/ytdl-org/youtube-dl/pull/30279
* thx former GH user kikuyan
2024-02-02 12:36:05 +00:00
dirkf
bec9180e89
[downloader/dash] Support range
in fragment (format f'{start}-{end}')
...
* adapted from https://github.com/ytdl-org/youtube-dl/pull/30279
* thx former GH user kikuyan
2024-02-02 12:36:05 +00:00
dirkf
c58b655a9e
[InfoExtractor] Support DASH subtitle extraction (yt-dlp back-port)
2024-02-02 12:36:05 +00:00
dirkf
dc512e3a8a
[YouTube] Fix like_count
extraction using likeButtonViewModel
...
* also fix various tests
* TODO: check against yt-dlp tests
2024-01-22 11:10:34 +00:00
dirkf
f8b0135850
[YouTube] Rework n-sig processing, realigning with yt-dlp
...
* apply n-sig before chunked fragments, fixes #32692
2024-01-22 11:10:34 +00:00
dirkf
640d39f03a
[InfoExtractor] Support some warning and ._downloader
shortcut methods from yt-dlp
2024-01-22 11:10:34 +00:00
dirkf
6651871416
[compat] Rework compat for method
parameter of compat_urllib_request.Request
constructor
...
* fixes #32573
* does not break `utils.HEADrequest` (eg)
2024-01-22 11:10:34 +00:00
mk-pmb
be008e657d
[core] Fix format string injection for metadata JSON filename message.
2023-12-06 02:45:41 +00:00
Robotix
b1bbc1e502
[Epidemic Sound] Add new extractor ( #32628 )
...
* Add simple extractor
* Support separate tracks
* Use index as id instead of slug
---------
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-12-06 01:17:57 +00:00
dirkf
55a442adae
[Imgur] Overhaul extractor module ( #32612 )
...
Revise extractors for new API and page formats
2023-12-05 20:02:30 +00:00
mimvahedi
c62936a5f2
[telewebion] Fix extraction ( #32634 )
...
* [telewebion] fix extraction
Resolves https://github.com/ytdl-org/youtube-dl/issues/5135#issuecomment-932952119
---------
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-12-02 15:25:09 +00:00
dirkf
427472351c
[utils] Make restricted filenames ignore characters in Unicode categories Mark, Other
...
Resolves #32629
2023-11-29 22:08:01 +00:00
ReenigneArcher
b7fca0fab3
[Youtube] Update consent cookie handling to match site
...
Apologies for force push!
[skip ci]
2023-11-29 21:43:02 +00:00
dirkf
00ef748cc0
[downloader] Fix baa6c5e
: show ETA of http download as ETA instead of total d/l time
2023-09-24 22:07:47 +01:00
dirkf
66ab0814c4
[utils] Revert bbd3e7e
, updating docstring, test instead
2023-09-03 23:15:19 +01:00
dirkf
bbd3e7e999
[utils] Properly handle list values in update_url()
...
An actual list value in a query update could have been treated
as a list of values because of the key:list parse_qs format.
2023-09-03 01:18:22 +01:00
dirkf
31f50c8194
[S4C] Add thumbnail extraction, extract series as playlist
...
Based on https://github.com/yt-dlp/yt-dlp/pull/7776 : thx ifan-t, bashonly
2023-08-31 23:16:50 +01:00
dirkf
86e3cf5e58
[S4C] Add extractor for Sianel Pedwar Cymru
...
* from https://github.com/yt-dlp/yt-dlp/pull/7730 , thx ifan-t, bashonly
2023-08-04 22:54:12 +01:00
dirkf
2efc8de4d2
[utils] Advertise optional supported Content-Encoding
s
2023-08-01 01:05:09 +01:00
dirkf
e4178b5af3
[utils] Add and use filter_dict()
from yt-dlp
2023-08-01 01:05:09 +01:00
dirkf
2d2a4bc832
[utils] Revise isinstance()
tests (especially for str/unicode/bytes) to complete Linter fix
2023-08-01 01:05:09 +01:00
dirkf
7d965e6b65
[utils] Avoid comparing type(var)
, etc, to pass new Linter rules
2023-08-01 01:05:09 +01:00
dirkf
abef53466d
[utils] Rework URL path munging for ., .. components
...
* move processing to YoutubeDLHandler
* also process `Location` header for redirect
* use tests from https://github.com/yt-dlp/yt-dlp/pull/7662
2023-07-29 14:27:26 +01:00
dirkf
e7926ae9f4
[utils] Rework decoding of Content-Encoding
s
...
* support nested encodings
* support optional `br` encoding, if brotli package is installed
* support optional 'compress' encoding, if ncompress package is installed
* response `Content-Encoding` has only unprocessed encodings, or removed
* response `Content-Length` is decoded length (usable for filesize metadata)
* use zlib for both deflate and gzip decompression
* some elements taken from yt-dlp: thx especially coletdjnz
2023-07-29 14:27:26 +01:00
dirkf
b870181229
[build] Extend use of devscripts/utils
2023-07-25 13:19:43 +01:00
dirkf
a25e9f3c84
[compat] Use compat_open()
2023-07-25 13:19:43 +01:00
dirkf
2b7dd3b2a2
[utils] Fix update_Request() with empty data (not None)
2023-07-25 13:19:43 +01:00
dirkf
7bce2ad441
[build] Fix various Jython CI and test issues
2023-07-25 13:19:43 +01:00
dirkf
1fa8b86f0b
[utils] Remove stray undocumented Host header in redirect (fix 46fde7c
)
2023-07-20 05:29:59 +01:00
dirkf
b2ba24bb02
[InfoExtractor] Add _match_valid_url()
class method and refactor
...
* API compatible with yt-dlp
* also support Sequence of patterns in _VALID_URL
* one place to compile _VALID_URL
* TODO: remove existing extractor shims
2023-07-19 22:14:50 +01:00
dirkf
a190b55964
[utils] Fix broken Py 3.11+ compat in traverse_obj()
...
* inspect.getargspec is missing despite doc claiming backward compat
* replace with emulation of `Signature.bind()`
2023-07-19 22:14:50 +01:00
dirkf
b2741f2654
[InfoExtractor] Add search methods for Next/Nuxt.js from yt-dlp
...
* add _search_nextjs_data(), from https://github.com/yt-dlp/yt-dlp/pull/1386
thanks selfisekai
* add _search_nuxt_data(), from https://github.com/yt-dlp/yt-dlp/pull/1921 ,
thanks Lesmiscore, pukkandan
* add tests for the above
* also fix HTML5 type recognition and tests, from
222a230871
,
thanks Lesmiscore
* update extractors in PR using above, fix tests.
2023-07-19 22:14:50 +01:00
dirkf
8465222041
[Clipchamp] Add new extractor back-ported from yt-dlp
2023-07-19 22:14:50 +01:00
dirkf
4339910df3
[DLF] Add site extractors back-ported from yt-dlp
...
* from https://github.com/yt-dlp/yt-dlp/pull/6697 , thanks nick-cd
2023-07-19 22:14:50 +01:00
dirkf
eaaf4c6736
[Whyp] Add extractor back-ported from yt-dlp
...
* from https://github.com/yt-dlp/yt-dlp/pull/6803 , thanks CoryTibbettsDev
2023-07-19 22:14:50 +01:00
dirkf
4566e6e53e
[GlobalPlayer] Add site extractors back-ported from yt-dlp
...
* from https://github.com/yt-dlp/yt-dlp/pull/6903 , thanks garret1317
2023-07-19 22:14:50 +01:00
dirkf
1e8ccdd2eb
[InfoExtractor] Support groups in _search_regex()
, etc
2023-07-19 22:14:50 +01:00
dirkf
cb9366eda5
[utils] Minor updates (merge_dicts, T)
...
A couple of mods to ease yt-dlp back-ports:
* add kwargs to merge_dicts:
`unblank=True` (disallow empty string), `rev=False` (reverse the merge list)
* add `T(x)` shortcut for `{x}`, unsupported in Py2.6
2023-07-19 22:14:50 +01:00
dirkf
d9d07a9581
[utils] Improve js_to_json, align with yt-dlp
...
* support variable substitution, from https://github.com/yt-dlp/yt-dlp/pull/#521 etc,
thanks ChillingPepper, Grub4k, pukkandan
* improve escape handling, from https://github.com/yt-dlp/yt-dlp/pull/#521
thanks Grub4k
* support template strings from https://github.com/yt-dlp/yt-dlp/pull/6623
thanks Grub4k
* add limited `!` evaluation (eg, !!0 -> false, see tests)
2023-07-19 22:14:50 +01:00
dirkf
825a40744b
[utils] Align traverse_obj() with yt-dlp
...
Thanks Grub4k for these:
* traverse `Iterable`s, from https://github.com/yt-dlp/yt-dlp/pull/6902 , etc
* traverse `set` key for transformations/filters, `re.Match` group names, from
776995bc10
, etc
* traverse `re.Match`es, from https://github.com/yt-dlp/yt-dlp/pull/5174
* always return list when branching, from https://github.com/yt-dlp/yt-dlp/pull/5170
2023-07-19 22:14:50 +01:00
dirkf
47214e46d8
[compat] Fix old Pythons broken loading of valueless cookie attributes
...
Cookie string parsing in Py 2.6.9, probably earlier, requires `=`.
Also 3.2, though the CPython code appears to be OK: 3.1 was also wrong.
2023-07-18 10:50:46 +01:00
dirkf
1d8d5a93f7
[test] Fixes for old Pythons
2023-07-18 10:50:46 +01:00
dirkf
1634b1d61e
[doc] Warn against setting cookies with --add-header
2023-07-18 10:50:46 +01:00
bashonly
21438a4194
[downloader/external] Fix cookie support
2023-07-18 10:50:46 +01:00
Simon Sawicki
8334ec961b
[core] Process header cookies on loading
2023-07-18 10:50:46 +01:00
bashonly
3801d36416
[utils] YoutubeDLCookieJar
: Add get_cookie_header
and get_cookies_for_url
methods
2023-07-18 10:50:46 +01:00
dirkf
b383be9887
[core] Remove Cookie
header on redirect to prevent leaks
...
Adated from yt-dlp/yt-dlp-ghsa-v8mc-9377-rwjj/pull/1/commits/101caac
Thx coletdjnz
2023-07-18 10:50:46 +01:00
dirkf
46fde7caee
[core] Update redirect handling from yt-dlp
...
* Thx coletdjnz: https://github.com/yt-dlp/yt-dlp/pull/7094
* add test that redirected `POST` loses its `Content-Type`
2023-07-18 10:50:46 +01:00
dirkf
648dc5304c
[compat] Add Request and HTTPClient compat for redirect
...
* support `method` parameter of `Request.__init__` (Py 2 and old Py 3)
* support `getcode` method of compat_http_client.HTTPResponse (Py 2)
2023-07-18 10:50:46 +01:00
dirkf
d5ef405c5d
[core] Align error reporting methods with yt-dlp
2023-07-18 10:50:46 +01:00
dirkf
f47fdb9564
[utils] Add {expected_type} and Iterable support to traverse_obj()
2023-07-18 10:50:46 +01:00
dirkf
b6dff4073d
[core] Revert version display from b8a86dc
2023-07-18 10:50:46 +01:00
dirkf
f24bc9272e
[Misc] Fixes for 2.6 compatibility
2023-07-05 22:58:54 +01:00
dirkf
2500300c2a
[workflows/ci.yml] Restore test support for Py 3.2
2023-07-05 22:51:15 +01:00
dirkf
fa7f0effbe
[YouTube] Avoid crash in author extraction
2023-06-22 23:14:21 +01:00
pukkandan
9112e668a5
[YouTube] Improve nsig function name extraction
...
Fixes player b7910ca8, using `,` vs `;`
See https://github.com/ytdl-org/youtube-dl/issues/32292#issuecomment-1602231170
Co-authored-by: dirkf
2023-06-22 16:46:53 +01:00