Commit Graph

586 Commits

Author SHA1 Message Date
Sergey M․ c790e93ab5
[extractor/common] Clarify url and manifest_url meta fields 2019-03-05 00:41:53 +07:00
Sergey M․ 39c780fdec
[extractor/common] Return MPD manifest as format's url meta field (#20242)
For symmetry with other segmented media
2019-03-05 00:40:57 +07:00
Ales Jirasek 22f5f5c6fc
[malltv] Add extractor (closes #18058) 2019-02-08 00:43:26 +07:00
Sergey M․ 458fd30f56
[extractor/common] Extract season in _json_ld 2019-01-27 04:36:58 +07:00
Remita Amine 6945b9e78f [extractor/common] improve jwplayer relative url handling(closes #18892) 2019-01-20 13:31:52 +01:00
Remita Amine 379306ef55 [extractor/common] fix typo 2019-01-19 21:35:02 +01:00
Remita Amine 2bfc1d9d68 [extractor/common] imporove HLS video only format detection(closes #18923) 2019-01-19 21:25:15 +01:00
Sergey M․ 440863ade1
[extractor/common] Use episode name as title in _json_ld 2019-01-08 10:02:49 +07:00
Sergey M․ 391256dc0e
[extractor/common] Add support for movies in _json_ld 2019-01-08 10:02:00 +07:00
Xiao Di Guan 95e42d7336 [extractor/common] Ensure response handle is not prematurely closed before it can be read if it matches expected_status (resolves #17195, closes #17846, resolves #17447) 2018-11-03 01:18:20 +07:00
Sergey M․ bebef10909
[extractor/common] Add validation for JSON-LD URLs 2018-10-29 00:21:45 +07:00
Sergey M․ 0e7b8d3eac
[extractor/common] Fix typos 2018-09-15 01:53:01 +07:00
Sergey M․ 6f1f59f39c
[extractor/common] Introduce channel meta fields 2018-09-15 01:23:36 +07:00
Remita Amine bd21ead2a2 [extractor/common] add support for DASH and MSS formats extraction in SMIL manifests 2018-07-18 18:34:04 +01:00
Sergey M․ 0685d9727b
[utils] Share JSON-LD regex 2018-07-09 23:43:05 +07:00
Sergey M․ eca1f0d115
[extractor/common] Properly escape % in MPD templates (closes #16867) 2018-07-01 02:11:36 +07:00
Sergey M․ 5e8e2fa51f
[extractor/common] Use source URL as Referer for HTML5 entries (closes #16849) 2018-06-29 01:25:05 +07:00
Sergey M․ d391b7e23d
[extractor/common] Introduce expected_status for convenient accept of failed HTTP requests
Useful when some non-success (2xx) HTTP status codes should be considered normal. Previously this required to manually catch corresponding exceptions and read the response.
2018-06-18 04:54:08 +07:00
Sergey M․ 504f20dd30
Remove experimental mark for some options 2018-05-19 23:53:24 +07:00
Sergey M․ 5f95927a62
Improve geo bypass mechanism
* Introduce geo bypass context
* Add ability to bypass based on IP blocks in CIDR notation
* Introduce --geo-bypass-ip-block
2018-05-02 07:20:59 +07:00
Sergey M․ e7e4a6e0f9
[extractor/common] Extract interaction statistic 2018-04-28 02:48:03 +07:00
Sergey M․ 0fe7783ece
[extractor/common] Add _download_json_handle 2018-04-28 01:59:15 +07:00
aeph6Ee0 66b686727b [extractor/common] Relax JSON-LD context check (closes #16006) 2018-04-08 03:09:42 +07:00
Sergey M․ 6780154e6b
[extractor/common] Improve thumbnail extraction for HTML5 entries 2018-03-19 23:43:53 +07:00
Sergey M․ 47a5cb7734
Generalize XML manifest processing code and improve XSPF parsing (closes #15794) 2018-03-18 02:52:17 +07:00
Sergey M․ e0d198c18d
[extractor/common] Add _download_xml_handle 2018-03-18 02:52:01 +07:00
Ricardo Constantino 96b8b9abae
[extractor/generic] Support relative URIs in _parse_xspf
<location> can have relative URIs, not just absolute.
2018-03-18 02:48:44 +07:00
Sergey M․ f4b7427279
[extractor/common] Improve jwplayer subtitles extraction (closes #15695) 2018-02-25 00:59:29 +07:00
Sergey M․ 240f26229d
[extractor/common] Respect secure schemes in _extract_wowza_formats 2018-02-05 23:56:00 +07:00
Sergey M․ 00c97e3e7a
[downloader/http] Add ability to pass downloader options via info dict 2018-02-04 07:16:22 +07:00
Sergey M․ 3931b84597
[extractor/common] Improve _json_ld for articles 2018-01-27 23:24:38 +07:00
Sergey M․ 0d9c48de4f
[extractor/common] Improve DASH formats extraction for jwplayer (#9242, #15187) 2018-01-21 17:42:48 +07:00
Ondřej Caletka 126f225bcf
[extractor/common] Add container meta field for formats extracted in _parse_mpd_formats 2017-12-31 04:04:09 +07:00
felix 2501d41ef4
[common] use AACL as the default fourcc when AudioTag is 255 2017-12-30 07:22:07 +07:00
Sergey M․ 9d6ac71c27
[extractor/common] Fix extraction of DASH formats with the same representation id (closes #15111) 2017-12-29 23:14:56 +07:00
Sergey M․ 2132edaa03
[extractor/common] Move X-Forwarded-For setup code into _request_webpage 2017-12-23 21:17:53 +07:00
Sergey M․ c10c93238e
[extractor/common] Introduce uploader, uploader_id and uploader_url meta fields for playlists (#11427, #15018) 2017-12-19 03:51:03 +07:00
Sergey M․ 78593e294c
Add references for #14844 2017-12-02 21:22:43 +07:00
Sergey M․ 603fc4e0ea
[extractor/common] Add durations for DASH fragments with bare SegmentURLs 2017-12-02 21:21:01 +07:00
Petr Novak 41bf647e89
[extractor/common] Add support for DASH manifests with SegmentLists with bare SegmentURLs 2017-12-02 21:16:36 +07:00
Sergey M․ f610dbb05f
[extractor/common] Use final URL when dumping request (closes #14769) 2017-11-18 19:04:56 +07:00
Remita Amine ea2295842f [common] skip Apple FairPlay m3u8 manifests(closes #14741) 2017-11-14 17:41:30 +01:00
Sergey M․ 187ee66c94
[extractor/common] Add protocol for f4m formats 2017-11-04 22:11:39 +07:00
Sergey M․ 48107c198b
[f4m] Prefer baseURL for relative URLs (closes #14660) 2017-11-04 22:10:55 +07:00
Sergey M․ 044eeb1455
[extractor/common] Respect URL query in _extract_wowza_formats (closes #14645) 2017-11-01 23:39:26 +07:00
Sergey M․ 9211e3319e
[extractor/common] Prefix format id for audio only HLS formats 2017-10-29 07:05:55 +07:00
Remita Amine 50d808f5c9 [common] add support for jwplayer youtube embeds 2017-10-12 16:12:47 +00:00
M.K c110944fa2 [extractor/common] Fix typo in _parse_mpd_formats 2017-10-04 03:50:27 +07:00
Yen Chi Hsuan 4ed2d7b7d1 Fix flake8 issues after #14225 2017-09-17 13:53:04 +08:00
Yen Chi Hsuan a88d461dff Merge pull request #14225 from Tithen-Firion/openload-phantomjs-method
Openload phantomjs method
2017-09-16 02:28:28 +08:00
Sergey M․ 1ed4549942
[extractor/common] Extract format id from label attribute of source tag for HTML5 videos (#14034) 2017-08-27 03:27:05 +07:00
Sergey M․ dd121cc1ca
[extractor/common] Extract height from res attribute of source tag for HTML5 videos (closes #14034) 2017-08-27 03:12:56 +07:00
Sergey M․ e01c3d2ef7
[extractor/common] Introduce _parse_xml 2017-08-23 00:32:41 +07:00
Sergey M․ b359e977b9
[extractor/common] Make HLS and DASH extraction non fatal in _parse_html5_media_entries (closes #13970) 2017-08-20 14:16:58 +07:00
Sergey M․ 4850478543
[extractor/common] Add support for float durations in _parse_mpd_formats (closes #13919) 2017-08-15 23:58:00 +07:00
Sergey M․ 868f79db41
[extractor/common] Fix _media_formats 2017-08-12 19:24:26 +07:00
Sergey M․ ac8491fcca
[extractor/common] Make _family_friendly_search optional 2017-08-12 17:11:35 +07:00
Sergey M․ 82889d4ae5
[extractor/common] Respect source's type attribute for HTML5 media (closes #13892) 2017-08-12 16:48:11 +07:00
Sergey M․ 1141e9104b
Use relative paths for DASH fragments (closes #12990)
10x reduced JSON size
refs #13810
2017-08-05 07:40:29 +07:00
Sergey M․ 749ca5eced
[extractor/common] Fix playlist_from_matches 2017-07-16 04:33:14 +07:00
Sergey M․ 4328ddf82b
[extractor/common] Add support for AMP tags in _parse_html5_media_entries 2017-07-09 16:29:52 +07:00
Sergey M․ c69701c6ab
[extractor/common] Improve _json_ld 2017-06-30 22:19:06 +07:00
Sergey M․ 96a2daa1ee
[extractor/common] Improve jwplayer subtitles extraction 2017-06-15 23:40:39 +07:00
Yen Chi Hsuan 6a9cb29509
[extractor/common] Fix json dumping with --geo-bypass
The line "[debug] Using fake IP %s (%s) as X-Forwarded-For." was printed
to stdout even with -j/-J, which breaks the resultant JSON.
2017-06-15 13:04:36 +08:00
Sergey M․ 0a268c6e11
[extractor/common] Improve jwplayer formats extraction (closes #13379) 2017-06-14 22:02:15 +07:00
Sergey M․ 1afd0b0da7
[extractor/common] Return unicode string from _match_id 2017-06-09 00:40:03 +07:00
Sergey M․ f2e2f0c777
[extractor/common] Fix rtmp and rtsp formats' URLs in _extract_wowza_formats 2017-05-17 22:20:25 +07:00
Sergey M․ 6f76679804
[extractor/common] Add support for schemeless URLs in _extract_wowza_formats (closes #13088, closes #13092) 2017-05-16 22:11:34 +07:00
Sergey M․ 76d5a36391
[extractor/common] Respect Width and Height attributes in ISM manifests 2017-05-14 06:11:45 +07:00
Remita Amine ff6f9a6704 [extractor/common] fix typo in _extract_akamai_formats 2017-05-04 16:07:08 +01:00
Tithen-Firion c89267d31a Merge branch 'master' into openload-phantomjs-method 2017-05-04 11:00:06 +02:00
remitamine 55949fede6 [common] introduce chapters field 2017-05-02 20:41:48 +01:00
Sergey M․ 33a81c2c6f
[extractor/common] Extract view count from JSON-LD 2017-04-30 21:45:59 +07:00
Sergey M․ c89b49f743
[extractor/common] Add manifest_url for explicit group rendition formats 2017-04-28 03:00:14 +07:00
Sergey M․ ff99fe529e
Don't list master m3u8 playlists in format list (closes #12832) 2017-04-27 21:53:17 +07:00
Sergey M․ ac9c69ace7
[extractor/common] Improve jwplayer regex 2017-04-25 23:46:05 +07:00
Tithen-Firion 40e41780f1 [phantomjs] add cookie support 2017-04-25 15:12:54 +02:00
Sergey M․ 3019cb0c99
[extractor/common] Rephrase comment 2017-04-23 11:52:07 +07:00
Sergey M․ ddd258f922
[test_InfoExtractor] Add m3u8 parsing test for NAME attribute in EXT-X-STREAM-INF tag 2017-04-23 11:49:57 +07:00
Sergey M․ 9c99bef704
[extractor/common] Use float for scaled tbr 2017-04-23 11:33:49 +07:00
Sergey M․ cb2520802d
[extractor/common] Improve m3u8 extraction (closes #12211)
* Extract m3u8 parsing to separate method
* Improve rendition groups extraction
* Build stream name according stream GROUP-ID
* Ignore reference to AUDIO group without URI when stream has no CODECS
+ Add test coverage for parsing m3u8 from #11507, #11995, #12211 and twitch vod
2017-04-22 07:01:00 +07:00
Sergey M․ bae1404893
[extractor/common] Add support for video of WebPage context in _json_ld (closes #12778) 2017-04-18 22:21:38 +07:00
Remita Amine bf1b87cd91 [common] Relax JWPlayer regex and remove duplicate urls(#12768) 2017-04-17 08:48:24 +01:00
Remita Amine 40fcba5edb improve coding style 2017-04-12 20:38:43 +01:00
Sergey M․ fd47550885
[extractor/common] Add coding cookie 2017-04-02 04:42:10 +07:00
Sergey M․ 4457823dda
[extractor/common] Move censorship checks to a separate method and add check for just another ISP 2017-04-02 03:57:44 +07:00
Random User 4f06c1c9fc Merge branch 'master' of github.com-rndusr:rg3/youtube-dl into fix/str-item-assignment 2017-03-25 21:36:59 +01:00
Random User c73e330e7a _find_jwplayer_data() returns dict or None
This simplifies code for callers of `_find_jwplayer_data()` which no longer have
to run `_parse_json()` on the return value.

It also makes sure that `_find_jwplayer_data()` returns either a `dict` or
`None` and nothing else.
2017-03-25 19:38:30 +01:00
John Hawkinson 46b18f2349 [BostonGlobe] New. Nonstandard version of Brightcove.
Has a "data-brightcove-video-id" instead of a "data-video-id," otherwise
pretty much just Brightcove. Except the Globe isn't all Brightcove
videos, so fallback to Generic, too.

Also, abstract playlist_from_matches() from generic.py to common.py, and use
it here.

History of these changes can be found in
51170427d4b1143572a498dedaee61863a5b2c5b.
2017-03-19 20:40:31 +08:00
Sergey M․ b51dc9db0e
[extractor/common] Extract SMIL formats from jwplayer 2017-03-16 03:30:53 +07:00
Sergey M․ 1a2192cb90
[extractor/common] Pass arguments to _parse_jwplayer_formats and PEP8 2017-03-05 23:29:17 +07:00
Sergey M․ 0236cd0dfd
[extractor/common] Improve height extraction and extract bitrate 2017-03-05 23:25:03 +07:00
Sergey M․ ed0cf9b383
[extractor/common] Move jwplayer formats extraction in separate method 2017-03-05 23:22:27 +07:00
Yen Chi Hsuan eeb0a95684
[extractor/common] Add 'preference' to _parse_html5_media_entries
Some websites, like NJPWorld, put different qualities on different
player pages.
2017-02-25 18:40:05 +08:00
Sergey M․ eea0716cae
[extractor/common] Print origin country for fake IP 2017-02-21 23:14:33 +07:00
Sergey M․ 336a76551b
[extractor/common] Do not quit _initialize_geo_bypass on empty countries 2017-02-21 23:09:41 +07:00
Sergey M․ dc0a869e5e
[extractor/common] Fix typo 2017-02-21 23:05:31 +07:00
Sergey M․ e39b5d4ab8
[extractor/common] Allow calling _initialize_geo_bypass from extractors (#11970) 2017-02-21 23:00:43 +07:00
Sergey M․ 3ccdde8cb7
[extractor/common] Emphasize geo bypass APIs are experimental 2017-02-20 23:21:15 +07:00
Sergey M․ 4248dad92b Improve geo bypass mechanism
* Rename options to preffixly match with --geo-verification-proxy
* Introduce _GEO_COUNTRIES for extractors
* Implement faking IP right away for sites with known geo restriction
2017-02-19 05:10:08 +08:00