Michal Čihař
b3ee552e4b
[utils] Handle single-line comments in js_to_json
2017-02-03 03:04:33 +07:00
Sergey M․
15846398ca
[utils] Improve parse_duration
2017-01-26 23:23:08 +07:00
Sergey M․
cb655f34fb
[utils] Add more date formats
2017-01-12 22:39:45 +07:00
Remita Amine
7fe1592073
[common] fix dash codec information for mixed videos and fragment url construction( #11490 )
2016-12-20 12:35:03 +01:00
Sergey M․
b0c65c677f
[utils] Improve urljoin
2016-12-17 18:49:55 +07:00
Sergey M․
e34c33614d
[utils] Add convenience urljoin
2016-12-13 02:23:49 +07:00
Yen Chi Hsuan
582be35847
Update coding style after pycodestyle 2.1.0
...
In pycodestyle 2.1.0, E305 was introduced, which requires two blank
lines after top level declarations, too.
See https://github.com/PyCQA/pycodestyle/issues/400
See also #10689 ; thanks @stepshal for first mentioning this issue and
initial patches
2016-11-17 19:45:42 +08:00
Sergey M․
02dc0a36b7
[utils] Introduce base_url
2016-11-02 02:30:18 +07:00
Sergey M․
c6eed6b8c0
[utils] Lower priority for rare date formats and add tests
2016-09-29 23:52:29 +07:00
Sergey M․
3e4185c396
[utils] Use native french month names
2016-09-14 23:59:38 +07:00
Sergey M․
f6717dec8a
[utils] Improve month_by_name and add tests
2016-09-14 23:59:38 +07:00
Sergey M․
6562d34a8c
[utils] Improve mimetype2ext
2016-09-02 22:57:48 +07:00
Yen Chi Hsuan
70852b47ca
[utils] Recognize units with full names in parse_filename
...
Reference: https://en.wikipedia.org/wiki/Template:Quantities_of_bytes
2016-08-20 00:17:26 +08:00
Yen Chi Hsuan
e4659b4547
[utils] Correct octal/hexadecimal number detection in js_to_json
2016-08-19 20:37:17 +08:00
Sergey M․
13585d7682
[utils] Recognize lowercase units in parse_filesize
2016-08-18 23:32:00 +07:00
Remita Amine
5f2c2b7936
[test_utils] add test for option with not str value
2016-08-13 09:54:12 +01:00
Sergey M․
a8795327ca
[utils] Add support TV Parental Guidelines ratings in parse_age_limit
2016-08-07 20:45:18 +07:00
Yen Chi Hsuan
7dc2a74e0a
[utils] Fix unified_timestamp for formats parsed by parsedate_tz()
2016-08-05 11:41:55 +08:00
Yen Chi Hsuan
0b68de3cc1
Merge pull request #8876 from remitamine/html5_media
...
[extractor/common] add helper method to extract html5 media entries
2016-07-10 23:40:45 +08:00
Yen Chi Hsuan
84c237fb8a
[utils] Add get_element_by_class
...
For #9950
2016-07-06 20:02:52 +08:00
Remita Amine
dfaa86b75e
[test_utils] add test for smuggling a smuggled url
2016-07-04 21:36:32 +01:00
remitamine
4f3c5e0627
[utils] add helper function for parsing codecs
2016-06-26 14:03:58 +01:00
Yen Chi Hsuan
1143535d76
[utils] Add urshift()
...
Used in IqiyiIE and LeIE
2016-06-26 15:16:49 +08:00
Sergey M․
46f59e89ea
[utils] Add unified_timestamp
2016-06-25 23:19:18 +07:00
Yen Chi Hsuan
47212f7bcb
[utils] Don't transform numbers not starting with a zero
...
Fix test_Viidea and maybe others
2016-06-16 11:00:54 +08:00
Yen Chi Hsuan
55b2f099c0
[utils] Decode HTML5 entities
...
Used in test_Vporn_1. Also related to #9270
2016-06-10 15:11:55 +08:00
bzc6p
b96f007eeb
Added sanitization support for Hungarian letters Ő and Ű
2016-06-02 11:39:32 +02:00
Sergey M․
46bc9b7d7c
[utils] Allow None in remove_{start,end}
2016-05-19 04:31:30 +06:00
Sergey M․
364cf465dd
[test_utils] PEP 8
2016-05-14 20:46:33 +06:00
Sergey M․
89ac4a19e6
[utils] Process non-base 10 integers in js_to_json
2016-05-14 20:39:58 +06:00
felix
bd1e484448
[utils] js_to_json: various improvements
...
now JS object literals like { /* " */ 0: ",]\xaa<\/p>", } will be correctly converted to JSON.
2016-05-14 20:12:39 +06:00
Yen Chi Hsuan
778a1ccca7
[utils] Add Œ and œ found in French to ACCENT_CHARS
...
Fixes #9463
2016-05-12 19:48:48 +08:00
Yen Chi Hsuan
dab0daeeb0
[utils,compat] Move struct_pack and struct_unpack to compat.py
2016-05-10 14:51:38 +08:00
Adam Thalhammer
31c4448f6e
Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes #9347
2016-05-02 13:25:12 +10:00
Adam Thalhammer
79a2e94e79
Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes #9347
2016-05-02 13:21:39 +10:00
Sergey M
b6c0d4f431
Merge pull request #9110 from remitamine/parse_duration
...
[utils] imporove parse_duration to handle more formats
2016-04-21 22:53:16 +07:00
remitamine
acaff49575
[utils] imporove parse_duration to handle more formats
2016-04-21 16:34:54 +01:00
Jaime Marquínez Ferrándiz
eb9c3edd5e
[test/utils] Add test for date_from_str
2016-04-09 22:40:05 +02:00
Yen Chi Hsuan
81f36eba88
[test/test_utils] Update for escape_url change (again)
2016-03-23 23:23:26 +08:00
Yen Chi Hsuan
2d60465e44
[test/test_utils] Update for escape_url change
2016-03-23 23:20:28 +08:00
Jaime Marquínez Ferrándiz
782b1b5bd1
[utils] lookup_unit_table: Match word boundary instead of end of string
2016-03-19 11:44:49 +01:00
Sergey M․
c5229f3926
[utils] PEP 8
2016-03-16 21:50:04 +06:00
remitamine
83548824c2
Merge pull request #8092 from bpfoley/twitter-thumbnail
...
[utils] Add extract_attributes for extracting html tag attributes
2016-03-16 13:16:27 +01:00
Sergey M․
fb47597b09
[bbc] Generalize unit table lookup and add parse_count
2016-03-13 16:27:20 +06:00
remitamine
3201a67f61
[test/test_utils] add more tests for update_url_query
2016-03-03 19:18:57 +01:00
remitamine
fb640d0a3d
[test/test_utils] add tests for update_url_query
2016-03-03 18:40:05 +01:00
Brian Foley
8bb56eeeea
[utils] Add extract_attributes for extracting html tag attributes
...
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
2016-03-03 10:11:37 +00:00
Yen Chi Hsuan
5eb6bdced4
[utils] Multiple changes to base_n()
...
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
2016-02-27 03:22:52 +08:00
Sergey M․
f160785c5c
[utils] Remove AM/PM from unified_strdate patterns
2016-02-25 00:52:49 +06:00
Yen Chi Hsuan
5bc880b988
[utils] Add OHDave's RSA encryption function
2016-02-20 19:54:58 +08:00
Sergey M․
8411229bd5
[utils] Allow dot in strip_jsonp
2016-02-07 19:47:09 +06:00
Sergey M․
86296ad2cd
[utils] Add ability to control skipping false values in dict_get
2016-02-07 08:13:04 +06:00
Sergey M․
cbecc9b903
[utils] Add dict_get convenience method
2016-02-07 06:12:53 +06:00
Sergey M․
6b77d52b1f
[test_utils] Add tests for encode_compat_str
2015-12-20 07:07:14 +06:00
Yen Chi Hsuan
db2fe38b55
[utils] Support alternative timestamp format in TTML
...
Fixes #7608
2015-12-19 19:29:51 +08:00
Yen Chi Hsuan
d631d5f9f2
[utils] Fix TTML conversion
...
Tolerate invalid timestamps (closes #7909 )
2015-12-19 18:21:42 +08:00
Sergey M․
31b2051e21
[utils] Add remove_quotes
2015-12-14 21:30:58 +06:00
Sergey M․
9cb9a5df77
[utils] Check ext with trailing slash against the list of known extensions
2015-11-22 17:27:13 +06:00
Sergey M․
5035536e3f
[test_utils] Add tests for determine_ext
2015-11-22 06:33:52 +06:00
Sergey M․
7aefc49c40
[utils] Skip invalid/non HTML entities ( Closes #7518 )
2015-11-16 20:20:16 +06:00
Jaime Marquínez Ferrándiz
6a75040278
[utils] unified_strdate: Return None if the date format can't be recognized ( fixes #7340 )
...
This issue was introduced with ae12bc3ebb
, it returned 'None'.
2015-11-02 14:08:38 +01:00
Sergey M
30eecc6a04
Merge pull request #7296 from jaimeMF/xml_attrib_unicode
...
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (…
2015-10-31 18:15:21 +00:00
Sergey M․
578c074575
[utils] Support list of xpath in xpath_element
2015-10-31 22:39:44 +06:00
Sergey M․
52c3a6e49d
[utils] Improve parse_iso8601
2015-10-28 21:40:22 +06:00
Jaime Marquínez Ferrándiz
36e6f62cd0
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x ( #7178 )
...
Attributes aren't unicode objects, so they couldn't be directly used in info_dict fields (for example '--write-description' doesn't work with bytes).
2015-10-25 20:13:16 +01:00
Sergey M․
d01949dc89
[utils:js_to_json] Fix bad escape in double quoted strings
2015-10-20 23:09:51 +06:00
Sergey M․
f71264490c
[test_utils] Add tests for cli option converters
2015-09-05 03:07:19 +06:00
Sergey M․
87f70ab39d
[test_utils] Add more tests for xpath
2015-09-05 00:36:16 +06:00
Sergey M․
ee114368ad
[utils] Make value optional for find_xpath_attr
...
This allows selecting particular attributes by name but without specifying the value and similar to xpath syntax `[@attrib]`
2015-08-01 20:22:13 +06:00
Yen Chi Hsuan
9c29bc69f7
[utils] Improve parse_duration
...
Now dots are parsed. For example '87 Min.'
2015-07-22 23:15:22 +08:00
Yen Chi Hsuan
1b0427e6c4
[utils] Support TTML without default namespace
...
In a strict sense such TTML is invalid, but Yahoo uses it.
2015-05-19 00:45:01 +08:00
Yen Chi Hsuan
7dff03636a
[utils] Support 'dur' field in TTML
2015-05-12 12:47:37 +08:00
Yen Chi Hsuan
d39e0f05db
[utils] Remove sanitize_url_path_consecutive_slashes()
...
This function is used only in SohuIE, which is updated to use a new
extraction logic.
2015-05-09 17:37:39 +08:00
Yen Chi Hsuan
0fe2ff78e6
[NBC] Enhance embedURL extraction ( closes #2549 )
2015-05-04 21:55:04 +08:00
Sergey M․
b3ed15b760
[utils] Add replace_extension
2015-05-02 23:23:06 +06:00
Sergey M․
a4bcaad773
[test_utils] Add tests for prepend_extension
2015-05-02 23:10:48 +06:00
Yen Chi Hsuan
bf6427d2fb
[ffmpeg] Add dfxp (TTML) subtitles support ( #3432 , #5146 )
2015-04-25 23:18:27 +08:00
Yen Chi Hsuan
0a1603634b
[utils] Remove url_infer_protocol
2015-04-08 21:39:34 +08:00
Yen Chi Hsuan
418c5cc3fc
[udn] Add new extractor
2015-04-08 17:26:51 +08:00
Sergey M․
8cf70de428
[test_utils] Add test for unified_strdate
2015-04-04 19:11:01 +06:00
Sergey M․
ba9e68f402
[utils] Drop trailing comma before closing brace
2015-04-04 17:48:55 +06:00
Naglis Jonaitis
91757b0f37
[utils] Escape all HTML entities written in hexadecimal form
2015-03-26 17:15:27 +02:00
Jaime Marquínez Ferrándiz
5379a2d40d
[test/utils] Test xpath_text
2015-03-21 14:12:43 +01:00
Sergey M․
92a4793b3c
[utils] Place sanitize url function near other sanitizing functions
2015-03-17 21:34:22 +06:00
Sergey M․
dc03a42537
Merge branch 'sohu_fix' of https://github.com/yan12125/youtube-dl into yan12125-sohu_fix
2015-03-17 21:18:36 +06:00
Sergey M․
2ebfeacabc
[utils] Keep dot and dotdot unmodified ( Closes #5171 )
2015-03-10 00:50:11 +06:00
Sergey M․
f18ef2d144
[utils] Disallow trailing dot in sanitize_path for a path part
2015-03-08 22:08:48 +06:00
Sergey M․
a2aaf4dbc6
[utils] Add sanitize_path
2015-03-08 20:55:22 +06:00
Yen Chi Hsuan
55969016e9
[utils] Add a function to sanitize consecutive slashes in URLs
2015-03-06 12:43:49 +08:00
Philipp Hagemeister
a7440261c5
[utils] Streap leading dots
...
Fixes #2865 , closes #5087
2015-03-02 19:07:19 +01:00
Philipp Hagemeister
3e675fabe0
[airmozilla] Be more tolerant when nonessential items are missing ( #5030 )
2015-02-26 01:25:00 +01:00
Philipp Hagemeister
5a42414b9c
[utils] Prevent hyphen at beginning of filename ( Fixes #5035 )
2015-02-24 11:38:01 +01:00
Philipp Hagemeister
d305dd73a3
[utils] Fix js_to_json
...
Previously, the runtime could be atrocious for longer inputs.
2015-02-18 23:59:51 +01:00
Philipp Hagemeister
347de4931c
[YoutubeDL] Add generic video filtering ( Fixes #4916 )
...
This functionality is intended to eventually encompass the current format filtering.
2015-02-10 03:32:24 +01:00
Philipp Hagemeister
9bb8e0a3f9
[wsj] Add new extractor ( Fixes #4854 )
2015-02-03 10:58:28 +01:00
Philipp Hagemeister
8f4b58d70e
[ntvde] Add new extractor ( Fixes #4850 )
2015-02-02 21:48:54 +01:00
Philipp Hagemeister
cfb56d1af3
Add --list-thumbnails
2015-01-25 02:43:19 +01:00
Philipp Hagemeister
61ca9a80b3
[generic] Add support for BOMs ( Fixes #4753 )
2015-01-23 01:21:30 +01:00
Naglis Jonaitis
a69801e2c6
[utils] Add additional format to unified_strdate
2015-01-14 00:16:34 +02:00
Sergey M․
a5fb718c50
[test_utils] Add more tests for parse_duration
2015-01-12 21:39:58 +06:00