More string operators for --match-filter

This commit is contained in:
Max Teegen 2021-06-13 16:25:19 +02:00
parent c2350cac24
commit 3b74d490e0
2 changed files with 102 additions and 23 deletions

115
README.md
View File

@ -128,6 +128,7 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
CIDR notation CIDR notation
## Video Selection: ## Video Selection:
<<<<<<< HEAD
--playlist-start NUMBER Playlist video to start at (default is --playlist-start NUMBER Playlist video to start at (default is
1) 1)
--playlist-end NUMBER Playlist video to end at (default is --playlist-end NUMBER Playlist video to end at (default is
@ -160,27 +161,33 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
than COUNT views than COUNT views
--max-views COUNT Do not download any videos with more --max-views COUNT Do not download any videos with more
than COUNT views than COUNT views
--match-filter FILTER Generic video filter. Specify any key --match-filter FILTER Generic video filter. Specify any key (see
(see the "OUTPUT TEMPLATE" for a list the "OUTPUT TEMPLATE" for a list of
of available keys) to match if the key available keys) to match if the key is
is present, !key to check if the key is present, !key to check if the key is not
not present, key > NUMBER (like present, key > NUMBER (like "comment_count
"comment_count > 12", also works with > 12", also works with >=, <, <=, !=, =) to
>=, <, <=, !=, =) to compare against a compare against a number, key = 'LITERAL'
number, key = 'LITERAL' (like "uploader (like "uploader = 'Mike Smith'", also works
= 'Mike Smith'", also works with !=) to with !=) to match against a string literal
match against a string literal and & to and & to require multiple matches. Values
require multiple matches. Values which which are not known are excluded unless you
are not known are excluded unless you put a question mark (?) after the operator.
put a question mark (?) after the For example, to only match videos that have
operator. For example, to only match been liked more than 100 times and disliked
videos that have been liked more than less than 50 times (or the dislike
100 times and disliked less than 50 functionality is not available at the given
times (or the dislike functionality is service), but who also have a description,
not available at the given service), use --match-filter "like_count > 100 &
but who also have a description, use
--match-filter "like_count > 100 &
dislike_count <? 50 & description" . dislike_count <? 50 & description" .
For matching strings, the oparators ~= and
!~= check for string containment and
exclusion. The operators *= and !*= search
for a regular expression.
For example, to only match videos which
have neither 'sponsored' nor 'Sponsored' in
the title, use --match-filter "title !*=
'[Ss]ponsored'".
--no-playlist Download only the video, if the URL --no-playlist Download only the video, if the URL
refers to a video and a playlist. refers to a video and a playlist.
--yes-playlist Download the playlist, if the URL --yes-playlist Download the playlist, if the URL
@ -192,6 +199,74 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
downloaded videos in it. downloaded videos in it.
--include-ads Download advertisements as well --include-ads Download advertisements as well
(experimental) (experimental)
=======
--playlist-start NUMBER Playlist video to start at (default is 1)
--playlist-end NUMBER Playlist video to end at (default is last)
--playlist-items ITEM_SPEC Playlist video items to download. Specify
indices of the videos in the playlist
separated by commas like: "--playlist-items
1,2,5,8" if you want to download videos
indexed 1, 2, 5, 8 in the playlist. You can
specify range: "--playlist-items
1-3,7,10-13", it will download the videos
at index 1, 2, 3, 7, 10, 11, 12 and 13.
--match-title REGEX Download only matching titles (regex or
caseless sub-string)
--reject-title REGEX Skip download for matching titles (regex or
caseless sub-string)
--max-downloads NUMBER Abort after downloading NUMBER files
--min-filesize SIZE Do not download any videos smaller than
SIZE (e.g. 50k or 44.6m)
--max-filesize SIZE Do not download any videos larger than SIZE
(e.g. 50k or 44.6m)
--date DATE Download only videos uploaded in this date
--datebefore DATE Download only videos uploaded on or before
this date (i.e. inclusive)
--dateafter DATE Download only videos uploaded on or after
this date (i.e. inclusive)
--min-views COUNT Do not download any videos with less than
COUNT views
--max-views COUNT Do not download any videos with more than
COUNT views
--match-filter FILTER Generic video filter. Specify any key (see
the "OUTPUT TEMPLATE" for a list of
available keys) to match if the key is
present, !key to check if the key is not
present, key > NUMBER (like "comment_count
> 12", also works with >=, <, <=, !=, =) to
compare against a number, key = 'LITERAL'
(like "uploader = 'Mike Smith'", also works
with !=) to match against a string literal
and & to require multiple matches. Values
which are not known are excluded unless you
put a question mark (?) after the operator.
For example, to only match videos that have
been liked more than 100 times and disliked
less than 50 times (or the dislike
functionality is not available at the given
service), but who also have a description,
use --match-filter "like_count > 100 &
dislike_count <? 50 & description" .
For matching strings, the oparators ~= and
!~= check for string containment and
exclusion. The operators *= and !*= search
for a regular expression.
For example, to only match videos which
have neither 'sponsored' nor 'Sponsored' in
the title, use --match-filter "title !*=
'[Ss]ponsored'"
--no-playlist Download only the video, if the URL refers
to a video and a playlist.
--yes-playlist Download the playlist, if the URL refers to
a video and a playlist.
--age-limit YEARS Download only videos suitable for the given
age
--download-archive FILE Download only videos not listed in the
archive file. Record the IDs of all
downloaded videos in it.
--include-ads Download advertisements as well
(experimental)
>>>>>>> dd954e809 (More string operators for --match-filter)
## Download Options: ## Download Options:
-r, --limit-rate RATE Maximum download rate in bytes per -r, --limit-rate RATE Maximum download rate in bytes per

View File

@ -4369,6 +4369,10 @@ def _match_one(filter_part, dct):
'>=': operator.ge, '>=': operator.ge,
'=': operator.eq, '=': operator.eq,
'!=': operator.ne, '!=': operator.ne,
'~=': operator.contains,
'!~=': lambda left, right: not operator.contains(left, right),
'*=': lambda left, right: bool(re.search(right, left)),
'!*=': lambda left, right: not bool(re.search(right, left)),
} }
operator_rex = re.compile(r'''(?x)\s* operator_rex = re.compile(r'''(?x)\s*
(?P<key>[a-z_]+) (?P<key>[a-z_]+)
@ -4392,14 +4396,14 @@ def _match_one(filter_part, dct):
# https://github.com/ytdl-org/youtube-dl/issues/11082). # https://github.com/ytdl-org/youtube-dl/issues/11082).
or actual_value is not None and m.group('intval') is not None or actual_value is not None and m.group('intval') is not None
and isinstance(actual_value, compat_str)): and isinstance(actual_value, compat_str)):
if m.group('op') not in ('=', '!='):
raise ValueError(
'Operator %s does not support string values!' % m.group('op'))
comparison_value = m.group('quotedstrval') or m.group('strval') or m.group('intval') comparison_value = m.group('quotedstrval') or m.group('strval') or m.group('intval')
quote = m.group('quote') quote = m.group('quote')
if quote is not None: if quote is not None:
comparison_value = comparison_value.replace(r'\%s' % quote, quote) comparison_value = comparison_value.replace(r'\%s' % quote, quote)
else: else:
if m.group('op') in ('~=', '!~=', '*=', '!*='):
raise ValueError(
'Operator %s only supports string values!' % m.group('op'))
try: try:
comparison_value = int(m.group('intval')) comparison_value = int(m.group('intval'))
except ValueError: except ValueError: