Compare commits

...

15 Commits

Author SHA1 Message Date
Sergey M․
8e953dcbb1
release 2021.01.03 2021-01-03 09:12:47 +07:00
Sergey M․
f4afb9a6a8
[ChangeLog] Actualize
[ci skip]
2021-01-03 09:10:40 +07:00
Sergey M․
d5b8cf093c
[nrk] Fix age limit extraction 2021-01-03 08:58:05 +07:00
Sergey M․
5c6e84c0ff
[nrk] Improve series metadata extraction (closes #27473) 2021-01-03 08:31:26 +07:00
Sergey M․
1aaee908b9
[nrk] PEP 8 2021-01-03 07:57:12 +07:00
Sergey M․
b2d9fd9c9f
[nrk] Improve episode and season number extraction 2021-01-03 07:53:05 +07:00
Sergey M․
bc2f83b95e
[nrktv] Fix tests 2021-01-03 07:31:03 +07:00
Sergey M․
85de33b04e
[nrk] Improve series metadata extraction 2021-01-03 07:30:45 +07:00
Sergey M․
7dfd966848
[nrk] Extract subtitles 2021-01-03 07:02:13 +07:00
Sergey M․
a25d03d7cb
[nrk] Fix age limit extraction 2021-01-03 06:51:21 +07:00
Sergey M․
cabfd4b1f0
[nrk] Inline _extract_from_playback 2021-01-03 06:45:17 +07:00
Sergey M․
7b643d4cd0
[nrk] Improve video id extraction 2021-01-03 06:44:28 +07:00
Sergey M․
1f1d01d498
[nrk] Add more shortcut tests 2021-01-03 06:41:37 +07:00
Sergey M․
21a42e2588
[nrk] Improve extraction (closes #27634, closes #27635)
+ Add support for mp3 formats
* Generalize and delegate all item extractors to nrk, beware ie key breakages
+ Add support for podcasts
+ Generalize nrk shortcut form to support all kind of ids
2021-01-03 06:29:52 +07:00
Sergey M․
2df93a0c4a
[nrktv] Switch to playback endpoint
mediaelement endpoint is no longer in use.
2021-01-03 05:01:45 +07:00
10 changed files with 276 additions and 224 deletions

View File

@ -18,7 +18,7 @@ title: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.31. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.01.03. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -26,7 +26,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2020.12.31**
- [ ] I've verified that I'm running youtube-dl version **2021.01.03**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
@ -41,7 +41,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.12.31
[debug] youtube-dl version 2021.01.03
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@ -19,7 +19,7 @@ labels: 'site-support-request'
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.31. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.01.03. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2020.12.31**
- [ ] I've verified that I'm running youtube-dl version **2021.01.03**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones

View File

@ -18,13 +18,13 @@ title: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.31. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.01.03. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2020.12.31**
- [ ] I've verified that I'm running youtube-dl version **2021.01.03**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones

View File

@ -18,7 +18,7 @@ title: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.31. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.01.03. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2020.12.31**
- [ ] I've verified that I'm running youtube-dl version **2021.01.03**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
@ -43,7 +43,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.12.31
[debug] youtube-dl version 2021.01.03
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@ -19,13 +19,13 @@ labels: 'request'
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.12.31. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.01.03. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2020.12.31**
- [ ] I've verified that I'm running youtube-dl version **2021.01.03**
- [ ] I've searched the bugtracker for similar feature requests including closed ones

View File

@ -1,3 +1,24 @@
version 2021.01.03
Extractors
* [nrk] Improve series metadata extraction (#27473)
+ [nrk] Extract subtitles
* [nrk] Fix age limit extraction
* [nrk] Improve video id extraction
+ [nrk] Add support for podcasts (#27634, #27635)
* [nrk] Generalize and delegate all item extractors to nrk
+ [nrk] Add support for mp3 formats
* [nrktv] Switch to playback endpoint
* [vvvvid] Fix season metadata extraction (#18130)
* [stitcher] Fix extraction (#20811, #27606)
* [acast] Fix extraction (#21444, #27612, #27613)
+ [arcpublishing] Add support for arcpublishing.com (#2298, #9340, #17200)
+ [sky] Add support for Sports News articles and Brighcove videos (#13054)
+ [vvvvid] Extract akamai formats
* [vvvvid] Skip unplayable episodes (#27599)
* [yandexvideo] Fix extraction for Python 3.4
version 2020.12.31
Core

View File

@ -57,6 +57,7 @@
- **appletrailers**
- **appletrailers:section**
- **archive.org**: archive.org videos
- **ArcPublishing**
- **ARD**
- **ARD:mediathek**
- **ARDBetaMediathek**
@ -610,6 +611,7 @@
- **Npr**
- **NRK**
- **NRKPlaylist**
- **NRKRadioPodkast**
- **NRKSkole**: NRK Skole
- **NRKTV**: NRK TV and NRK Radio
- **NRKTVDirekte**: NRK TV Direkte and NRK Radio Direkte
@ -813,12 +815,13 @@
- **ShowRoomLive**
- **Sina**
- **sky.it**
- **sky:news**
- **sky:sports**
- **sky:sports:news**
- **skyacademy.it**
- **SkylineWebcams**
- **SkyNews**
- **skynewsarabia:article**
- **skynewsarabia:video**
- **SkySports**
- **Slideshare**
- **SlidesLive**
- **Slutload**

View File

@ -790,6 +790,7 @@ from .nrk import (
NRKSkoleIE,
NRKTVIE,
NRKTVDirekteIE,
NRKRadioPodkastIE,
NRKTVEpisodeIE,
NRKTVEpisodesIE,
NRKTVSeasonIE,

View File

@ -6,16 +6,13 @@ import random
import re
from .common import InfoExtractor
from ..compat import (
compat_str,
compat_urllib_parse_unquote,
)
from ..compat import compat_str
from ..utils import (
determine_ext,
ExtractorError,
int_or_none,
parse_age_limit,
parse_duration,
str_or_none,
try_get,
urljoin,
url_or_none,
@ -63,7 +60,8 @@ class NRKBaseIE(InfoExtractor):
return self._download_json(
urljoin('http://psapi.nrk.no/', path),
video_id, note or 'Downloading %s JSON' % item,
fatal=fatal, query=query)
fatal=fatal, query=query,
headers={'Accept-Encoding': 'gzip, deflate, br'})
class NRKIE(NRKBaseIE):
@ -116,9 +114,39 @@ class NRKIE(NRKBaseIE):
}, {
'url': 'https://www.nrk.no/video/humor/kommentatorboksen-reiser-til-sjos_d1fda11f-a4ad-437a-a374-0398bc84e999',
'only_matching': True,
}, {
# podcast
'url': 'nrk:l_96f4f1b0-de54-4e6a-b4f1-b0de54fe6af8',
'only_matching': True,
}, {
'url': 'nrk:podcast/l_96f4f1b0-de54-4e6a-b4f1-b0de54fe6af8',
'only_matching': True,
}, {
# clip
'url': 'nrk:150533',
'only_matching': True,
}, {
'url': 'nrk:clip/150533',
'only_matching': True,
}, {
# program
'url': 'nrk:MDDP12000117',
'only_matching': True,
}, {
'url': 'nrk:program/ENRK10100318',
'only_matching': True,
}, {
# direkte
'url': 'nrk:nrk1',
'only_matching': True,
}, {
'url': 'nrk:channel/nrk1',
'only_matching': True,
}]
def _extract_from_playback(self, video_id):
def _real_extract(self, url):
video_id = self._match_id(url).split('/')[-1]
path_templ = 'playback/%s/' + video_id
def call_playback_api(item, query=None):
@ -126,6 +154,8 @@ class NRKIE(NRKBaseIE):
# known values for preferredCdn: akamai, iponly, minicdn and telenor
manifest = call_playback_api('manifest', {'preferredCdn': 'akamai'})
video_id = try_get(manifest, lambda x: x['id'], compat_str) or video_id
if manifest.get('playability') == 'nonPlayable':
self._raise_error(manifest['nonPlayable'])
@ -140,8 +170,15 @@ class NRKIE(NRKBaseIE):
format_url = url_or_none(asset.get('url'))
if not format_url:
continue
if asset.get('format') == 'HLS' or determine_ext(format_url) == 'm3u8':
asset_format = (asset.get('format') or '').lower()
if asset_format == 'hls' or determine_ext(format_url) == 'm3u8':
formats.extend(self._extract_nrk_formats(format_url, video_id))
elif asset_format == 'mp3':
formats.append({
'url': format_url,
'format_id': asset_format,
'vcodec': 'none',
})
self._sort_formats(formats)
data = call_playback_api('metadata')
@ -168,31 +205,94 @@ class NRKIE(NRKBaseIE):
'height': int_or_none(image.get('pixelHeight')),
})
return {
subtitles = {}
for sub in try_get(playable, lambda x: x['subtitles'], list) or []:
if not isinstance(sub, dict):
continue
sub_url = url_or_none(sub.get('webVtt'))
if not sub_url:
continue
sub_key = str_or_none(sub.get('language')) or 'nb'
sub_type = str_or_none(sub.get('type'))
if sub_type:
sub_key += '-%s' % sub_type
subtitles.setdefault(sub_key, []).append({
'url': sub_url,
})
legal_age = try_get(
data, lambda x: x['legalAge']['body']['rating']['code'], compat_str)
# https://en.wikipedia.org/wiki/Norwegian_Media_Authority
if legal_age == 'A':
age_limit = 0
elif legal_age.isdigit():
age_limit = int_or_none(legal_age)
else:
age_limit = None
is_series = try_get(data, lambda x: x['_links']['series']['name']) == 'series'
info = {
'id': video_id,
'title': title,
'alt_title': alt_title,
'description': description,
'duration': duration,
'thumbnails': thumbnails,
'age_limit': age_limit,
'formats': formats,
'subtitles': subtitles,
}
def _real_extract(self, url):
video_id = self._match_id(url)
return self._extract_from_playback(video_id)
if is_series:
series = season_id = season_number = episode = episode_number = None
programs = self._call_api(
'programs/%s' % video_id, video_id, 'programs', fatal=False)
if programs and isinstance(programs, dict):
series = str_or_none(programs.get('seriesTitle'))
season_id = str_or_none(programs.get('seasonId'))
season_number = int_or_none(programs.get('seasonNumber'))
episode = str_or_none(programs.get('episodeTitle'))
episode_number = int_or_none(programs.get('episodeNumber'))
if not series:
series = title
if alt_title:
title += ' - %s' % alt_title
if not season_number:
season_number = int_or_none(self._search_regex(
r'Sesong\s+(\d+)', description or '', 'season number',
default=None))
if not episode:
episode = alt_title if is_series else None
if not episode_number:
episode_number = int_or_none(self._search_regex(
r'^(\d+)\.', episode or '', 'episode number',
default=None))
if not episode_number:
episode_number = int_or_none(self._search_regex(
r'\((\d+)\s*:\s*\d+\)', description or '',
'episode number', default=None))
info.update({
'title': title,
'series': series,
'season_id': season_id,
'season_number': season_number,
'episode': episode,
'episode_number': episode_number,
})
return info
class NRKTVIE(NRKBaseIE):
class NRKTVIE(InfoExtractor):
IE_DESC = 'NRK TV and NRK Radio'
_EPISODE_RE = r'(?P<id>[a-zA-Z]{4}\d{8})'
_VALID_URL = r'https?://(?:tv|radio)\.nrk(?:super)?\.no/(?:[^/]+/)*%s' % _EPISODE_RE
_API_HOSTS = ('psapi-ne.nrk.no', 'psapi-we.nrk.no')
_TESTS = [{
'url': 'https://tv.nrk.no/program/MDDP12000117',
'md5': 'c4a5960f1b00b40d47db65c1064e0ab1',
'info_dict': {
'id': 'MDDP12000117AA',
'id': 'MDDP12000117',
'ext': 'mp4',
'title': 'Alarm Trolltunga',
'description': 'md5:46923a6e6510eefcce23d5ef2a58f2ce',
@ -203,24 +303,27 @@ class NRKTVIE(NRKBaseIE):
'url': 'https://tv.nrk.no/serie/20-spoersmaal-tv/MUHH48000314/23-05-2014',
'md5': '8d40dab61cea8ab0114e090b029a0565',
'info_dict': {
'id': 'MUHH48000314AA',
'id': 'MUHH48000314',
'ext': 'mp4',
'title': '20 spørsmål 23.05.2014',
'title': '20 spørsmål - 23. mai 2014',
'alt_title': '23. mai 2014',
'description': 'md5:bdea103bc35494c143c6a9acdd84887a',
'duration': 1741,
'series': '20 spørsmål',
'episode': '23.05.2014',
'episode': '23. mai 2014',
'age_limit': 0,
},
}, {
'url': 'https://tv.nrk.no/program/mdfp15000514',
'info_dict': {
'id': 'MDFP15000514CA',
'id': 'MDFP15000514',
'ext': 'mp4',
'title': 'Grunnlovsjubiléet - Stor ståhei for ingenting 24.05.2014',
'title': 'Kunnskapskanalen - Grunnlovsjubiléet - Stor ståhei for ingenting',
'description': 'md5:89290c5ccde1b3a24bb8050ab67fe1db',
'duration': 4605.08,
'series': 'Kunnskapskanalen',
'episode': '24.05.2014',
'episode': 'Grunnlovsjubiléet - Stor ståhei for ingenting',
'age_limit': 0,
},
'params': {
'skip_download': True,
@ -229,10 +332,11 @@ class NRKTVIE(NRKBaseIE):
# single playlist video
'url': 'https://tv.nrk.no/serie/tour-de-ski/MSPO40010515/06-01-2015#del=2',
'info_dict': {
'id': 'MSPO40010515AH',
'id': 'MSPO40010515',
'ext': 'mp4',
'title': 'Sprint fri teknikk, kvinner og menn 06.01.2015',
'description': 'md5:c03aba1e917561eface5214020551b7a',
'age_limit': 0,
},
'params': {
'skip_download': True,
@ -242,24 +346,27 @@ class NRKTVIE(NRKBaseIE):
}, {
'url': 'https://tv.nrk.no/serie/tour-de-ski/MSPO40010515/06-01-2015',
'info_dict': {
'id': 'MSPO40010515AH',
'id': 'MSPO40010515',
'ext': 'mp4',
'title': 'Sprint fri teknikk, kvinner og menn 06.01.2015',
'description': 'md5:c03aba1e917561eface5214020551b7a',
'age_limit': 0,
},
'expected_warnings': ['Failed to download m3u8 information'],
'skip': 'Ikke tilgjengelig utenfor Norge',
}, {
'url': 'https://tv.nrk.no/serie/anno/KMTE50001317/sesong-3/episode-13',
'info_dict': {
'id': 'KMTE50001317AA',
'id': 'KMTE50001317',
'ext': 'mp4',
'title': 'Anno 13:30',
'title': 'Anno - 13. episode',
'description': 'md5:11d9613661a8dbe6f9bef54e3a4cbbfa',
'duration': 2340,
'series': 'Anno',
'episode': '13:30',
'episode': '13. episode',
'season_number': 3,
'episode_number': 13,
'age_limit': 0,
},
'params': {
'skip_download': True,
@ -267,13 +374,14 @@ class NRKTVIE(NRKBaseIE):
}, {
'url': 'https://tv.nrk.no/serie/nytt-paa-nytt/MUHH46000317/27-01-2017',
'info_dict': {
'id': 'MUHH46000317AA',
'id': 'MUHH46000317',
'ext': 'mp4',
'title': 'Nytt på Nytt 27.01.2017',
'description': 'md5:5358d6388fba0ea6f0b6d11c48b9eb4b',
'duration': 1796,
'series': 'Nytt på nytt',
'episode': '27.01.2017',
'age_limit': 0,
},
'params': {
'skip_download': True,
@ -290,180 +398,26 @@ class NRKTVIE(NRKBaseIE):
'only_matching': True,
}]
_api_host = None
def _extract_from_mediaelement(self, video_id):
api_hosts = (self._api_host, ) if self._api_host else self._API_HOSTS
for api_host in api_hosts:
data = self._download_json(
'http://%s/mediaelement/%s' % (api_host, video_id),
video_id, 'Downloading mediaelement JSON',
fatal=api_host == api_hosts[-1])
if not data:
continue
self._api_host = api_host
break
title = data.get('fullTitle') or data.get('mainTitle') or data['title']
video_id = data.get('id') or video_id
urls = []
entries = []
conviva = data.get('convivaStatistics') or {}
live = (data.get('mediaElementType') == 'Live'
or data.get('isLive') is True or conviva.get('isLive'))
def make_title(t):
return self._live_title(t) if live else t
media_assets = data.get('mediaAssets')
if media_assets and isinstance(media_assets, list):
def video_id_and_title(idx):
return ((video_id, title) if len(media_assets) == 1
else ('%s-%d' % (video_id, idx), '%s (Part %d)' % (title, idx)))
for num, asset in enumerate(media_assets, 1):
asset_url = asset.get('url')
if not asset_url or asset_url in urls:
continue
urls.append(asset_url)
formats = self._extract_nrk_formats(asset_url, video_id)
if not formats:
continue
self._sort_formats(formats)
entry_id, entry_title = video_id_and_title(num)
duration = parse_duration(asset.get('duration'))
subtitles = {}
for subtitle in ('webVtt', 'timedText'):
subtitle_url = asset.get('%sSubtitlesUrl' % subtitle)
if subtitle_url:
subtitles.setdefault('no', []).append({
'url': compat_urllib_parse_unquote(subtitle_url)
})
entries.append({
'id': asset.get('carrierId') or entry_id,
'title': make_title(entry_title),
'duration': duration,
'subtitles': subtitles,
'formats': formats,
'is_live': live,
})
if not entries:
media_url = data.get('mediaUrl')
if media_url and media_url not in urls:
formats = self._extract_nrk_formats(media_url, video_id)
if formats:
self._sort_formats(formats)
duration = parse_duration(data.get('duration'))
entries = [{
'id': video_id,
'title': make_title(title),
'duration': duration,
'formats': formats,
'is_live': live,
}]
if not entries:
self._raise_error(data)
series = conviva.get('seriesName') or data.get('seriesTitle')
episode = conviva.get('episodeName') or data.get('episodeNumberOrDate')
season_number = None
episode_number = None
if data.get('mediaElementType') == 'Episode':
_season_episode = data.get('scoresStatistics', {}).get('springStreamStream') or \
data.get('relativeOriginUrl', '')
EPISODENUM_RE = [
r'/s(?P<season>\d{,2})e(?P<episode>\d{,2})\.',
r'/sesong-(?P<season>\d{,2})/episode-(?P<episode>\d{,2})',
]
season_number = int_or_none(self._search_regex(
EPISODENUM_RE, _season_episode, 'season number',
default=None, group='season'))
episode_number = int_or_none(self._search_regex(
EPISODENUM_RE, _season_episode, 'episode number',
default=None, group='episode'))
thumbnails = None
images = data.get('images')
if images and isinstance(images, dict):
web_images = images.get('webImages')
if isinstance(web_images, list):
thumbnails = [{
'url': image['imageUrl'],
'width': int_or_none(image.get('width')),
'height': int_or_none(image.get('height')),
} for image in web_images if image.get('imageUrl')]
description = data.get('description')
category = data.get('mediaAnalytics', {}).get('category')
common_info = {
'description': description,
'series': series,
'episode': episode,
'season_number': season_number,
'episode_number': episode_number,
'categories': [category] if category else None,
'age_limit': parse_age_limit(data.get('legalAge')),
'thumbnails': thumbnails,
}
vcodec = 'none' if data.get('mediaType') == 'Audio' else None
for entry in entries:
entry.update(common_info)
for f in entry['formats']:
f['vcodec'] = vcodec
points = data.get('shortIndexPoints')
if isinstance(points, list):
chapters = []
for next_num, point in enumerate(points, start=1):
if not isinstance(point, dict):
continue
start_time = parse_duration(point.get('startPoint'))
if start_time is None:
continue
end_time = parse_duration(
data.get('duration')
if next_num == len(points)
else points[next_num].get('startPoint'))
if end_time is None:
continue
chapters.append({
'start_time': start_time,
'end_time': end_time,
'title': point.get('title'),
})
if chapters and len(entries) == 1:
entries[0]['chapters'] = chapters
return self.playlist_result(entries, video_id, title, description)
def _real_extract(self, url):
video_id = self._match_id(url)
return self._extract_from_mediaelement(video_id)
return self.url_result(
'nrk:%s' % video_id, ie=NRKIE.ie_key(), video_id=video_id)
class NRKTVEpisodeIE(InfoExtractor):
_VALID_URL = r'https?://tv\.nrk\.no/serie/(?P<id>[^/]+/sesong/\d+/episode/\d+)'
_VALID_URL = r'https?://tv\.nrk\.no/serie/(?P<id>[^/]+/sesong/(?P<season_number>\d+)/episode/(?P<episode_number>\d+))'
_TESTS = [{
'url': 'https://tv.nrk.no/serie/hellums-kro/sesong/1/episode/2',
'info_dict': {
'id': 'MUHH36005220BA',
'id': 'MUHH36005220',
'ext': 'mp4',
'title': 'Kro, krig og kjærlighet 2:6',
'description': 'md5:b32a7dc0b1ed27c8064f58b97bda4350',
'duration': 1563,
'title': 'Hellums kro - 2. Kro, krig og kjærlighet',
'description': 'md5:ad92ddffc04cea8ce14b415deef81787',
'duration': 1563.92,
'series': 'Hellums kro',
'season_number': 1,
'episode_number': 2,
'episode': '2:6',
'episode': '2. Kro, krig og kjærlighet',
'age_limit': 6,
},
'params': {
@ -472,15 +426,16 @@ class NRKTVEpisodeIE(InfoExtractor):
}, {
'url': 'https://tv.nrk.no/serie/backstage/sesong/1/episode/8',
'info_dict': {
'id': 'MSUI14000816AA',
'id': 'MSUI14000816',
'ext': 'mp4',
'title': 'Backstage 8:30',
'title': 'Backstage - 8. episode',
'description': 'md5:de6ca5d5a2d56849e4021f2bf2850df4',
'duration': 1320,
'series': 'Backstage',
'season_number': 1,
'episode_number': 8,
'episode': '8:30',
'episode': '8. episode',
'age_limit': 0,
},
'params': {
'skip_download': True,
@ -489,7 +444,7 @@ class NRKTVEpisodeIE(InfoExtractor):
}]
def _real_extract(self, url):
display_id = self._match_id(url)
display_id, season_number, episode_number = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, display_id)
@ -501,10 +456,12 @@ class NRKTVEpisodeIE(InfoExtractor):
assert re.match(NRKTVIE._EPISODE_RE, nrk_id)
info.update({
'_type': 'url_transparent',
'_type': 'url',
'id': nrk_id,
'url': 'nrk:%s' % nrk_id,
'ie_key': NRKIE.ie_key(),
'season_number': int(season_number),
'episode_number': int(episode_number),
})
return info
@ -518,8 +475,6 @@ class NRKTVSerieBaseIE(NRKBaseIE):
nrk_id = episode.get('prfId') or episode.get('episodeId')
if not nrk_id or not isinstance(nrk_id, compat_str):
continue
if not re.match(NRKTVIE._EPISODE_RE, nrk_id):
continue
entries.append(self.url_result(
'nrk:%s' % nrk_id, ie=NRKIE.ie_key(), video_id=nrk_id))
return entries
@ -531,6 +486,10 @@ class NRKTVSerieBaseIE(NRKBaseIE):
if embedded.get(asset_key):
return asset_key
@staticmethod
def _catalog_name(serie_kind):
return 'podcast' if serie_kind in ('podcast', 'podkast') else 'series'
def _entries(self, data, display_id):
for page_num in itertools.count(1):
embedded = data.get('_embedded') or data
@ -564,7 +523,16 @@ class NRKTVSerieBaseIE(NRKBaseIE):
class NRKTVSeasonIE(NRKTVSerieBaseIE):
_VALID_URL = r'https?://(?P<domain>tv|radio)\.nrk\.no/serie/(?P<serie>[^/]+)/(?:sesong/)?(?P<id>\d+)'
_VALID_URL = r'''(?x)
https?://
(?P<domain>tv|radio)\.nrk\.no/
(?P<serie_kind>serie|pod[ck]ast)/
(?P<serie>[^/]+)/
(?:
(?:sesong/)?(?P<id>\d+)|
sesong/(?P<id_2>[^/?#&]+)
)
'''
_TESTS = [{
'url': 'https://tv.nrk.no/serie/backstage/sesong/1',
'info_dict': {
@ -600,19 +568,34 @@ class NRKTVSeasonIE(NRKTVSerieBaseIE):
# 180 entries, single page
'url': 'https://tv.nrk.no/serie/spangas/sesong/1',
'only_matching': True,
}, {
'url': 'https://radio.nrk.no/podkast/hele_historien/sesong/diagnose-kverulant',
'info_dict': {
'id': 'hele_historien/diagnose-kverulant',
'title': 'Diagnose kverulant',
},
'playlist_mincount': 3,
}, {
'url': 'https://radio.nrk.no/podkast/loerdagsraadet/sesong/202101',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return (False if NRKTVIE.suitable(url) or NRKTVEpisodeIE.suitable(url)
return (False if NRKTVIE.suitable(url) or NRKTVEpisodeIE.suitable(url) or NRKRadioPodkastIE.suitable(url)
else super(NRKTVSeasonIE, cls).suitable(url))
def _real_extract(self, url):
domain, serie, season_id = re.match(self._VALID_URL, url).groups()
mobj = re.match(self._VALID_URL, url)
domain = mobj.group('domain')
serie_kind = mobj.group('serie_kind')
serie = mobj.group('serie')
season_id = mobj.group('id') or mobj.group('id_2')
display_id = '%s/%s' % (serie, season_id)
data = self._call_api(
'%s/catalog/series/%s/seasons/%s' % (domain, serie, season_id),
'%s/catalog/%s/%s/seasons/%s'
% (domain, self._catalog_name(serie_kind), serie, season_id),
display_id, 'season', query={'pageSize': 50})
title = try_get(data, lambda x: x['titles']['title'], compat_str) or display_id
@ -622,7 +605,7 @@ class NRKTVSeasonIE(NRKTVSerieBaseIE):
class NRKTVSeriesIE(NRKTVSerieBaseIE):
_VALID_URL = r'https?://(?P<domain>(?:tv|radio)\.nrk|(?:tv\.)?nrksuper)\.no/serie/(?P<id>[^/]+)'
_VALID_URL = r'https?://(?P<domain>(?:tv|radio)\.nrk|(?:tv\.)?nrksuper)\.no/(?P<serie_kind>serie|pod[ck]ast)/(?P<id>[^/]+)'
_TESTS = [{
# new layout, instalments
'url': 'https://tv.nrk.no/serie/groenn-glede',
@ -682,23 +665,33 @@ class NRKTVSeriesIE(NRKTVSerieBaseIE):
}, {
'url': 'https://nrksuper.no/serie/labyrint',
'only_matching': True,
}, {
'url': 'https://radio.nrk.no/podkast/ulrikkes_univers',
'info_dict': {
'id': 'ulrikkes_univers',
},
'playlist_mincount': 10,
}, {
'url': 'https://radio.nrk.no/podkast/ulrikkes_univers/nrkno-poddkast-26588-134079-05042018030000',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return (
False if any(ie.suitable(url)
for ie in (NRKTVIE, NRKTVEpisodeIE, NRKTVSeasonIE))
for ie in (NRKTVIE, NRKTVEpisodeIE, NRKRadioPodkastIE, NRKTVSeasonIE))
else super(NRKTVSeriesIE, cls).suitable(url))
def _real_extract(self, url):
site, series_id = re.match(self._VALID_URL, url).groups()
site, serie_kind, series_id = re.match(self._VALID_URL, url).groups()
is_radio = site == 'radio.nrk'
domain = 'radio' if is_radio else 'tv'
size_prefix = 'p' if is_radio else 'embeddedInstalmentsP'
series = self._call_api(
'%s/catalog/series/%s' % (domain, series_id),
'%s/catalog/%s/%s'
% (domain, self._catalog_name(serie_kind), series_id),
series_id, 'serie', query={size_prefix + 'ageSize': 50})
titles = try_get(series, [
lambda x: x['titles'],
@ -713,12 +706,14 @@ class NRKTVSeriesIE(NRKTVSerieBaseIE):
embedded_seasons = embedded.get('seasons') or []
if len(linked_seasons) > len(embedded_seasons):
for season in linked_seasons:
season_name = season.get('name')
if season_name and isinstance(season_name, compat_str):
season_url = urljoin(url, season.get('href'))
if not season_url:
season_name = season.get('name')
if season_name and isinstance(season_name, compat_str):
season_url = 'https://%s.nrk.no/serie/%s/sesong/%s' % (domain, series_id, season_name)
if season_url:
entries.append(self.url_result(
'https://%s.nrk.no/serie/%s/sesong/%s'
% (domain, series_id, season_name),
ie=NRKTVSeasonIE.ie_key(),
season_url, ie=NRKTVSeasonIE.ie_key(),
video_title=season.get('title')))
else:
for season in embedded_seasons:
@ -743,6 +738,38 @@ class NRKTVDirekteIE(NRKTVIE):
}]
class NRKRadioPodkastIE(InfoExtractor):
_VALID_URL = r'https?://radio\.nrk\.no/pod[ck]ast/(?:[^/]+/)+(?P<id>l_[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
_TESTS = [{
'url': 'https://radio.nrk.no/podkast/ulrikkes_univers/l_96f4f1b0-de54-4e6a-b4f1-b0de54fe6af8',
'md5': '8d40dab61cea8ab0114e090b029a0565',
'info_dict': {
'id': 'MUHH48000314AA',
'ext': 'mp4',
'title': '20 spørsmål 23.05.2014',
'description': 'md5:bdea103bc35494c143c6a9acdd84887a',
'duration': 1741,
'series': '20 spørsmål',
'episode': '23.05.2014',
},
}, {
'url': 'https://radio.nrk.no/podcast/ulrikkes_univers/l_96f4f1b0-de54-4e6a-b4f1-b0de54fe6af8',
'only_matching': True,
}, {
'url': 'https://radio.nrk.no/podkast/ulrikkes_univers/sesong/1/l_96f4f1b0-de54-4e6a-b4f1-b0de54fe6af8',
'only_matching': True,
}, {
'url': 'https://radio.nrk.no/podkast/hele_historien/sesong/bortfoert-i-bergen/l_774d1a2c-7aa7-4965-8d1a-2c7aa7d9652c',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
return self.url_result(
'nrk:%s' % video_id, ie=NRKIE.ie_key(), video_id=video_id)
class NRKPlaylistBaseIE(InfoExtractor):
def _extract_description(self, webpage):
pass

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2020.12.31'
__version__ = '2021.01.03'