Video: AV1 and ARM

AV1’s no longer the slow codec it was when it was released. Real-time encodes and decodes are now practical with open-source software implementations called rav1e for encoding and dav1d for decoding. We’ve also seen in previous talks the SVT-AV1 provides real-time encoding and WebRTC now has a real-time version with the AV1 codec.

In this talk, rav1e contributor Vibhoothi explains more about these projects and how the ARM chipset helps speed up encoding. The Dav1d started project started in 2018 with the intention of being a fast, cross-platform AV1 encoder with a small binary which Vibhoothi says is exactly what we have in 2021. Dav1d is the complementary decoder project. AV1 decoding is found in many places now including in Android Q, in Microsoft’s media extension for it, VLC supports AV1 on linux and macOS thanks to dav1d, AV1 is supported in all major browsers, on NVIDIA and AMD GPUs plus Intel Tiger Lake CPUs. Netflix even use dav1d to stream AV1 onto some mobile devices. Overall, then, we see that AV1 has ‘arrived’ in the sense that it’s in common and increasing use.

The ARM CPU architecture underpins nearly all smartphones and most tablets so ARM is found in a vast number of devices. It’s only relatively recently that ARM has made it into mainstream servers. One big milestone has been the release of Neoverse which is an ARM chip for infrastructure. AWS now offer ARM instances that have a 40% higher performance but a 20% reduced cost. These have been snapped up by Netflix but also by a plethora of non-media companies. Recently Apple has made waves with their introduction of the M1 ARM-based chip for desktops which has benchmarks far in excess of the previous x86 offering which shows that the future for ARM-based implementations of the rav1e encoder and dav1d decoder are bright.

Vibhoothi outlines how dav1d works better on ARM then x86 with improved threading support including hand-written asm optimisations and support for 10-bit assembly. rav1e has wide support in VLC, GStreamer, FFmpeg, libavif and others.

The talk finishes with a range of benchmarks showing how better-than-real-time encoding and decoding is possible and how the number of threads relates to the throughput. Vibhoothi’s final thoughts focus on what’s still missing in the ARM implementations.

Watch now!
Speaker

Vibhoothi Vibhoothi
Developer, VideoLAN
Research Assistant, Trinity College Dublin,
Codec Development, rav1e, Mozilla

Video: Bit-Rate Evaluation of Compressed HDR using SL-HDR1

HDR video can look vastly better than standard dynamic range (SDR), but much of our broadcast infrastructure is made for SDR delivery. SL-HDR1 allows you to deliver HDR over SDR transmission chains by breaking down HDR signals into an SDR video plus enhancement metadata which describes how to reconstruct the original HDR signal. Now part of the ATSC 3.0 suite of standards, people are asking the question whether you get better compression using SL-HDR1 or compressing HDR directly.

HDR works by changing the interpretation of the video samples. As human sight has a non-linear response to luminance, we can take the same 256 or 1024 possible luminance values and map them to brightness so that where the eye isn’t very sensitive, only a few values are used, but there is a lot of detail where we see well. Humans perceive more detail at lower luminosity, so HDR devotes a lot more of the luminance values to describing that area and relatively few at high brightness where specular highlights tend to be. HDR, therefore, has the benefit of not only increasing the dynamic range but actually provides more detail in the lower light areas than SDR.

Ciro Noronha from Cobalt has been examining the question of encoding. Video encoders are agnostic to dynamic range. Since HDR and SDR only define the meaning of the luminance values, the video encoder sees no difference. Yet there have been a number of papers saying that sending SL-HDR1 can result in bitrate savings over HDR. SL-HDR1 is defined in ETSI TS 103 433-1 and included in ATSC A/341. The metadata carriage is done using SMPTE ST 2108-1 or carried within the video stream using SEI. Ciro set out to do some tests to see if this was the case with technology consultant Matt Goldman giving his perspective on HDR and the findings.

Ciro tested with three types of Tested 1080p BT.2020 10-bit content with the AVC and HEVC encoders set to 4:2:0, 10-bit with a 100-frame GOP. Quality was rated using PSNR as well as two special types of PSNR which look at distortion/deviation from the CIE colour space. The findings show that AVC encode chains benefit more from SL-HDR1 than HEVC and it’s clear that the benefit is content-dependent. Work remains to be done now to connect these results with verified subjective tests. With LCEVC and VVC, MPEG has seen that subjective assessments can show up to 10% better results than objective metrics. Additionally, PSNR is not well known for correlating well with visual improvements.

Watch now!
Speakers

Ciro Noronha Ciro Noronha
Executive Vice President of Engineering, Cobalt Digital
President, Rist Forum
Matthew Goldman Matthew Goldman
Technology Consultant

Video: MPEG-5 Essential Video Coding (EVC) Standard

Learning from the patent miss-steps of HEVC, MPEG have released MPEG-5 EVC which brings bitrate savings, faster encoding and clearer licencing terms including a royalty-free implementation. The hope being that with more control over exposure to patent risk, companies large and small will adopt EVC as they improve and launch streaming services now and in the future.

At Mile High Video 2020, Kiho Choi introduced the MPEG 5 Essential Video Coding. Naturally, the motivation to produce a new codec was partly based on the continued need to reduce video bitrates. With estimates of the video traffic share on the internet, both now and in the future all hovering between 75% and 90% any reduction in bitrate will have a wide benefit, best exemplified by Netflix and Facebook’s decision to reduce the bitrate at the top of their ABR ladder during the pandemic which impacted the quality available to viewers. The unspoken point of this talk is that if the top rung used EVC, viewers wouldn’t notice a drop in quality.

The most important point about EVC, which is in contrast to the MPEG/ISO co-defined standard form last year, VVC, is that it provides businesses a lot of control over their exposure to patent royalties. It’s no secret that much HEVC adoption has been hampered by the risk that large users could be approached for licencing fees. Whilst it has made its way into Apple devices, which is no minimal success, big players like ESPN won’t have anything to do with it. EVC tackles this problem in two ways. One is to have a baseline profile which provides bitrate savings over its predecessors but uses a combination of technologies which are either old enough to not be eligible for royalty payments or that have been validated as free to use. Companies should, therefore, be able to use this level of codec without any reasonable concern over legal exposure. Moreover, the main profile which does use patentable technologies allows for each individual part of the profile to be switched off meaning anyone encoding EVC has control, assuming the vendor makes this possible, over which technologies they are using and hence their exposure to risk. Kiho points out that this business-requirements-first approach is new and in contrast to many codecs.

Kiho highlights a number of the individual tools within both the baseline and main codecs which provide the bitrate savings before showing us the results of the objective and subjective testing. Within the EVC docs, the testing methodology is spelt out to allow EVC to be compared against predecessors AVC and HEVC. The baseline codec shows an improvement of 38% against 1080p60 material and 35% for UHD material compared to AVC doing the same tasks yet it achieves a quicker encoder (less compute needed) and the decode is approximately the same. The main profile, being more efficient is compared against HEVC which is, itself, around 50% more efficient than AVC. Against HEVC, Kiho says, EVC main profile produces an improvement of around 30% encoding gain for UHD footage and 25% for 1080p60 footage. Encoding is close to 5x longer and decoder is around 1.5x longer than HEVC.

Kiho finishes by summarising subjective testing of SDR and HDR videos which show that, in contrast to the objective savings which are calculated by computers, in practice perceived quality is higher and enables a higher bitrate reduction, a phenomenon which has been seen in other codec comparisons such LCEVC. SDR results show a 50% encoding gain for 4K and 30% for 1080p60 against AVC. Against HEVC, the main profile is able to deliver 50% coding gains for 4K content and 40% for 1080p60. For HDR, the main profile provides an approximately 35% encoding gain for both 1080p60 and 4k.

Watch now!
Speakers

Kiho Choi Kiho Choi
Senior Engineer & Technical Lead for Multimedia Standards at Samsung Electronics
Lead Editor of MPEG5 Part 1 Essential Video Coding

Video: Cloud Encoding – Overview & Best Practices

There are so many ways to work in the cloud. You can use a monolithic solution which does everything for you which is almost guaranteed by its nature to under-deliver on features in one way or another for any non-trivial workflow. Or you could pick best-of-breed functional elements and plumb them together yourself. With the former, you have a fast time to market and in-built simplicity along with some known limitations. With the latter, you may have exactly what you need, to the standard you wanted but there’s a lot of work to implement and test the system.

Tom Kuppinen from Bitmovin joins Christopher Olekas from SSIMWAVE and host of this Kirchner Waterloo Video Tech talk on cloud encoding. After the initial introduction to ‘middle-aged’ startup, Bitmovin, Tom talks about what ‘agility in the cloud’ means being cloud-agnostic. This is the, yet unmentioned, elephant in the room for broadcasters who are so used to having extreme redundancy. Whether it’s the BBC’s “no closer than 70m” requirement for separation of circuits or the standard deployment methodology for systems using SMPTE’s ST 2110 which will have two totally independent networks, putting everything into one cloud provider really isn’t in the same ballpark. AWS has availability zones, of course, which is one of a number of great ways of reducing the blast radius of problems. But surely there’s no better way of reducing the impact of an AWS problem than having part of your infrastructure in another cloud provider.

Bitmovin have implementations in Azure, Google Cloud and AWS along with other cloud providers. In this author’s opinion, it’s a sign of the maturity of the market that this is being thought about, but few companies are truly using multiple cloud providers in an agnostic way; this will surely change over the next 5 years. For reliable and repeatable deployments, API control is your best bet. For detailed monitoring, you will need to use APIs. For connecting together solutions from different vendors, you’ll need APIs. It’s no surprise that Bitmovin say they program ‘API First’; it’s a really important element to any medium to large deployment.

 

 

When it comes to the encoding itself, per-title encoding helps reduce bitrates and storage. Tom explains how it analyses each video and chooses the best combination parameters for the title. In the Q&A, Tom confirms they are working on implementing per-scene encoding which promises more savings still.

To add to the complexity of a best-of-breed encoding solution, using best-of-breed codecs is part and parcel of the value. Bitmovin were early with AV1 and they support VP9 and HEVC. They can also distribute the encoding so that it’s encoded in parallel by as many cores as needed. This was their initial offering for AV1 encoding which was spread over more than 200 cores.

Tom talks about how the cloud-based codecs can integrate into workflows and reveals that HDR conversion, instance pre-warming, advanced subtitling support and AV1 improvements are on the roadmap while leads on to the Q&A. Questions include whether it’s difficult to deploy on multiple clouds, which HDR standards are likely to become the favourites, what the pain points are about live streaming and how to handle metadata.

Watch now!
Speakers

Tom Kuppinen Tom Kuppinen
Senior Sales Engineer,
Bitmovin
Moderator: Christopher Olekas
Senior Software Engineer,
SSIMWAVE Inc.