Video: The ROI of Deploying Multiple Codecs

Adding a new codec to your streaming service is a big decision. It seems inevitable that H.264 will be around for a long time and that new codecs won’t replace it, but just take their share of the market. In the short term, this means your streaming service may also need to deliver H.264 and your new codec which will add complexity and increase CDN storage requirements. What are the steps to justifying a move to a new codec and what’s the state of play today?

In this Streaming Media panel, Jan Ozer is joined by Facebook’s Colleen Henry, Amnon Cohen-Tidhar from Cloudinary and Anush Moorthy from Netflix talk about their experiences with new codecs and their approach to new Codecs. Anush starts by outlining the need to consider decoder support as a major step to rolling out a new codec. The topic of decoder support came up several times during this panel in discussing the merits of hardware versus software decoding. Colleen points out that running VP9 and VVC is possible, but some members of the panel see a benefit in deploying hardware – sometimes deploying on devices like smart TVs, hardware decoding is a must. When it comes to supporting third party devices, we hear that logging is vitally important since when you can’t get your hands on a device to test with, this is all you have to help improve the experience. It’s best, in Facebook’s view, to work closely with vendors to get the most out of their implementations. Amnon adds that his company is working hard to push forward improved reporting from browsers so they can better indicate their capabilities for decoding.

 

 

Colleen talks about the importance of codec switching to enhance performance at the bottom end of the ABR ladder with codecs like AV1 with H264 at the higher end. This is a good compromise between the computation needed for AV1 and giving the best quality at very low bitrates. But Anush points out that storage will increase when you start using two codecs, particularly in the CDN so this needs to be considered as part of the consideration of onboarding new codecs. Dropping AV1 support at higher bitrates is an acknowledgement that we also need to consider the cost of encoding in terms of computation.

The panel briefly discusses the newer codecs such as MPEG VVC and MPEG LCEVC. Colleen sees promise in VVC in as much as it can be decoded in software today. She also says good things about LCEVC suggesting we call it an enhancement codec due to the way it works. To find out more about these, check out this SMPTE talk. Both of these can be deployed as software decoders which allow for a way to get started while hardware establishes itself in the ecosystem.

Colleen discusses the importance of understanding your assets. If you have live video, your approach is very different to on-demand. If you are lucky enough to have an asset that is getting millions upon millions of views, you’ll want to compress every bit out of that, but for live, there’s a limit to what you can achieve. Also, you need to understand how your long-tail archive is going to be accessed to decide how much effort your business wants to put into compressing the assets further.

The video comes to a close by discussing the Alliance of Open Media’s approach to AV1 encoders and decoders, discussing the hard work optimising the libAV1 research encoder and the other implementations which are ready for production. Colleen points out the benefit of webassembly which allows a full decoder to be pushed into the browser and the discussion ends talking about codec support for HDR delivery technologies such as HDR10+.

Watch now!
Speakers

Colleen Henry Colleen Henry
Cobra Commander of Facebook Video Special Forces.
Anush Moorthy Anush Moorthy
Manager, Video & Imagine Encoding
Netflix
Amnon Cohen-Tidhar Amnon Cohen-Tidhar
Senior Director or Video Architecture,
Cloudinary
Jan Ozer Moderator: Jan Ozer
Principal, Stremaing Learning Center
Contributing Editor, Streaming Media

Video: The early days of Netflix Streaming Days and Perspective

David Ronca has had a long history in the industry and is most known for his time at Netflix where he was pivotal in the inception and implementation of many technologies. Because Netflix was one of the first companies streaming video on the internet, and at a global scale, they are responsible for many innovations that the industry as a whole benefits from today and are the recipient of 7 technical Emmys. David is often pictured holding an Emmy awarded to Netflix for their role in the standardisation and promotion of Japanese subtitles one of the less-talked-about innovations in contrast to VMAF, Per-Title encoding and per-shot encoding.

In this video, talking to John Porterfield, David talks about the early days at Netflix when it was pivoting from emailing DVDs to streaming. He talks about the move from Windows-based applications to cross-platform technologies, at the time Microsoft Silverlight which was a big direction shift for Netflix and for him. The first Silverlight implementation within Netflix was also the first adaptive bitrate (ABR) version of Netflix which is where David found he next calling within Netflix writing code to synchronise the segments after DRM.

The PS3, David recalls, was the worlds most powerful Blu-ray player and part of the Blu-ray spec is a Java implementation. David recounts the six months he spent in a team of three working to implement a full adaptive bitrate streaming application within Blu-ray’s Java implementation. This was done in order to get around some contractual issues and worked by extending the features which were built into Blu-ray for downloading new trailers to show instead of those on disc. This YouTube review from 2009 shows a slick interface slowed down by the speed of the internet connection.

David also talks about his close work with and respect for Netflix colleague Anne Aaron who has been featured previously on The Broadcast Knowledge. He goes on to talk about the inception of VMAF which is a metric for computationally determining the quality of video developed by Netflix as they didn’t feel that any of the current metrics such as PSRN and MS-SSIM captured the human opinion of video well enough. It’s widely understood that PSNR has its place but can give very different results to subjective evaluations. And, indeed, VMAF also is not perfect as David mentions. However, using VMAF well and understanding its limits results in a much more accurate description of quality than with many other metrics and unlike competing metrics such as SSIMWAVE’s SSIMPLUS, is open source and royalty-free.

David concludes his talk with John saying that high-quality, well-delivered streaming is now everywhere. The struggles of the early years have resulted in a lot of well-learned lessons by the industry at large. This commoditisation is welcome and shows a maturity in the industry that begs the question about where the puck is going to next. For David, he sees environmental sustainability to be one of the key goals. Both environmentally and financially, he says that streaming providers will now want to maximise the output-per-watt of their data centres. Data centre power is currently 3% of all global power consumption and is forecast to reach up to 20%. Looking to newer codecs is one way to achieve a reduction in power consumption. David spoke about AV1 last time he spoke with John which delivers lower bitrate with high computation requirements. At hyperscale, using dedicated ASIC chips to do the encoding is one way to drive down power consumption. An alternative route is new MPEG codec LCEVC which delivers better-than-AVC performance in software at much-reduced power consumption. With the prevalence of video – both for entertainment and outside, for example, body cams – moving to more power-efficient codecs and codec implementations seems the obvious and moral move.

Watch now!
Speakers

David Ronca David Ronca
Director, Video Encoding,
Facebook
John Porterfield
Freelance Video Webcast Producer and Tech Evangelist
JP’sChalkTalks YouTube Channel

Video: Netflix – Delivering better video encodes for legacy devices

With over 139 million paying customers, Netflix is very much in the bandwidth optimisation game. It keeps their costs down, it keeps customers’ costs down for those on metered tariffs and a lower bitrate keeps the service more responsive.

As we’ve seen on The Broadcast Knowledge over the years, Netflix has tried hard to find new ways to encode video with Per-Title encoding, VMAF and, more recently, per-shot encoding as well as moving to more efficient codecs such as AV1.

 

Mariana Afonso from Netflix discusses what do you do with devices that decode the latest encoders either because they are too old or can’t get certification? Techniques such as per-title encoding work well because they are wholly managed in the encoder. Whereas with codecs such as AV1, the decoder has to support it too, meaning it’s not as widely applicable an optimisation.

As per-title encoding was developed within Netflix before they got their VMAF metric finished, it still uses PSNR, explains Mariana. This means there is still an opportunity to bring down bitrates by using VMAF. Because VMAF more accurately captures how the video looks, it’s able to lead optimisation algorithms better and shows gains in tests.

Better than per-title is per-chunk. The per-chunk work done modulates the average target bitrate from chunk to chunk. This avoids over-allocating bits for low-complexity scenes and results in a more consistent quality by 6 to 16%.

Watch now!
Speaker

Mariana Alfonso Mariana Afonso
Research Scientist, Video Algorithms,
Netflix

Video:Measuring Video Quality with VMAF – Why You Should Care

VMAF, from Netflix, has become a popular tool for evaluating video quality since its launch as an Open Source project in 2017. Coming out of research from the University of Southern California and The University of Texas at Austin, it’s seen as one of the leading ways to automate video assessment.

Netflix’s Christos Bampis gives us a brief overview of VMAF’s origins and its aims. VMAF came about because other metrics such as MS-SSIM and, in particular, PSNR aren’t close enough indicators of quality. Indeed, Christos shows that when it comes to animated content (i.e. anime and cartoons) subjective scores can be very high, but if we look at the PSNR score it can be the same as the PSNR of score another live-action video clip which humans rate a lot lower, subjectively. Moreover, in less extreme examples, Christos explains. PSNR is often 5% or so away from the actual subjective score in either direction.

To a simple approximation, VMAF is a method of bringing out the spatial and temporal information from a video frame in a way which emphasises the types of things humans are attuned to such as contrast masking. Christos shows an example of a picture where artefacts in the trees are much harder to see than similar artefacts on a colour gradient such as a sky or still water. These extraction methods take account of situations like this and are then fed into a trained model which matches the results of the model with the numbers that humans would have given it. The idea being that when trained on many examples, it can correctly predict a human’s score given a set of data extracted from a picture. Christos shows examples of how well VMAF out-performs PSNR in gauging video quality.

 

Challenges are in focus in the second half of the talk. What are the things which still need working on to improve VMAF? Christos zooms in on two: design dimensionality and noise. By design dimensionality, he means how can VMAF be extended to be more general, delivering a number which has a consistent meaning in different scenarios? As the VMAF model has been trained on AVC, how can we deal with different artefacts which are seen with different codecs? Do we need a new model for HDR content instead of SDR and how should viewing conditions, whether ambient light or resolution and size of the display device, be brought into the metric? The second challenge Christos highlights is noise as he reveals VMAF tends to give lower scores than it should to noisy sources. Codecs like AV1 have film-grain synthesis tools and these need to be evaluated, so behaving correctly in the presence of video noise is important.

The talk finishes with Christos outlining that VMAF’s applicability to the industry is only increasing with new codecs coming out such as LCEVC, VCC, AV1 and more – such diversity in the codec ecosystem wasn’t an obvious prediction in 2014 when the initial research work was started. Christos underlines the fact that VMAF is a continually evolving metric which is Open Source and open to contributions. The Q&A covers failure cases, super-resolution and how to interpret close-call results which are only 1% different.

Watch now!
Download the presentation
Speaker

Christos Bampis Christos Bampis
Senior Software Engineer,
Netflix