Video: The early days of Netflix Streaming Days and Perspective

David Ronca has had a long history in the industry and is most known for his time at Netflix where he was pivotal in the inception and implementation of many technologies. Because Netflix was one of the first companies streaming video on the internet, and at a global scale, they are responsible for many innovations that the industry as a whole benefits from today and are the recipient of 7 technical Emmys. David is often pictured holding an Emmy awarded to Netflix for their role in the standardisation and promotion of Japanese subtitles one of the less-talked-about innovations in contrast to VMAF, Per-Title encoding and per-shot encoding.

In this video, talking to John Porterfield, David talks about the early days at Netflix when it was pivoting from emailing DVDs to streaming. He talks about the move from Windows-based applications to cross-platform technologies, at the time Microsoft Silverlight which was a big direction shift for Netflix and for him. The first Silverlight implementation within Netflix was also the first adaptive bitrate (ABR) version of Netflix which is where David found he next calling within Netflix writing code to synchronise the segments after DRM.

The PS3, David recalls, was the worlds most powerful Blu-ray player and part of the Blu-ray spec is a Java implementation. David recounts the six months he spent in a team of three working to implement a full adaptive bitrate streaming application within Blu-ray’s Java implementation. This was done in order to get around some contractual issues and worked by extending the features which were built into Blu-ray for downloading new trailers to show instead of those on disc. This YouTube review from 2009 shows a slick interface slowed down by the speed of the internet connection.

David also talks about his close work with and respect for Netflix colleague Anne Aaron who has been featured previously on The Broadcast Knowledge. He goes on to talk about the inception of VMAF which is a metric for computationally determining the quality of video developed by Netflix as they didn’t feel that any of the current metrics such as PSRN and MS-SSIM captured the human opinion of video well enough. It’s widely understood that PSNR has its place but can give very different results to subjective evaluations. And, indeed, VMAF also is not perfect as David mentions. However, using VMAF well and understanding its limits results in a much more accurate description of quality than with many other metrics and unlike competing metrics such as SSIMWAVE’s SSIMPLUS, is open source and royalty-free.

David concludes his talk with John saying that high-quality, well-delivered streaming is now everywhere. The struggles of the early years have resulted in a lot of well-learned lessons by the industry at large. This commoditisation is welcome and shows a maturity in the industry that begs the question about where the puck is going to next. For David, he sees environmental sustainability to be one of the key goals. Both environmentally and financially, he says that streaming providers will now want to maximise the output-per-watt of their data centres. Data centre power is currently 3% of all global power consumption and is forecast to reach up to 20%. Looking to newer codecs is one way to achieve a reduction in power consumption. David spoke about AV1 last time he spoke with John which delivers lower bitrate with high computation requirements. At hyperscale, using dedicated ASIC chips to do the encoding is one way to drive down power consumption. An alternative route is new MPEG codec LCEVC which delivers better-than-AVC performance in software at much-reduced power consumption. With the prevalence of video – both for entertainment and outside, for example, body cams – moving to more power-efficient codecs and codec implementations seems the obvious and moral move.

Watch now!
Speakers

David Ronca David Ronca
Director, Video Encoding,
Facebook
John Porterfield
Freelance Video Webcast Producer and Tech Evangelist
JP’sChalkTalks YouTube Channel

Video: Netflix – Delivering better video encodes for legacy devices

With over 139 million paying customers, Netflix is very much in the bandwidth optimisation game. It keeps their costs down, it keeps customers’ costs down for those on metered tariffs and a lower bitrate keeps the service more responsive.

As we’ve seen on The Broadcast Knowledge over the years, Netflix has tried hard to find new ways to encode video with Per-Title encoding, VMAF and, more recently, per-shot encoding as well as moving to more efficient codecs such as AV1.

 

Mariana Afonso from Netflix discusses what do you do with devices that decode the latest encoders either because they are too old or can’t get certification? Techniques such as per-title encoding work well because they are wholly managed in the encoder. Whereas with codecs such as AV1, the decoder has to support it too, meaning it’s not as widely applicable an optimisation.

As per-title encoding was developed within Netflix before they got their VMAF metric finished, it still uses PSNR, explains Mariana. This means there is still an opportunity to bring down bitrates by using VMAF. Because VMAF more accurately captures how the video looks, it’s able to lead optimisation algorithms better and shows gains in tests.

Better than per-title is per-chunk. The per-chunk work done modulates the average target bitrate from chunk to chunk. This avoids over-allocating bits for low-complexity scenes and results in a more consistent quality by 6 to 16%.

Watch now!
Speaker

Mariana Alfonso Mariana Afonso
Research Scientist, Video Algorithms,
Netflix

Video: Cloud Encoding – Overview & Best Practices

There are so many ways to work in the cloud. You can use a monolithic solution which does everything for you which is almost guaranteed by its nature to under-deliver on features in one way or another for any non-trivial workflow. Or you could pick best-of-breed functional elements and plumb them together yourself. With the former, you have a fast time to market and in-built simplicity along with some known limitations. With the latter, you may have exactly what you need, to the standard you wanted but there’s a lot of work to implement and test the system.

Tom Kuppinen from Bitmovin joins Christopher Olekas from SSIMWAVE and host of this Kirchner Waterloo Video Tech talk on cloud encoding. After the initial introduction to ‘middle-aged’ startup, Bitmovin, Tom talks about what ‘agility in the cloud’ means being cloud-agnostic. This is the, yet unmentioned, elephant in the room for broadcasters who are so used to having extreme redundancy. Whether it’s the BBC’s “no closer than 70m” requirement for separation of circuits or the standard deployment methodology for systems using SMPTE’s ST 2110 which will have two totally independent networks, putting everything into one cloud provider really isn’t in the same ballpark. AWS has availability zones, of course, which is one of a number of great ways of reducing the blast radius of problems. But surely there’s no better way of reducing the impact of an AWS problem than having part of your infrastructure in another cloud provider.

Bitmovin have implementations in Azure, Google Cloud and AWS along with other cloud providers. In this author’s opinion, it’s a sign of the maturity of the market that this is being thought about, but few companies are truly using multiple cloud providers in an agnostic way; this will surely change over the next 5 years. For reliable and repeatable deployments, API control is your best bet. For detailed monitoring, you will need to use APIs. For connecting together solutions from different vendors, you’ll need APIs. It’s no surprise that Bitmovin say they program ‘API First’; it’s a really important element to any medium to large deployment.

 

 

When it comes to the encoding itself, per-title encoding helps reduce bitrates and storage. Tom explains how it analyses each video and chooses the best combination parameters for the title. In the Q&A, Tom confirms they are working on implementing per-scene encoding which promises more savings still.

To add to the complexity of a best-of-breed encoding solution, using best-of-breed codecs is part and parcel of the value. Bitmovin were early with AV1 and they support VP9 and HEVC. They can also distribute the encoding so that it’s encoded in parallel by as many cores as needed. This was their initial offering for AV1 encoding which was spread over more than 200 cores.

Tom talks about how the cloud-based codecs can integrate into workflows and reveals that HDR conversion, instance pre-warming, advanced subtitling support and AV1 improvements are on the roadmap while leads on to the Q&A. Questions include whether it’s difficult to deploy on multiple clouds, which HDR standards are likely to become the favourites, what the pain points are about live streaming and how to handle metadata.

Watch now!
Speakers

Tom Kuppinen Tom Kuppinen
Senior Sales Engineer,
Bitmovin
Moderator: Christopher Olekas
Senior Software Engineer,
SSIMWAVE Inc.

Video: Scaling Video with AV1!

A nuanced look at AV1. If we’ve learnt one thing about codecs over the last year or more, it’s that in the modern world pure bitrate efficiency isn’t the only game in town. JPEG 2000 and, now, JPEG XS, have always been excused their high bitrate compared to MPEG codecs because they deliver low latency and high fidelity. Now, it’s clear that we also need to consider the computational demand of codec when evaluating which to use in any one situation.

John Porterfield welcomes Facebook’s David Ronca to understand how AV1’s arriving on the market. David’s the director of Facebook’s video processing team, so is in pole position to understand how useful AV1 is in delivering video to viewers and how well it achieves its goals. The conversation looks at how to encode, the unexpected ways in which AV1 performs better than other codecs and the state of the hardware and software decoder ecosystem.

David starts by looking at the convex hull, explaining that it’s a way of encoding content multiple times at different resolutions and bitrates and graphing the results. This graph allows you to find the best combination of bitrate and resolution for a target quality. This works well, but the multiple encodes burdens the decision with a lot of extra computation to get the best set of encoding parameters. As proof of its effectiveness, David cites a time when a 200kbps max target was given for and encoder of video plus audio. The convex hull method gave a good experience for small screens despite the compromises made in encoding fidelity. The important part is being flexible on which resolution you choose to encode because by allowing the resolution to drift up or down as well as the bitrate, higher fidelity combinations can be found over keeping the resolution fixed. This is called per-title encoding and was pioneered by Netflix as discussed in the linked talk, where David previously worked and authored this blog post on the topic.

It’s an accepted fact that encoder complexity increases for every generation. Whilst this makes sense, particularly in the standard MPEG line where MPEG 2 gave way to AVC which gave way to HEVC which is now being superseded by VVC all of which achieved an approximately 50% compression improvement at the cost of a ten-fold computation increase. But David contends that this buries the lede. Whilst it’s true that the best (read: slowest) compression improves by 50% and has a 10% complexity increase, it’s often missed that at the other end of the curve, one of the fastest settings of the newer codec can now match the best of the old codec with a 90% reduction in computation. For companies working in the software world encoding, this is big news. David demonstrates this by graphing the SVT-AV1 encoder against the x265 HEVC encoder and that against x264.

David touches on an important point, that there is so much video encoding going on in the tech giants and distributed around the world, that it’s important for us to keep reducing the complexity year on year. As it is now, with the complexity increasing with each generation of encoder, something has to give in the future otherwise complexity will go off the scale. The Alliance for Open Media’s AV1 has something to say on the topic as it’s improved on HEVC with only a 5% increase in complexity. Other codecs such as MPEG’s LCEVC also deliver improved bitrate but at lower complexity. There is a clear environmental impact from video encoding and David is focused on reducing this.

AOM is also fighting the commercial problem that codecs have. Companies don’t mind paying for codecs, but they do mind uncertainty. After all, what’s the point in paying for a codec if you still might be approached for more money. Whilst MPEG’s implementation of VVC and EVC aims to give more control to companies to help them control their risk, AOM’s royalty-free codec with a defence fund against legal attacks, arguably, gives the most predictable risk of all. AOM’s aim, David explains, is to allow the web to expand without having to worry about royalty fees.

Next is some disappointing news for AV1 fans. Hardware decoder deployments have been delayed until 2023/24 which probably means no meaningful mobile penetration until 2026/27. In the meantime the very good dav1d decoder and also gav1 are expected to fill the gap. Already quite fast, the aim is for them to be able to do 720p60 decoding for average android devices by 2024.

Watch now!
Speakers

David Ronca David Ronca
Director, Video Encoding,
Facebook
John Porterfield
Freelance Video Webcast Producer and Tech Evangelist
JP’sChalkTalks YouTube Channel