Video: Super Resolution – The scaler of tomorrow, here today!

If we ever had a time when most displays were the same resolution, those days are long gone with smartphone and tablets with extremely high pixel density nestled in with laptop screens of various resolutions and 1080-line TVs which are gradually being replaced with UHD variants. This means that HD videos are nearly always being upscaled which makes ‘getting upscaling right’ a really worthwhile topic. The well-known basic up/downscaling algorithms have been around for a while, and even the best-performing Lanczos is well over 20 years old. The ‘new kid on the block’ isn’t another algorithm, it’s a whole technique of inferring better upscaling using machine learning called ‘super resolution’.

Nick Chadwick from Mux has been running the code and the numbers to see how well super resolution works. Taking to the stage at Demuxed SF, he starts by looking at where scaling is used and what type it is. The most common algorithms are nearest neighbour, bi-cubic, bi-linear and lanczos with nearest neighbour being the most basic and least-well performing. Nick shows, using VMAF that using these for up and downscaling, that the traditional opinions of how well these algorithms perform are valid. He then introduces some test videos which are designed to let you see whether your video path is using bi-linear or bi-cubic upscaling, presenting his results of when bi-cubic can be seen (Safari on a MacBook Pro) as opposed to bi-linear (Chrome on a MacBook Pro). The test videos are available here.

In the next part of the talk, Nick digs a little deeper into how super resolution works and how he tested ffmpeg’s implementation of super resolution. Though he hit some difficulties in using this young filter, he is able to present some videos and shows that they are, indeed, “better to view” meaning that the text looks sharper and is easier to see with details being more easy pick out. It’s certainly possible to see some extra speckling introduced by the process, but VMAF score is around 10 points higher matching with the subjective experience.

The downsides are a very significant increase in computational power needed which limits its use in live applications plus there is a need for good, if not very good, understanding of ML concepts and coding. And, of course, it wouldn’t be the online streaming community if clients weren’t already being developed to do super-resolution on the decode despite most devices not being practically capable of it. So Nick finishes off his talk discussing what’s in progress and papers relating to the implementation of super resolution and what it can borrow from other developing technologies.

Watch now!
Speaker

Nick Chadwick Nick Chadwick
Software Engineer,
Mux

Video: AV1 at Netflix

Netflix have continually been pushing forward video compression and analysis because their assets are played so many times that every bit saved is real money saved. VMAF is a great example of Netflix’s desire to push the state of the art forward. Developed by Netflix and two universities, this new objective metric allowed them to better evaluate the quality of videos using computer analysis and has continued to be the foundation of their work since.

One use of VMAF has been to verify the results of Netflix’s Per-Shot Encoding method which alters encoding parameters for each shot of the film rather than using a fixed set of parameters for the whole film. The Broadcast Knowledge has featured talks on their previous technique, per-title encoding (among others).

AV1, however must be the most famous innovation that Neflix is behind. A founding member of the Alliance for Open Media (AoM), Netflix saw a need a for a better codec and by making an open one, which also played to the needs of other internet giants such as Google, was a good way to create a vibrant community around it driving submissions to the codec itself but also, it is hoped, in the implementation and adoption.

In this two-part talk, LiWei Guo starts off by explaining the ways in which AV1 will be used by Netflix. Since this talk took place, Netflix has started streaming in AV1 to Android clients. LiWei points out that AV1 supports 10-bit video as standard – a notable difference from other codecs like AVC and HEVC. This allows Netflix to use 10-bit without worrying about decoder compatibility and he shows examples of skies and water which are significantly by the use of 10-bit.

Another feature of AV1 is the Film Grain synthesis which seeks to improve encoding efficiency by removing the random film grain of the source during the encode process then inserting a similar random noise on top to recreate the same look and feel. As anything random can’t be predicted, noise such of this is very wasteful for a codec to try and encode, therefore it’s not <a surprise that this can result in as much as a 30% reduction in bitrate. Before concluding, LiWei briefly explains per-shot encoding then shows data showing the overall improvements.

Andrey Norkin, also from Netflix explains their work with Intel on the SVT-AV1 software video encoder which leverages Intel’s SVT technology, a framework optimised for Xeon chips for video encoding and analysis. Netflix’s motivations are to further increase adoption by delivering a data centre-ready, optimised encoder and to create an AV1 encoder they can use to support their own internal research activities (did someone say AV2?). SVT allows for parallelisation, important for any computer nowadays with so many cores available.

Finishing up, Andrey points us to the Github repository, lets us know the development statement (as of November 2019) and looks at the speed increases that have taken off, comparing SVT-AV1 against the reference libaom encoder.

Watch now!
Speakers

Andrey Norkin Andrey Norkin
Senior Research Scientist,
Netflix
LiWei Guo LiWei Guo
Senior Software Engineer,
Netflix

Video: Delivering Better Manifests with Effective VMAF

Measuring video quality is done daily around the world between two video assets. But what happens when you want to take the aggregate quality of a whole manifest? With VMAF being a well regarded metric, how can we use that in an automatic way to get the overview we need?

In this talk, Nick Chadwick from Mux shares the examples and scripts he’s been using to analyse videos. Starting with an example where everything is equal other than quality, he explains the difficulties in choosing the ‘better’ option when the variables are much less correlated. For instance, Nick also examines the situations where a video is clearly better, but where the benefit is outweighed by the minimal quality benefit and the disproportionately high bitrate requirement.

So with all of this complexity, it feels like comparing manifests may be a complexity too far, particularly where one manifest has 5 renditions, the other only 4. The question being, how do you create an aggregate video quality metric and determine whether that missing rendition is a detriment or a benefit?

Before unveiling the final solution, Nick makes the point of looking at how people are going to be using the service. Depending on the demographic and the devices people tend to use for that service, you will find different consumption ratios for the various parts of the ABR ladder. For instance, some services may see very high usage on 2nd screens which, in this case, may take low-resolution video and also lot of ‘TV’ size renditions at 1080p50 or above with little in between. Similarly other services may seldom ever see the highest resolutions being used, percentage-wise. This shows us that it’s important not only to look at the quality of each rendition but how likely it is to be seen.

To bring these thoughts together into a coherent conclusion, Nick unveils an open-source analyser which takes into account not only the VMAF score and the resolution but also the likely viewership such that we can now start to compare, for a given service, the relative merits of different ABR ladders.

The talk ends with Nick answering questions on the tendency to see jumps between different resolutions – for instance if we over-optimise and only have two renditions, it would be easy to see the switch – how to compare videos of different resolutions and also on his example user data.

Watch now!
Speakers

Nick Chadwick Nick Chadwick
Software Engineer,
Mux

Video: What to do after per-title encoding

Per-title encoding is a common method of optimising quality and compression by changing the encoding options on a file-by-file basis. Although some would say the start of per-scene encoding is the death knell for per-title encoding, either is much better than the more traditional plan of applying exactly the same settings to each video.

This talk with Mux’s Nick Chadwick and Ben Dodson looks at what per-title encoding is and how to go about doing it. The initial work involves doing many encodes of the same video and analysing each for quality. This allows you to out which resolutions and bitrates to encode at and how to deliver the best vide.

Ben Dodson explains the way they implemented this at Mux using machine learning. This was done by getting computers to ‘watch’ videos and extract metadata. That metadata can then be used to inform the encoding parameters without the computer watching the whole of a new video.

Nick takes some time to explain MUX’s ‘convex hulls’ which give a shape to the content’s performance at different bitrates and helps visualise the optimum encoding parameters the content. Moreover we see that using this technique, we can explore how to change resolution to create the best encode. This doesn’t always mean reducing the resolution; there are some surprising circumstances when it makes sense to start at high resolutions, even for low bitrates.

The next stage after per-title encoding is to segment the video and encode each segment differently which Nick explores and explains how to deliver different resolutions throughout the stream seamlessly switching between them. Ben takes over and explains how this can be implemented and how to chose the segment boundaries correctly, again, using a machine learning approach to analysis and decision making.

Watch now!
Speakers

Nick Chadwick Nick Chadwick
Software Engineer,
Mux
Ben Dodson Ben Dodson
Data Scientist,
Mux