Measuring video quality is done daily around the world between two video assets. But what happens when you want to take the aggregate quality of a whole manifest? With VMAF being a well regarded metric, how can we use that in an automatic way to get the overview we need?
In this talk, Nick Chadwick from Mux shares the examples and scripts he’s been using to analyse videos. Starting with an example where everything is equal other than quality, he explains the difficulties in choosing the ‘better’ option when the variables are much less correlated. For instance, Nick also examines the situations where a video is clearly better, but where the benefit is outweighed by the minimal quality benefit and the disproportionately high bitrate requirement.
So with all of this complexity, it feels like comparing manifests may be a complexity too far, particularly where one manifest has 5 renditions, the other only 4. The question being, how do you create an aggregate video quality metric and determine whether that missing rendition is a detriment or a benefit?
Before unveiling the final solution, Nick makes the point of looking at how people are going to be using the service. Depending on the demographic and the devices people tend to use for that service, you will find different consumption ratios for the various parts of the ABR ladder. For instance, some services may see very high usage on 2nd screens which, in this case, may take low-resolution video and also lot of ‘TV’ size renditions at 1080p50 or above with little in between. Similarly other services may seldom ever see the highest resolutions being used, percentage-wise. This shows us that it’s important not only to look at the quality of each rendition but how likely it is to be seen.
To bring these thoughts together into a coherent conclusion, Nick unveils an open-source analyser which takes into account not only the VMAF score and the resolution but also the likely viewership such that we can now start to compare, for a given service, the relative merits of different ABR ladders.
The talk ends with Nick answering questions on the tendency to see jumps between different resolutions – for instance if we over-optimise and only have two renditions, it would be easy to see the switch – how to compare videos of different resolutions and also on his example user data.
AVC, now 16 years old, is long in the tooth but supported by billions of devices. The impetus to replace it comes from the drive to serve customers with a lower cost/base and a more capable platform. Cue the new contenders VVC and AV1 – not to mention HEVC. It’s no surprise they comptes better then AVC (also known as MPEG 4 and h.264) but do they deliver a cost efficient, legally safe codec on which to build a business?
Thierry Fautier has done the measurements and presents them in this talk. Thierry explains that the tests were done using reference code which, though unoptimised for speed, should represent the best quality possible from each codec and compared 1080p video all of which is reproduced in the IBC conference paper.
Licensing is one important topic as, by some, HEVC is seen as a failed codec not in terms of its compression but rather in the réticente by many companies to deploy it which has been due to the business risk of uncertain licensing costs and/or the expense of the known licensing costs. VVC faces the challenge of entering the market and avoiding these concerns which MPEG is determined to do.
Thierry concludes by comparing AVC against HEVC, AV1 and VVC in terms of deployment dates, deployed devices and the deployment environment. He looks at the challenge of moving large video libraries over to high-complexity codecs due to cost and time required to re-compress. The session ends with questions from the audience. Watch now! Speaker
President-Chair at Ultra HD Forum,
VP Video Strategy, Harmonic
In the ongoing battle to find the minimum bitrate for good looking video, automation is key to achieving this quickly and cheaply. However, metrics like PSNR don’t always give the best answers meaning that eyes are still better the job than silicon.
In this talk from the Demuxed conference, Intel’s Vasavee Vijayaraghavan shows us examples of computer analysis failing to identify lowest bitrate leaving the encoder spending many megabits encoding video so that it looks imperceptibly better. Further more it’s clear that MOS – the Mean Opinion Score – which has a well defined protocol behind it continues to produce the best results, though setting up and co-ordinating takes orders of magnitude more time and money.
Vasavee shows how she’s managed to develop a hybrid workflow which combines metrics and MOS scores to get much of the benefit of computer-generated metrics fed into the manual MOS process. This allows a much more targeted subjective perceptual quality MOS process thereby speeding up the whole process but still getting that human touch where it’s most valuable.
Per-title encoding is a common method of optimising quality and compression by changing the encoding options on a file-by-file basis. Although some would say the start of per-scene encoding is the death knell for per-title encoding, either is much better than the more traditional plan of applying exactly the same settings to each video.
This talk with Mux’s Nick Chadwick and Ben Dodson looks at what per-title encoding is and how to go about doing it. The initial work involves doing many encodes of the same video and analysing each for quality. This allows you to out which resolutions and bitrates to encode at and how to deliver the best vide.
Ben Dodson explains the way they implemented this at Mux using machine learning. This was done by getting computers to ‘watch’ videos and extract metadata. That metadata can then be used to inform the encoding parameters without the computer watching the whole of a new video.
Nick takes some time to explain MUX’s ‘convex hulls’ which give a shape to the content’s performance at different bitrates and helps visualise the optimum encoding parameters the content. Moreover we see that using this technique, we can explore how to change resolution to create the best encode. This doesn’t always mean reducing the resolution; there are some surprising circumstances when it makes sense to start at high resolutions, even for low bitrates.
The next stage after per-title encoding is to segment the video and encode each segment differently which Nick explores and explains how to deliver different resolutions throughout the stream seamlessly switching between them. Ben takes over and explains how this can be implemented and how to chose the segment boundaries correctly, again, using a machine learning approach to analysis and decision making.