It’s one of the most common visual artefacts affecting both video and images. The scourge of the beautiful sunset and the enemy of natural skin tones, banding is very noticeable as it’s not seen in nature. Banding happens when there is not enough bit depth to allow for a smooth gradient of colour or brightness which leads to strips of one shade and an abrupt change to a strip of the next, clearly different, shade.
In this Video Tech talk, SSIMWAVE’s Dr. Hojat Yeganeh explains what can be done to reduce or eliminate banding. He starts by explaining how banding is created during compression, where the quantiser has reduced the accuracy of otherwise unique pixels to very similar numbers leaving them looking the same.
Dr. Hojat explains why we see these edges so clearly. By both looking at how contrast is defined but also by referencing Dolby’s famous graph showing contrast steps against luminance where they plotted 10-bit HDR against 12-bit HDR and show that the 12-bit PQ image is always below the ‘Barten limit’ which is the threshold beyond which no contrast steps are visible. It shows that a 10-bit HDR image is always susceptible to showing quantised, i.e. banded, steps.
Why do we deliver 10-bit HDR video if it can still show banding? This is because in real footage, camera noise and film grain serve to break up the bands. Dr. Hojat explains that this random noise amounts to ‘dithering’. Well known in both audio and video, when you add random noise which changes over time, humans stop being able to see the bands. TV manufacturers also apply dithering to the picture before showing which can further break up banding, at the cost of more noise on the image.
How can you automatically detect banding? We hear that typical metrics like VMAF and SSIM aren’t usefully sensitive to banding. SSIMWAVE’s SSIMPLUS metric, on the other hand, has been created to also be able to create a banding detection map which helps with the automatic identification of banding.
The video finishes with questions including when banding is part of artistic intention, types of metrics not identifiable by typical metrics, consumer display limitations among others.
Automatic assessment of video quality is essential for creating encoders, selecting vendors, choosing operating points and, for online streaming services, in ongoing service improvement. But getting a computer to understand what looks good and what looks bad to humans is not trivial. When the computer doesn’t have the source video to compare against, it’s even harder.
In this talk, Dr. Ahmed Badr from SSIMWAVE looks at how video quality assessment (VQA) works and goes into detail on No-Reference (NR) techniques. He starts by stating the case for VQA which is an extension, and often replacement for subjective scoring by people. Clearly this is time-consuming, can be more expensive due to involvement of people (and the time) plus requires specific viewing conditions. When done well, a whole, carefully decorated room is required. So when it comes to analysing all the video created by a TV station or automating per-title encoding optimisation, we know we have to remove the human element.
Ahmed moves on to discuss the challenges of No Reference VQA such as identifying intended blur or noise. NR VQA is a two-step process with the first being extracting features from the video. These features are then mapped to a quality model which can be done with a machine learning/AI process which is the technique which Ahmed analyses next. The first task is to come up with a dataset of videos which should be carefully chosen, then it’s important to choose a metric to use for the training, for instance, MS-SSIM or VMAF. This is needed so that the learning algorithm can get the feedback it needs to improve. The last two elements are choosing what you are optimising for, technically called a loss function, and then choosing an AI model for use.
The data set you create needs to be aimed at exploring a certain aspect or range of aspects of video. It could be that you want to optimise for sports, but if you need a broad array of genres, optimising for reducing compression or scaling artefacts may be the main theme of the video dataset. Ahmed talks about the millions of video samples that they have collated and how they’ve used that to create their metric called SSIMPLUS which can work both with a reference and without.
Zhou Wang explains how to compare HEVC & AVC with AV1 and shares his findings. Using various metrics such as VMAF, PSNR and SSIMPlus he explores the affects of resolution on bitrate savings and then turns his gaze to computation complexity.
This talk was given at the Mile High Video conference in Denver CO, 2018.
In this on-demaind video, Streaming Learning Center’s Jan Ozer explains objective metrics to us and how they can be used to build better ABR ladders.
Choosing the number of streams in an adaptive group and configuring them is usually a subjective, touchy-feely exercise, with no way to really gauge the effectiveness and efficiency of the streams. However, by measuring stream quality via metrics such as PSNR, SSIMplus, and VQM, you can precisely assess the quality delivered by each stream and its relevancy to the adaptive group.
This presentation identifies several key objective quality metrics, teaches how to apply them, and provides an objective framework for analyzing which streams are absolutely required in your adaptive group and their optimal configuration.
Views and opinions expressed on this website are those of the author(s) and do not necessarily reflect those of SMPTE or SMPTE Members.
This website is presented for informational purposes only. Any reference to specific companies, products or services does not represent promotion, recommendation, or endorsement by SMPTE