Video: Optimal Design of Encoding Profiles for Web Streaming

With us since 1998, ABR (Adaptive Bitrate) has been allowing streaming players to select a stream appropriate for their computer and bandwidth. But in this video, we hear that over 20 years on, we’re still developing ways to understand and optimise the performance of ABRs for delivery, finding the best balance of size and quality.

Brightcove’s Yuriy Reznik takes us deep into the theory, but start at the basics of what ABR is and why we. use it. He covers how it delivers a whole series os separate streams at different resolutions and bitrates. Whilst that works well, he quickly starts to show the downsides of ‘static’ ABR profiles. These are where a provider decides that all assets will be encoded at the same set bitrate of 6 or 7 bitrates even though some titles such as cartoons will require less bandwidth than sports programmes. This is where per-title and other encoding techniques come in.

Netflix coined the term ‘per-title encoding’ which has since been called content-aware encoding. This takes in to consideration the content itself when determining the bitrate to encode at. Using automatic processes to determine objective quality of a sample encode, it is able to determine the optimum bitrate.

Content & network-aware encoding takes into account the network delivery as part of the optimisation as well as the quality of the final video itself. It’s able to estimate the likelihood of a stream being selected for playback based upon its bitrate. The trick is combining these two factors simultaneously to find the optimum bitrate vs quality.

The last element to add in order to make this ABR optimisation as realistic as practical is to take into account the way people actually view the content. Looking at a real example from the US open, we see how on PCs, the viewing window can be many different sizes and you can calculate the probability of the different sizes being used. Furthermore we know there is some intelligence in the players where they won’t take in a stream with a resolution which is much bigger than the browser viewport.

Yuriy brings starts the final section of his talk by explaining that he brought in another quality metric from Westerink & Roufs which allows him to estimate how people see video which has been encoded at a certain resolution which is then scaled to a fixed interim resolution for decoding and then to the correct size for the browser windows.

The result of adding in this further check shows that fewer points on the ladder tend to be better, giving an overall higher quality value. Going much beyond 3 is typically not useful for the website. Shows only a few resolutions needed to get good average quality. Adding more isn’t so useful.

Yuriy finishes by introducing SSIM modeling of the noise of an encoder at different bitrates. Bringing together all of these factors, modelled as equations, allows him to suggest optimal ABR ladders.

Watch now!
Speaker

Yuriy Reznik Yuriy Reznik
Technology Fellow and Head of Research,
Brightcove

Video: No-Reference QoE Assessment: Knowledge-based vs. Learning-based

Automatic assessment of video quality is essential for creating encoders, selecting vendors, choosing operating points and, for online streaming services, in ongoing service improvement. But getting a computer to understand what looks good and what looks bad to humans is not trivial. When the computer doesn’t have the source video to compare against, it’s even harder.

In this talk, Dr. Ahmed Badr from SSIMWAVE looks at how video quality assessment (VQA) works and goes into detail on No-Reference (NR) techniques. He starts by stating the case for VQA which is an extension, and often replacement for subjective scoring by people. Clearly this is time-consuming, can be more expensive due to involvement of people (and the time) plus requires specific viewing conditions. When done well, a whole, carefully decorated room is required. So when it comes to analysing all the video created by a TV station or automating per-title encoding optimisation, we know we have to remove the human element.

Ahmed moves on to discuss the challenges of No Reference VQA such as identifying intended blur or noise. NR VQA is a two-step process with the first being extracting features from the video. These features are then mapped to a quality model which can be done with a machine learning/AI process which is the technique which Ahmed analyses next. The first task is to come up with a dataset of videos which should be carefully chosen, then it’s important to choose a metric to use for the training, for instance, MS-SSIM or VMAF. This is needed so that the learning algorithm can get the feedback it needs to improve. The last two elements are choosing what you are optimising for, technically called a loss function, and then choosing an AI model for use.

The data set you create needs to be aimed at exploring a certain aspect or range of aspects of video. It could be that you want to optimise for sports, but if you need a broad array of genres, optimising for reducing compression or scaling artefacts may be the main theme of the video dataset. Ahmed talks about the millions of video samples that they have collated and how they’ve used that to create their metric called SSIMPLUS which can work both with a reference and without.

Watch now!
Speaker

Dr. Ahmed Badr Dr. Ahmed Badr
SSIMWAVE

Video: Extension to 4K resolution of a Parametric Model for Perceptual Video Quality

Measuring video quality automatically is invaluable and, for many uses, essential. But as video evolves with higher frame rates, HDR, a wider colour gamut (WCG) and higher resolutions, we need to make sure the automatic evaluations evolve too. Called ‘Objective Metrics’, these computer-based assessments go by the name of PSNR, DMOS, VMAF and others. One use for these metrics is to automatically analyse an encoded video to determine if it looks good enough and should be re-encoded. This allows for the bitrate to be optimised for quality. Rafael Sotelo, from the Universidad de Montevideo, explains how his university helped work on an update to Predicted MOS to do just this.

MOS is the Mean Opinion Score and is a result derived from a group of people watching some content in a controlled environment. They vote to say how they feel about the content and the data, when combined gives an indication of the quality of the video. The trick is to enable a computer to predict what people will say. Rafael explains how this is done looking at some of the maths behind the predicted score.

In order to test any ‘upgrades’ to the objective metric, you need to test it against people’s actual score. So Rafael explains how he set up his viewing environments in both Uruguay and Italy to be compliant with BT.500. BT.500 is a standard which explains how a room should be in order to have viewing conditions which maximise the ability of the viewers to appreciate the pros and cons of the content. For instance, it explains how dim the room should be, how reflective the screens and how they should be calibrated. The guidelines don’t apply to HDR, 4K etc. so the team devised an extension to the standard in order to carryout the testing. This is called ‘subjective testing’.

With all of this work done, Rafael shows us the benefits of using this extended metric and the results achieved.

Watch now!
Speakers

Rafael Sotelo Rafael Sotelo
Director, ICT Department
Universidad de Montevideo

Video: Best Practices for Advanced Software Encoder Evaluations

Streaming Media East brings together Beamr, Netflix BAMTECH Media and SSIMWAVE to discuss the best ways to evaluate software encoders and we see there is much overlap with hardware encoder evaluation, too.

The panel gets into detail covering:

  • Test Design
  • Choosing source sequences
  • Rate Control Modes
  • Bit Rate or Quality Target Levls
  • Offline (VOD) vs Live (Linear)
  • Discrete vs. Multi-resolution/Bitrate
  • Subjective vs. objective measurements
  • Encoding Efficiency vs Performance
  • Video vs Still frames
  • PSNR Tuning
  • Evaluation at Encode Resolution Vs Display Resolution

Watch now for this comprehensive ‘How To’

Speakers

Anne Aaron Dr. Anne Aaron
Director of Video Algorithms,
Netflix
Scott Labrozzi Scott Labrozzi
VP Video Processing, Core Media Video Processing,
BAMTECH Media
Zhou Wang Dr. Zhou Wang
Chief Science Officer,
SSIMWAVE
Tom Vaughan Moderator: Tom Vaughan
VP Strategy,
Beamr