Video: LCEVC – The Latest MPEG Standard

Video is so pervasive in our world that we need to move past thinking of codecs and compression being about reducing bitrate. That will always be a major consideration, but speed of compression and the computation needed can also be deal breakers. Millions of embedded devices need to encode video which don’t have the grunt available to the live AV1-encoding clusters in the cloud. Further more, the structure of the final data itself can be important for later processing and decoding. So we can see how use-cases can arise out needs of various industries, far beyond broadcast, which mean that codecs need to do more than make files small.

This year LCEVC from MPEG will be standardised. Called Low Complexity Enhancement Video Coding, this codec provides compression both where computing is constrained and where it is plentiful. Guido Meardi, CEO of V-Nova, talks us through what LCEVC is starting with a chart showing how computation has increased vastly as compression has improved. It’s this trend that this codec intends to put an end to by adding, Guido explains, an enhancement layer over some lower-resolution video. By encoding a lower-resolution, computational processing is minimised. When displayed, an enhancement layer allows this low resolution video to be sharpened again to bring it back to the original.

After demonstrating the business benefits, we see the block diagram of the encoder and decoder which helps visualise how this enhancement might be calculated and work. Guido then shows us what the enhancement layer looks like – a fairy flat image with lots of thin edges on it but, importantly, it also captures a lot of almost random detail which can’t be guessed by upsamplers. This, of course, is the point. If it were possible to upscale the low-resolution video and guess/infer all the data, then we would always do that. Rather, downscaling and upscaling is a lossy process. Here, that loss is worth it because of the computational gains and because the enhancement layer will put back much of what was once lost.

In order to demonstrate LCEVC’s ability, Guido shows graphs comparing LCEVC at UHD for x264 showing improvements of between 20 and 45% and image examples of artefacts which are avoided using LCEVC. We then see that when applied to AVC, HEVC and VVC it speeds up encodes at least two fold. Guido finishes this presentation showing how you can test out the encoder and decoder yourself.

The last segment of this video, Tarek Amara from Twitch sits down to talk with Guido about the codec and the background behind it. Their talk covers V-Nova’s approach to open source, licensing, LCEVC’s gradual improvements as it went through the proving process as part of MPEG standardisation plus questions from the floor.

Watch now!
Speakers

Guido Meardi Guido Meardi
CEO & Co-Founder,
V-Nova
Tarek Amara Tarek Amara
Principal Video Specialist,
Twitch

Video: Extension to 4K resolution of a Parametric Model for Perceptual Video Quality

Measuring video quality automatically is invaluable and, for many uses, essential. But as video evolves with higher frame rates, HDR, a wider colour gamut (WCG) and higher resolutions, we need to make sure the automatic evaluations evolve too. Called ‘Objective Metrics’, these computer-based assessments go by the name of PSNR, DMOS, VMAF and others. One use for these metrics is to automatically analyse an encoded video to determine if it looks good enough and should be re-encoded. This allows for the bitrate to be optimised for quality. Rafael Sotelo, from the Universidad de Montevideo, explains how his university helped work on an update to Predicted MOS to do just this.

MOS is the Mean Opinion Score and is a result derived from a group of people watching some content in a controlled environment. They vote to say how they feel about the content and the data, when combined gives an indication of the quality of the video. The trick is to enable a computer to predict what people will say. Rafael explains how this is done looking at some of the maths behind the predicted score.

In order to test any ‘upgrades’ to the objective metric, you need to test it against people’s actual score. So Rafael explains how he set up his viewing environments in both Uruguay and Italy to be compliant with BT.500. BT.500 is a standard which explains how a room should be in order to have viewing conditions which maximise the ability of the viewers to appreciate the pros and cons of the content. For instance, it explains how dim the room should be, how reflective the screens and how they should be calibrated. The guidelines don’t apply to HDR, 4K etc. so the team devised an extension to the standard in order to carryout the testing. This is called ‘subjective testing’.

With all of this work done, Rafael shows us the benefits of using this extended metric and the results achieved.

Watch now!
Speakers

Rafael Sotelo Rafael Sotelo
Director, ICT Department
Universidad de Montevideo

Video: Live Closed Captioning and Subtitling in SMPTE 2110 (update)

The SMPTE ST 2110-40 standard specifies the real-time, RTP transport of SMPTE ST 291-1 Ancillary Data packets. It allows creation of IP essence flows carrying the VANC data familiar to us from SDI (like AFD, closed captions or ad triggering), complementing the existing video and audio portions of the SMPTE ST 2110 suite.

This presentation, by Bill McLaughlin from EEG, is an updated tutorial on subtitling, closed captioning, and other ancillary data workflows using the ST 2110-40 standard. Topics include synchronization, merging of data from different sources and standards conversion.

Building on Bill’s previous presentation at the IP Showcase), this talk at NAB 2019 demonstrates a big increase in the number of vendors supporting ST 2110-40 standard. Previously a generic packet analyser like Wireshark with dissector was recommended for troubleshooting IP ancillary data. But now most leading multiviewer / analyser products can display captioning, subtitling and timecode from 2110-40 streams. At the recent “JT-NM Tested Program” event 29 products passed 2110-40 Reception Validation. Moreover, 27 products passed 2110-40 Transmitter Validation which mean that their output can be reconstructed into SDI video signals with appropriate timing and then decoded correctly.

Bill points out that ST 2110-40 is not really a new standard at this point, it only defines how to carry ancillary data from the traditional payloads over IP. Special care needs to be taken when different VANC data packets are concatenated in the IP domain. A lot of existing devices are simple ST 2110-40 receivers which would require a kind of VANC funnel to create a combined stream of all the relevant ancillary data, making sure that line numbers and packet types don’t conflict, especially when signals need to be converted back to SDI.

There is a new ST 2110-41 standard being developed for additional ancilary data which do not match up with ancillary data standardised in ST 291-1. Another idea discussed is to move away from SDI VANC data format and use a TTML track (Timed Text Markup Language – textual information associated with timing information) to carry ancillary information.

Watch now!

Download the slides.

Speakers

 

Bill McLaughlin Bill McLaughlin
VP of Product Development
EEG

Video: AV1 at Netflix

Netflix have continually been pushing forward video compression and analysis because their assets are played so many times that every bit saved is real money saved. VMAF is a great example of Netflix’s desire to push the state of the art forward. Developed by Netflix and two universities, this new objective metric allowed them to better evaluate the quality of videos using computer analysis and has continued to be the foundation of their work since.

One use of VMAF has been to verify the results of Netflix’s Per-Shot Encoding method which alters encoding parameters for each shot of the film rather than using a fixed set of parameters for the whole film. The Broadcast Knowledge has featured talks on their previous technique, per-title encoding (among others).

AV1, however must be the most famous innovation that Neflix is behind. A founding member of the Alliance for Open Media (AoM), Netflix saw a need a for a better codec and by making an open one, which also played to the needs of other internet giants such as Google, was a good way to create a vibrant community around it driving submissions to the codec itself but also, it is hoped, in the implementation and adoption.

In this two-part talk, LiWei Guo starts off by explaining the ways in which AV1 will be used by Netflix. Since this talk took place, Netflix has started streaming in AV1 to Android clients. LiWei points out that AV1 supports 10-bit video as standard – a notable difference from other codecs like AVC and HEVC. This allows Netflix to use 10-bit without worrying about decoder compatibility and he shows examples of skies and water which are significantly by the use of 10-bit.

Another feature of AV1 is the Film Grain synthesis which seeks to improve encoding efficiency by removing the random film grain of the source during the encode process then inserting a similar random noise on top to recreate the same look and feel. As anything random can’t be predicted, noise such of this is very wasteful for a codec to try and encode, therefore it’s not <a surprise that this can result in as much as a 30% reduction in bitrate. Before concluding, LiWei briefly explains per-shot encoding then shows data showing the overall improvements.

Andrey Norkin, also from Netflix explains their work with Intel on the SVT-AV1 software video encoder which leverages Intel’s SVT technology, a framework optimised for Xeon chips for video encoding and analysis. Netflix’s motivations are to further increase adoption by delivering a data centre-ready, optimised encoder and to create an AV1 encoder they can use to support their own internal research activities (did someone say AV2?). SVT allows for parallelisation, important for any computer nowadays with so many cores available.

Finishing up, Andrey points us to the Github repository, lets us know the development statement (as of November 2019) and looks at the speed increases that have taken off, comparing SVT-AV1 against the reference libaom encoder.

Watch now!
Speakers

Andrey Norkin Andrey Norkin
Senior Research Scientist,
Netflix
LiWei Guo LiWei Guo
Senior Software Engineer,
Netflix