Video: VVC – The new Versatile Video Coding standard

The Codec landscape is a more nuanced place than 5 years ago, but there will always be a place for a traditional Codec that cuts file sizes in half while harnessing recent increases in computation. Enter VVC (Versatile Video Codec) the successor to HEVC, created by MPEG and the ITU by JVET (Joint Video Experts Team), which delivers up to 50% compression improvement by evolving the HEVC toolset and adding new features.

In this video Virginie Drugeon from Panasonic takes us through VVC’s advances, its applications and performance in this IEEE BTS webinar. VVC aims not only to deliver better compression but has an emphasis on delivering at higher resolutions with HDR and as 10-bit video. It also acknowledges that natural video isn’t the only video used nowadays with much more content now including computer games and other computer-generated imagery. To achieve this, VVC has had to up its toolset.

 

 

Any codec is comprised of a whole set of tools that carry out different tasks. The amount that each of these tools is used to encode the video is controllable, to some extent, and is what gives rise to the different ‘profiles’, ‘levels’ and ‘tiers’ that are mentioned when dealing with MPEG codecs. These are necessary to allow for lower-powered decoding to be possible. Artificially constraining the capabilities of the encoder gives maximum performance guarantees for both the encoder and decoder which gives manufacturers control over the cost of their software and hardware products. Virginie walks us through many of these tools explaining what’s been improved.

Most codecs split the image up into blocks, not only MPEG codecs but the Chinese AVS codecs and AV1 also do. The more ways you have to do this, the better compression you can achieve but this adds more complexity to the encoding so each generation adds more options to balance compression against the extra computing power now available since the last codec. VVC allows rectangles rather than just squares to be used and the size of sections can now be 128×128 pixels, also covered in this Bitmovin video. This can be done separately for the chroma and luma channels.

Virginie explains that the encoding is done through predicting the next frame and sending the corrections on top of that. This means that the encoder needs to have a decoder within it so it can see what is decoded and understand the differences. Virginie explains there are three types of prediction. Intra prediction uses the current frame to predict the content of a block, inter prediction which uses other frames to predict video data and also a hybrid mode which uses both, new to VVC. There are now 93 directional intra prediction angles and the introduction of matrix-based intra prediction. This is an example of the beginning of the move to AI for codecs, a move which is seen as inevitable by The Broadcast Knowledge as we see more examples of how traditional mathematical algorithms are improved upon by AI, Machine Learning and/or Deep Learning. A good example of this is super-resolution. In this case, Virginie says that they used machine learning to generate some matrices which are used for the transform meaning that there’s no neural network within the codec, but that the matrices were created based on real-world data. It seems clear that as processing power increases, a neural network will be implemented in future codecs (whether MPEG or otherwise).

For screen encoding, we see that intra block copying (IBC) is still present from HEVC, explained here from 17:30 IBC allows part of a frame to be copied to another which is a great technique for computer-generated content. Whilst this was in HEVC it was not in the basic package of tools in HEVC meaning it was much less accessible as support in the decoders was often lacking. Two new tools are block differential pulse code modulation & transform skip with adapted residual coding each discussed, along with IBC in this free paper.

Virginie moves on to Coding performance explaining that the JVET reference software called VTM has been used to compare against HEVC’s HM reference and has shown, using PSNR, an average 41% improvement on luminance with screen content at 48%. Fraunhofer HHI’s VVenc software has been shown to be 49%.

Along with the ability to be applied to screen content and 360-degree video, the versatility in the title of the codec also refers to the different layers and tiers it has which stretch from 4:2:0 10 bit video all the way up to 4:4:4 video including spatial scalability. The main tier is intended for delivery applications and the high for contribution applications with framerates up to 960 fps, up from 300 in HEVC. There are levels defined all the way up to 8K. Virginie spends some time explaining NAL units which are in common with HEVC and AVC, explained here from slide 22 along with the VCL (Video Coding Layer) which Virginie also covers.

Random access has long been essential for linear broadcast video but now also streaming video. This is done with IDR (Instantaneous Decoding Refresh), CRA (Clean Random Access) and GDR (Gradual Decoding Refresh). IDR is well known already, but GDR is a new addition which seeks to smooth out the bitrate. With a traditional IBBPBBPBBI GOP structure, there will be a periodic peak in bitrate because the I frames are much larger than the B and, indeed, P frames. The idea with GDR is to have the I frame gradually transmitted over a number of frames spreading out the peak. This disadvantage is you need to wait longer until you have your full I frame available.

Virginie introduces subpictures which are a major development in VVC allowing separately encoded pictures within the same stream. Effectively creating a multiplexed stream, sections of the picture can be swapped out for other videos. For instance, if you wanted a picture in picture, you could swap the thumbnail video stream before the decoder meaning you only need one decoder for the whole picture. To do the same without VVC, you would need two decoders. Subpictures have found use in 360 video allowing reduced bitrate where only the part which is being watched is shown in high quality. By manipulating the bitstream at the sender end.

Before finishing by explaining that VVC can be carried by both MPEG’s ISO BMFF and MPEG2 Transport Streams, Virginie covers Reference Picture Resampling, also covered in this video from Seattle Video Tech allows reference frames of one resolution to be an I frame for another resolution stream. This has applications in adaptive streaming and spatial scalability. Virginie also covers the enhanced timing available with HRD

Watch now!
Video is free to watch
Speaker

Virginie Drugeon Virginie Drugeon
Senior Engineer Digital TV Standardisation,
Panasonic

Video: Debugging Streaming Errors with Video Analytics

Errors in streaming often require deep knowledge that system specialsts and developers have, but getting them the data they need is often an uphill struggle. This video shows ways in which we can short circuit this problem showing some approaches that Bitmovin is taking to get the data to the right people. Bitmovin announced, yesterday, €25M of further investment in the company. We’ve featured Bitmovin many times here on The Broadcast Knowledge talking about codecs, low-latency live streaming or super-resolution. Reading through this full list makes it clear that Bitmovin’s interested in the whole chain from encode to delivery.

Christoph Prager sets the scene looking at an analysis of errors showing that only 15% have a clear reason with 65% being ambiguous. If an error’s ambiguous, you need data to drill into it and disambiguate the situation. This is exacerbated by the standard aggregate metrics which make getting to the root cause very difficult. Definitions of ‘buffering percentage’ and ‘startup time’ are very useful to gauge the scale of an issue or to find there’s even a problem to begin with. But for developers, they are like the foreword to the book they need to read to find the problem. This has led Bitmovin to think from the angle that errors are a lot more obvious when you have the data.

Daniel Hölbling-Inzko takes us through Bitmovin’s new features to expose data surrounding errors. Whilst these will be coming to Bitmovin products, they show what a useful set of tools for debugging would be and can inspire the same in your platform if you are able to customise those aspects of it. Daniel points out that the right detailed information can be useful to customer support, but it’s the deeper information that he’s interested it. Bitmovin can collate all the stack traces from problem places but also track segments from the time there was an error.

Segment tracking shows the status, type, download speed, time to first byte and the size of each of 10 segments from around the time the error was collected. Viewing these can help see trends such as diminishing bandwidth or just simply show that a problem happened abruptly. Daniel talks through three errors where segment tracking can help you pinpoint problems: ‘NETWORK_SEGMENT_DOWNLOAD_TIMEOUT’, ‘ANALYTICS_BUFFERING_TIMEOUT’ and ‘DRM: license request failed’. Because the requests are now split out individually it makes it easy to see where the 403 error is that is stopping the DRM or how the internet speed is dropping resulting in an analytics timeout. Daniel highlights that it’s the trends that are usually the most important part.

Watch now!
Free Registration Required
Speakers

Christoph Prager Christoph Prager
Product Manager, Analytics
Bitmovin
Daniel Hölbling-Inzko Daniel Hölbling-Inzko
Engineering Director, Analytics
Bitmovin

Video: Update on CTA WAVE’s Tools


With wide membership including Apple, Comcast, Google, Disney, Bitmovin, Akamai and many others, the WAVE interoperability effort is tackling the difficulties web media encoding, playback and platform issues utilising global standards.

Bob Cabmpell from Eurofins explains that with so many streaming formats and device types, we tend to see inconsistent behaviour while streaming due to lack of compliance with standards. This adds a cost for content providers and suppliers. The Web Application Video Ecosystem (WAVE) tries to create solutions to this problem not by creating standards, but by bringing together initiatives across the industry to improve interoperability as well as creating test tools.

Core the work are these five technologies: CENC, DASH & CMAF, HLS, HTML5 video, DRM. For a deeper look at WAVE, watch this talk with Microsoft’s John Simmons who looks at each of these in more depth. In this video, Bob looks at the test tools provided by CTA WAVE.
 

 
Bob looks first at an MPD validator aimed at people prepping and delivering content who need to test thier DSAH manifests are correct. This can be done at https://conformance.dashif.org where Bob walks us through the process and the types of errors and warnings available in the report. App developers are advised to develop to a document of guidelines rather than having a test suite whereas API compliance can be found at WeBAPITests2018.ctawave.org and Bob finishes off with a sneak peek of a new device capabilities suite which will help automate the detection of problems such as non-smooth playback when switching between ABR rungs.

Watch now!
Speakers

Bob Campbell Dr. Bob Campbell
Director of Engineering,
Eurofins Digital Testing

Video: Cloud Encoding – Overview & Best Practices

There are so many ways to work in the cloud. You can use a monolithic solution which does everything for you which is almost guaranteed by its nature to under-deliver on features in one way or another for any non-trivial workflow. Or you could pick best-of-breed functional elements and plumb them together yourself. With the former, you have a fast time to market and in-built simplicity along with some known limitations. With the latter, you may have exactly what you need, to the standard you wanted but there’s a lot of work to implement and test the system.

Tom Kuppinen from Bitmovin joins Christopher Olekas from SSIMWAVE and host of this Kirchner Waterloo Video Tech talk on cloud encoding. After the initial introduction to ‘middle-aged’ startup, Bitmovin, Tom talks about what ‘agility in the cloud’ means being cloud-agnostic. This is the, yet unmentioned, elephant in the room for broadcasters who are so used to having extreme redundancy. Whether it’s the BBC’s “no closer than 70m” requirement for separation of circuits or the standard deployment methodology for systems using SMPTE’s ST 2110 which will have two totally independent networks, putting everything into one cloud provider really isn’t in the same ballpark. AWS has availability zones, of course, which is one of a number of great ways of reducing the blast radius of problems. But surely there’s no better way of reducing the impact of an AWS problem than having part of your infrastructure in another cloud provider.

Bitmovin have implementations in Azure, Google Cloud and AWS along with other cloud providers. In this author’s opinion, it’s a sign of the maturity of the market that this is being thought about, but few companies are truly using multiple cloud providers in an agnostic way; this will surely change over the next 5 years. For reliable and repeatable deployments, API control is your best bet. For detailed monitoring, you will need to use APIs. For connecting together solutions from different vendors, you’ll need APIs. It’s no surprise that Bitmovin say they program ‘API First’; it’s a really important element to any medium to large deployment.

 

 

When it comes to the encoding itself, per-title encoding helps reduce bitrates and storage. Tom explains how it analyses each video and chooses the best combination parameters for the title. In the Q&A, Tom confirms they are working on implementing per-scene encoding which promises more savings still.

To add to the complexity of a best-of-breed encoding solution, using best-of-breed codecs is part and parcel of the value. Bitmovin were early with AV1 and they support VP9 and HEVC. They can also distribute the encoding so that it’s encoded in parallel by as many cores as needed. This was their initial offering for AV1 encoding which was spread over more than 200 cores.

Tom talks about how the cloud-based codecs can integrate into workflows and reveals that HDR conversion, instance pre-warming, advanced subtitling support and AV1 improvements are on the roadmap while leads on to the Q&A. Questions include whether it’s difficult to deploy on multiple clouds, which HDR standards are likely to become the favourites, what the pain points are about live streaming and how to handle metadata.

Watch now!
Speakers

Tom Kuppinen Tom Kuppinen
Senior Sales Engineer,
Bitmovin
Moderator: Christopher Olekas
Senior Software Engineer,
SSIMWAVE Inc.