The MPEG-5: Essential Video Codec (EVC) promises to do what no MPEG standard has done before, deliver great improvements in compression and give assurances over patents. With a novel standardisation process, EVC provides a royalty-free base layer plus licensing details are provided upfront.
SMPTE 2019 saw Jonatan Samuelson take us through the details. Founder of Divideon and an editor of the evolving standard. Jonatan starts by explaining the codec landscape in terms of the new and recent codecs coming online showing how EVC differs including from it’s sister codec, VVC in parallel with which EVC is being developed.
Jonatan explains how the patents are being dealt with, comparing to HEVC, he shows that there is a much more simplified range of patent holders. But importantly, the codec has very granular tools to turn on and off separate tools so that you can exclude any that you don’t wish to use for licensing reasons. This is the first time this level of control has been possible. Along with the royalty-free base layer, this codec hopes to provide companies the control they need in order to safely use the codec with predictable costs and without legal challenges.
Target applications for EVC are realtime encoding, video conferencing but also newer ’emerging’ video formats such as 8K with HDR & WCG. To do this, Jonatan explains the different blocks that create the codec itself ahead of walking us through the results.
AVC, now 16 years old, is long in the tooth but supported by billions of devices. The impetus to replace it comes from the drive to serve customers with a lower cost/base and a more capable platform. Cue the new contenders VVC and AV1 – not to mention HEVC. It’s no surprise they comptes better then AVC (also known as MPEG 4 and h.264) but do they deliver a cost efficient, legally safe codec on which to build a business?
Thierry Fautier has done the measurements and presents them in this talk. Thierry explains that the tests were done using reference code which, though unoptimised for speed, should represent the best quality possible from each codec and compared 1080p video all of which is reproduced in the IBC conference paper.
Licensing is one important topic as, by some, HEVC is seen as a failed codec not in terms of its compression but rather in the réticente by many companies to deploy it which has been due to the business risk of uncertain licensing costs and/or the expense of the known licensing costs. VVC faces the challenge of entering the market and avoiding these concerns which MPEG is determined to do.
Thierry concludes by comparing AVC against HEVC, AV1 and VVC in terms of deployment dates, deployed devices and the deployment environment. He looks at the challenge of moving large video libraries over to high-complexity codecs due to cost and time required to re-compress. The session ends with questions from the audience. Watch now! Speaker
President-Chair at Ultra HD Forum,
VP Video Strategy, Harmonic
Per-title encoding is a common method of optimising quality and compression by changing the encoding options on a file-by-file basis. Although some would say the start of per-scene encoding is the death knell for per-title encoding, either is much better than the more traditional plan of applying exactly the same settings to each video.
This talk with Mux’s Nick Chadwick and Ben Dodson looks at what per-title encoding is and how to go about doing it. The initial work involves doing many encodes of the same video and analysing each for quality. This allows you to out which resolutions and bitrates to encode at and how to deliver the best vide.
Ben Dodson explains the way they implemented this at Mux using machine learning. This was done by getting computers to ‘watch’ videos and extract metadata. That metadata can then be used to inform the encoding parameters without the computer watching the whole of a new video.
Nick takes some time to explain MUX’s ‘convex hulls’ which give a shape to the content’s performance at different bitrates and helps visualise the optimum encoding parameters the content. Moreover we see that using this technique, we can explore how to change resolution to create the best encode. This doesn’t always mean reducing the resolution; there are some surprising circumstances when it makes sense to start at high resolutions, even for low bitrates.
The next stage after per-title encoding is to segment the video and encode each segment differently which Nick explores and explains how to deliver different resolutions throughout the stream seamlessly switching between them. Ben takes over and explains how this can be implemented and how to chose the segment boundaries correctly, again, using a machine learning approach to analysis and decision making.
Artificial Intelligence, Machine Learning and related technologies aren’t going to go away…the real question is where they are best put to use. Here, Dan Grois from Comcast shows their transformative effect on video.
Some of us can have a passable attempt at explaining what neural networks, but to start to understand how this technology works understanding how our neural networks work is important and this is where Dan starts his talk. By walking us through the workings of our own bodies, he explains how we can get computers to mimic parts of this process. This all starts by creating a single neuron but Dan explains multi-layer perception by networking many together.
As we see examples of what these networks are able to do, piece by piece, we start to see how these can be applied to video. These techniques can be applied to many parts of the HEVC encoding process. For instance, extrapolating multiple reference frames, generating interpolation filters, predicting variations etc. Doing this we can achieve a 10% encoding improvements. Indeed, a Deep Neural Network (DNN) can totally replace the DCT (Discrete Cosine Transform) widely used in MPEG and beyond. Upsampling and downsampling can also be significantly improved – something that has already been successfully demonstrated in the market.
Dan isn’t shy of tackling the reason we haven’t seen the above gains widely in use; those of memory requirements and high computational costs. But this work is foundational in ensuring that these issues are overcome at the earliest opportunity and in optimising the approach to implementing them to the best extent possible to day.
The last part of the talk is an interesting look at the logical conclusion of this technology.