Artificial Intelligence, Machine Learning and related technologies aren’t going to go away…the real question is where they are best put to use. Here, Dan Grois from Comcast shows their transformative effect on video.
Some of us can have a passable attempt at explaining what neural networks, but to start to understand how this technology works understanding how our neural networks work is important and this is where Dan starts his talk. By walking us through the workings of our own bodies, he explains how we can get computers to mimic parts of this process. This all starts by creating a single neuron but Dan explains multi-layer perception by networking many together.
As we see examples of what these networks are able to do, piece by piece, we start to see how these can be applied to video. These techniques can be applied to many parts of the HEVC encoding process. For instance, extrapolating multiple reference frames, generating interpolation filters, predicting variations etc. Doing this we can achieve a 10% encoding improvements. Indeed, a Deep Neural Network (DNN) can totally replace the DCT (Discrete Cosine Transform) widely used in MPEG and beyond. Upsampling and downsampling can also be significantly improved – something that has already been successfully demonstrated in the market.
Dan isn’t shy of tackling the reason we haven’t seen the above gains widely in use; those of memory requirements and high computational costs. But this work is foundational in ensuring that these issues are overcome at the earliest opportunity and in optimising the approach to implementing them to the best extent possible to day.
The last part of the talk is an interesting look at the logical conclusion of this technology.
We’ve got used to a world of near-universal AVC/h.264 support, but in our desire to deliver better services, we need new codecs. VVC is nearing completion and is attracting increasing attention with its ability to deliver better compression than HEVC in a range of different situations.
Benjamin Bross from the Fraunhofer Institute talks at Mile High Video 2019 about what Versatile Video Coding (VVC) is and the different ways it achieves these results. Benjamin starts by introducing the codec, teasing us with details of machine learning which is used for block prediction and then explains the targets for the video codec.
Next we look at the bitrate curves showing how encoding has improved over the years and where we can expect VVC to fit in before showing results of testing the codec as it exists today which already shows improvement in compression. Encoding complexity and speed are also compared and as expected complexity has increased and speed has reduced. This is always a challenge at the beginning of a new codec standard, but is typically solved in due course. Benjamin also looks at the effect of resolution and frame rate on compression efficiency.
Every codec has sets of tools which can be tuned and used in certain combinations to deal with different types of content so as to optimise performance. VVC is no exception and Benjamin looks at some of the highlights:
Screen Content Coding – specific tools to encode computer graphics rather than ‘natural’ video. With the sharp edges on computer screens, different techniques can produce better results
Reference Picture Rescaling – allows resolution changes in the video stream. This can also be used to deliver multiple resolutions at the same time
Independent Sub Pictures – separate pictures available in same raster. Allows, for instance, sending large resolutions and allowing decoders to only decode part of the picture.
If there’s any talk that cuts through the AV1 hype, it must be this one. The talk from the @Scale conference starts by re-introducing AV1 and AoM but then moves quickly on to encoding techniques and the toolsets now available in AV1.
Starting by looking at the evolution from VP9 to AV1, Google engineer Yue Chen looks at:
Many companies would love to be using free codecs, unencumbered by patents, rather than paying for HEVC or AVC. Phil Cluff shows that, contrary to popular belief, it is possible stream with free codecs and get good coverage on mobile and desktop.
Phil starts off by looking at the codecs available and whether they’re patent encumbered with an eye to how much of the market can actually decode them. Free codecs and containers like WebM, VP8 etc. are not supported by Safari which reduces mobile penetration by half. To prove the point, Phil presents the results of his trials in using HEVC, AVC and VP8 on all major browsers.
Whilst this initially leaves a disappointing result for streaming with libre codecs on mobile, there is a solution! Phil explains how an idea from several years ago is being reworked to provide a free streaming protocol MPAG-SASH which avoids using DASH which is itself based on ISO BMFF which is patent encumbered. He then explains how open video players like video.js can be modified to decode libre codecs.
With these two enhancements, we finally see that coverage of up to 80% on mobile is, in principle, possible.