Video: Deep Neural Networks for Video Coding

Artificial Intelligence, Machine Learning and related technologies aren’t going to go away…the real question is where they are best put to use. Here, Dan Grois from Comcast shows their transformative effect on video.

Some of us can have a passable attempt at explaining what neural networks, but to start to understand how this technology works understanding how our neural networks work is important and this is where Dan starts his talk. By walking us through the workings of our own bodies, he explains how we can get computers to mimic parts of this process. This all starts by creating a single neuron but Dan explains multi-layer perception by networking many together.

As we see examples of what these networks are able to do, piece by piece, we start to see how these can be applied to video. These techniques can be applied to many parts of the HEVC encoding process. For instance, extrapolating multiple reference frames, generating interpolation filters, predicting variations etc. Doing this we can achieve a 10% encoding improvements. Indeed, a Deep Neural Network (DNN) can totally replace the DCT (Discrete Cosine Transform) widely used in MPEG and beyond. Upsampling and downsampling can also be significantly improved – something that has already been successfully demonstrated in the market.

Dan isn’t shy of tackling the reason we haven’t seen the above gains widely in use; those of memory requirements and high computational costs. But this work is foundational in ensuring that these issues are overcome at the earliest opportunity and in optimising the approach to implementing them to the best extent possible to day.

The last part of the talk is an interesting look at the logical conclusion of this technology.

Watch now!

Speaker

Dan Grois Dan Grois
Principal Researcher
Comcast

Video: Multiple Codec Live Streaming At Twitch

Twitch is constantly searching for better and lower cost ways of streaming and its move to include VP9 was one of the most high profile ways of doing this. In this talk, a team of Twitch engineers examine the reasons for this and other moves.

Tarek Amara first takes to the stage to introduce Twitch and its scale before looking at the codecs available, the fragmentation of support but also the drivers to improve the video delivered to viewers both in terms of frame rate and resolution in addition to quality. The discussion turns to the reasons to implement of VP9 and we see that if HEVC were chosen instead, less than 3% of people would be able to receive it.

Nagendra Babu explains the basic architecture employed at Twitch before going on to explain the challenges they met in testing and developing the backend and app. He also talks about the difficulty of running multiple transcodes in the cloud. FPGAs are in important tool for Twitch, and Nagendra discusses how they deal with their programming.

The last speaker is Nikhil who talks about the format of VP9 being FMP4 delivered by transport stream and then outlines the pros and cons of Fragmented FMP4 before handing the floor to the audience.

Watch now!
Speakers

Tarek Amara Tarek Amara
Principal Video Specialist,
Twitch
Nikhil Purushe Nikhil Purushe
Senior Software Engineer,
Twitch
Nagendra Babu Nagendra Babu
Senior Software Engineer,
Twitch

Video: Versatile Video Coding (VVC) Standard on the Final Stretch

We’ve got used to a world of near-universal AVC/h.264 support, but in our desire to deliver better services, we need new codecs. VVC is nearing completion and is attracting increasing attention with its ability to deliver better compression than HEVC in a range of different situations.

Benjamin Bross from the Fraunhofer Institute talks at Mile High Video 2019 about what Versatile Video Coding (VVC) is and the different ways it achieves these results. Benjamin starts by introducing the codec, teasing us with details of machine learning which is used for block prediction and then explains the targets for the video codec.

Next, we look at the bitrate curves showing how encoding has improved over the years and where we can expect VVC to fit in before showing results of testing the codec as it exists today which already shows improvement in compression. Encoding complexity and speed are also compared and as expected complexity has increased and speed has reduced. This is always a challenge at the beginning of a new codec standard but is typically solved in due course. Benjamin also looks at the effect of resolution and frame rate on compression efficiency.

Every codec has sets of tools which can be tuned and used in certain combinations to deal with different types of content so as to optimise performance. VVC is no exception and Benjamin looks at some of the highlights:

  • Screen Content Coding – specific tools to encode computer graphics rather than ‘natural’ video. With the sharp edges on computer screens, different techniques can produce better results
  • Reference Picture Rescaling – allows resolution changes in the video stream. This can also be used to deliver multiple resolutions at the same time
  • Independent Sub Pictures – separate pictures available in the same raster. Allows, for instance, sending large resolutions and allowing decoders to only decode part of the picture.

Watch now!

Speaker

Benjamin Bross Benjamin Bross
Product Manager,
Fraunhofer HHI

Video: The Past, Present and Future of AV1

AV1 has strong backing from tech giants but is still seldom seen in the wild. Find out what the plans are for the future with Google’s Debargha Mukherjee.

Debargha’s intent in this talk is simple: to frame a description of what AV1 can do and is doing today in terms of the history of the codec and looking forward to the future and a potential AV2.

The talk starts by demonstrating the need for better video codecs not least of which is the statistic that by 2021, 81% of the internet’s traffic is expected to be video. But on top of that, there is a frustration with the slow decade-long refresh process which is traditional for video codecs. In order to match the new internet landscape with fast-evolving services, it seemed appropriate to have a codec which, not only delivered better encoding but also saw a quicker five-year refresh cycle.

As a comparison to the royalty-free AV1, Debargha then looks at VP9 it is deployed. Furthermore, VP10 who’s development was stopped and diverted into the AV1 effort which is then the topic for the next part of the talk; the Alliance for Open Media, the standardisation process and then a look at some of the encoding tools available to archive the stated aims.

To round off the description of what’s presently happening with AV1 trials of VP9, HEVC and AV1 are shown demonstrating AV1s ability to improve compression for a certain quality. Bitmovin and Facebook’s tests are also highlighted along with speed tests.

Looking, now, to the future, the talk finishes by explaining the future roadmap for hardware decoding and other expected milestones in the coming years plus the software work such as SVT-AV1 and DAV1D for optimised encoding and decoding. With the promised five-year cycle, we need to look forward now to AV2 and Debargha discusses what it might be and what it would need to achieve.

Watch now!
Speaker

Debargha Mukherjee Debargha Mukherjee
Principal Software Engineer,
Google