Video: Predicting Viewer Attention in Video for use in Compression

Video compression is a never-ending endavour with hundreds of techniques possible. Some which aren’t in use are waiting for computers to catch up or, in this case, to find the best way to apply new techniques, such as machine learning, to the task.

In this talk from Streaming Tech Sweden 2018, Fritz Barnes from Entecon explains that region of interest compression – where you compress the image more in areas where the viewer won’t be looking – can significantly help reduce bitrate.

Fritz looks at techniques to analyse video and work out where people will be looking. This technique is called ‘saliancy deteciton’ and has been made practical by machine learning. Convolutional Neural Networks are introduced. The extensive training material is introduced and explains the model used to learn from it. Optical flow is a way to encode the motion of the video and is also part of the video.

The talk finishes by looking at the results of this technique; both the successes and problems.

Watch now!
Free registration required
Streaming Tech Sweden is an annual conference run by Eyevinn Technology in Sweden. Talks are recorded and are available to delegates for several months and are then freely available. Whilst registration is required on the website, it is free to register and to watch this video.

Video: Versatile Video Coding (VVC) Standard on the Final Stretch

We’ve got used to a world of near-universal AVC/h.264 support, but in our desire to deliver better services, we need new codecs. VVC is nearing completion and is attracting increasing attention with its ability to deliver better compression than HEVC in a range of different situations.

Benjamin Bross from the Fraunhofer Institute talks at Mile High Video 2019 about what Versatile Video Coding (VVC) is and the different ways it achieves these results. Benjamin starts by introducing the codec, teasing us with details of machine learning which is used for block prediction and then explains the targets for the video codec.

Next, we look at the bitrate curves showing how encoding has improved over the years and where we can expect VVC to fit in before showing results of testing the codec as it exists today which already shows improvement in compression. Encoding complexity and speed are also compared and as expected complexity has increased and speed has reduced. This is always a challenge at the beginning of a new codec standard but is typically solved in due course. Benjamin also looks at the effect of resolution and frame rate on compression efficiency.

Every codec has sets of tools which can be tuned and used in certain combinations to deal with different types of content so as to optimise performance. VVC is no exception and Benjamin looks at some of the highlights:

  • Screen Content Coding – specific tools to encode computer graphics rather than ‘natural’ video. With the sharp edges on computer screens, different techniques can produce better results
  • Reference Picture Rescaling – allows resolution changes in the video stream. This can also be used to deliver multiple resolutions at the same time
  • Independent Sub Pictures – separate pictures available in the same raster. Allows, for instance, sending large resolutions and allowing decoders to only decode part of the picture.

Watch now!

Speaker

Benjamin Bross Benjamin Bross
Product Manager,
Fraunhofer HHI

Video: A Technical Overview of AV1

If there’s any talk that cuts through the AV1 hype, it must be this one. The talk from the @Scale conference starts by re-introducing AV1 and AoM but then moves quickly on to encoding techniques and the toolsets now available in AV1.

Starting by looking at the evolution from VP9 to AV1, Google engineer Yue Chen looks at:

  • Extended Reference Frames
  • Motion Vector Prediction
  • Dynamic Motion Vector Referencing
  • Overlapped Block Motion Compensation
  • Masked Compound Prediction
  • Warped Motion Compensation
  • Transform (TX) Coding, Kernels & Block Partitioning
  • Entropy Coding
  • AV1 Symbol Coding
  • Level-map TX Coefficient Coding
  • Restoration and Post-Processing
  • Constrained Dire. Enhancement Filtering
  • In-loop restoration & super resolution
  • Film Grain Synthesis

The talk finishes by looking at Compression Efficiency of AV1 against both HEVC (x.265) & VP9 (libvpx) then coding complexity in terms of speed plus what’s next on the roadmap!

Watch now!

Speaker

Yue Chen Yue Chen
Senior AV1 Engineer,
Google

Video: Per-Title Encoding, @Scale Conference

Per-title encoding with machine learning is the topic of this video from MUX.

Nick Chadwick explains that rather than using the same set of parameters to encode every video, the smart money is to find the best balance of bitrate and resolution for each video. By analysing a large number of combinations of bitrate and resolution, Nick shows you can build what he calls a ‘convex hull’ when graphing against quality. This allows you to find the optimal settings.

Doing this en mass is difficult, and Nick spends some time looking at the different ways of implementing it. In the end, Nick and data scientist Ben Dodson built a system which optimses bitrate for each title using neural nets trained on data sets. This resulted in 84% of videos looking better using this method rather than a static ladder.

Watch now!
Speaker

Nick Chadwick Nick Chadwick
Software Engineer,
Mux