Video: High-Efficiency Video Coding (HEVC) Primer

HEVC continues to gain adoption thanks to its bitrate savings over AVC (H.264), though much stands in the balance this year as AV1 continues to gain momentum and MPEG’s VVC is released. Both of which promise greater compression. Compression, however, is a compromise between encoding complexity (computation), quality and speed. HEVC stands on the shoulders of AVC and this video explains the techniques it uses to be better.

Christian Timmerer, co-founder of Bitmovin, builds on his previous video about AVC as he details the tools and capabilities of HEVC (all known as H.265). He summarises the performance of HEVC as providing twice as much compression for the same video quality (or getting better quality for a higher number of bits). Whilst it’s decoder requirements have gone up by 50%, it provides better parallelisation opportunities. Amongst the features that create this are variable block-size motion compensation, improved interpolation method and more directions for spatial prediction. Most of the improvements are specifically an expansion of the abilities laid out in AVC. For instance, making size or direction variable or providing more options.

After outlining some of the details behind the new capabilities, we look at the performance improvements of some HEVC implementations over AVC implementations showing up to a 65% improvement of bitrate averaging out at around 50%. Christian finishes by looking at the newer codecs coming out soon such as VVC, LCEVC

Watch now!
Speakers

Christian Timmerer Christian Timmerer
CIO & Cofounder, Bitmovin
Associate Professor, Universität Klagenfurt

Video: Benefits of IP Systems for Sporting Venues

As you walk around any exhibitions there seems to be a myriad of ‘benefits’ of IP working, many of which don’t resonate for particular use cases. Only the most extraordinary businesses need all of the benefits, so in this talk, Imagine Communication’s John Mailhot discusses how IP helps sports venues.

John sets the scene by separating out the function of OB trucks and the ‘inside production’ facilities which have a whole host of non-TV production to do including driving scoreboards, displays inside the venue, replays and importantly has to deal with over 250 events a year, not all of which will have an OB truck.

We see that the scale that IP can work at is a great benefit as many signals can fit down one fibre and 2022-7 seamless switching can easily provide full redundancy for every fibre and SFP. This is a level of redundancy which is simply not seen in SDI systems. With stadia being very large, necessitating cable runs of over 500m, the fact that IP needs fewer cables overall is a great benefit.

John shows an example of an Arista switch only 7U in height which provides 144x 100G ports meaning it could support over 4000 inputs and 4000 outputs. Such density is unprecedented and for OB trucks can be a dealbreaker. For sports venues, this can also be a big motivator but also allow more flexibility in distributing the solution rather than relying on a massive central interconnect with a 1100×1100 SDI router in a central CTA.

TV is nothing without audio and the benefits to audio in 2110 are non trivial since with the audio being split off from the video, we are no longer limited to dealing with just 16 channels per video and de-embedding from a video frame any time we want to touch it.

Timing is an interesting benefit. I say this because, whilst PTP can end up being quite complex compared to black and burst, it has some big benefits. First off, it can live in the same cables as your data where as black and burst requires a whole separate cable infrastructure. PTP also allows you to timestamp all essences which helps with lip-sync throughout your workflow.

John leads us through some examples of how this works for different areas finishing by summing up the relevant benefits such as scalability, multi-format, space efficient, and timing amongst others.

Watch now!
Download the slides
Speakers

John Mailhot John Mailhot
CTO, Networking & Infrastructure,
Imagine Communications

Video: DASH: from on-demand to large scale live for premium services

A bumper video here with 7 short talks from VideoLAN, Will Law and Hulu among others, all exploring the state of MPEG DASH today, the latest developments and the hot topics such as low latency, ad insertion, bandwidth prediction and one red-letter feature of DASH – multi-DRM.

The first 10 minutes sets the scene introducing the DASH Industry Forum (DASH IF) and explaining who takes part and what it does. Thomas Stockhammer, who is chair of the Interoperability Working Group explains that DASH IF is made of companies, headline members including Google, Ericsson, Comcast and Thomas’ employer Qualcomm who are working to promote the adoption of MPEG-DASH by working to improve the specification, advise on how to put it into practice in real life, promote interoperability, and being a liaison point for other standards bodies. The remaining talks in this video exemplify the work which is being done by the group to push the technology forward.

Meeting Live Broadcast Requirements – the latest on DASH low latency!
Akamai’s Will Law takes to the mic next to look at the continuing push to make low-latency streaming available as a mainstream option for services to use. Will Law has spoken about about low latency at Demuxed 2019 when he discussed the three main file-based to deliver low latency DASH, LHLS and LL-HLS as well as his famous ‘Chunky Monkey’ talk where he explains how CMAF, an implementation of MPEG-DASH, works in light-hearted detail.

In today’s talk, Will sets out what ‘low latency’ is and revises how CMAF allows latencies of below 10 seconds to be achieved. A lot of people focus on the duration of the chunks in reducing latency and while it’s true that it’s hard to get low latency with 10-second chunk sizes, Will puts much more emphasis on the player buffer rather than the chunk size themselves in producing a low-latency stream. This is because even when you have very small chunk sizes, choosing when to start playing (immediately or waiting for the next chunk) can be an important part of keeping the latency down between live and your playback position. A common technique to manage that latency is to slightly increase and decrease playback speed in order to manage the gap without, hopefully, without the viewer noticing.

Chunk-based streaming protocols like HLS make Adaptive Bitrate (ABR) relatively easy whereby the player monitors the download of each chunk. If the, say, 5-second chunk arrives within 0.25 seconds, it knows it could safely choose a higher-bitrate chunk next time. If, however, the chunk arrives in 4.8 seconds, it can choose to the next chunk to be lower-bitrate so as to receive the chunk with more headroom. With CMAF this is not easy to do since the segments all arrive in near real-time since the transferred files represent very small sections and are sent as soon as they are created. This problem is addressed in a later talk in this talk.

To finish off, Will talks about ‘Resync Elements’ which are a way of signalling mid-chunk IDRs. These help players find all the points which they can join a stream or switch bitrate which is important when some are not at the start of chunks. For live streams, these are noted in the manifest file which Will walks through on screen.

Ad Insertion in Live Content:Pre-, Mid- and Post-rolling
Whilst not always a hit with viewers, ads are important to many services in terms of generating the revenue needed to continue delivering content to viewers. In order to provide targeted ads, to ensure they are available and to ensure that there is a record of which ads were played when, the ad-serving infrastructure is complex. Hulu’s Zachary Cava walks us through the parts of the infrastructure that are defined within DASH such as exchanging information on ‘Ad Decision Parameters’ and ad metadata.

In chunked streams, ads are inserted at chunk boundaries. This presents challenges in terms of making sure that certain parameters are maintained during this swap which is given the general name of ‘Content Splice Conditioning.’ This conditioning can align the first segment aligned with the period start time, for example. Zachary lays out the three options provided for this splice conditioning before finishing his talk covering prepared content recommendations, ad metadata and tracking.

Bandwidth Prediction for Multi-bitrate Streaming at Low Latency
Next up is Comcast’s Ali C. Begen who follows on from Will Law’s talk to cover bandwidth prediction when operating at low-latency. As an example of the problem, let’s look at HTTP/1.1 which allows us to download a file before it’s finished being written. This allows us to receive a 10-second chunk as it’s being written which means we’ll receive it at the same rate the live video is being encoded. As a consequence, the time each chunk takes to arrive will be the same as the real-time chunk duration (in this example, 10 seconds.) When you are dealing with already-written chunks, your download time will be dependent on your bandwidth and therefore the time can be an indicator of whether your player should increase or decrease the bitrate of the stream it’s pulling. Getting back this indicator for low-latency streams is what Ali presents in this talk.

Based on this paper Ali co-authored with Christian Timmerer, he explains a way of looking at the idle time between consecutive chunks and using a sliding window to generate a bandwidth prediction.

Implementing DASH low latency in FFmpeg
Open-source developer Jean-Baptiste Kempf who is well known for his work on VLC discusses his work writing an MPEG-DASH implementation for FFmpeg called the DASH-LL. He explains how it works and who to use it with examples. You can copy and paste the examples from the pdf of his talk.

Managing multi-DRM with DASH
The final talk, ahead of Q&A is from NAGRA discussing the use of DRM within MPEG-DASH. MPEG-DASH uses Common Encryption (CENC) which allows the DASH protocol to use more than one DRM scheme and is typically seen to allow the use of ‘FairPlay’, ‘Widevine’ and ‘PlayReady’ encryption schemes on a single stream dependent on the OS of the receiver. There is complexity in having a single server which can talk to and negotiate signing licences with multiple DRM services which is the difficulty that Lauren Piron discusses in this final talk before the Q&A led by Ericsson’s VP of international standards, Per Fröjdh.

Watch now!
Speakers

Thomas Stockhammer Thomas Stockhammer
Director of Technical Standards,
Qualcomm
Will Law Will Law
Chief Architect,
Akamai
Zachary Cava Zachary Cava
Software Architect,
Hulu
Ali C. Begen Ali C. Begen
Technical Consultant, Video Architecture, Strategy and Technology group,
Comcast
Jean-Baptiste Kempf Jean-Baptiste Kempf
President & Lead VLC Developer
VideoLAN
Laurent Piron Laurent Piron
Principal Solution Architect
NAGRA
Per Fröjdh Moderator: Per Fröjdh
VP International Standards,
Ericsson

Video: Advanced Video Coding Standards AVC

Whilst the encoding landscape is shifting, AVC (AKA H.264) still dominates many areas of video distribution so, for many, understanding what’s under the hood opens up a whole realm of diagnostics and fault finding that wouldn’t be possible without. Whilst many understand that MPEG video is built around I, B and P frames, this short talk offers deeper details which helps how it behaves both when it’s working well and otherwise.

Christian Timmerer, co-founder of Bitmovin, starts his lesson on AVC with the summary of improvements in AVC over the basic MPEG 2 model people tend to learn as a foundation. Improvements such as variable block size motion compensation, multiple reference frames and improved adaptive entropy coding. We see that, as we would expect the input can use 4:2:0 or 4:2:2 chroma sub-sampling as well as full 4:4:4 representation with 16×16 macroblocks for luminance (8×8 for chroma in 4:2:0). AVC can handle Pictures split into several slices which are self-contained sequences of macroblocks. Slices themselves can then be grouped.

Intra-prediction is the next topic where by an algorithm uses the information within the slice to predict a macroblock. This prediction is then subtracted from the actual block and coded thereby reducing the amount of data that needs to be transferred. The decoder can make the same prediction and reconstruct the full block from the data provided.

The next sections talk about motion prediction and the different sizes of macroblocks. A macroblock is a fixed area on the picture which can be described by a mixture of some basic patterns but the more complex the texture in the block, the more patterns need to be combined to recreate it. By splitting up the 16×16 block, we can often find a simpler way to describe the 8×8 or 8×16 shapes than if they had to encompass a whole 16×16 block.

 

B-frames are fairly well understood by many, but even if they are unfamiliar to you, Christian explains the concept whereby B-frames provide solely motion information of macroblocks both from frames before and after. This allows macroblocks which barely change to be ‘moved around the screen’ so to speak with minimal changes other than location. Whilst P and I frames provide new macroblocks, B-frames are intended just to provide this directional information. Christian explains some of the nuances of B-frame encoding including weighted prediction.

Quantisation is one of the most important parts of the MPEG process since quantisation is the process by which information is removed and the codec becomes lossy. Thus the way this happens, and the optimisations possible are key so Christian covers the way this happens before explaining the deblocking filter available. After splitting the picture up into so many macroblocks which are independently processed, edges between the blocks can become apparent so this filter helps smooth any artefacts to make them more pleasing to the eye. Christian finishes talking about AVC by exploring entropy encoding and thinking about how AVC encoding can and can’t be improved by adding more memory and computation to the encoder.

Watch now!
Speaker

Christian Timmerer Christian Timmerer
CIO & Cofounder, Bitmovin
Associate Professor, Universität Klagenfurt