Video: Advanced Video Coding Standards AVC

Whilst the encoding landscape is shifting, AVC (AKA H.264) still dominates many areas of video distribution so, for many, understanding what’s under the hood opens up a whole realm of diagnostics and fault finding that wouldn’t be possible without. Whilst many understand that MPEG video is built around I, B and P frames, this short talk offers deeper details which helps how it behaves both when it’s working well and otherwise.

Christian Timmerer, co-founder of Bitmovin, starts his lesson on AVC with the summary of improvements in AVC over the basic MPEG 2 model people tend to learn as a foundation. Improvements such as variable block size motion compensation, multiple reference frames and improved adaptive entropy coding. We see that, as we would expect the input can use 4:2:0 or 4:2:2 chroma sub-sampling as well as full 4:4:4 representation with 16×16 macroblocks for luminance (8×8 for chroma in 4:2:0). AVC can handle Pictures split into several slices which are self-contained sequences of macroblocks. Slices themselves can then be grouped.

Intra-prediction is the next topic where by an algorithm uses the information within the slice to predict a macroblock. This prediction is then subtracted from the actual block and coded thereby reducing the amount of data that needs to be transferred. The decoder can make the same prediction and reconstruct the full block from the data provided.

The next sections talk about motion prediction and the different sizes of macroblocks. A macroblock is a fixed area on the picture which can be described by a mixture of some basic patterns but the more complex the texture in the block, the more patterns need to be combined to recreate it. By splitting up the 16×16 block, we can often find a simpler way to describe the 8×8 or 8×16 shapes than if they had to encompass a whole 16×16 block.


B-frames are fairly well understood by many, but even if they are unfamiliar to you, Christian explains the concept whereby B-frames provide solely motion information of macroblocks both from frames before and after. This allows macroblocks which barely change to be ‘moved around the screen’ so to speak with minimal changes other than location. Whilst P and I frames provide new macroblocks, B-frames are intended just to provide this directional information. Christian explains some of the nuances of B-frame encoding including weighted prediction.

Quantisation is one of the most important parts of the MPEG process since quantisation is the process by which information is removed and the codec becomes lossy. Thus the way this happens, and the optimisations possible are key so Christian covers the way this happens before explaining the deblocking filter available. After splitting the picture up into so many macroblocks which are independently processed, edges between the blocks can become apparent so this filter helps smooth any artefacts to make them more pleasing to the eye. Christian finishes talking about AVC by exploring entropy encoding and thinking about how AVC encoding can and can’t be improved by adding more memory and computation to the encoder.

Watch now!

Christian Timmerer Christian Timmerer
CIO & Cofounder, Bitmovin
Associate Professor, Universität Klagenfurt

Video: Hardware Transcoding Solutions For The Cloud

Hardware encoding is more pervasive with Intel’s Quick Sync embedding CUDA GPUs inside GPUs plus NVIDIA GPUs have MPEG NVENC encoding support so how does it compare with software encoding? For HEVC, can Xilinx’s FPGA solution be a boost in terms of quality or cost compared to software encoding?

Jan Ozer has stepped up to the plate to put this all to the test analysing how many real-time encodes are possible on various cloud computing instances, the cost implications and the quality of the output. Jan’s analytical and systematic approach brings us data rather than anecdotes giving confidence in the outcomes and the ability to test it for yourself.

Over and above these elements, Jan also looks at the bit rate stability of the encodes which can be important for systems which are sensitive to variations such services running at scale. We see that the hardware AVC solutions perform better than x264.

Jan takes us through the way he set up these tests whilst sharing the relevant ffmpeg commands. Finally he shares BD plots and example images which exemplify the differences between the codecs.

Watch now!
Download the slides

Jan Ozer Jan Ozer
Principal, Streaming Learning Center
Contributing Editor, Streaming Media

Video: Timing Tails & Buffers

Timing and synchronisation have always been a fundamental aspect of TV and as we move to IP, we see that timing is just as important. Whilst there are digital workflows that don’t need to be synchronised against each other, many do such as studio productions. However, as we see in this talk from The Broadcast Bridge’s Tony Orme, IP networks make timing all the more variable and accounting for this is key to success.

To start with Tony looks at the way the OBs, also known as REMIs, are moving to IP and need a timing plane across all of the different parts of production. We see how traditionally synchronisation is needed and the effect of timing problems not only in missed data but also with all essences being sent separately synchronisation problems between them can easily creep in.

When it comes to IP timing itself, Tony explains how PTP is used to record the capture time of the media/essences and distribute through the system. Looking at the data on the wire and the interval between that and the last will show a distribution of, hopefully, a few microseconds variation. This variation gives rise to jitter which is a varying delay in data arrival. The larger the spread, the more difficult it will be to recover data. To examine this more closely, Tony looks at the reasons for and the impacts of congestion, jitter, reordering of data.

Bursting, to make one of these as an example, is a much overlooked issue on networks. While it can occur in many scenarios without any undue problems, microbusting can be a major issue and one that you need to look for to find. This surrounds the issue of how you decide that a data flow is, say, 500Mbps. If you had an encoder which sent data at 1Gbps for 5 minutes and no data for 5 minutes, then over the 10 minute window, the average bitrate would have been 500Mbps. This clearly isn’t a 500Mbps encoder, but how narrow do you need to have your measurement window to be happy it is, indeed, 500Mbps by all reasonable definitions? Do you need to measure it over 1 second, 1 millisecond? Behind microbursting is the tendency of computers to send whatever data they have as quickly as possible; if a computer has a 10Gbe NIC, then it will send at 10Gbps. What video receivers actually need is well spaced packets which always come a set time apart.

Buffers a necessary for IP transmission, in fact within a computer there are many buffers. So using and understanding buffers is very important. Tony takes us through the thought process of considering what buffers are and why we need them. With this groundwork laid, understanding their use and potential problems is easier and well illustrated in this talk. For instance, since there are buffers in many parts of the chain to send data from an application to a NIC and have it arrive at the destination, the best way to maximise the chances of having a deterministic delay in the Tx path is to insert PTP information almost at the point of egress in the NIC rather than in the application itself.

The talk concludes by looking at buffer fill models and the problems that come with streaming using TCP/IP rather then UDP/IP (or RTP). The latter being the most common.

Watch now!
Download the presentations!


Tony Orme Tony Orme
The Broadcast Bridge

Video: QoE Impact from Router Buffer sizing and Active Queue Management

Netflix take to the stage at Demux to tell us about the work they’ve been doing to understand and reduce latency by looking at the queue management of their managed switches. As Tony Orme mentioned yesterday, we need buffers in IP systems to allow synchronous parts to interact. Here, we’re looking at how the core network fabric’s buffers can get in
the way of the main video flows.

Te-Yuan Huang from Netflix explains their work in investigating buffers and how best to use them. She talks about the flows that occur due to the buffer models of standard switches i.e. waiting until the buffer is full and then dropping everything else that comes in until the buffer is emptied. There is an alternative method, Active Queue Management (AQM), called FQ-CoDel which drops packets based on probability before the buffer is dropped. By carefully choosing the probability, you can actually improve buffer handling and the impact it has on latency.

Te-Yuan shows us results from tests that her team has done which show that the FQ-CoDel specification does, indeed, reduce latency. After showing us the data, she summarises saying that FQ-CoDel improves playback and QOE.

Watch now!

Te-Yuan Huang Te-Yuan Huang
Engineering Manager (Adaptive Streaming),