Video: High-Efficiency Video Coding (HEVC) Primer

HEVC continues to gain adoption thanks to its bitrate savings over AVC (H.264), though much stands in the balance this year as AV1 continues to gain momentum and MPEG’s VVC is released. Both of which promise greater compression. Compression, however, is a compromise between encoding complexity (computation), quality and speed. HEVC stands on the shoulders of AVC and this video explains the techniques it uses to be better.

Christian Timmerer, co-founder of Bitmovin, builds on his previous video about AVC as he details the tools and capabilities of HEVC (all known as H.265). He summarises the performance of HEVC as providing twice as much compression for the same video quality (or getting better quality for a higher number of bits). Whilst it’s decoder requirements have gone up by 50%, it provides better parallelisation opportunities. Amongst the features that create this are variable block-size motion compensation, improved interpolation method and more directions for spatial prediction. Most of the improvements are specifically an expansion of the abilities laid out in AVC. For instance, making size or direction variable or providing more options.

After outlining some of the details behind the new capabilities, we look at the performance improvements of some HEVC implementations over AVC implementations showing up to a 65% improvement of bitrate averaging out at around 50%. Christian finishes by looking at the newer codecs coming out soon such as VVC, LCEVC

Watch now!
Speakers

Christian Timmerer Christian Timmerer
CIO & Cofounder, Bitmovin
Associate Professor, Universität Klagenfurt

Video: Efficient Carriage of Sub-Rasters With ST 2110-20

One of the main promises of IP video is flexibility and what better way to demonstrate that than stepping off the well-worn path of broadcast resolutions? 1920×1080 is much loved nowadays, but not everything needs to be put into an HD-sized frame. SMPTE ST-2110 allows video of all shapes and sizes, so let’s not be afraid to use the control given to us.

Paul Briscoe, talking on behalf of Evertz, takes the podium to explain the idea. Using logo insertion as an expample, he shows that if you want to put a small BUG/DOG/graphic on screen with a key, then real there’s not a lot of data that needs to be transferred. Typically a graphic needs a key and a fill. Whilst the key is typically luma-only, the fill needs to be full colour.

In the world of SDI, sending your key and fill around would need two whole HD signals and up to 6Gbps of data. When your graphic is only a small logo, these SDI signals are mainly redundant data. Using ST 2110-20, however, in the IP domain we can be much more efficient. 2110 allows resolutions up to 32,000 pixels square so we should be able to send just the information which is necessary.

Paul introduces the idea of a “pixel group” (pgroup) which is the minimal group of video data samples that make up an integer number of pixels and also align to an octet boundary. Along with defining a size, we also get to define an X,Y position. Paul explains how using pgroups helps, and hinders, sending video this way and then delves in to how timing would work. To finish off, Paul examines edge cases and talks about other examples such as stock tickers, not to mention the possibility of motion as we get to define the X, Y position.

Watch now!
This wall chart gives more info on pgroups and other low-level ST 2110-20 constructs.
Download the slides from this presentation

Speakers

Paul Briscoe Paul Briscoe
Principal Consultant,
Televisionary Consulting

Video: Introduction To AES67 & SMPTE ST 2110

While standardisation of video and audio over IP is welcome, this does leave us with a plethora of standards numbers to keep track of along with interoperability edge cases to keep track of. Audio-over-IP standard AES67 is part of the SMPTE ST-2110 standards suite and was born largely from RAVENNA which is still in use in it’s own right. It’s with this backdrop that Andreas Hildebrand from ALC NetworX who have been developing RAVENNA for 10 years now, takes the mic to explain how this all fits together. Whilst there are many technologies at play, this webinar focusses on AES67 and 2110.

Andreas explains how AES67 started out of a plan to unite the many proprietary audio-over-IP formats. For instance, synchronisation – like ST 2110 as we’ll see later – was based on PTP. Andreas gives an overview of this synchronisation and then we shows how they looked at each of the OSI layers and defined a technology that could service everyone. RTP, the Real-time Transport Protocol has been in use for a long time for transport of video and audio so made a perfect option for the transport layer. Andreas highlights the important timing information in the headers and how it can be delivered by unicast or IGMP multicast.

As for the audio, standard PCM is the audio of choice here. Andreas details the different format options available such as 24-bit with 8 channels and 48 samples per packet. By varying the format permutations, we can increase the sample rate to 96kHz or modify the number of audio tracks. To signal all of this format information, Session Description Protocol messages are sent which are small text files outlining the format of the upcoming audio. These are defined in RFC 4566. For a deeper introduction to IP basics and these topics, have a look at Ed Calverley’s talk.

The second half of the video is an introduction to ST-2110. A deeper dive can be found elsewhere on the site from Wes Simpson.
Andreas starts from the basis of ST 2022-6 showing how that was an SDI-based format where all the audio, video and metadata were combined together. ST 2110 brings the splitting of media, known as ‘essences’, which allows them to follow separate workflows without requiring lots of de-embedding and embedding processes.

Like most modern standards, ATSC 3.0 is another example, SMPTE ST 2110 is a suite of many standards documents. Andreas takes the time to explain each one and the ones currently being worked on. The first standard is ST 2110-10 which defines the use of PTP for timing and synchronisation. This uses SMPTE ST 2059 to relate PTP time to the phase of media essences.

2110-20 is up next and is the main standard that defines use of uncompressed video with headline features such as being raster/resolution agnostic, colour sampling and more. 2110-21 defines traffic shaping. Andreas takes time to explain why traffic shaping is necessary and what Narrow, Narrow-Linear, Wide mean in terms of packet timing. Finishing the video theme, 2110-22 defines the carriage of mezzanine-compressed video. Intended for compression like TICO and JPEG XS which have light, fast compression, this is the first time that compressed media has entered the 2110 suite.

2110-30 marks the beginning of the audio standards describing how AES67 can be used. As Andreas demonstrates, AES67 has some modes which are not compatible, so he spends time explaining the constraints and how to implement this. For more detail on this topic, check out his previous talk on the matter. 2110-31 introduces AES3 audio which, like in SDI, provides both the ability to have PCM audio, but also non-PCM audio like Dolby E and D.

Finishing up the talk, we hear about 2110-40 which governs transport of ancillary metadata and a look to the standards still being written, 2110-23 Single Video essence over multiple 2110-20 streams, 2110-24 for transport of SD signals and 2110-41 Transport of extensible, dynamic metadata.

Watch now!
Speaker

Andreas Hildebrand Andreas Hildebrand
Senior Product Manager,
ALC NetworX Gmbh.

Video: ATSC 3.0 Part II – Cutting Edge OFDM with IP

RF, modulation, Single Frequency Networks (SFNs) – there’s a lot to love about this talk which is the second in a series of ATSC seminars however much is transferable to DVB. Today we’re focussed on transmission showing how ATSC 3.0 improves on DVB-T, how it simultaneously delivers feeds with different levels of robustness, the benefits of SFNs and much more.

In the second in this series of ATSC 3.0 talks, GatesAir’s Joe Seccia leads the proceedings starting by explaining why ATSC 3.0 didn’t simply adopt DVB-T2’s modulation scheme. The answer, explained in detail by Joe, is that by putting in further work, they got closer to the Shannon limit than DVB-T2 does. He continues to highlight the relevant standards which comprise the ATSC 3.0 standard which define the RF physical layer.

After showing how the different processes such as convolutional encoding and multiplexing fit together in the transmission chain, Joe focuses in on Layered Division Multiplexing (LDM) where a high robustness signal can be carefully combined with a lower robustness signal such that where one interferes with the other, there is enough separation to allow it to be decoded.

Next we are introduced to PLPs – Physical Layer Pipes. These can also be found in DVB-T2 and DVB-S2 and are logical channels carrying one or more services, with a modulation scheme and robustness particular to that individual pipe. Within those lie Frames and Subframes and Joe gives a good breakdown of the difference in meaning of the three, the Frame being at the top of the pile containing the other two. We look at how the bootstrap signal at a known modulation scheme and symbol rate details what’s coming next such which allow this very dynamic working with streams being sent with different modulation settings. The bootstrap is also important as it contains Early Alert System (EAS) signalling.

Layered Division Multiplexing is the next hot topic we hit and this elicits questions from the audience. LDM is important because it allows two streams to be sent at the same time with independent or related broadcasts. For instance this could deliver UHD content with HD underneath with the HD modulated to give much better robustness.

Another way of maintaining robustness is to establish an SFN which is now possible with ATSC 3.0. Joe explains how this is possible and how the RF from different antennae can help with reception. Importantly he also outlines how toward out the maximum separation between antennae and talks through different deployment techniques. He then works through some specific cases to understand the transmission power needed.

As the end of the video nears, Joe talks about MIMO transmission explaining how this, among other benefits, can allow channel bonding where two 6Mhz channels can be treated as a single 12Mhz channel. He talks about how PTP can complement GPS in maintaining timing if diverse systems are linked with ethernet and he then finishes with a walkthrough of configuring a system.

Watch now!
Speakers

Joe Seccia Joe Seccia
Manager, TV Transmission Market and Product Development Strategy
GatesAir