Video: Video Compression Basics

Video compression is used everywhere we look. So often is it not practical to use uncompressed video, that everything in the consumer space video is delivered compressed so it pays to understand how this works, particularly if part of your job involves using video formats such as AVC, also known as H.264 or HEVC, AKA H.265.

Gisle Sælensminde from Vizrt takes us on this journey of creating compressed video. He starts by explaining why we need uncompressed video and then talks about containers such as MPEG-2 Transport Streams, mp4, MOV and others. He explains that the container’s job is partly to hold metadata such as the framerate, resolution and timestamps among a long list of other things.

Gisle takes some time to look at the past timeline of codecs in order to understand where we’re going from what went before. As many use the same principles, Gisle looks at the different type of frames inside most compressed formats – I, P and B frames which are used in set patterns known as GOPs – Group(s) of Pictures. A GOP defines how long is between I frames. In the talk we learn that I frames are required for a decoder to be able to tune in part way through a feed and still start seeing some pictures. This is because it’s the I frame which holds a whole picture rather than the other types o frame which don’t.

Colours are important, so Gisle looks at the way that colours are represented. Many people know about defining colours by looking at the values of Red, Green and Blue, but fewer about YUV. This is all covered in the talk so we know about conversion between the two types.

Almost synonymous with codecs such as HEVC and AVC are Macroblocks. This is the name given to the parts of the raster which have been spit up into squares, each of which will be analysed independently. We’ll look at who these macro blocks are used, but Gisle also spends some time looking to the future as both HEVC, VP9 and now AV1 use variable-size macro block analysis.

A process which happens throughout broadcast is chroma subsampling. This topic, whereby we keep more of the luminance channel than colours, is explored ahead of looking at DCTs – Discrete Cosine Transforms – which are foundational to most video codecs. We see that by analysing these macro blocks with DCTs. we can express the image in a different way and even cut down on some of the detail we get from DCTs in order to reduce the bitrate.

Before some very useful demos looking at the result of varying quantisation across a picture, the difference signal between the source and encoded picture plus deblocking technology to hide some of the artefacts which can arise from DCT-based codecs when they are pushed for bandwidth.

Gisle finishes this talk at Media City Bergen by taking a number of questions from the floor.

Watch now!
Speaker

Gisle Sælensminde Gisle Sælensminde
Senior Software Engineer,
Vizrt

Video: ABA IP Fundamentals For Broadcast

IP explained from the fundamentals here in this in this talk from Wayne Pecena building up a picture of networking from the basics. This talk discusses not just the essentials for uncompressed video over IP, SMPTE ST 2110 for instance, but for any use of IP within broadcast even if just for management traffic. Networking is a fundamental skill, so even if you know what an IP address is, it’s worth diving down and shoring up the foundations by listening to this talk from the President of SBE and long-standing Director of Engineering at Texas A&M University.

This talk covers what a Network is, what elements make up a network and an insight into how the internet developed out of a small number of these elements. Wayne then looks at the different standards organisations that specify protocols for use in networking and IP. He explains what they do and highlights the IETF’s famous RFCs as well as the IEEE’s 802-series of ethernet standards including 802.11 for Wi-Fi.

The OSI model is next, which is an important piece of the puzzle for understanding networking. Once you understand, as the OSI model lays out, that different aspects of networking are built on top of, but operate separately from other parts, fault-finding, desiring networks and understanding the individual technologies becomes much easier. The OSI model explains how the standards that define the physical cables work underneath those for Ethernet as separate layers. There are layers all the way up to how your software works but much of broadcasting that takes place in studios and MCRs can be handled within the first 4, out of 7 layers.

The last section of the talk deals with how packets are formed by adding information from each layer to the data payload. Wayne then finishes off with a look at fibre interfaces, different types of SFP and the fibres themselves.

Watch now!
Speaker

Wayne Pecena Wayne Pecena
Director of Engineering, KAMU TV/FM at Texas A&M University
President, Society of Broadcast Engineers AKA SBE

Video: The Basics of SMPTE ST 2110 in 60 Minutes

SMPTE ST 2110 is a growing suite of standards detailing uncompressed media transport over networks. Now at 8 documents, it’s far more than just ‘video over IP’. This talk looks at the new ways that video can be transported, dealing with PTP timing, creating ‘SDPs’ and is a thorough look at all the documents.

Building on this talk from Ed Calverley which explains how we can use networks to carry uncompressed video, Wes Simpson goes through all the parts of the ST 2110 suite explaining how they work and interoperate as part of the IP Showcase at NAB 2019.

Wes starts by highlighting the new parts of 2110, namely the overview document which gives a high level overview of all the standard docs, the addition of compressed bit-rate video carriage and the recommended practice document for splitting a single video and sending it over multiple links; both of which are detailed later in the talk.

SMPTE ST 2110 is fundamentally different, as highlighted next, in that it splits up all the separate parts of the signal (i.e. video, audio and metadata) so they can be transferred and processed separately. This is a great advantage in terms of reading metadata without having to ingest large amounts of video meaning that the networking and processing requirements are much lighter than they would otherwise be. However, when essences are separated, putting them back together without any synchronisation issues is tricky.

ST 2110-10 deals with timing and knowing which packets of one essence are associated with packets of another essence at any particular point in time. It does this with PTP, which is detailed in IEEE 1588 and also in SMPTE ST 2059-2. Two standards are needed to make this work because the IEEE defined how to derive and carry timing over the network, SMPTE then detailed how to match the PTP times to phases of media. Wes highlights that care needs to be used when using PTP and AES67 as the audio standard requires specific timing parameters.

The next section moves into the video portion of 2110 dealing with video encapsulation on the networks pixel grouping and the headers needed for the packets. Wes then spends some time walking us through calculating the bitrate of a stream. Whilst for most people using a look-up table of standard formats would suffice, understanding how to calculate the throughput helps develop a very good understanding of the way 2110 is carried on the wire as you have to take note not only of the video itself (4:2:2 10 bit, for instance) but also the pixel groupings, UDP, RTP and IP headers.

Timing of packets on the wire isn’t anything new as it is also important for compressed applications, but it is of similar importance to ensure that packets are sent properly paced on wire. This is to say that if you need to send 10 packets, you send them one at a time with equal time between them, not all at once right next to each other. Such ‘micro bursting’ can cause problems not only for the receiver which then needs to use more buffers, but also when mixed with other streams on the network it can affect the efficiency of the routers and switches leading to jitter and possibly dropped packets. 2110-21 sets standards to govern the timing of network pacing for all of the 2110 suite.

Referring back to his warning earlier regarding timing and AES67, Wes now goes into detail on the 2110-30 standard which describes the use of audio for these uncompressed workflows. He explains how the sample rates and packet times relate to the ability to carry multiple audios with some configurations allowing 64 audios in one stream rather than the typical 8.

‘Essences’, rather than media, is a word often heard when talking about 2110. This is an acknowledgement that metadata is just as important as the media described in 2110. It’s sent separately as described by 2110-40. Wes explains the way captions/subtitles, ad triggers, timecode and more can be encapsulated in the stream as ancillary ‘ANC’ packets.

2110-22 is an exciting new addition as this enables the use of compressed video such as VC-2 and JPEG-XS which are ultra low latency codecs allowing the video stream to be reduced by half, a quarter or more. As described in this talk the ability to create workflows on a single IP infrastructure seamlessly moving into and out of compressed video is allowing remote production across countries allowing for equipment to be centralised with people and control surfaces elsewhere.

Noted as ‘forthcoming’ by Wes, but having since been published, is RP 2110-23 which adds back in a feature that was lost when migrating from 2022-6 into 2110 – the ability to send a UHD feed as 4x HD feeds. This can be useful to allow for UHD to be used as a production format but for multiviewers to only need to work in HD mode for monitoring. Wes explains the different modes available. The talk finishes by looking at RTP timestamps and SDPs.

Watch now!
The slides for this talk are available here
Speakers

Wes Simpson Wes Simpson
President,
Telecom Product Consulting

Video: DOS Gaming Aspect Ratio – 320×200


Occasionally, talks about broadcast topics can be a little dry. Not this one which discusses aspect ratios. For those who feel they are too well versed in 16:9, 4:3 and the many other standard aspect ratios in use in the film and broadcast industries, looking at them through the lens of retro computer gaming will be a breath of fresh air. For those who are new to anything that’s not widescreen 16:9 this is a great intro to a topic of fundamental importance for anyone dealing with video.

This video is no surprise coming from YouTube channel Displaced Gamers who have previously been on The Broadcast Knowledge talking about 525-Line Analog Video and Analog Luma – A History and Explanation of Video. After a brief intro, we quickly start looking at what standard resolutions are today and their aspect ratios.

The aspect ratio of a video is a way of describing how wide it is compared to its height. This can be done by an actual ratio of width:height or displayed more mathematically as a decimal such as 1.778 in the case of 16:9 widescreen. The video discusses how old CRTs display video, their use of analogue dials that changed the width and height of the image.

In today’s world, pixels tend to be square so those encountering any pixels which aren’t square tend to work in archiving and preservation. But the reality today is that with so many second screen devices, there are all sorts of resolutions and a variety of aspect ratios. As people working in media and entertainment, we have to understand the impact on the size and shape of the video when displaying it on different screens. This video shows the impacts vividly using figurines from Doom and comparing them with the in-game graphics from Doom before then looking at aspect ratios across the SNES, Amiga, Atari ST as well as IBM DOS.

Watch now!
Speaker

Chris Kennedy Chris Kennedy
Displaced Gamers, YouTube Channel