Video: IP Fundamentals For Broadcast Seminar IV

“When networking gets real”, perhaps, could have been the title of this last of 4 talks about IP for broadcast. This session wraps up a number of topics from the classic ‘TCP Vs. UDP’ discussion to IPv6 and examines the switches and networks that make up a network as well as the architecture options. Not only that, but we also look at VPNs and firewalls finishing by discussing some aspects of network security. When viewed with the previous three talks, this discusses many of the nuances from the topics already covered bringing in the relevance of ‘real world’ situations.

Wayne Pecena, President of SBE, starts by discussing subnets and collision domains. The issue with any NIC (Network Interface Controller) is that it’s not to know when someone else is talking on the wire (i.e. when another NIC is sending a message by changing the voltage of the wire). It’s important that NICs detect when other NICs are sending messages and seek to avoid sending while this is happening. If this does’t work out well, then two messages on the same wire are seen as a ‘collision’. It’s no surprise that collisions are to be avoided which is the starting point of Wayne’s discussion.

Moving from Layer 2 to Layer 4, Wayne pits TCP against UDP looking at the pros and cons of each protocol. Whilst this is no secret, as part of the previous talks this is just what’s needed to round the topic off ahead of talking about network architecture.

“Building and Securing a Segmented IP Network Infrastructure” is the title of the next talk which starts to deal with real-world problems when an engineer gets back from a training session and starts to actually specify a network herself. How should the routers and switches be interconnected to deliver the functionality required by the business and, as we shall see, which routers/switches are actually needed? Wayne discusses some of the considerations of purchasing switches (layer 2) and routers (layer 3 & 2) including the differing terms used by HP and Cisco before talking about how to assign IP addresses, also called an IP space. Wayne takes us through IP addressing plans, examples of what they would look like in excel along with a lot of the real-world thinking behind it.

Security is next on the list, not just in terms of ‘cybersecurity’ in the general sense but in terms of best practice, firewalls and VPNs. Wayne takes a good segment of time out to discus the different aspects of firewalls – how they work, ACLs (Access-control Lists), and port security amongst other topics before doing the same for VPNs (Virtual Private Networks) before making the point that a VPN and a firewall are not the same. A VPN allows you to extend a network out from a building to be in another – the typical example being from your work’s address into your home. Whilst a VPN is secured so that only certain people can extend the network, a firewall more generally acts to prevent anything coming into a network.

As an addendum to this talk, Wayne explains IPV4 depletion and how IPv6 addressing works. In practice, for broadcasters deploying within their company in the year 2020, IPv6 is unlikely to be a topic needed. However, for people who are distributing to homes and working closer with CDNs and ISPs, there is a chance that this information is more relevant on a day-to-day basis. Whilst IP address depletion is a real thing, since every company has a 10.x.x.x address space to play with, most companies use internal equipment with an IPv4 address plan.
Watch now!

Wayne Pecena Wayne Pecena
Director of Engineering, KAMU TV/FM at Texas A&M University
President, Society of Broadcast Engineers AKA SBE

Video: PTP in Virtualized Media Environment

How do we reconcile the tension between the continual move towards virtualisation, microservices and docker-like deployments and the requirements of SMPTE 2110 to have highly precise timing so it can synchronise the video, audio and other essence streams? Virtualisation adds fluidity in to computing so it can share a single set of resources amongst many virtual computers yet PTP, the Precision Time Protocol a successor to NTP, requires close to nano-second precision in its timestamps.

Alex Vainman from Mellanox explains how to make PTP work in these cases and brings along a case study to boot. Starting with a little overview and a glossary, Alex explains the parts of the virtual machine and the environment in which it sits. There’s the physical layer, the hypervisor as well as the virtual machines themselves – each virtual machine being it’s own self-contained computer sitting on a shared computer. Hardware must be shared between, often, many different computers. However some devices aren’t intended to be shared. Take, for instance, a dongle that contains a licence for software. This should clearly be only owned by one computer therefore there is a ‘direct’ mode which means that it is only seen by one computer. Alex goes on to explain the different virtualisation I/O modes which allow devices which can be shared, a printer, storage or CPU for instance need to have access computers may need to wait until they have access to the device to enable sharing. Waiting, of course, is not good for a precision time protocol.

In order to understand the impact that virtualisation might have, Alex details the accuracy and other requirements necessary to have PTP working well enough to support SMPTE 2110 workflows. Although PTP is an IEEE standard, this is just a standard to define how to establish accurate time. It doesn’t help us understand how to phase and bring together media signals without SMPTE ST 2059-1 and -2 which provide the standard of the incoming PTP signal and the way by which we can compare timing and media signals (more info here.) All important is to understand how PTP can actually determine the accurate time given that we know every single message has an unknown propagation delay. By exchanging messages, Alex shows, it is quite practical to measure the delays involved and bring them into the time calculation.

We now have enough information to see why the increased jitter of VM-based systems is going to cause a problem as there are non-deterministic factors such as contention and traffic load to consider as well as factors such as software overhead. Alex takes us through the different options of getting PTP well synchronised in a variety of different VM architectures. The first takes the host clock and ensures that is synchronised to PTP. Using a dedicated PTP library within the VM, this then speaks to the host clock and synchronises the VM OS clock providing very accurate timing. Another, where hardware support in the VM’s hardware for PTP isn’t present, is to have NICs with dedicated PTP support which can directly support the VM OSes maintaining their own PTP clock. The major downside here is that each OS will have to make their own PTP calls creating more load on the PTP system as opposed to the previous architecture whereby the host clock of the VM was the only clock synchronising to the system PTP and therefore there was only ever one set of PTP messages no matter how many VMs were being supported.

To finish off, Alex explains how Windows VMs can be supported – for now through third-party software – and summarises the ways in which we can, in fact, create PTP ecosystems that incorporate virtual machines.

Watch now!
Download the slides

Alex Vainman Alex Vainman
Senior Staff Engineer,
Mellanox Technologies

Video: Super Resolution – The scaler of tomorrow, here today!

If we ever had a time when most displays were the same resolution, those days are long gone with smartphone and tablets with extremely high pixel density nestled in with laptop screens of various resolutions and 1080-line TVs which are gradually being replaced with UHD variants. This means that HD videos are nearly always being upscaled which makes ‘getting upscaling right’ a really worthwhile topic. The well-known basic up/downscaling algorithms have been around for a while, and even the best-performing Lanczos is well over 20 years old. The ‘new kid on the block’ isn’t another algorithm, it’s a whole technique of inferring better upscaling using machine learning called ‘super resolution’.

Nick Chadwick from Mux has been running the code and the numbers to see how well super resolution works. Taking to the stage at Demuxed SF, he starts by looking at where scaling is used and what type it is. The most common algorithms are nearest neighbour, bi-cubic, bi-linear and lanczos with nearest neighbour being the most basic and least-well performing. Nick shows, using VMAF that using these for up and downscaling, that the traditional opinions of how well these algorithms perform are valid. He then introduces some test videos which are designed to let you see whether your video path is using bi-linear or bi-cubic upscaling, presenting his results of when bi-cubic can be seen (Safari on a MacBook Pro) as opposed to bi-linear (Chrome on a MacBook Pro). The test videos are available here.

In the next part of the talk, Nick digs a little deeper into how super resolution works and how he tested ffmpeg’s implementation of super resolution. Though he hit some difficulties in using this young filter, he is able to present some videos and shows that they are, indeed, “better to view” meaning that the text looks sharper and is easier to see with details being more easy pick out. It’s certainly possible to see some extra speckling introduced by the process, but VMAF score is around 10 points higher matching with the subjective experience.

The downsides are a very significant increase in computational power needed which limits its use in live applications plus there is a need for good, if not very good, understanding of ML concepts and coding. And, of course, it wouldn’t be the online streaming community if clients weren’t already being developed to do super-resolution on the decode despite most devices not being practically capable of it. So Nick finishes off his talk discussing what’s in progress and papers relating to the implementation of super resolution and what it can borrow from other developing technologies.

Watch now!

Nick Chadwick Nick Chadwick
Software Engineer,

Video: Advanced Video Coding Standards AVC

Whilst the encoding landscape is shifting, AVC (AKA H.264) still dominates many areas of video distribution so, for many, understanding what’s under the hood opens up a whole realm of diagnostics and fault finding that wouldn’t be possible without. Whilst many understand that MPEG video is built around I, B and P frames, this short talk offers deeper details which helps how it behaves both when it’s working well and otherwise.

Christian Timmerer, co-founder of Bitmovin, starts his lesson on AVC with the summary of improvements in AVC over the basic MPEG 2 model people tend to learn as a foundation. Improvements such as variable block size motion compensation, multiple reference frames and improved adaptive entropy coding. We see that, as we would expect the input can use 4:2:0 or 4:2:2 chroma sub-sampling as well as full 4:4:4 representation with 16×16 macroblocks for luminance (8×8 for chroma in 4:2:0). AVC can handle Pictures split into several slices which are self-contained sequences of macroblocks. Slices themselves can then be grouped.

Intra-prediction is the next topic where by an algorithm uses the information within the slice to predict a macroblock. This prediction is then subtracted from the actual block and coded thereby reducing the amount of data that needs to be transferred. The decoder can make the same prediction and reconstruct the full block from the data provided.

The next sections talk about motion prediction and the different sizes of macroblocks. A macroblock is a fixed area on the picture which can be described by a mixture of some basic patterns but the more complex the texture in the block, the more patterns need to be combined to recreate it. By splitting up the 16×16 block, we can often find a simpler way to describe the 8×8 or 8×16 shapes than if they had to encompass a whole 16×16 block.


B-frames are fairly well understood by many, but even if they are unfamiliar to you, Christian explains the concept whereby B-frames provide solely motion information of macroblocks both from frames before and after. This allows macroblocks which barely change to be ‘moved around the screen’ so to speak with minimal changes other than location. Whilst P and I frames provide new macroblocks, B-frames are intended just to provide this directional information. Christian explains some of the nuances of B-frame encoding including weighted prediction.

Quantisation is one of the most important parts of the MPEG process since quantisation is the process by which information is removed and the codec becomes lossy. Thus the way this happens, and the optimisations possible are key so Christian covers the way this happens before explaining the deblocking filter available. After splitting the picture up into so many macroblocks which are independently processed, edges between the blocks can become apparent so this filter helps smooth any artefacts to make them more pleasing to the eye. Christian finishes talking about AVC by exploring entropy encoding and thinking about how AVC encoding can and can’t be improved by adding more memory and computation to the encoder.

Watch now!

Christian Timmerer Christian Timmerer
CIO & Cofounder, Bitmovin
Associate Professor, Universität Klagenfurt