Video: Maximise your video density with ST 2110

What can ST 2110 do for you? What problems can it solve? These questions and more are tackled in this video from BBright and Matrox.

Guillaume Arthuis from BBright kicks off the video by highlighting that SMPTE ST 2110 sends all media as separate streams. Called essences, all aspects of a signal are delivered separately such as metadata, audio and video. For a device which looks at subtitling, this saves having to receive a 3Gb/s stream just to get a few Kbps of data. Sending of the video has also been improved as no blanking data is sent which can see bandwidth savings of up to 30% depending on the video format. It shouldn’t be forgotten that network cables are bi-directional and typically can carry many streams. This means the number of cables in a facility can be greatly reduced.

Marwan al-Habbal from Matrox compares the pros and cons of SDI against ST 2110. SDI has incredible interoperability, has good reliability and ‘discovery’ is not really a problem since everything is point-to-point connected with uni-directional cabling. These latter two points are, of course, downsides compared to ST 2110. Marwan looks at whether we can be confident in 2110’s reliability, discovery and connectivity. Within the standard, ‘narrow’ and ‘wide’ senders are specified. Marwan makes the point that using narrow senders everywhere will give better determinism and can avoid momentary ‘blips’ in the network. Any problems on the network can be mitigated by using ST 2022-7 seamless switching whereby two feeds are sent over the network(s) and a single stream is reassembled from the received packets. Testing is the key to interoperability. JT-NM’s testing programme is, by another name, a ‘plugfest’ whereas many vendors as possible connect to other vendors’ equipment in order to test compatibility. This is leading to confidence in terms of inter-vendor workflows being generally accessible.

Another major benefit of ST 2110 is density. Guillaume takes us through calculations showing that you can implement a 512×512 router using just a 1U switch at an approximate cost of $80. He also looks at future scaling approaches. One approach outlined is to use 25G interfaces today to leave room for expansion but the other is to implement JPEG XS running ST 2110-22. This is a relatively new standard which brings in the ability to use compressed video in 2110 for the first time. This would allow ‘HD’ bitrates for low-latency UHD streams.

Watch now!
Speakers

Guillaume Arthuis Guillaume Arthuis
President,
BBright
Marwan al-Habbal Marwan Al-Habbal
OEM Product Manager.
Matrox

Video: The New Video Codec Landscape – VVC, EVC, HEVC, LC-EVC, AV1 and more

The codec arena is a lot more complex than before. Gone is the world of 5 years ago with AVC doing nearly everything. Whilst AVC is still a major force, we now have AV1 and VP9 being used globally with billions of uses a year, HEVC is not the force majeure it was once expected to be, but is now seeing significant use on iPhones and overall adoption continues to grow. And now, in 2020 we see three new codecs on the scene, VVC, EVC and LCEVC.

To help us make sense of this SMPTE has invited Walt Husak and Sean McCarthy to take us through what the current codecs are, what makes them different, how well they work, how to compare them and what the future roadmaps hold.

Sean starts by explaining which codecs are maintained by which bodies, with the IEC, ITU and MPEG being involved, not to mention the corporate codecs (VP8, and VP9 from Google) and the Chinese AVS series of codecs. Sean explains that these share major common elements and are each evolutions of each other. But why are all these codecs needed? Next, we see the use-cases that have brought these codecs into existence. Granted, AVC and HEVC entered the scene to reduce bitrate in an effort to make HD and UHD practical, respectively, but EVC and LC-EVC have different aims.

Sean gives a brief overview of the basics of encoding starting with partitioning the image, predicting parts of it, applying transformations, refining it (also known as applying ‘loop filters) and finishing with entropy codings. All of these blocks are briefly explained and exist in all the codecs covered in this talk. The evolutions which make the newer codecs better are therefore evolutions of each of these elements. For instance, explains Sean, splitting the image into different sections, known as partitioning, has become more sophisticated in recent codecs allowing for larger sections to be considered at once but, at the same time, smaller partitions created within each.

All codecs have profiles whereby the tools in use, or the complexity of their implementation, is standardised for certain types of video: 8-bit, 10-bit, HDR etc. This allows hardware implementers to understand the upper bounds of computation so they don’t end up over-provisioning hardware resources and increasing the cost. Sean looks at how VVC uses the same tools throughout all of its four profiles with only a few exceptions. Screen content sees two extra tools come for 4:2:2 formats and above. AV1 has the same tools throughout all the profiles but, deliberately, EVC doesn’t. Essential Video Coding has a royalty-free base layer that uses techniques that are not subject to any use payments. Using this layer gives you AVC-quality encoding, approximately. Using the main profile, however, gets you similar to HEVC encoding albeit with royalty payments.

The next part of the talk examines two main reasons for the increase in compression over recent codec generation, block size and partitioning, before highlighting some new tools in VVC and AV1. Block size refers to the size of the blocks that an image is split up into for processing. By using a larger block, the algorithms can spot patterns more efficiently so the continued increase from 16×16 in AVC to 128×128 now in VVC drives an increase in computation but also in compression. Once you have your block, splitting it up following the features of the images is the next stage. Called partitioning, we see the number of ways that the codecs can mathematically split a block has grown significantly. VVC can also partition chroma separately to luma. VVC and AV1 also include 64 and 16 ways, respectively, to diagonally partition rather than the typical vertical and horizontal partitioning modes.

Screen content coding tools are increasingly important, pandemics aside, there has long been growth in the amount of computer-generated content being shared online whether that’s through esports, video conference screen sharing or elsewhere. Truth be told, HEVC has support for screen-content encoding but it’s not in the main profile so many implementations don’t support it. VVC not only evolves the screen-content tools, but it also makes it present as default. AV1, also, was designed to work well with screen content. Sean takes some time to look at the IBC tool, intra-block copy, which allows the encoder to relate parts of the current frame to other sections. Working at the prediction stage, with screen content that contains, for instance, lots of text, parts of that text will look similar and to a first approximation, one part of the image can be duplicated in another. This is similar to motion compensation where a macroblock is ‘copied’ to another frame in a different position, but all the work is done on the present frame for Intra BC. Palette mode is another screen content tool that allows the colour of a section of the image to be described as a palette of colours rather than using the full RGB value for each and every pixel.

Sean covers the scaled prediction between resolutions in VVC and super-resolution in AV1, VVC’s 360-degree video optimisations and luma mapping before handing over to Walt Husak who goes into more detail on how the newer codecs work, starting with LCEVC.

LCEVC is a codec that improves the performance of already-deployed codecs, typically used to enhance spatial resolution. If you wanted to encode HD, the codec would downsample the HD to an SD resolution and encode that with AVC, HEVC or another codec. At the same time, it would upsample that encoded video again and generate two correction layers that correct for artefacts and add sharpness. This information is added into the base codec and sent to the decoder. This can allow a software-only enhancement to a hardware deployment fully utilising the hardware which has already been deployed. Walt notes that the enhancement layers are much the same technology as has already been standardised by SMPTE as VC6 (ST 2117). LCEVC has been found to be computationally efficient allowing it to address markets such as embedded devices where hardware restrictions would otherwise prohibit the use of higher resolutions than for which it was originally designed. Very low bitrate performance is also very good.

Sean introduces us to his “Dos and Don’ts” of codec comparisons. The theme running through them is to take care that you are comparing like for like. Codecs can be set to run ‘fast’ or ‘slow’ each of which holds its own compromises in terms of encoding time and resulting quality. Similarly, there are some implementations that are made simply to implement the standard as rigorously as possible which is an invaluable tool when developing the codec or an implementation. Such a reference implementation for codec X, clearly, shouldn’t be compared to production implementations of a codec Y as the times are guaranteed to be very different and you will not learn anything from the process. Similarly, there are different tools that give codecs much more time to optimise known as single- and double-pass which shouldn’t be cross-compared.

The talk draws to a close with a look at codec performance. Sean shows a number of graphs showing how VVC performs against HEVC. Interestingly the metrics clearly show a 40% increase in efficiency of VVC over HEVC, but when seen in subjective tests, the ratings show a 50% improvement. VVC’s encoder is approximately 10x as complex as HEVC’s.

HEVC and AV1 perform similarly for the same bit rate. Overall, Sean says, AV1 is a little blurrier in regions of spatial detail and can have some temporal flickering. HEVC is more likely to have blocking and ringing artefacts. EVC’s main profile is up to 29% better than HEVC. LCEVC performs up to 8% better than AVC when using an AVC base layer and also slightly better than HEVC when using an HEVC based codec. Sean makes the point that the AVC has been continually updated since its initial release and is now on version 27, so it’s not strictly true to simply say it’s an ‘old’ codec. HEVC similarly is on version 7. Sean runs down part of the roadmap for AVC which leads on to the use of AI in codecs.

Finishing the video, Walt looks at the use of Deep Learning in codecs. Deep learning is also known as machine learning and referred to as AI (Artificial Intelligence). For most people, these terms are interchangeable and refer to the ability of a signal to be manipulated not by a fixed equation or algorithm (such as Lanczos scaling) but by a computer that has been trained through many millions of examples to recognise what looks ‘right’ and to replicate that effect in new scenarios.

Walt talks about JPEG’s AI learning research on still images who are aiming to complete an ‘end-to-end’ study of compression with AI tools. There’s also MPEG’s Deep Neural Network-based Video Coding which is looking at which tools within codecs can be replaced with AI. Also, recently we have seen the foundation of the MPAI (Moving Picture, Audio and Data Coding by Artificial Intelligence) organisation by Leonardo Chiariglione, an industry body devoted to the use of AI in compression. With all this activity, it’s clear that future advances in compression will be driven by the increasing use of these techniques.

The video ends with a Q&A session.

Watch now!
Find out more on SMPTE’s site
Speakers

Sean McCarthy Sean McCarthy
Director, Video Strategy and Standards,
Dolby Laboratories
Walt Husak Walt Husak
Director, Image Technologies,
Dolby Laboratories

Video: RAVENNA AM824 & SMPTE ST 2110-31 Applications



Audio has a long heritage in IP compared to video, so there’s plenty of overlap and there are edge cases abound when working between RAVENNA, AES67 and SMPTE ST 2110-30 and -31. SMPTE’s 2110 suite of standards currently holds two methods of carrying audio including a way of carrying encoded audio such as Dolby AC4 and Dolby E.

RAVENNA Evangelist Andreas Hildebrand is joined by Dolby Labs architect James Cowdrey to discuss the compatibility of -30 and -31 with AES67 and how non-PCM data can be carried in -31 whether that be lightly compressed audio, object audio for immersive experiences or even just pure metadata.

Andreas starts by revising the key differences between AES67 and RAVENNA. The core of AES67 fits neatly within RAVENNA’s capabilities including the transport of up to 24-bit linear PCM with 48 samples per packet and up to 8 channels of 48kHz audio. RAVENNA offers more sample rates, more channels and adds discovery and redundancy with modes such as ‘MADI’ and ‘High performance’ which help constrain and select the relevant parameters.

SMPTE ST 2110-30 is based on AES67 but adds its own constraints such that any -30 stream can be received by an AES67 decoder, however, an AES67 sender needs to be aware of -30’s constraints for it to be correctly decoded by a -30 receiver. Andreas says that all AES67 senders now have this capability.


In contrast to 2110-30, 2110-31 is all about AES3 and the ability of AES3 to carry both linear PCM and non-PCM data. We look at the structure of the AES3 which contains audio blocks each of which has 192 Frames. These frames are split into 2, in the case of stereo, 64 in the case of MADI. Within each of these subframes, we finally find the preamble and the 24-bit data. Andreas explains how this is linked to AM824 and the SDP details needed.

James Cowdery leads the second part of today’s talk first talking about SMPTE ST 337 which details how to send non-PCM audio and data in an AES3 serial digital audio interface. It can carry AC-3, AC-4 for object audio delivering immersive audio experiences, Dolby E and also the metadata standards KLV and Serial ADM.

‘Why use Dolby E?’ asks James. Dolby E has a number of advantages although as bandwidth has become more available, it is increasingly replaced by uncompressed audio. However legacy workflows may now be reliant on IP infrastructure between the receiver and decoder, so it’s important to be able to carry it. Dolby E also packs a whole set of surround sound within a single data stream removing any problems of relative phase and can be carried over MPEG-2 transport streams so it still has plenty of flexibility and uses cases.

Its strength can bring fragility and one way which you can destroy a Dolby E feed is by switching between two videos containing Dolby E in the middle of the data rather than waiting for the gap between packets which is called the guardband. Dolby E needs to be aligned to the video so that you can crossfade and switch between videos without breaking the audio. James makes the point that one reason to use -31 and not -30 to carry Dolby E, or any other non-PCM data, is that -30 assumes that a sample rate converter can be used and so there is usually little control over when an SRC is brought in to use. A sample rate converter, of course, would destroy any non-PCM data.

RAVENNA 824 and 2110-31 gateways will preserver the line position of Dolby data. Can support Dolby E transport can therefore be supported by a vendor without Dolby support. James notes that your Dolby E packets need to be 125 microseconds to achieve packet-level switching without missing a guardband and corrupting data.

Immersive audio requires metadata. sADM is an open specification for metadata interchange, the aim of which is to help interoperability between vendors. sADM metadata can be embedded in SDI, transported uncompressed as SMPTE 302 in MPEG-2 Transport Streams and for 2110, is carried in -31. It’s based on XML description of metadata from the Audio Definition Model and James advises using the GZip compression mode to reduce the bitrate as it can be sent per-frame. An alternative metadata standard is SMPTE ST 336 which is an open format providing a binary payload which makes it a lower-latency method for sending Metadata. These methods of sending metadata made sense in the past, but now, with SMPTE ST 2110 having its own section for metadata essences, we see 2110-41 taking shape to allow data like this to be carried on its own.

Watch now!
Speakers

James Cowdery James Cowdery
Senior Staff Architect
Dolby Laboratories
Andreas Hildebrand Andreas Hildebrand
RAVENNA Evangelist,
ALC NetworX

Video: Decentralised Production Tips and Best Practices

Live sports production has seen a massive change during COVID. We looked at how this changed at the MCR recently on The Broadcast Knowledge hearing how Sky Sports had radically changed along with Arsenal TV. This time we look to see how life in the truck has changed. The headline being that most people are staying at home, so how to you keep people at home and mix a multi-camera event?

Ken Kerschbaumer from Sports Video Group talks to VidOvation Jim Jachetta
and James Japhet from Hawk-Eye to understand the role they’ve been playing in bringing live sports to screen where the REMI/Outside Broadcast has been pared down to the minimum and most staff are at home. The conversation starts with the backdrop of The Players Championship, part of the PGA Tour which was produced by 28 operators in the UK who mixes 120+ camera angles and the audio to produce 25 live streams including graphics for broadcasters around the world.

Lip-sync and genlock aren’t optional when it comes to live sports. Jim explains that his equipment can do up to fifty cameras with genlock synchronisation over bonded cellular and this is how The Players worked with a bonded cellular on each camera. Jim discusses how audio, also has to be frame-accurate as they had many, many mics always open going back to the sound mixer at home.

James from Hawk-Eye explained that part of their decision to leave equipment on-site was due to lip-sync concerns. Their system worked differently to VidOvation, allowing people to ‘remote desktop’, using a Hawk-Eye-specifc low-latency technology dedicated to video transport. This also works well for events where there isn’t enough connectivity to support streaming of 10, 20 or 50+ feeds to different locations from the location.

The production has to change to take account of two factors: the chance a camera’s connectivity might go down and latency. It’s important to plan shots ahead of time to account for these factors, outlining what the backup plan is, say going to a wide shot on camera 3, if camera 1 can’t be used. When working with bonded cellular, latency is an unavoidable factor and can be as high as 3 seconds. In this scenario, Jim explains it’s important to explain to the camera operators what you’re looking for in a shot and let them work more autonomously than you might traditionally do.

Latency is also very noticeable for the camera shaders who usually rack cameras with milliseconds of latency. CCU’s are not used to waiting a long time for responses, so a lot of faked messages need to be sent to keep the CCU and controller happy. The shader operator needs to then get used to the latency, which won’t be as high as the video latency and take things a little slower in order to get the job done.

Not travelling everywhere has been received fairly well by freelancers who can now book in more jobs and don’t need to suffer reduced pay for travel days. There are still people travelling to site, Jim says, but usually, people who can drive and then will sit in the control room with shields. For the PGA Tour, the savings are racking up. Whilst there are a lot of other costs/losses at the moment for so many industries, it’s clear that the reduced travel and hosting will continue to be beneficial after restrictions are lifted.

Watch now!
Speakers

Jim Jachetta Jim Jachetta
EVP & CTO: Wireless Video & Cellular Uplinks
VidOvation
James Japhet James Japhet
Managing Director
Hawk-Eye North America
Ken Kerschbaumer Ken Kerschbaumer
Editorial Director,
Sports Video Group