AES3 Archives – The Broadcast Knowledge

Video: RAVENNA AM824 & SMPTE ST 2110-31 Applications

Posted on 24th November 2020 by Russell Trafford-Jones

Audio has a long heritage in IP compared to video, so there’s plenty of overlap and there are edge cases abound when working between RAVENNA, AES67 and SMPTE ST 2110-30 and -31. SMPTE’s 2110 suite of standards currently holds two methods of carrying audio including a way of carrying encoded audio such as Dolby AC4 and Dolby E.

RAVENNA Evangelist Andreas Hildebrand is joined by Dolby Labs architect James Cowdrey to discuss the compatibility of -30 and -31 with AES67 and how non-PCM data can be carried in -31 whether that be lightly compressed audio, object audio for immersive experiences or even just pure metadata.

Andreas starts by revising the key differences between AES67 and RAVENNA. The core of AES67 fits neatly within RAVENNA’s capabilities including the transport of up to 24-bit linear PCM with 48 samples per packet and up to 8 channels of 48kHz audio. RAVENNA offers more sample rates, more channels and adds discovery and redundancy with modes such as ‘MADI’ and ‘High performance’ which help constrain and select the relevant parameters.

SMPTE ST 2110-30 is based on AES67 but adds its own constraints such that any -30 stream can be received by an AES67 decoder, however, an AES67 sender needs to be aware of -30’s constraints for it to be correctly decoded by a -30 receiver. Andreas says that all AES67 senders now have this capability.

Source: ALC NetworX

In contrast to 2110-30, 2110-31 is all about AES3 and the ability of AES3 to carry both linear PCM and non-PCM data. We look at the structure of the AES3 which contains audio blocks each of which has 192 Frames. These frames are split into 2, in the case of stereo, 64 in the case of MADI. Within each of these subframes, we finally find the preamble and the 24-bit data. Andreas explains how this is linked to AM824 and the SDP details needed.

James Cowdery leads the second part of today’s talk first talking about SMPTE ST 337 which details how to send non-PCM audio and data in an AES3 serial digital audio interface. It can carry AC-3, AC-4 for object audio delivering immersive audio experiences, Dolby E and also the metadata standards KLV and Serial ADM.

‘Why use Dolby E?’ asks James. Dolby E has a number of advantages although as bandwidth has become more available, it is increasingly replaced by uncompressed audio. However legacy workflows may now be reliant on IP infrastructure between the receiver and decoder, so it’s important to be able to carry it. Dolby E also packs a whole set of surround sound within a single data stream removing any problems of relative phase and can be carried over MPEG-2 transport streams so it still has plenty of flexibility and uses cases.

Its strength can bring fragility and one way which you can destroy a Dolby E feed is by switching between two videos containing Dolby E in the middle of the data rather than waiting for the gap between packets which is called the guardband. Dolby E needs to be aligned to the video so that you can crossfade and switch between videos without breaking the audio. James makes the point that one reason to use -31 and not -30 to carry Dolby E, or any other non-PCM data, is that -30 assumes that a sample rate converter can be used and so there is usually little control over when an SRC is brought in to use. A sample rate converter, of course, would destroy any non-PCM data.

RAVENNA 824 and 2110-31 gateways will preserver the line position of Dolby data. Can support Dolby E transport can therefore be supported by a vendor without Dolby support. James notes that your Dolby E packets need to be 125 microseconds to achieve packet-level switching without missing a guardband and corrupting data.

Immersive audio requires metadata. sADM is an open specification for metadata interchange, the aim of which is to help interoperability between vendors. sADM metadata can be embedded in SDI, transported uncompressed as SMPTE 302 in MPEG-2 Transport Streams and for 2110, is carried in -31. It’s based on XML description of metadata from the Audio Definition Model and James advises using the GZip compression mode to reduce the bitrate as it can be sent per-frame. An alternative metadata standard is SMPTE ST 336 which is an open format providing a binary payload which makes it a lower-latency method for sending Metadata. These methods of sending metadata made sense in the past, but now, with SMPTE ST 2110 having its own section for metadata essences, we see 2110-41 taking shape to allow data like this to be carried on its own.

Watch now!
Speakers

	James Cowdery Senior Staff Architect Dolby Laboratories
	Andreas Hildebrand RAVENNA Evangelist, ALC NetworX

Video: AES67 & ST 2110 Deeper Dive – The Audio Files

Posted on 22nd May 2020 by Russell Trafford-Jones

A deeper dive here, in the continuing series of videos looking at AES67, SMPTE ST 2110 and Ravenna. Andreas Hildebrand from ALC Networx is back to investigate the next level down on how AES67 and ST 2110 operate and how they can be configured. The talk, however, remains accessible throughout and starts with an reminder of what AES67 is and why it exists. This is was also covered in his first talk.

After explaining the AES67 was created as a way for multiple audio-over-IP standards to interoperate, Andreas looks at the stack, stepping through it to explain each element. The first topic is timing. He explains that every device on the AES67 network is not only governed by PTP, but it’s also runs its own clock which is called the Local Clock. From the Local Clock, the device then also creates a Media Clock which is based on the Local Clock time but is used to crate any frequency needed for the media (48KHz, for instance). Finally an RTP clock is kept for transmission over the network.

https://www.youtube.com/watch?v=3V54wsTlU5Y

AES67 & ST 2110 Deeper Dive – The Audio Files (https://www.youtube.com/watch?v=3V54wsTlU5Y)

The next item featured on the stack is encoding. AES67 is baed on linear audio, also known as PCM. AES67 ensures that 48KHz, 16 & 24-bit audio is supported on all devices and allows up to 8 channels per stream. Importantly, Andreas explains the different versions of packet time which are supported, 1ms being mandatory which allows 48 samples of 48 KHz audio into teach IP packet.

SDP – Session Description Protocol is next which describes in a simple text file what’s in the AES67 stream giving its configuration. Then Andreas looks at what Link Offset is and examines its role in determining latency and the types of latency it’s been made to compensate for. He then talks you through working out what latency setting you need to use including taking into account the number of switches in a network and our frame size.

SMPTE ST 2110 is the focus for the last part of the talk. This, Andreas explains, is a way of moving, typically uncompressed, professional media (also known as essences) around a network for live production with very low latency. It sends audio separately to the video and uses AES67 to do so. This is defined in standard ST 2110-30. However, there are some important configurations for AES67 which are mandated in order to be compatible which Andreas explains. One example is forcing all devices to be slave only, another is setting the RTP clock offset to zero. Andreas finishes the talk by summarising what parts of ST 2110 and AES67 overlap including discussing the frame sizes supported.

Watch now!
Download the presentation
Speaker

Andreas Hildebrand
Senior Product Manager,
ALC NetworX Gmbh.

Video: SMPTE Technical Primers

Posted on 28th November 2019 by Russell Trafford-Jones

The Broadcast Knowledge exists to help individuals up-skill whatever your starting point. Videos like this are far too rare giving an introduction to a large number of topics. For those starting out or who need to revise a topic, this really hits the mark particularly as there are many new topics.

John Mailhot takes the lead on SMPTE 2110 explaining that it’s built on separate media (essence) flows. He covers how synchronisation is maintained and also gives an overview of the many parts of the SMPTE ST 2110 suite. He talks in more detail about the audio and metadata parts of the standard suite.

Eric Gsell discusses digital archiving and the considerations which come with deciding what formats to use. He explains colour space, the CIE model and the colour spaces we use such as 709, 2100 and P3 before turning to file formats. With the advent of HDR video and displays which can show bright video, Eric takes some time to explain why this could represent a problem for visual health as we don’t fully understand how the displays and the eye interact with this type of material. He finishes off by explaining the different ways of measuring the light output of displays and their standardisation.

Yvonne Thomas talks about the cloud starting by explaining the different between platform as a service (PaaS), infrastructure as a service (IaaS) and similar cloud terms. As cloud migrations are forecast to grow significantly, Yvonne looks at the drivers behind this and the benefits that it can bring when used in the right way. Using the cloud, Yvonne shows, can be an opportunity for improving workflows and adding more feedback and iterative refinement into your products and infrastructure.

Looking at video deployments in the cloud, Yvonne introduces video codecs AV1 and VVC both, in their own way, successors to HEVC/h.265 as well as the two transport protocols SRT and RIST which exist to reliably send video with low latency over lossy networks such as the internet. To learn more about these protocols, check out this popular talk on RIST by Merrick Ackermans and this SRT Overview.

Rounding off the primer is Linda Gedemer from Source Sound VR who introduces immersive audio, measuring sound output (SPL) from speakers and looking at the interesting problem of forward speakers in cinemas. The have long been behind the screen which has meant the screens have to be perforated to let the sound through which interferes with the sound itself. Now that cinema screens are changing to be solid screens, not completely dissimilar to large outdoor video displays, the speakers are having to move but now with them out of the line of sight, how can we keep the sound in the right place for the audience?

This video is a great summary of many of the key challenges in the industry and works well for beginners and those who just need to keep up.

Watch now!
Speakers

	John Mailhot Systems Architect for IP Convergence, Imagine Communications
	Eric Gsell Staff Engineer, Dolby Laboratories
	Linda Gedemer, PhD Technical Director, VR Audio Evangelist Source Sound VR
	Yvonne Thomas Strategic Technologist Digital TV Group

Video: The Audio Parts of ST 2110 Explained

Posted on 21st December 2018 by Russell Trafford-Jones

At the IBC 2018 IP Showcase, Andreas Hildebrand explains how AES67 and 2110 work together and how technologies like Dante, RAVENNA and Livewire fit in.

While there are lots of resources for working with 2110 video, but this is one of the few which tackles Audio. Andreas covers one of the ‘gotchas’ in 2110 – the compatability requirements for AES within the standard. He then looks at the timing requirements of 2110 and how they differ to those of AES67 and finally discusses AES3 while explaining the ST 2110-31 standard.

Presenter

Andreas Hildebrand
Senior Product Manager and Evangelist for the RAVENNA technology developed by ALC NetworX, Germany,

Subscribe to get daily updates