Video: RAVENNA AM824 & SMPTE ST 2110-31 Applications



Audio has a long heritage in IP compared to video, so there’s plenty of overlap and there are edge cases abound when working between RAVENNA, AES67 and SMPTE ST 2110-30 and -31. SMPTE’s 2110 suite of standards currently holds two methods of carrying audio including a way of carrying encoded audio such as Dolby AC4 and Dolby E.

RAVENNA Evangelist Andreas Hildebrand is joined by Dolby Labs architect James Cowdrey to discuss the compatibility of -30 and -31 with AES67 and how non-PCM data can be carried in -31 whether that be lightly compressed audio, object audio for immersive experiences or even just pure metadata.

Andreas starts by revising the key differences between AES67 and RAVENNA. The core of AES67 fits neatly within RAVENNA’s capabilities including the transport of up to 24-bit linear PCM with 48 samples per packet and up to 8 channels of 48kHz audio. RAVENNA offers more sample rates, more channels and adds discovery and redundancy with modes such as ‘MADI’ and ‘High performance’ which help constrain and select the relevant parameters.

SMPTE ST 2110-30 is based on AES67 but adds its own constraints such that any -30 stream can be received by an AES67 decoder, however, an AES67 sender needs to be aware of -30’s constraints for it to be correctly decoded by a -30 receiver. Andreas says that all AES67 senders now have this capability.


In contrast to 2110-30, 2110-31 is all about AES3 and the ability of AES3 to carry both linear PCM and non-PCM data. We look at the structure of the AES3 which contains audio blocks each of which has 192 Frames. These frames are split into 2, in the case of stereo, 64 in the case of MADI. Within each of these subframes, we finally find the preamble and the 24-bit data. Andreas explains how this is linked to AM824 and the SDP details needed.

James Cowdery leads the second part of today’s talk first talking about SMPTE ST 337 which details how to send non-PCM audio and data in an AES3 serial digital audio interface. It can carry AC-3, AC-4 for object audio delivering immersive audio experiences, Dolby E and also the metadata standards KLV and Serial ADM.

‘Why use Dolby E?’ asks James. Dolby E has a number of advantages although as bandwidth has become more available, it is increasingly replaced by uncompressed audio. However legacy workflows may now be reliant on IP infrastructure between the receiver and decoder, so it’s important to be able to carry it. Dolby E also packs a whole set of surround sound within a single data stream removing any problems of relative phase and can be carried over MPEG-2 transport streams so it still has plenty of flexibility and uses cases.

Its strength can bring fragility and one way which you can destroy a Dolby E feed is by switching between two videos containing Dolby E in the middle of the data rather than waiting for the gap between packets which is called the guardband. Dolby E needs to be aligned to the video so that you can crossfade and switch between videos without breaking the audio. James makes the point that one reason to use -31 and not -30 to carry Dolby E, or any other non-PCM data, is that -30 assumes that a sample rate converter can be used and so there is usually little control over when an SRC is brought in to use. A sample rate converter, of course, would destroy any non-PCM data.

RAVENNA 824 and 2110-31 gateways will preserver the line position of Dolby data. Can support Dolby E transport can therefore be supported by a vendor without Dolby support. James notes that your Dolby E packets need to be 125 microseconds to achieve packet-level switching without missing a guardband and corrupting data.

Immersive audio requires metadata. sADM is an open specification for metadata interchange, the aim of which is to help interoperability between vendors. sADM metadata can be embedded in SDI, transported uncompressed as SMPTE 302 in MPEG-2 Transport Streams and for 2110, is carried in -31. It’s based on XML description of metadata from the Audio Definition Model and James advises using the GZip compression mode to reduce the bitrate as it can be sent per-frame. An alternative metadata standard is SMPTE ST 336 which is an open format providing a binary payload which makes it a lower-latency method for sending Metadata. These methods of sending metadata made sense in the past, but now, with SMPTE ST 2110 having its own section for metadata essences, we see 2110-41 taking shape to allow data like this to be carried on its own.

Watch now!
Speakers

James Cowdery James Cowdery
Senior Staff Architect
Dolby Laboratories
Andreas Hildebrand Andreas Hildebrand
RAVENNA Evangelist,
ALC NetworX

Video: AES67 over WAN

Deeply embedded in the audio industry and adopted into SMPTE ST 2110, AES67 workflows surround us. Increasingly our workflows are in multiple locations so moving AES67 on the WAN and the internet is essential. If networks were always perfect, this would be easy but as that’s not the case, this RAVENNA talk examines what the problems are and how to solve them.

Andreas Hildebrand introduces the video with an examination of how the WAN, whether that’s a company’s managed wide area network or the internet at large, is different from a LAN. Typical issues are packet loss, varying latency meaning the packets arrive with jitter, lack of PTP and multicast. With this in mind, Nicolas Sturmel from Merging Technologies takes the reins to examine the solutions.

Nicolas explains the typically EBU Tech 3326 (also known as ACIP) is used for WAN contribution which specifies how a sender and receiver communicate and the codecs to be used. Although PCM is available, many codecs such as AptX are also prescribed for use. Nicolas says that ACIP is great for most applications but if you need low-latency, precise timing and PCM-quality staying AES67 may be the best policy, even over the WAN.

Having identified your AES6-over-WAN workflow, the question is how to pull it off. Nicolas looks at three methods, one is FEC whereby you are constantly sending redundant data. FEC can send up to around 25% extra data so that if any is lost, the extra information sent can be leveraged to determine the lost values and reconstruct the stream. This is can work well but requires sending this extra data constantly therefore putting up your bandwidth. It can also only deal with certain losses requiring them to be of a short duration.

Instead of FEC, you can use RIST, SRT or a similar re-transmission technology. These will actively recover any lost packets and have the benefit that you only transmit more data when you have lost data. Lastly, he mentions SMPTE ST 2022-7 which uses two paths of identical data to cover losses in any one of them. Although this is 100% extra data, the benefit is that it can deal with any type of loss including a complete path failure which neither of the others can do. It is, however possible to combine FEC or RIST with a 2022-7 workflow so you can have two levels of protection.

Timing over the WAN is not ideal as PTP loses accuracy over long-latency links and it assumes symmetry. On the internet, it’s possible to get links where the latency is longer in one direction than the other. An easy, though potentially costly, workaround for distributing PTP over the WAN is to use GPS, GLONASS or similar to synchronise grandmaster clocks at each location.

Watch now!
Speakers

Nicolas Sturmel Nicolas Sturmel
Product Manager & Senior Technologist
Merging Technologies
Andreas Hildebrand Andreas Hildebrand
RAVENNA Evangelist,
ALCNetworx

Video: Timing Requirements in Broadcast Applications

How does timing for AES67 and SMPTE ST 2110-30 work? All is revealed in this short video by Andrea Hildebrand who explains why we need PTP timing and how we relate the absolute time to the signals themselves.

In a network for audio streams, Andreas starts, we want all the streams to run on their native sample rate, use the same clock, but also want to have the possibility of multiple concurrent streams using different sample rates. Also, it’s important to have a deterministic end-to-end latency and that, when streams arrive, they should be suitably aligned. We achieve all of this by distributing time around the system. Audio has very high accuracy requirements of down to within 10 microseconds for typical 48KHz broadcast signals, but AES11 requires within 1 microsecond which is why the Precision Time Protocol, PTP is used which is defined by the standard IEEE 1588. For more information on PTP, check out our PTP back library

End devices run their own local clocks, synchronised to the PTP on the network. In charge of it all, there is a grandmaster locked to GPS which can then distribute to other secondary clocks which feed the end devices. The end device can generate a media clock from the PTP and by using PTP, different facilities can be kept in time with each other. All media is then timestamped with the time when they were generated. For advice on architecting PTP, have a listen to this talk from Arista’s Gerard Phillips.

RTP is used to carry professional media streams like AES. RTP builds on top of UDP to add the critical timing information we need. Namely, the timestamp but also the sequence number. Andreas looks at the structure of the RTP packet header to see where the timestamp and identifiers go. To follow up on the IT basics underpinning AES67 and SMPTE ST 2110, check out Ed Calverley’s presentation on the topic.

‘Profiles’ are required to link the time of day to media flows – to give the time some meaning in terms of the expected signal. The AES67 Media Profile does this for AES67 as an annexe in the standard. SMPTE use ST 2059 to define how to use AES67 as well as all the other essences it supports and relate them all back to an originating epoch time in 1970.

The talk finishes by looking at the overlap in timing specs for AES67 and ST 2110-30 (AES67 for 2110). For more information on how AES67 and ST 2110 work (and don’t work) together, watch Andreas’s ‘Deeper dive’ on the topic.

Watch now!
Speakers

Andreas Hildebrand Andreas Hildebrand
RAVENNA Evangelist
ALC NetworX

Video: AES67/SMPTE ST 2110 Audio Transport & Routing (NMOS IS-08)

Let’s face it, SMPTE ST 2110 isn’t trivial to get up and running at scale. It carries audio as AES67, though with some restrictions which can cause problems for full interoperability with non-2110 AES67 systems. But once all of this is up and running, you’re still lacking discoverability, control and management. These aspects are covered by AMWA’s NMOS IS-04, IS-05 and IS0-08 projects.

Andreas Hildrebrand, Evangelist at ALX NetworX, takes the stand at the AES exhibition to explain how this can all work together. He starts reiterating one of the main benefits of the move to 2110 over 2022-6, namely that audio devices don’t need to receive and de-embed audio. With a dependency on PTP, SMPTE ST 2110-30 an -31 define carriage of AES67 and AES3.

We take a look at IS-04 and IS-05 which define registration, discovery and configuration. Using an address received from DHCP, usually, new devices on the network will put in an entry into an IS-04 registry which can be queried by an API to find out what senders and listeners are available in a system. IS-05 can then use this information to create connections between devices. IS-05, Andreas explains, is able to issue a create connection request to endpoints asking them to connect. It’s up to the endpoints themselves to initiate the request as appropriate.

Once a connection has been made, there remains the problem of dealing with audio mapping. Andreas uses the example of a single stream containing multiple channels. Where a device only needs to use one or two of these, IS-08 can be used to tell the receiver which audio it should be decoding. This is ideal when delivering audio to a speaker. Andreas then walks us through worked examples.

Watch now!
Speaker

Andreas Hildebrand Andreas Hildebrand
Ravenna Technology Evangelist,
ALC NetworX