Video: RAVENNA AM824 & SMPTE ST 2110-31 Applications



Audio has a long heritage in IP compared to video, so there’s plenty of overlap and there are edge cases abound when working between RAVENNA, AES67 and SMPTE ST 2110-30 and -31. SMPTE’s 2110 suite of standards currently holds two methods of carrying audio including a way of carrying encoded audio such as Dolby AC4 and Dolby E.

RAVENNA Evangelist Andreas Hildebrand is joined by Dolby Labs architect James Cowdrey to discuss the compatibility of -30 and -31 with AES67 and how non-PCM data can be carried in -31 whether that be lightly compressed audio, object audio for immersive experiences or even just pure metadata.

Andreas starts by revising the key differences between AES67 and RAVENNA. The core of AES67 fits neatly within RAVENNA’s capabilities including the transport of up to 24-bit linear PCM with 48 samples per packet and up to 8 channels of 48kHz audio. RAVENNA offers more sample rates, more channels and adds discovery and redundancy with modes such as ‘MADI’ and ‘High performance’ which help constrain and select the relevant parameters.

SMPTE ST 2110-30 is based on AES67 but adds its own constraints such that any -30 stream can be received by an AES67 decoder, however, an AES67 sender needs to be aware of -30’s constraints for it to be correctly decoded by a -30 receiver. Andreas says that all AES67 senders now have this capability.


In contrast to 2110-30, 2110-31 is all about AES3 and the ability of AES3 to carry both linear PCM and non-PCM data. We look at the structure of the AES3 which contains audio blocks each of which has 192 Frames. These frames are split into 2, in the case of stereo, 64 in the case of MADI. Within each of these subframes, we finally find the preamble and the 24-bit data. Andreas explains how this is linked to AM824 and the SDP details needed.

James Cowdery leads the second part of today’s talk first talking about SMPTE ST 337 which details how to send non-PCM audio and data in an AES3 serial digital audio interface. It can carry AC-3, AC-4 for object audio delivering immersive audio experiences, Dolby E and also the metadata standards KLV and Serial ADM.

‘Why use Dolby E?’ asks James. Dolby E has a number of advantages although as bandwidth has become more available, it is increasingly replaced by uncompressed audio. However legacy workflows may now be reliant on IP infrastructure between the receiver and decoder, so it’s important to be able to carry it. Dolby E also packs a whole set of surround sound within a single data stream removing any problems of relative phase and can be carried over MPEG-2 transport streams so it still has plenty of flexibility and uses cases.

Its strength can bring fragility and one way which you can destroy a Dolby E feed is by switching between two videos containing Dolby E in the middle of the data rather than waiting for the gap between packets which is called the guardband. Dolby E needs to be aligned to the video so that you can crossfade and switch between videos without breaking the audio. James makes the point that one reason to use -31 and not -30 to carry Dolby E, or any other non-PCM data, is that -30 assumes that a sample rate converter can be used and so there is usually little control over when an SRC is brought in to use. A sample rate converter, of course, would destroy any non-PCM data.

RAVENNA 824 and 2110-31 gateways will preserver the line position of Dolby data. Can support Dolby E transport can therefore be supported by a vendor without Dolby support. James notes that your Dolby E packets need to be 125 microseconds to achieve packet-level switching without missing a guardband and corrupting data.

Immersive audio requires metadata. sADM is an open specification for metadata interchange, the aim of which is to help interoperability between vendors. sADM metadata can be embedded in SDI, transported uncompressed as SMPTE 302 in MPEG-2 Transport Streams and for 2110, is carried in -31. It’s based on XML description of metadata from the Audio Definition Model and James advises using the GZip compression mode to reduce the bitrate as it can be sent per-frame. An alternative metadata standard is SMPTE ST 336 which is an open format providing a binary payload which makes it a lower-latency method for sending Metadata. These methods of sending metadata made sense in the past, but now, with SMPTE ST 2110 having its own section for metadata essences, we see 2110-41 taking shape to allow data like this to be carried on its own.

Watch now!
Speakers

James Cowdery James Cowdery
Senior Staff Architect
Dolby Laboratories
Andreas Hildebrand Andreas Hildebrand
RAVENNA Evangelist,
ALC NetworX

Video: Timing Requirements in Broadcast Applications

How does timing for AES67 and SMPTE ST 2110-30 work? All is revealed in this short video by Andrea Hildebrand who explains why we need PTP timing and how we relate the absolute time to the signals themselves.

In a network for audio streams, Andreas starts, we want all the streams to run on their native sample rate, use the same clock, but also want to have the possibility of multiple concurrent streams using different sample rates. Also, it’s important to have a deterministic end-to-end latency and that, when streams arrive, they should be suitably aligned. We achieve all of this by distributing time around the system. Audio has very high accuracy requirements of down to within 10 microseconds for typical 48KHz broadcast signals, but AES11 requires within 1 microsecond which is why the Precision Time Protocol, PTP is used which is defined by the standard IEEE 1588. For more information on PTP, check out our PTP back library

End devices run their own local clocks, synchronised to the PTP on the network. In charge of it all, there is a grandmaster locked to GPS which can then distribute to other secondary clocks which feed the end devices. The end device can generate a media clock from the PTP and by using PTP, different facilities can be kept in time with each other. All media is then timestamped with the time when they were generated. For advice on architecting PTP, have a listen to this talk from Arista’s Gerard Phillips.

RTP is used to carry professional media streams like AES. RTP builds on top of UDP to add the critical timing information we need. Namely, the timestamp but also the sequence number. Andreas looks at the structure of the RTP packet header to see where the timestamp and identifiers go. To follow up on the IT basics underpinning AES67 and SMPTE ST 2110, check out Ed Calverley’s presentation on the topic.

‘Profiles’ are required to link the time of day to media flows – to give the time some meaning in terms of the expected signal. The AES67 Media Profile does this for AES67 as an annexe in the standard. SMPTE use ST 2059 to define how to use AES67 as well as all the other essences it supports and relate them all back to an originating epoch time in 1970.

The talk finishes by looking at the overlap in timing specs for AES67 and ST 2110-30 (AES67 for 2110). For more information on how AES67 and ST 2110 work (and don’t work) together, watch Andreas’s ‘Deeper dive’ on the topic.

Watch now!
Speakers

Andreas Hildebrand Andreas Hildebrand
RAVENNA Evangelist
ALC NetworX

Video: RIST Unfiltered – Q&A Session

RIST is a protocol which allows for reliable streaming over lossy networks like the internet. Whilst many people know that much, they may not know more and may have questions. Today’s video aims to answer the most common questions. For a technical presentation of RIST, look no further than this talk and this article

Kieran Kunhya deals out the questions to the panel from the RIST Forum, RIST members and AWS. Asking:
Does RIST need 3rd party equipment?
Is there an open-source implementation of RIST?
Whether there are any RIST learning courses?
as well as why companies should use RIST over SRT.
RIST, we hear is based on RTP which is a very widely deployed technology for real-time media transport and is widely used for SMPTE 2022-2 and 6 streams, SMPTE 2110, AES67 and other audio protocols. So not only is it proven, but it’s also based on RFCs along with much of RIST. SRT, the panel says, is based on the UDT file transfer protocol which is not an RFC and wasn’t designed for live media transport although SRT does perform very well for live media.

“Why are there so many competitors in RIST?” is another common question which is answered by talking about the need for interoperability. Fostering widespread interoperability will grow the market for these products much more than it would with many smaller protocols. “What new traction is RIST getting?” is answered by David Griggs from AWS who says they are committed to the protocol and find that customers like the openness of the protocol and are thus willing to invest their time in creating workflows based on it. Adi Rozenberg lists many examples of customers who are using the technology today. You can hear David Griggs explain RIST from his perspective in this talk.

Other questions handled are the licence that RIST is available under and the open-source implementations, the latency involved in using RIST and whether it can carry NDI. Sergio explains that NDI is a TCP-based protocol so you can transmit it by extracting UDP out of it, using multicast or using a VizRT-tool for extracting the media without recompressing. Finally, the panel looks at how to join the RIST Activity Group in the VSF and the RIST Forum. They talk about the origin of RIST being in an open request to the industry from ESPN and what is coming in the upcoming Advanced Profile.

Watch now!
Speakers

Rick Ackermans Rick Ackermans
RIST AG Chair,
Director of RF & Transmission Engineering, CBS Television
David Griggs David Griggs
Senior Product Manager, Media Services,
AWS Elemental
Sergio Ammirata Sergio Ammirata
RIST AG Member,
Chief Science Officer, SipRadius
Adi Rozenberg Adi Rozenberg
RIST Forum Director
AG Member, Co-Founder & CTO, VideoFlow
Ciro Noronha Ciro Noronha
RIST Forum President and AG Member
EVP of Engineering, Cobalt Digital
Paul Atwell Paul Atwell
RIST Forum Director,
President, Media Transport Solutions
Wes Simpson Wes Simpson
RIST AG Co-Chair,
President & Founder, LearnIPvideo.com
Kieran Kunhya Kieran Kunhya
RIST Forum Director
Founder & CEO, Open Broadcast Systems

Video: SRT Protocol Overview

SRT’s ability to make lossy networks seem like perfect video circuits is increasingly well known, testified to by the SRT Alliance having just surpassed 400 member companies. But this isn’t your average ‘overview’, it dispenses with the technology introductions and goes straight into the detail so is ideal for people who already know the basics and want some deeper knowledge plus a look at the new features to come.

For those wanting an introduction, this article What is SRT? is a good starter which also links to two other intro videos. But today we’re going to join Haivision’s Maxim Sharabayko to look below the surface of SRT.

Maxim starts by introducing the open-source Git repository and the open-source integrations available before heading into the feature matrix. This shows what is and isn’t in SRT. We see that on top of ARQ, it has FEC, encryption, stream multiplexing and, soon, connection bonding. Addressing the major feature areas one by one, we start with connectivity.

SRT has two modes to establish a connection which Maixm shows on handshake diagrams. We can see that establishment need only take 2x round trips so is quick to establish. This allows Maxim to show how firewall traversal is accomplished, though NAT traversal is not yet implemented.

Next on the list of topics is access control whereby we need to ensure that only authorised users can gain access. This is achieved using the Stream ID field within SRT control packets which can contain up to 512 characters meaning it can be used to transfer usernames, passwords (in the form of keys) and requests. Maxim then explains the AES PSK encryption function and discusses the potential implementation of TLS and DTLS.

Content delivery is next under the magnifying glass starting with the structure of SRT packets and the difference between the two types: Data and Control, the former being restricted to only containing payload or FEC data. Maxim covers the positive acknowledgement which is contained with SRT with the range of received packets being acknowledged every 10ms and, where 64 packets come in less than 10ms, a low-overhead acknowledgement being sent for each group of 64 data packets. But of course, it’s the NAK packets which are the most important part of the protocol. Maxim explains they are able to send back one sequence number or a range of lost packets and talks about when they are sent. We see how this then fits into the Timestamp Based Packet Delivery (TSBPD) mechanism which itself is a feature of SRT which delivers packets to the receiver with the same timing as they arrived at the sender. The last thing we look at in the section is a worked example of Too-Late Packet Drop which explains when and why packets are dropped.

ARQ isn’t the only recovery mechanism in SRT, it also provides FEC and, soon, channel bonding. FEC’s can be useful but do have downsides which should be understood. There is a permanent bandwidth overhead, even when the circuit is working well, and a further latency is needed in order to generate the necessary recovery packets. Bonding allows you to stream the same stream over more than one circuit and use data from circuit B to fill in any gaps in circuit A, this technique is used in SMPTE ST 2022-7. Connection bonding, though, can also be used with multiple connections at once and having dynamic balancing across them. Maxim sums up the pros and cons of the different techniques in the table below.

Pros and cons of different packet recovery techniques. Source: Haivision

The talk finishes with a look at stream multiplexing, congestion control and ways in which you can use the SRT statistics which are constantly updated to manage your connectivity.

Watch now!
Speakers

Maxim Sharabayko Maxim Sharabayko
Senior Software Developer,
Havision