Video: Audio networking – ask anything you want!

It’s open season with these AES67 audio-over-Ip experts who are all the questions put to them on working with AES67. Not only was AES67 baked in to SMPTE ST 2110-30, it’s also a standard that brings compatability between Dante and RAVENNA as well as other AoIP technologies.

After a quick summary of what AES66 is, this talk quickly moves into answering these, and other questions:

  • How much bandwidth does stereo AES67 require?
  • Can multicast be used within Ravenna
  • Will there be a slipless switching/2022-7 style function?
  • Should receivers automatically adjust to original stream
  • Is it possible to avoid using PTP in an audio-only system?
  • Cost of PTP-capable switches
  • What’s the difference between Boundary Clocks and Transparent Clocks
  • Can AES67 go over the internet?
  • Tools for spotting problems
  • IPMX for Pro-AV update (See this talk)
  • Is NMOS ‘the answer’ for discovery and configuration?
  • Latency for Ravenna and AES67
  • New advancements in the PTP standard.

Watch now!
Speakers

Andreas Hildebrand Andreas Hildebrand
Evangelist,
ALC NetworX
Claude Cellier Claude Cellier
President & CEO
Merging Technologies SA
Claudio Becker-Foss
CTO,
DirectOut
Daniel Boldt Daniel Boldt
Head of Software Development,
Meinberg
Terry Holton Terry Holton
Audio subgroup Chairman,
AIMS
Roland Hemming Moderator: Roland Hemming
Audio Consultant
RH Consulting

Video: AES67 & SMPTE ST 2110 Timing and Synchronization

Good timing is essential in production for AES67 audio and SMPTE ST 2110. Delivering timing is no longer a matter of delivering a signal throughout your facility, over IP timing is bidirectional and forms a system which should be monitored and managed. Timing distribution has always needed design and architecture, but the detail and understanding needed are much more. At the beginning of this talk, Andreas Hildebrand explains why we need to bother with such complexity, after all, we got along very well for many years without it! Non-IP timing signals are distributed on their own cables as part of their own system. There are some parts of the chain which can get away without timing signals, but when they are needed, they are on a separate cable. With IP, having a separate network for distribution of timing doesn’t make sense so, whether you have an analogue or digital timing signal, that needs to be moving into the IP domain. But how much accuracy in timing to you need? Network devices already widely use NTP which can achieve an accuracy of less than a millisecond. Andreas explains that this isn’t enough for professional audio. At 48Khz, AES samples happen at an accuracy of plus or minus 10 microseconds with 192KHz going down to 2.5 microseconds. As your timing signal has to be less than the accuracy you need, this means we need to achieve nanosecond precision.

Daniel Boldt from timing specialists Meinberg is the focus of this talk explaining how we achieve this nano-second precision. Enter PTP, the Precision Time Protocol. This is a cross-industry standard from the IEEE uses in telcoms, power, finance and in many others wherever a network and its devices need to understand the time. It’s not a static standard, Daniel explains, and it’s just about to see its third revision which, like the last, adds features.

Before finding out about the latest changes, Daniel explains how PTP works in the first place; how is it possible to accurately derive time down to the nanosecond over a network which will have variable propagation times? We see how timestamps are introduced into the network interface controller (NIC) at the last moment allowing the timestamps to be created in hardware which removes some of the variable delays that is typical in software. This happens, Daniel shows, in the switch as well as in the server network cards. This article will refer to either a primary clock or a grand master. Daniel steps us through the messages exchanged between the primary and secondary clock which is the interaction at the heart of the protocol. The key is that after the primary has sent a timestamp, the secondary sends its timestamp to the primary which replies saying the time it received the secondary the reply. The secondary ends up with 4 timestamps that it can combine to determine its offset from the primary’s time and the delay in receiving messages. Applying this information allows it to correct the clock very accurately.

PTP Primary-Secondary Message Exchange.
Source: Meinberg

Most broadcasters would prefer to have more than one grandmaster clock but if there are multiple clocks, how do you choose which to sync from? Timing systems have long used strata whereby clocks are rated based on accuracy, either for internal accuracy & stability or by what they are synched to. This is also true for PTP and is part of the considerations in the ‘Best Master Clock Algorithm’. The BMCA starts by allowing a time source to assess its own accuracy and then search for better options on the network. Clocks announce themselves to the network and by listening to other announcements, a clock can decide if it should become a primary clock if, for instance, it hears no announce messages at all. For devices which should never be a grand primary, you can force them never to decide to become grand masters. This is a requisite for audio devices participating in ST 2110-3x.

Passing PTP around the network takes some care and is most easily done by using switches which understand PTP. These switches either run a ‘boundary clock’ or are ‘transparent clocks’. Daniel explores both of these scenarios explaining how the boundary clock switch is able to run multiple primary and secondary clocks depending on what is connected on each interface. We also see what work the switches have to do behind the scenes to maintain timing precision in transparent mode. In summary, Daniel summaries boundary clocks as being good for hierarchical systems and scales well but requires continuous monitoring whereas transparent clocks are simpler to deploy and require minimal monitoring. The main issue with transparent clocks is that they don’t scale well as all your timing messages are still going back to one main clock which could get overwhelmed.

SMPTE 2022-7 has been a very successful standard as its reliance only on RTP has allowed it to be widely applicable to compressed and uncompressed IP flows. It is often used in 2110 networks, too, where two separate networks are run and brought together at the receiving device. That device, on a packet-by-packet basis, is free to derive its audio/video stream from either network. This requires, however, exactly the same timing on both networks so Daniel looks at an example diagram where this PTP sharing is shown.

PTP’s still evolving and in this next section, Daniel takes us through some of the coming improvements which are also outlined at Meinberg’s blog. These are profile isolation, multi-domain clocks, security improvements and more.

Andreas takes the final section of the webinar to explain how we use PTP in media networks. All receivers will have the same clock which could be derived from GPS removing the need to distribute PTP between sites. 2110 is based on RTP which requires a timestamp to be added to every packet delivered to the network. RTP is a wrapper around IP packets which includes a timestamp which can be derived from the media clock counter.

Andreas looks at how accurate RTP delivery is achieved, dealing with offset values, populating the timestamp from the PTP clock for realties streams and he explains how the playout delay is calculated from the link offset. Finally, he shows the relatively simple process of synchronisation art the playout device. With all the timestamps in the system, synchronising playback of audio, video and metadata using buffers can be achieved fairly easily. Unfortunately, timestamps are easily destroyed by secondary processing (for instance loudness adjustment for an audio stream). Clearly, if this happened, synchronisation at the receiver would be broken. Whilst this will be addressed by out-of-band messaging in future standards, for now, this is managed by a broadcast controller which can take delay information from processing stages and distribute this to receivers.

Watch now!
Speakers

Daniel Boldt Daniel Boldt
Head of Software Development,
Meinberg
Andreas Hildebrand Andreas Hildebrand
RAVENNA Technology Evangelist,
ALC NetworX

Video: AES67 & ST 2110 Deeper Dive – The Audio Files

A deeper dive here, in the continuing series of videos looking at AES67, SMPTE ST 2110 and Ravenna. Andreas Hildebrand from ALC Networx is back to investigate the next level down on how AES67 and ST 2110 operate and how they can be configured. The talk, however, remains accessible throughout and starts with an reminder of what AES67 is and why it exists. This is was also covered in his first talk.

After explaining the AES67 was created as a way for multiple audio-over-IP standards to interoperate, Andreas looks at the stack, stepping through it to explain each element. The first topic is timing. He explains that every device on the AES67 network is not only governed by PTP, but it’s also runs its own clock which is called the Local Clock. From the Local Clock, the device then also creates a Media Clock which is based on the Local Clock time but is used to crate any frequency needed for the media (48KHz, for instance). Finally an RTP clock is kept for transmission over the network.

The next item featured on the stack is encoding. AES67 is baed on linear audio, also known as PCM. AES67 ensures that 48KHz, 16 & 24-bit audio is supported on all devices and allows up to 8 channels per stream. Importantly, Andreas explains the different versions of packet time which are supported, 1ms being mandatory which allows 48 samples of 48 KHz audio into teach IP packet.

SDP – Session Description Protocol is next which describes in a simple text file what’s in the AES67 stream giving its configuration. Then Andreas looks at what Link Offset is and examines its role in determining latency and the types of latency it’s been made to compensate for. He then talks you through working out what latency setting you need to use including taking into account the number of switches in a network and our frame size.

SMPTE ST 2110 is the focus for the last part of the talk. This, Andreas explains, is a way of moving, typically uncompressed, professional media (also known as essences) around a network for live production with very low latency. It sends audio separately to the video and uses AES67 to do so. This is defined in standard ST 2110-30. However, there are some important configurations for AES67 which are mandated in order to be compatible which Andreas explains. One example is forcing all devices to be slave only, another is setting the RTP clock offset to zero. Andreas finishes the talk by summarising what parts of ST 2110 and AES67 overlap including discussing the frame sizes supported.

Watch now!
Download the presentation
Speaker

Andreas Hildebrand Andreas Hildebrand
Senior Product Manager,
ALC NetworX Gmbh.

Video: Introduction To AES67 & SMPTE ST 2110

While standardisation of video and audio over IP is welcome, this does leave us with a plethora of standards numbers to keep track of along with interoperability edge cases to keep track of. Audio-over-IP standard AES67 is part of the SMPTE ST-2110 standards suite and was born largely from RAVENNA which is still in use in it’s own right. It’s with this backdrop that Andreas Hildebrand from ALC NetworX who have been developing RAVENNA for 10 years now, takes the mic to explain how this all fits together. Whilst there are many technologies at play, this webinar focusses on AES67 and 2110.

Andreas explains how AES67 started out of a plan to unite the many proprietary audio-over-IP formats. For instance, synchronisation – like ST 2110 as we’ll see later – was based on PTP. Andreas gives an overview of this synchronisation and then we shows how they looked at each of the OSI layers and defined a technology that could service everyone. RTP, the Real-time Transport Protocol has been in use for a long time for transport of video and audio so made a perfect option for the transport layer. Andreas highlights the important timing information in the headers and how it can be delivered by unicast or IGMP multicast.

As for the audio, standard PCM is the audio of choice here. Andreas details the different format options available such as 24-bit with 8 channels and 48 samples per packet. By varying the format permutations, we can increase the sample rate to 96kHz or modify the number of audio tracks. To signal all of this format information, Session Description Protocol messages are sent which are small text files outlining the format of the upcoming audio. These are defined in RFC 4566. For a deeper introduction to IP basics and these topics, have a look at Ed Calverley’s talk.

The second half of the video is an introduction to ST-2110. A deeper dive can be found elsewhere on the site from Wes Simpson.
Andreas starts from the basis of ST 2022-6 showing how that was an SDI-based format where all the audio, video and metadata were combined together. ST 2110 brings the splitting of media, known as ‘essences’, which allows them to follow separate workflows without requiring lots of de-embedding and embedding processes.

Like most modern standards, ATSC 3.0 is another example, SMPTE ST 2110 is a suite of many standards documents. Andreas takes the time to explain each one and the ones currently being worked on. The first standard is ST 2110-10 which defines the use of PTP for timing and synchronisation. This uses SMPTE ST 2059 to relate PTP time to the phase of media essences.

2110-20 is up next and is the main standard that defines use of uncompressed video with headline features such as being raster/resolution agnostic, colour sampling and more. 2110-21 defines traffic shaping. Andreas takes time to explain why traffic shaping is necessary and what Narrow, Narrow-Linear, Wide mean in terms of packet timing. Finishing the video theme, 2110-22 defines the carriage of mezzanine-compressed video. Intended for compression like TICO and JPEG XS which have light, fast compression, this is the first time that compressed media has entered the 2110 suite.

2110-30 marks the beginning of the audio standards describing how AES67 can be used. As Andreas demonstrates, AES67 has some modes which are not compatible, so he spends time explaining the constraints and how to implement this. For more detail on this topic, check out his previous talk on the matter. 2110-31 introduces AES3 audio which, like in SDI, provides both the ability to have PCM audio, but also non-PCM audio like Dolby E and D.

Finishing up the talk, we hear about 2110-40 which governs transport of ancillary metadata and a look to the standards still being written, 2110-23 Single Video essence over multiple 2110-20 streams, 2110-24 for transport of SD signals and 2110-41 Transport of extensible, dynamic metadata.

Watch now!
Speaker

Andreas Hildebrand Andreas Hildebrand
Senior Product Manager,
ALC NetworX Gmbh.