Video: AES67 & SMPTE ST 2110 Timing and Synchronization

Good timing is essential in production for AES67 audio and SMPTE ST 2110. Delivering timing is no longer a matter of delivering a signal throughout your facility, over IP timing is bidirectional and forms a system which should be monitored and managed. Timing distribution has always needed design and architecture, but the detail and understanding needed are much more. At the beginning of this talk, Andreas Hildebrand explains why we need to bother with such complexity, after all, we got along very well for many years without it! Non-IP timing signals are distributed on their own cables as part of their own system. There are some parts of the chain which can get away without timing signals, but when they are needed, they are on a separate cable. With IP, having a separate network for distribution of timing doesn’t make sense so, whether you have an analogue or digital timing signal, that needs to be moving into the IP domain. But how much accuracy in timing to you need? Network devices already widely use NTP which can achieve an accuracy of less than a millisecond. Andreas explains that this isn’t enough for professional audio. At 48Khz, AES samples happen at an accuracy of plus or minus 10 microseconds with 192KHz going down to 2.5 microseconds. As your timing signal has to be less than the accuracy you need, this means we need to achieve nanosecond precision.

Daniel Boldt from timing specialists Meinberg is the focus of this talk explaining how we achieve this nano-second precision. Enter PTP, the Precision Time Protocol. This is a cross-industry standard from the IEEE uses in telcoms, power, finance and in many others wherever a network and its devices need to understand the time. It’s not a static standard, Daniel explains, and it’s just about to see its third revision which, like the last, adds features.

Before finding out about the latest changes, Daniel explains how PTP works in the first place; how is it possible to accurately derive time down to the nanosecond over a network which will have variable propagation times? We see how timestamps are introduced into the network interface controller (NIC) at the last moment allowing the timestamps to be created in hardware which removes some of the variable delays that is typical in software. This happens, Daniel shows, in the switch as well as in the server network cards. This article will refer to either a primary clock or a grand master. Daniel steps us through the messages exchanged between the primary and secondary clock which is the interaction at the heart of the protocol. The key is that after the primary has sent a timestamp, the secondary sends its timestamp to the primary which replies saying the time it received the secondary the reply. The secondary ends up with 4 timestamps that it can combine to determine its offset from the primary’s time and the delay in receiving messages. Applying this information allows it to correct the clock very accurately.

PTP Primary-Secondary Message Exchange.
Source: Meinberg

Most broadcasters would prefer to have more than one grandmaster clock but if there are multiple clocks, how do you choose which to sync from? Timing systems have long used strata whereby clocks are rated based on accuracy, either for internal accuracy & stability or by what they are synched to. This is also true for PTP and is part of the considerations in the ‘Best Master Clock Algorithm’. The BMCA starts by allowing a time source to assess its own accuracy and then search for better options on the network. Clocks announce themselves to the network and by listening to other announcements, a clock can decide if it should become a primary clock if, for instance, it hears no announce messages at all. For devices which should never be a grand primary, you can force them never to decide to become grand masters. This is a requisite for audio devices participating in ST 2110-3x.

Passing PTP around the network takes some care and is most easily done by using switches which understand PTP. These switches either run a ‘boundary clock’ or are ‘transparent clocks’. Daniel explores both of these scenarios explaining how the boundary clock switch is able to run multiple primary and secondary clocks depending on what is connected on each interface. We also see what work the switches have to do behind the scenes to maintain timing precision in transparent mode. In summary, Daniel summaries boundary clocks as being good for hierarchical systems and scales well but requires continuous monitoring whereas transparent clocks are simpler to deploy and require minimal monitoring. The main issue with transparent clocks is that they don’t scale well as all your timing messages are still going back to one main clock which could get overwhelmed.

SMPTE 2022-7 has been a very successful standard as its reliance only on RTP has allowed it to be widely applicable to compressed and uncompressed IP flows. It is often used in 2110 networks, too, where two separate networks are run and brought together at the receiving device. That device, on a packet-by-packet basis, is free to derive its audio/video stream from either network. This requires, however, exactly the same timing on both networks so Daniel looks at an example diagram where this PTP sharing is shown.

PTP’s still evolving and in this next section, Daniel takes us through some of the coming improvements which are also outlined at Meinberg’s blog. These are profile isolation, multi-domain clocks, security improvements and more.

Andreas takes the final section of the webinar to explain how we use PTP in media networks. All receivers will have the same clock which could be derived from GPS removing the need to distribute PTP between sites. 2110 is based on RTP which requires a timestamp to be added to every packet delivered to the network. RTP is a wrapper around IP packets which includes a timestamp which can be derived from the media clock counter.

Andreas looks at how accurate RTP delivery is achieved, dealing with offset values, populating the timestamp from the PTP clock for realties streams and he explains how the playout delay is calculated from the link offset. Finally, he shows the relatively simple process of synchronisation art the playout device. With all the timestamps in the system, synchronising playback of audio, video and metadata using buffers can be achieved fairly easily. Unfortunately, timestamps are easily destroyed by secondary processing (for instance loudness adjustment for an audio stream). Clearly, if this happened, synchronisation at the receiver would be broken. Whilst this will be addressed by out-of-band messaging in future standards, for now, this is managed by a broadcast controller which can take delay information from processing stages and distribute this to receivers.

Watch now!
Speakers

Daniel Boldt Daniel Boldt
Head of Software Development,
Meinberg
Andreas Hildebrand Andreas Hildebrand
RAVENNA Technology Evangelist,
ALC NetworX

Video: Introduction To AES67 & SMPTE ST 2110

While standardisation of video and audio over IP is welcome, this does leave us with a plethora of standards numbers to keep track of along with interoperability edge cases to keep track of. Audio-over-IP standard AES67 is part of the SMPTE ST-2110 standards suite and was born largely from RAVENNA which is still in use in it’s own right. It’s with this backdrop that Andreas Hildebrand from ALC NetworX who have been developing RAVENNA for 10 years now, takes the mic to explain how this all fits together. Whilst there are many technologies at play, this webinar focusses on AES67 and 2110.

Andreas explains how AES67 started out of a plan to unite the many proprietary audio-over-IP formats. For instance, synchronisation – like ST 2110 as we’ll see later – was based on PTP. Andreas gives an overview of this synchronisation and then we shows how they looked at each of the OSI layers and defined a technology that could service everyone. RTP, the Real-time Transport Protocol has been in use for a long time for transport of video and audio so made a perfect option for the transport layer. Andreas highlights the important timing information in the headers and how it can be delivered by unicast or IGMP multicast.

As for the audio, standard PCM is the audio of choice here. Andreas details the different format options available such as 24-bit with 8 channels and 48 samples per packet. By varying the format permutations, we can increase the sample rate to 96kHz or modify the number of audio tracks. To signal all of this format information, Session Description Protocol messages are sent which are small text files outlining the format of the upcoming audio. These are defined in RFC 4566. For a deeper introduction to IP basics and these topics, have a look at Ed Calverly’s talk.

The second half of the video is an introduction to ST-2110. A deeper dive can be found elsewhere on the site from Wes Simpson.
Andreas starts from the basis of ST 2022-6 showing how that was an SDI-based format where all the audio, video and metadata were combined together. ST 2110 brings the splitting of media, known as ‘essences’, which allows them to follow separate workflows without requiring lots of de-embedding and embedding processes.

Like most modern standards, ATSC 3.0 is another example, SMPTE ST 2110 is a suite of many standards documents. Andreas takes the time to explain each one and the ones currently being worked on. The first standard is ST 2110-10 which defines the use of PTP for timing and synchronisation. This uses SMPTE ST 2059 to relate PTP time to the phase of media essences.

2110-20 is up next and is the main standard that defines use of uncompressed video with headline features such as being raster/resolution agnostic, colour sampling and more. 2110-21 defines traffic shaping. Andreas takes time to explain why traffic shaping is necessary and what Narrow, Narrow-Linear, Wide mean in terms of packet timing. Finishing the video theme, 2110-22 defines the carriage of mezzanine-compressed video. Intended for compression like TICO and JPEG XS which have light, fast compression, this is the first time that compressed media has entered the 2110 suite.

2110-30 marks the beginning of the audio standards describing how AES67 can be used. As Andreas demonstrates, AES67 has some modes which are not compatible, so he spends time explaining the constraints and how to implement this. For more detail on this topic, check out his previous talk on the matter. 2110-31 introduces AES3 audio which, like in SDI, provides both the ability to have PCM audio, but also non-PCM audio like Dolby E and D.

Finishing up the talk, we hear about 2110-40 which governs transport of ancillary metadata and a look to the standards still being written, 2110-23 Single Video essence over multiple 2110-20 streams, 2110-24 for transport of SD signals and 2110-41 Transport of extensible, dynamic metadata.

Watch now!
Speaker

Andreas Hildebrand Andreas Hildebrand
Senior Product Manager,
ALC NetworX Gmbh.

Video: Benefits of IP Systems for Sporting Venues

As you walk around any exhibitions there seems to be a myriad of ‘benefits’ of IP working, many of which don’t resonate for particular use cases. Only the most extraordinary businesses need all of the benefits, so in this talk, Imagine Communication’s John Mailhot discusses how IP helps sports venues.

John sets the scene by separating out the function of OB trucks and the ‘inside production’ facilities which have a whole host of non-TV production to do including driving scoreboards, displays inside the venue, replays and importantly has to deal with over 250 events a year, not all of which will have an OB truck.

We see that the scale that IP can work at is a great benefit as many signals can fit down one fibre and 2022-7 seamless switching can easily provide full redundancy for every fibre and SFP. This is a level of redundancy which is simply not seen in SDI systems. With stadia being very large, necessitating cable runs of over 500m, the fact that IP needs fewer cables overall is a great benefit.

John shows an example of an Arista switch only 7U in height which provides 144x 100G ports meaning it could support over 4000 inputs and 4000 outputs. Such density is unprecedented and for OB trucks can be a dealbreaker. For sports venues, this can also be a big motivator but also allow more flexibility in distributing the solution rather than relying on a massive central interconnect with a 1100×1100 SDI router in a central CTA.

TV is nothing without audio and the benefits to audio in 2110 are non trivial since with the audio being split off from the video, we are no longer limited to dealing with just 16 channels per video and de-embedding from a video frame any time we want to touch it.

Timing is an interesting benefit. I say this because, whilst PTP can end up being quite complex compared to black and burst, it has some big benefits. First off, it can live in the same cables as your data where as black and burst requires a whole separate cable infrastructure. PTP also allows you to timestamp all essences which helps with lip-sync throughout your workflow.

John leads us through some examples of how this works for different areas finishing by summing up the relevant benefits such as scalability, multi-format, space efficient, and timing amongst others.

Watch now!
Download the slides
Speakers

John Mailhot John Mailhot
CTO, Networking & Infrastructure,
Imagine Communications

Video: The Basics of SMPTE ST 2110 in 60 Minutes

SMPTE ST 2110 is a growing suite of standards detailing uncompressed media transport over networks. Now at 8 documents, it’s far more than just ‘video over IP’. This talk looks at the new ways that video can be transported, dealing with PTP timing, creating ‘SDPs’ and is a thorough look at all the documents.

Building on this talk from Ed Calverly which explains how we can use networks to carry uncompressed video, Wes Simpson goes through all the parts of the ST 2110 suite explaining how they work and interoperate as part of the IP Showcase at NAB 2019.

Wes starts by highlighting the new parts of 2110, namely the overview document which gives a high level overview of all the standard docs, the addition of compressed bit-rate video carriage and the recommended practice document for splitting a single video and sending it over multiple links; both of which are detailed later in the talk.

SMPTE ST 2110 is fundamentally different, as highlighted next, in that it splits up all the separate parts of the signal (i.e. video, audio and metadata) so they can be transferred and processed separately. This is a great advantage in terms of reading metadata without having to ingest large amounts of video meaning that the networking and processing requirements are much lighter than they would otherwise be. However, when essences are separated, putting them back together without any synchronisation issues is tricky.

ST 2110-10 deals with timing and knowing which packets of one essence are associated with packets of another essence at any particular point in time. It does this with PTP, which is detailed in IEEE 1588 and also in SMPTE ST 2059-2. Two standards are needed to make this work because the IEEE defined how to derive and carry timing over the network, SMPTE then detailed how to match the PTP times to phases of media. Wes highlights that care needs to be used when using PTP and AES67 as the audio standard requires specific timing parameters.

The next section moves into the video portion of 2110 dealing with video encapsulation on the networks pixel grouping and the headers needed for the packets. Wes then spends some time walking us through calculating the bitrate of a stream. Whilst for most people using a look-up table of standard formats would suffice, understanding how to calculate the throughput helps develop a very good understanding of the way 2110 is carried on the wire as you have to take note not only of the video itself (4:2:2 10 bit, for instance) but also the pixel groupings, UDP, RTP and IP headers.

Timing of packets on the wire isn’t anything new as it is also important for compressed applications, but it is of similar importance to ensure that packets are sent properly paced on wire. This is to say that if you need to send 10 packets, you send them one at a time with equal time between them, not all at once right next to each other. Such ‘micro bursting’ can cause problems not only for the receiver which then needs to use more buffers, but also when mixed with other streams on the network it can affect the efficiency of the routers and switches leading to jitter and possibly dropped packets. 2110-21 sets standards to govern the timing of network pacing for all of the 2110 suite.

Referring back to his warning earlier regarding timing and AES67, Wes now goes into detail on the 2110-30 standard which describes the use of audio for these uncompressed workflows. He explains how the sample rates and packet times relate to the ability to carry multiple audios with some configurations allowing 64 audios in one stream rather than the typical 8.

‘Essences’, rather than media, is a word often heard when talking about 2110. This is an acknowledgement that metadata is just as important as the media described in 2110. It’s sent separately as described by 2110-40. Wes explains the way captions/subtitles, ad triggers, timecode and more can be encapsulated in the stream as ancillary ‘ANC’ packets.

2110-22 is an exciting new addition as this enables the use of compressed video such as VC-2 and JPEG-XS which are ultra low latency codecs allowing the video stream to be reduced by half, a quarter or more. As described in this talk the ability to create workflows on a single IP infrastructure seamlessly moving into and out of compressed video is allowing remote production across countries allowing for equipment to be centralised with people and control surfaces elsewhere.

Noted as ‘forthcoming’ by Wes, but having since been published, is RP 2110-23 which adds back in a feature that was lost when migrating from 2022-6 into 2110 – the ability to send a UHD feed as 4x HD feeds. This can be useful to allow for UHD to be used as a production format but for multiviewers to only need to work in HD mode for monitoring. Wes explains the different modes available. The talk finishes by looking at RTP timestamps and SDPs.

Watch now!
The slides for this talk are available here
Speakers

Wes Simpson Wes Simpson
President,
Telecom Product Consulting