Video: Next Generation TV Audio

Often not discussed, audio is essential to television and film so as the pixels get better, so should the sound. All aspects of audio are moving forward with more processing power at the receiver, better compression at the sender and a seismic shift in how audio is handled, even in the consumer domain. It’s fair to say that Dolby have been busy.

Larry Schindel from Linear Acoustic is here thanks to the SBE to bring us up to date on what’s normally called ‘Next Generation Audio’ (NGA). He starts from the basics looking at how audio has been traditionally delivered by channels. Stereo sound is delivered as two channels, one for each speaker. The sound engineer choosing how the audio is split between them. With the move to 5.1 and beyond, this continued with the delivery of 6, 8 or even more channels of audio. The trouble is this was always fixed at the time it went through the sound suite. Mixing sound into channels makes assumptions on the layout of your speakers. Sometimes it’s not possible to put your speakers in the ideal position and your sound suffers.

Dolby Atmos has heralded a mainstream move to object-based audio where sounds are delivered with information about their position in the sound field as opposed to the traditional channel approach. Object-based audio leaves the downmixing to the receiver which can be set to take into account its unique room and speaker layout. It represents a change in thinking about audio, a move from thinking about the outputs to the inputs. Larry introduces Dolby Atmos and details the ways it can be delivered and highlights that it can work in a channel or object mode.

Larry then looks at where you can get media with Dolby Atmos. Cinemas are an obvious starting point, but there is a long list of streaming and pay-TV services which use it, too. Larry talks about the upcoming high-profile events which will be covered in Dolby Atmos showing that delivering this enhanced experience is something being taken seriously by broadcasters across the board.

For consumers, they still have the problem of getting the audio in the right place in their awkward, often small, rooms. Larry looks at some of the options for getting great audio in the home which include speakers which bounce sound off the ceiling and soundbars.

One of the key technologies for delivering Dolby Atmos is Dolby AC-4, the improved audio codec taking compression a step further from AC-3. We see that data rates have tumbled, for example, 5.1 surround on AC-3 would be 448Kbps, but can now be done in 144kbps with AC-4. Naturally, it supports channel and object modes and Larry explains how it can deliver a base mix with other audio elements over the top for the decoder to place allowing better customisation. This can include other languages or audio description/video description services. Importantly AC-4, like Dolby E, can be sent so that it doesn’t overlap video frames allowing it to accompany routed audio. Without this awareness of video, any time a video switch was made, the audio would become corrupted and there would be a click.

Dolby Atmos and AC-4 stand on their own and are widely applicable to much of the broadcast chain. Larry finishes this presentation by mentioning that Dolby AC-4 will be the audio of choice for ATSC 3.0. We’ve covered ATSC 3.0 extensively here at The Broadcast Knowledge so if you want more detail than there is in this section of the presentation, do dig in further.

Watch now!

Speaker

Larry Schindel Larry Schindel
Senior Product Manager,
Linear Acoustic

Video: Introduction to IPMX

The Broadcast Knowledge has documented over 100 videos and webinars on SMPTE ST 2110. It’s a great suite of standards but it’s not always simple to implement. For smaller systems, many of the complications and nuances don’t occur so a lot of the deeper dives into ST 2110 and its associated specifications such as NMOS from AMWA focus on the work done in large systems in tier-1 broadcasters such as the BBC, tpc and FIS Skiing for SVT.

ProAV, the professional end of the AV market, is a different market. Very few companies have a large AV department if one at all. So the ProAV market needs technologies which are much more ‘plug and play’ particularly those in the events side of the market. To date, the ProAV market has been successful in adopting IP technology with quick deployments by using heavily proprietary solutions like ZeeVee, SDVoE and NDI to name a few. These achieve interoperability by having the same software or hardware in each and every implementation.

IPMX aims to change this by bringing together a mix of standards and open specifications: SMPTE ST 2110, NMOS specs and AES. Any individual or company can gain access and develop a service or product to meet them.

Andreas gives a brief history of IP to date outlining how AES67, ST 2110, ST 2059 and the IS specifications, his point being that the work is not yet done. ProAV has needs beyond, though complementary to, those of broadcast.

AES67 is already the answer to a previous interoperability challenge, explains Andreas, as the world of audio over IP was once a purely federated world of proprietary standards which had no, or limited, interoperability. AES67 defined a way to allow these standards to interoperate and has now become the main way audio is moved in SMPTE 2110 under ST 2110-30 (2110-31 allows for AES3). Andreas explains the basics of 2110, AES, as well as the NMOS specifications. He then shows how they fit together in a layered design.

Andreas brings the talk to a close looking at some of the extensions that are needed, he highlights the ability to be more flexible with the quality-bandwidth-latency trade-off. Some ProAV applications require pixel perfection, but some are dictated by lower bandwidth. The current ecosystem, if you include ST 2110-22’s ability to carry JPEG-XS instead of uncompressed video allows only very coarse control of this. HDMI, naturally, is of great importance for ProAV with so many HDMI interfaces in play but also the wide variety of resolutions and framerates that are found outside of broadcast. Work is ongoing to enable HDCP to be carried, suitably encrypted, in these systems. Finally, there is a plan to specify a way to reduce the highly strict PTP requirements.

Watch now!
Speaker

Andreas Hildebrand Andreas Hildebrand
Evangelist,
ALC NetworX

Video: The Five Ws of 5G

Following on from last week’s deep dive below the hype of 5G this shorter talk looks both at the promise and implementation challenges of this technology which promises so much to so many different walks of life.

Michael Heiss, takes the stage and starts a short history lesson with 1G (an analogue technology) and shows how it stepped up through 2G A.K.A. GSM and moved into 4G, LTE and now 5G. Michael’s hypothesis is that this is the fourth industrial revolution. The first, he proposes is what we know as the Industrial Revolution which started with harnessing steam power. But until the invention of electricity, you had to be close to your power source. Electricity was the game-changer in enabling people, albeit with the relevant and long wires, to have the machines abstracted from the power generation. Similarly, while data and computing have transformed our world in the past 5 decades or more, Michael says 5G is the technology which will give us that abstraction like electricity to remote people from power production, 5G promises to allow people in general to not have to be next to a computer (where the data is). Michael outlines the ability of higher speeds and lower latency to enable new use-cases. He outlines consumer applications, medical use cases, and business uses.

As with any new technology, there is always a battle for dominance, so Michael outlines some of the different words and phrases used to explain what they mean. If you see “NR”, that stands for New Radio and comes from 3GPP. There are a number of frequency bands which 5G can occupy which Michael introduces. The current bands for 2G and 3G between 700 and 1400 MHz can be used. There are also a number of new frequencies up to and including some C-band frequencies which are in use. These are known, collectively, by some as the ‘sub 6’ frequencies to differentiate them from the millimetre-wave (mm-wave) frequencies which have been opened up starting at 24Ghz up to 47GHz.

It’s an inconvenient truth of physics that higher frequency RF is more highly attenuated in general. This means that the mm-wave frequencies, being so high, are actually only effective with almost direct ‘line of sight’ to the device. They can’t penetrate walls or windows. 5G will need many more cell sites outdoors thanks to the higher sub 6 frequencies, but to use mm-wave, telcos will be restricted to line-of-site transmitter-to-transmitter links or deploying highly local micro or femtocells on lamp posts (light poles) or ceiling mounted internal relays. Michael finishes his talk discussing these implementation difficulties.

Watch now!
Speakers

Michael Heiss Michael Heiss
Principal Consultant
M. Heiss Consulting

Video: AES67 & SMPTE ST 2110 Timing and Synchronization

Good timing is essential in production for AES67 audio and SMPTE ST 2110. Delivering timing is no longer a matter of delivering a signal throughout your facility, over IP timing is bidirectional and forms a system which should be monitored and managed. Timing distribution has always needed design and architecture, but the detail and understanding needed are much more. At the beginning of this talk, Andreas Hildebrand explains why we need to bother with such complexity, after all, we got along very well for many years without it! Non-IP timing signals are distributed on their own cables as part of their own system. There are some parts of the chain which can get away without timing signals, but when they are needed, they are on a separate cable. With IP, having a separate network for distribution of timing doesn’t make sense so, whether you have an analogue or digital timing signal, that needs to be moving into the IP domain. But how much accuracy in timing to you need? Network devices already widely use NTP which can achieve an accuracy of less than a millisecond. Andreas explains that this isn’t enough for professional audio. At 48Khz, AES samples happen at an accuracy of plus or minus 10 microseconds with 192KHz going down to 2.5 microseconds. As your timing signal has to be less than the accuracy you need, this means we need to achieve nanosecond precision.

Daniel Boldt from timing specialists Meinberg is the focus of this talk explaining how we achieve this nano-second precision. Enter PTP, the Precision Time Protocol. This is a cross-industry standard from the IEEE uses in telcoms, power, finance and in many others wherever a network and its devices need to understand the time. It’s not a static standard, Daniel explains, and it’s just about to see its third revision which, like the last, adds features.

Before finding out about the latest changes, Daniel explains how PTP works in the first place; how is it possible to accurately derive time down to the nanosecond over a network which will have variable propagation times? We see how timestamps are introduced into the network interface controller (NIC) at the last moment allowing the timestamps to be created in hardware which removes some of the variable delays that is typical in software. This happens, Daniel shows, in the switch as well as in the server network cards. This article will refer to either a primary clock or a grand master. Daniel steps us through the messages exchanged between the primary and secondary clock which is the interaction at the heart of the protocol. The key is that after the primary has sent a timestamp, the secondary sends its timestamp to the primary which replies saying the time it received the secondary the reply. The secondary ends up with 4 timestamps that it can combine to determine its offset from the primary’s time and the delay in receiving messages. Applying this information allows it to correct the clock very accurately.

PTP Primary-Secondary Message Exchange.
Source: Meinberg

Most broadcasters would prefer to have more than one grandmaster clock but if there are multiple clocks, how do you choose which to sync from? Timing systems have long used strata whereby clocks are rated based on accuracy, either for internal accuracy & stability or by what they are synched to. This is also true for PTP and is part of the considerations in the ‘Best Master Clock Algorithm’. The BMCA starts by allowing a time source to assess its own accuracy and then search for better options on the network. Clocks announce themselves to the network and by listening to other announcements, a clock can decide if it should become a primary clock if, for instance, it hears no announce messages at all. For devices which should never be a grand primary, you can force them never to decide to become grand masters. This is a requisite for audio devices participating in ST 2110-3x.

Passing PTP around the network takes some care and is most easily done by using switches which understand PTP. These switches either run a ‘boundary clock’ or are ‘transparent clocks’. Daniel explores both of these scenarios explaining how the boundary clock switch is able to run multiple primary and secondary clocks depending on what is connected on each interface. We also see what work the switches have to do behind the scenes to maintain timing precision in transparent mode. In summary, Daniel summaries boundary clocks as being good for hierarchical systems and scales well but requires continuous monitoring whereas transparent clocks are simpler to deploy and require minimal monitoring. The main issue with transparent clocks is that they don’t scale well as all your timing messages are still going back to one main clock which could get overwhelmed.

SMPTE 2022-7 has been a very successful standard as its reliance only on RTP has allowed it to be widely applicable to compressed and uncompressed IP flows. It is often used in 2110 networks, too, where two separate networks are run and brought together at the receiving device. That device, on a packet-by-packet basis, is free to derive its audio/video stream from either network. This requires, however, exactly the same timing on both networks so Daniel looks at an example diagram where this PTP sharing is shown.

PTP’s still evolving and in this next section, Daniel takes us through some of the coming improvements which are also outlined at Meinberg’s blog. These are profile isolation, multi-domain clocks, security improvements and more.

Andreas takes the final section of the webinar to explain how we use PTP in media networks. All receivers will have the same clock which could be derived from GPS removing the need to distribute PTP between sites. 2110 is based on RTP which requires a timestamp to be added to every packet delivered to the network. RTP is a wrapper around IP packets which includes a timestamp which can be derived from the media clock counter.

Andreas looks at how accurate RTP delivery is achieved, dealing with offset values, populating the timestamp from the PTP clock for realties streams and he explains how the playout delay is calculated from the link offset. Finally, he shows the relatively simple process of synchronisation art the playout device. With all the timestamps in the system, synchronising playback of audio, video and metadata using buffers can be achieved fairly easily. Unfortunately, timestamps are easily destroyed by secondary processing (for instance loudness adjustment for an audio stream). Clearly, if this happened, synchronisation at the receiver would be broken. Whilst this will be addressed by out-of-band messaging in future standards, for now, this is managed by a broadcast controller which can take delay information from processing stages and distribute this to receivers.

Watch now!
Speakers

Daniel Boldt Daniel Boldt
Head of Software Development,
Meinberg
Andreas Hildebrand Andreas Hildebrand
RAVENNA Technology Evangelist,
ALC NetworX