Video: Get it in the mixer! Achieving better audio immersiveness


Immersive audio is pretty much standard for premium sports coverage and can take many forms. Typically, immersive audio is explained as ‘better than surround sound’ and is often delivered to the listener as object audio such as AC-4. Delivering audio as objects allows the listener’s decoder to place the sounds appropriately for their specific setup, whether they have 3 speakers, 7, a ceiling bouncing array or just headphones. This video looks at how these can be carefully manipulated to maximise the immersiveness of the experience and is available as a binaural version.

This conversation from SVG Europe, hosted by Roger Charlseworth brings together three academics who are applying their research to live, on-air sports. First to speak is Hyunkook Lee who discusses how to capture 360 degree sound fields using microphone arrays. In order to capture audio from all around, we need to use multiple microphones but, as Hyunkook explains, any difference in location between microphones can lead to a phase difference in the audio. This can be perceived as a delay in audio between two microphones gives us the spatial sound of the audio just as the spacing of our ears helps us understand the soundscape. This effect can be considered separately in the vertical and horizontal domain, the latter being important.

Talking about vertical separation, Hyunkook discusses the ‘Pitch-Height’ effect whereby the pitch of the sound affects our perception of its height rather than any delays between different sound sources. Modulating the amplitude, however, can be effective. Now, when bringing together into one mix multiple versions of the same audio which have been slightly delayed, this produces comb filtering of the audio. As such, a high-level microphone used to capture ambience can colour the overall audio. Hyunkook shows that this colouration can be mitigated by reducing the upper sound by 7dB which can be done by angling the audio up. He finished by playing his binaural recordings recorded on his microphone arrays. A binaural version of this video is also available.

Second up, is Ben Shirley who talks about supporting the sound supervisor’s mix with AI. Ben highlights that a sound supervisor will not just be in charge of the main programme mix, but also the comms system. As such, if that breaks – which could endanger the wider production – their attention will have to go to that rather than mixing. Whilst this may not be so much of an issue with simpler games, when producing high-end mixes with object audio, this is very skilled job which requires constant attention. Naturally, the more immersive an experience is, the more obvious it is when mistakes happen. The solution created by Ben’s company is to use AI to create a pitch effects mix which can be used as a sustaining feed which covers moments when the sound supervisor can’t give the attention needed, but also allows them more flexibility to work on the finer points of the mix rather than ‘chasing the ball’.

The AI-trained system is able to create a constant-power mix of the on-pitch audio. By analysing the many microphones, it’s also able to detect ball kicks which aren’t close to any microphones and, indeed, may not be ‘heard’ by those mics at all. When it detects the vestiges of a ball kick, it has the ability to pull from a large range of ball kick sounds and insert on-the-fly in place of the real ball kick which wasn’t usefully recorded by any mic. This comes into its own, says Ben, when used with VR or 360-degree audio. Part of what makes immersive audio special is the promise of customising the sound to your needs. What does that mean? The most basic meaning is that it understands how many speakers you have and where they are meaning that it can create a downmix which will correctly place the sounds for you. Ideally, you would be able to add your own compression to accommodate listening at a ‘constant’ volume when dynamic range isn’t a good thing, for instance, listening at night without waking up the neighbours. Ben’s example is that in-stadium, people don’t want to hear the commentary as they don’t need to be told what to think about each shot.

Last in the order is Felix Krückels who talks about his work in German football to better use the tools already available to deal with object audio in a more nuanced way, improving the overall mix by using existing plugins. Felix starts by showing how the closeball/field of play mic contains a lot of the audio that the crowd mics contain. In fact, Felix says the closeball mic contains 90% of the crowd sound. When mixing that into stereo and also 5.1 we see that the spill in the closeball mic, we can get colouration. Some stadia have dedicated left and right crowd mics. Felix then talks about personalisation in sound giving the example of watching in a pub where there will be lots of local crowd noise so having a mix with lots of in-stadium crowd noise isn’t helpful. Much better, in that environment, to have clear commentary and ball effects with a lower-than-normal ambience. Felix plays a number of examples to show how using plugins to vary the delays can help produce the mixes needed.

Watch now!
Binarual Version
Speakers

Felix Krückels Felix Krückels
Audio Engineer,
Consultant and Academic
Hyunkook Lee Hyunkook Lee
Director of the Centre for Audio and Psychoacoustic Engineering,
University of Huddersfield
Ben Ben Shirley Ben Shirley
Director and co-Founder at Salsa Sound and Senior Lecturer and researcher in audio technology,
University of Salford
Roger Charlesworth Moderator: Roger Charlesworth
Independent consultant on media production technology

Video: Next-generation audio in the European market – The state of play

Next-generation audio refers to a range of new technologies which allow for immersive audio like 3D sound, for increased accessibility, for better personalisation and anything which delivers a step-change in the lister experience. NGA technologies can stand on their own but are often part of next-generation broadcast technologies like ATSC 3.0 or UHD/8K transmissions.

This talk from the Sports Video Group and Dolby presents one case study from a few that have happened in 2020 which delivered NGA over the air to homes. First, though, Dolby’s Jason Power brings us up to date on how NGA has been deployed to date and looks at what it is.

Whilst ‘3D sound’ is an easy to understand feature, ‘increased personalisation’ is less so. Jason introduces ideas for personalisation such as choosing which team you’re interested in and getting a different crowd mix dependant on that. The possibilities are vast and we’re only just starting to experiment with what’s possible and determine what people actually want or to change where your mics are, on the pitch or in the stands.

What can I do if I want to hear next-generation audio? Jason explains that four out of five TVs are now shipping with NGA audio and all of the five top manufacturers have support for at least one NGA technology. Such technologies are Dolby’s AC-4 and sADM. AC-4 allows delivery of Dolby Atmos which is an object-based audio format which allows the receiver much more freedom to render the sound correctly based on the current speaker set up. Should you change how many speakers you have, the decoder can render the sound differently to ensure the ‘stereo’ image remains correct.

To find out more about the technologies behind NGA, take a look at this talk from the Telos Alliance.

Next, Matthieu Parmentier talks about the Roland Garros event in 2020 which was delivered using sADM plus Dolby AC-4. sADM is an open specification for metadata interchange, the aim of which is to help interoperability between vendors. The S-ADM metadata is embedded in the SDI and then transported uncompressed as SMPTE 302M.

ATEME’s Mickaël Raulet completes the picture by explaining their approach which included setting up a full end-to-end system for testing and diagnosis. The event itself, we see, had three transmission paths. An SDR satellite backup and two feeds into the DVB-T2 transmitter at the Eiffel Tower.

The session ends with an extensive Q&A session where they discuss the challenges they faced and how they overcame them as well as how their businesses are changing.

Watch now!
Speakers

Jason Power Jason Power
Senior Director of Commercial Partnerships & Standards,
Dolby
Mickaël Raulet Mickaël Raulet
Vice President of Innovation,
ATEME
Matthieu Parmentier Matthieu Parmentier
Head of Data & Artificial Intelligence
France Television
Roger Charlesworth Moderator:Roger Charlesworth
Charlesworth Media