Video: Tweaking Error Correction Protocol Performance: A libRIST Deep Dive

There’s a false assumption that if you send video with these new error-correcting protocols like RIST or SRT that you just need to send the stream, it’ll get healed and everything will be good. But often people don’t consider what actually happens when things go wrong. To heal the stream, more data needs to be sent. Do you have enough headroom to cope with these resends? And what happens if part of your circuit becomes temporarily saturated, how will the feed cope? The reality is that it could kill it permanently due to re-request storms.

In this video from VidTrans21, Sergio Ammirata from SipRadius talks about how the error correcting protocol within RIST works and how it’s been improved to cope even better in a crisis. Joined by Adi Rozenberg they remind us of the key points of RIST and the libRIST. As a reminder, RIST is one of many protocols which allows the receiver to let the sender know which packets its missed and for them to be resent. For a proper overview of RIST and SRT, have a look at this talk explaining RIST and SRT or the multitude of talks here on The Broadcast Knowledge on RIST or SRT. Today’s video is not so much about why people use RIST, but how to make it performant with difficult circuits.

 
libRIST is an open-source, free, library which implements the RIST specification. The aim of libRIST is to allow companies to easily implement RIST within their own commercial and free programmes. Sergio points out that it’s an active project with over 675 commits in the last year bringing RIST to many platforms including ARM, AWS, Darwin, iOS, windows etc. and is now on version 0.2.0, plus is soon to be in VLC 4.0 and FFmpeg 4.3.

To understand why getting error correction is important, we can look at the effects of a simplistic implementation of the negative acknowledgement error recovery method. When the receiver doesn’t receive a packet it sends back a request for a resend of that packet. The sender will send that and, hopefully, it will be received. Let’s imagine, though, that you’re in a data centre sending to someone on a 100Mbps leased line. If the incoming bitrate of your receiver’s internet connection started getting close to 100Mbps due to the aggregate traffic coming into the site, the receiver may start missing out on occasional packets leading it to ask for more packets from the sender. The sender’s bitrate then increases which reduces the margin available in the incoming circuit resulting in more lost packets. This cycle continues until the line is saturated. It’s important to remember that saturating an incoming link doesn’t mean traffic can’t get out. It’s quite possible there are hundreds of megabits available outgoing so there’s plenty of bandwidth to shout for more and more re-requests. The sender is quite happy to send these re-requests as it’s on a 10Gbe link and has plenty of headroom left. Only by stopping the receiver would you be able to break this positive-feedback loop.

Now, all protocols deliver some form of control over what’s re-requested to try to manage difficult situations. Sergio agrees that other implementations of RIST work well in normal situations with less than 10% packet loss, for example. But where bursts of packet loss exceed 20% or the circuit headroom dips below 20%, Sergio says implementations tend to struggle.

As a lead-up up to the recent improvements made in congestion management, Sergio outlines how libRIST uses internal QOS to maintain a bandwidth cap. It will also monitor the RTT every tenth of a second to help spread retries over time. By checking how the RTT is changing in these extreme conditions, libRIST is able to throw away redundant re-requests leaving more bandwidth for useful requests. The fact that the sender is doing this work means that even if the receiver is on an older version of libRIST or on another implementation, the link can still benefit from the checking the libRIST 0.2.0 is doing. The upshot of all this work is that no longer can libRIST deal with 50% packet loss, it can now deliver an unblemished stream up to just shy of 70% packet loss.

Watch now!
Speaker

Sergio Ammirata Sergio Ammirata Ph.D.
Chief Scientist,
SipRadius LLC
Adi Rozenberg Adi Rozenberg
CTO & Co-founder,
VideoFlow

Video: Secrets of Near Field Monitoring

We don’t need to be running a recording studio to care about speaker placement. Broadcast facilities are full of audio monitoring rooms for a range of uses. The principles discussed in this talk by award-winning studio designer Carl Tatz can be put in to practice wherever you want to sit in a room and listen to decent, flat audio.

Joining Producer Mike Rodiguez who moderates this webinar for the Audio Engineering Society (AES), Carl focuses this discussion on getting the right sound in audio control rooms. This is done through the ‘Null Positioning Ensemble’ (NPE) which considers the mixing console, listener and the speakers ‘as one’ that can be moved around the room. The ensemble puts the two speakers at about 1.71m apart behind the console firing across the console. Their audio intersects 45cm in front of the console where the listener can sit forming an equilateral triangle. By sitting between the console and where the speakers cross, Carl says you hear the source rather than the speakers thus giving the best audio reproduction.

This effect works if the tweeters are at the same higher as the listener’s ears, says Carl, so should be adjusted to suit the listener. High frequencies are more directional than lower frequencies so for accurate listening, it’s important the speakers aren’t pointing too far off-axis. Exactly where to place your ensemble can seem daunting, but Carl has a calculator on his website which gives a great start allowing you to model your room as a rectangle and find out where the null points are going to be. The nulls are where sound cancels out due to reflections so moving your ensemble to avoid these nulls is the key to a great sound. Carl details how this is done and how, then, to optimise for the ‘real world’ room rather than the mathematical model.

 

 

Carl talks about the importance of sound treatment to remove reflections and stop the room from being too lively, with some specific suggestions. In general, the aim is to remove first reflections, have the back stony dead, the ceiling dead and bass traps in the corners. This should allow you to clap your hands without hearing reflection. But you can’t fix every problem with such treatment, Carl says, bringing up a frequency chart of a typical monitor setup which shows a 10dB dip around 125Hz. This is found in all monitoring setups and appears to develop from sound from the speakers bouncing off the floor under the console. He says that this needs to be filled in with subwoofers rather than being fixed with EQ or acoustic treatments.

Watch now!
Speakers

Carl Tatz Carl Tatz
Founder,
Carl Tatz Design LLC
Mike Rodriguez Moderator: Mike Rodriguez
Freelance Director & Producer

Video: Per-Title Encoding in the Wild

How deep do you want to go to make sure viewers get the absolute best quality streamed video? It’s been common over the past few years not to just choose 7 bitrates for a streamed service and encode everything to those bitrates. Rather to at least vary the bitrate for each video. In this talk we examine why doing this is leaving bitrate savings on the table which, in turn, means bitrate savings for your viewers, faster time-to-play and an overall better experience.

Jan Ozer starts with a look at the evolution of bitrate optimisation. It started with Beamr and, everyone’s favourite, FFmpeg. Both of which re-encode every frame until they get the best quality. FFmpeg’s CRF mode will change the quantizer parameter for each frame to maintain the same quality throughout the whole file, though with a variable bitrate. Beamr would encode each frame repeatedly reducing the bitrate until it got the desired quality. These worked well but missed out on a big trick…

Over the years, it’s been clear that sometimes 720p at 1Mbps looks better than 1080p at 1Mbps. This isn’t always the case and depends on the source footage. Much rolling news will be different from premium sports content in terms of sharpness and temporal content. So, really, the resolution needs to be assessed alongside data rate. This idea was brought into Netflix’s idea of per-title encoding. By re-encoding a title hundreds of times with different resolutions and data rates, they were able to determine the ‘convex hull’ which is a graph showing the optimum balance between quality, bitrate and resolution. That was back in 2015. Moving beyond that, we’ve started to consider more factors.

The next evolution is fairly obvious really, and that’s to make these evaluations not for each video, but for each shot. Doing this, Jan explains, offers bitrate improvements of 28% for AVC and more for other codecs. This is more complex than per-title because the stream itself changes, for instance, GOP sizes, so whilst we know this is something Netflix is using, there are no available commercial implementations currently.

Pushing these ideas further, perhaps the streaming service should take into account the device on which you are viewing. Some TV’s typically only ever take the top two rungs on the ladder, yet many mobile devices have low-resolutions screens and never get around to pulling the higher bitrates. So profiling a device based on either its model or historic activity can allow you to offer different ABR ladders to allow for a better experience.

All of this needs to be enabled by automatic, objective metrics so the metrics need to look out for the right aspects of the video. Jan explains that PSNR and MS-SSIM, though tried and trusted in the industry, only measure spatial information. Jan gives an overview of the alternatives. VMAF, he says, ads a detail loss metric, but it’s not until we start using PW-SSIM from Bright cove where aspects such as device information is taken into account. SSIMPLUS does this and also considers wide colour gamut HDR and frame rates. Similarly ATEME’s ‘Quality Vector’ considers frame rate and HDR.

Dr. Abdul Rehman follows Jan with his introduction to SSIMWAVE’s technologies and focuses on their ability to understand what quality the viewer will see. This allows a provider to choose whether to deliver a quality of ’70’ or, say, ’80’. Each service is different and the demographics will expect different things. It’s important to meet viewer expectations to avoid churn, but it’s in everyone’s interest to keep the data rate as low as possible.

Abdul gives the example of banding which is something that is not easily picked up by many metrics and so can be introduced as the encode optimiser continues to reduce the bitrate oblivious to the obvious banding. He says that since SSIMPLUS is not referenced to a source, this can give an accurate viewer score no matter the source material. Remember that if you use PSNR, you are comparing against your source. If the source is poor, your PSNR score might end up close to the maximum. The trouble is, your viewers will still see the poor video you send them, not caring if this is due to encoding or a bad source.

The video ends with a Q&A.

Watch now!
Speakers

Jan Ozer Jan Ozer
Principal, Stremaing Learning Center
Contributing Editor, Streaming Media
Abdul Rehman Abdul Rehman
CEO,
SSIMMWAVE

Video: IP for Broadcast, Virtual Immersive Studios, Esports

A wide range of topics today covering live virtual production, lenses, the reasons to move to IP, Esports careers and more. This is a recording of the SMPTE Toronto sections’ February meeting with guest speakers from Arista, Arri, TFO and Ross Video.

The first talk of the evening was from Ryan Morris of Arista talking about the importance of the move to IP. Those with an IP infrastructure have noticed that it’s easier to continue using their system during lockdown when access to the equipment itself is limited. While there will always be a need to move a 100Gbe fibre at some point or other, a running 2110 system easily allows new connections without needing SDI cables plugging up. This is down to IP’s ability to carry multiple signals, in both directions, down a single cable. A 100 gigabit fibre can carry 65 1080i59.94 signals, for instance which is in stark constrast to SDI cabling. Similarly when using an IP router, you can route thousands of flows in a few U of space where as a 1152×1152 SDI router takes up a whole rack.

Ryan moves to an overview of the protocols that make broadcast on IP networks possible starting with unicast, multicast and broadcast. The latter, he likens to a baby screaming. Multicast is like you talking to a group of friends. Multicast is the protocol used for audio, video and other essences when being sent over IP whether as part of SMPTE ST 2110 or ST 2022-6. And whilst it works well, the protocol managing it, IGMP, isn’t really as smart as we need it to be. IGMP knows nothing about the bandwidth of the flow being sent and has no knowledge of capacity or loading of any link. As such, links can get saturated using this method and can even mean that routine maintenance overloads the backup path resulting in an outage. Ryan concludes by saying that SDN resolves this problem. Ryan explains IGMP as analogous to knowing which address you need to drive to and simply setting off in the right direction, reacting to any traffic jams and roadblocks you find. In contrast, he says SDN is like having GPS where everything is taken in to account from the beginning and you know the whole path before you set off. Both will get you there, SDN will be more efficient, predictable and accountable.

To understand more about IP, watch these talks:
“Is IP really better than SDI?” by Ed Calverly detailing on how video over IP works and,
“Network design for live production” by, colleague of Ryan, Gerard Philips
 

 
Next in the line-up is François Gauthier who takes u through the history of cinema-related technologies showing how, at each stage, stanards helped the increasingly global industry work together. SMPTE’s earliest, well known, standardisation efforts were to aid the efforts around World War 1 interchanging films between projectors/cameras. Similarly, ARRI started in 1917 and has benefited from and worked to create SMPTE standards in cameras, lighting, workflows, colour grading and now mixed reality. François eloquently takes us on this journey showing at each stage the motivation for standardisation and how ARRI has developed in step.

A different type of innovation is on show in the next talk. Given by Cliff Lavalée updates on the latest improvements to his immersive studio. It was formerly featured in a previous SMPTE Toronto section talk when he explained the benefits of having a gaming-based 3D engine in this green-screen studio with camera tracking. In fact, it was the first studio of its kind as it came on line in 2016. Since then, game engined have made great inroads into studio production.

Having a completely virtual studio with camera tracking and 3D objects available to be live-rendered in response to the scene, has a number of benefits, Cliff explains. He can track the talent and make objects appear in front or behind them as appropriate in response to their movements. Real-time rendering and the green blank canvas gives design freedom as well as the ability to see what scenes will look like during the shoot rather than after. It’s no surprise that there are also cost savings. In one of a number of videos he shows, we see a children’s programme which takes place in a small village. By using the green screen, the live-action puppets can quickly change sets from place to place integrating real props with virtual backgrounds which move with the camera.

The last talk is from Cameron Reed who’s a former esports director and now works for Ross Video. Cameron gives a brief overview of how esports is split up into developers who make the game, tournament organisers, teams, live production companies and distribution platforms. The Broadcast Knowledge has followed esports for a while. Check out the back catalogue for more detailed videos on the subject.

It’s no surprise that the developers own the game. What’s interesting is that a computer game is much more complex and directly malluable than traditional sports games. Whilst FIFA might control football/soccer world-wide, there is little it can do to change the game. Formula 1 is, perhaps, closest to the esports model where rules will come and go about engines, tyres, refueling strategies etc. With esports, aspects of the game can change week to week in response to fans. Cameron explains esports as ‘free’ adverstising for the developers. Although they won’t always make money, even if they make 90% of their money back directly from the tournament and events for that year, it means they’ve had a 90% discount on their advertising budget. All the while, they’ve managed to inject life in to their game and extend the amount of interest it’s garnered. Camerong gives a brief acknowledgement that for distribution “Twitch is king” but underlines that this platform doesn’t support UHD as of the date of the meeting which doesn’t sit well with the efforts of the gameing industry to increase resolution and detail in games.

Cameron’s presentation finishes with a look at career progressions in esports both following a non/semi-technichal path and a technical path. The market holds a lot of interesting opportunities.

The session ends with a Q&A for all the panelists.

Watch now!
Speakers

Ryan Morris Ryan Morris
Systems Engineer,
Arista Networks
François Gauthier François Gauthier
TSR,
ARRI
Cliff Lavalée Cliff Lavallée
Director of LUV Studio Services,
Groupe Média TFO
Cameron Reed
Esports Business Development Manager,
Ross Video