In networking there are many possible bottlenecks, but the most pervasive is congestion caused by links operating at capacity and saturating the buffers. Full buffers are unable to fully adapt to the incoming traffic, increasing the chances of dropped packets, but the extra latency added by full buffer after full buffer quickly adds up and this extra latency further degrades the quality of the connection for the data that does make it through.
It’s no surprise then, that a lot of work goes into finding the best ‘congestion’ algorithms to allow data senders to back off when a link stops responding well. This talk, from Facebook engineer Nitin Garg, examines old and new approaches to keeping streams fast and responsive by running a 4-million-data-point test of three contenders, Cubic, BBR and Copa.
Nitin starts by introducing what we mean by ‘congestion’, how and why it occurs. The simple example is that your computer can send data, typically, at up to 1Gbps, yet your uplink to the internet is likely below this number. So congestion control is a feedback mechanism which lets your computer realise that sending at 1Gbps isn’t working and allows it to throttle back to a speed which fits within your upload bandwidth. The same is true further down the pipe. If you have 50Mbps uplink to the internet, but you are sending to a server which only has 10Mbps left, not only does your computer need to throttle below 50, but also 10Mbps.
We then walk through how Cubic, BBR and Copa work with Nitin explaining the differences. <a href=”https://web.mit.edu/copa/” rel=”noopener” target=”_blank>Copa is the newest of the protocols comes from MIT and comes with the unique ability to tune it to your need; throughput or low latency. As discussed above, to keep latency down, buffer size needs to be minimised which stops you being aggressive in loading up links which leads to latency and throughput being at opposite ends of a see-saw.
Nitin’s test was on mobile phones using Facebook’s Live streaming app on Android and iOS for live streaming with ABR where the app itself adapts to ensure that it is streaming with as high a quality as possible, but willing to reduce the bitrate when needed. Testing from global markets, they measured round trip times and the amount of delivered data. Nitin walks through the results both for latency and throughput and shows that when Copa is optimised for latency, in the worst conditions it leads the other two protocols in latency reduction.
Low latency can be a differentiator for a live streaming service, or just a way to ensure you’re not beaten to the punch by social media or broadcast TV. Either way, it’s seen as increasingly important for live streaming to be punctual breaking from the past where latencies of thirty to sixty seconds were not uncommon. As the industry has matured and connectivity has enough capacity for video, simply getting motion on the screen isn’t enough anymore.
Steve Heffernan from MUX takes us through the thinking about how we can deliver low latency video both into the cloud and out to the viewers. He starts by talking about the use cases for sub-second latency – anything with interaction/conversations – and how that’s different from low-latency streaming which is one to many, potentially very large scale distribution. If you’re on a video call with ten people, then you need sub-second latency else the conversation will suffer. But distributing to thousands or millions of people, the sacrifice in potential rebuffering of operating sub-second, isn’t worth it, and usually 3 seconds is perfectly fine.
Steve talks through the low-latency delivery chain starting with the camera and encoder then looking at the contribution protocol. RTMP is still often the only option, but increasingly it’s possible to use WebRTC or SRT, the latter usually being the best for streaming contribution. Once the video has hit the streaming infrastructure, be that in the cloud or otherwise, it’s time to look at how to build the manifest and send the video out. Steve talks us through the options of Low-Latency HLS (LHLS) CMAF DASH and Apple’s LL-HLS. Do note that since the talk, Apple removed the requirement for HTTP/2 push.
The talk finishes off with Steve looking at the players. If you don’t get the players logic right, you can start off much farther behind than necessary. This is becoming less of a problem now as players are starting to ‘bend time’ by speeding up and slowing down to bring their latency within a certain target range. But this only underlines the importance of the quality of your player implementation.
Can any stream be too low-latency? For some matching broadcast latency, is all they need. But for others, particularly for gaming, gambling or more interactive services, sub-second is a must and they are happy to swap out parts of their technology stack to make that happen. WebRTC is often seen as the best choice for anyone wanting to go achieve an almost instant stream. Started by Google in 2011 for video conferencing applications, WebRTC hit a 1.0 release in 2018 and has been adopted by a number of companies catering to the broadcast market.
WebRTC stands out among the plethora of streaming protocols since it is an actual stream of data and not a series of files transferred just in time. Traditionally buffers have been heavily used in streaming because it was so hard to get data to the player when the mainstream internet was starting out in the 90s and as the mobile internet was establishing itself 10 years later. Whilst those buffers are very helpful in dealing with delayed data, they are a big set back in delivering a low-latency stream. With WebRTC, there is very little buffering, so when using the protocol you have to understand that you may not get all your data delivered and if packets are missing glitches will be seen. This is one significant difference since MPEG DASH and HLS will either show you a blank screen or a perfect rendition of the file chunk that was sent thanks to TCP. This is an example of the compromises of going to sub-second latency; there are no second chances to get the packet again. And whilst this compromise may be a great exchange for an auction site or betting service, for other streaming services, it may be better to use CMAF with 3-second latency.
In this talk, Limelight Networks Video Architect Andrew Crowe introduces WebRTC and explains how it can be deployed. He starts by talking about the video codecs it contains. VP9 has recently been added to the options and for a long time, it was a VP8 technology. Andrew explains how the codecs it carries does have a knock-on effect on its compatibility with browsers. UDP is the underlying technology to all low-latency technologies since the bureaucracy of TCP/IP gets in the way of real-time media streams. Andrew also explains how security pervades WebRTC from its use of DTLS (which is like HTTPS/TLS for UDP) to secure RTP and SRTCP.
The last part of the talk discusses the architectures that CDN LimeLight uses to enable large-scale WebRTC streams including the need to get through firewalls. Andrew discusses how some features of the technology suit small-scale events, but can’t be used with thousands of viewers. He also discusses how adaptive bitrate streams can be delivered, although not within WebRTC itself, there is enough information to implement ABR in addition to the standard stream.
Even after restrictions are lifted, it’s estimated that overall streaming subscriptions will remain 10% higher than before the pandemic. We’ve known for a long time that streaming is here to stay and viewers want their live streams to arrive quickly and on-par with broadcast TV. There have been a number of attempts at this, the streaming community extended HLS to create LHLS which brought down latency quite a lot without making major changes to the defacto standard.
MPEG’s DASH also has created a standard for low-latency streaming allowing CMAF to be used to get the latency down even further than LHLS. Then Apple, the inventors of the original HLS, announced low-latency HLS (LL-HLS). We’ve looked at all of these previously here on The Broadcast Knowledge. This Online Streaming Primer is a great place to start. If you already know the basics, then there’s no better than Will Law to explain the details.
The big change that’s happened since Will Law’s talk above, is that Apple have revised their original plan. This talk from CTO and Founder of THEOplayer, Pieter-Jan Speelmans, explains how Apple’s modified its approach to low-latency. Starting with a reminder of the latency problem with HLS, Pieter-Jan explains how Apple originally wanted to implement LL-HLS with HTTP/2 push and the problems that caused. This has changed now, and this talk gives us the first glimpse of how well this works.
Pieter-Jan talks about how LL-DASH streams can be repurposed to LL-HLS, explains the protocol overheads and talks about the optimal settings regarding segment and part length. He explains how the segment length plays into both overall latency but also start-up latency and the ability to navigate the ABR ladder without buffering.
There was a lot of frustration initially within the community at the way Apple introduced LL-HLS both because of the way it was approached but also the problems implementing it. Now that the technical issues have been, at least partly, addressed, this is the first of hopefully many talks looking at the reality of the latest version. With an expected ‘GA’ date of September, it’s not long before nearly all Apple devices will be able to receive LL-HLS and using the protocol will need to be part of the playbook of many streaming services.
Views and opinions expressed on this website are those of the author(s) and do not necessarily reflect those of SMPTE or SMPTE Members.
This website is presented for informational purposes only. Any reference to specific companies, products or services does not represent promotion, recommendation, or endorsement by SMPTE