Video: LL-HLS Discussion with THEO, Wowza & Fastly

Roundtable discussion with Fastly, Theo and Wowza

iOS 14 has finally started to hit devices and with it, LL-HLS is now available in millions of devices. Low-Latency HLS is Apple’s latest evolution of HLS, a streaming protocol which has been widely used for over a decade. Its typical latency has gradually come down from 60 seconds to, between 6 and 15 seconds now. There are still a lot of companies that want to bring that down further and LL-HLS is Apple’s answer to people who want to operate at around 2-4 seconds total latency, which matches or beats traditional broadcast.

LL-HLS was introduced last year and had a rocky reception. It came after a community-driven low-latency scheme called LHLS and after MPEG DASH announced CMAF’s ability to hit the same 2-4 second window. Famously, this original context, as well as the technical questions over the new proposal, were summed up well in Phil Cluff’s blog post which was soon followed by a series of talks trying to make sense of LL-HLS ahead of implementation. This is the Apple video introducing LL-HLS in its first form. And the reactions from AL Shenker from CBS Interactive, Marina Kalkanis from M2A Media and Akamai’s Will Law which also nicely sums up the other two contenders. Apple have now changed some of the spec in response to their own further reasearch and external feedback which was received positively and summed up in, THEO CTO, Pieter-Jan Speelmans’ recent webinar bringing us the updates.

In this panel, Pieter is joined by Chris Buckley from Fastly Inc. and Wowza’s Jamie Sherry discussing pressing LL-HLS into action. Moderator Alison Kolodny hosts the talk which covers a wide variety of points.

“Wide adoption” is seen as the day-1 benefit. If you support LL-HLS then you’ll know you’re able to hit a large number of iPads, iPhones and Macs. Typically Apple sees a high percentage of the userbase upgrade fairly swiftly and easily see more than 75% of devices updated within four months of release. The panel then discusses how implementation has become easier given the change in protocol where the use of HTTP/2’s push technology was dropped which would have made typical CDN techniques like hosting the playlists separately to the media impossible. Overall, CDN implementation has become more practical since with pre-load hints, a CDN can host many, many connections into to it, all waiting for a certain chunk and collapse that down to a single link to the origin.

One aspect of implementation which has improved, we hear from Pieter-Jan, is building effective Adaptive Bit Rate (ABR) switching. With low-latency protocols, you are so close to live that it becomes very hard to download a chunk of video ahead of time and measure the download speed to see if it arrived quicker than realtime. If it did, you’d infer there was spare bit rate. LL-HLS’s use of rendition reports, however, make that a lot easier. Pieter-Jan also points out SSAI is easier with rendition reports.

The rest of the discussion covers device support for LL-HLS, subtitles workflows, the benefits of TLS 1.3 being recommended, and low-latency business cases.

Watch now!
The webinar is free to watch, on demand, in exchange for your email details. The link is emailed to you immediately.
Speaker

Chris Buckley
Senior Sales Engineer,
Fastly Inc.
Pieter-Jan Speelmans Pieter-Jan Speelmans
CTO,
THEO Technologies
Jamie Sherry Jamie Sherry
Senior Product Manager,
Wowza
Alison Kolodny Moderator: Alison Kolodny
Senior Product Manager of Media Services,
Frame.io

Video: Low Latency Live Streaming At Scale

Low latency can be a differentiator for a live streaming service, or just a way to ensure you’re not beaten to the punch by social media or broadcast TV. Either way, it’s seen as increasingly important for live streaming to be punctual breaking from the past where latencies of thirty to sixty seconds were not uncommon. As the industry has matured and connectivity has enough capacity for video, simply getting motion on the screen isn’t enough anymore.

Steve Heffernan from MUX takes us through the thinking about how we can deliver low latency video both into the cloud and out to the viewers. He starts by talking about the use cases for sub-second latency – anything with interaction/conversations – and how that’s different from low-latency streaming which is one to many, potentially very large scale distribution. If you’re on a video call with ten people, then you need sub-second latency else the conversation will suffer. But distributing to thousands or millions of people, the sacrifice in potential rebuffering of operating sub-second, isn’t worth it, and usually 3 seconds is perfectly fine.

Steve talks through the low-latency delivery chain starting with the camera and encoder then looking at the contribution protocol. RTMP is still often the only option, but increasingly it’s possible to use WebRTC or SRT, the latter usually being the best for streaming contribution. Once the video has hit the streaming infrastructure, be that in the cloud or otherwise, it’s time to look at how to build the manifest and send the video out. Steve talks us through the options of Low-Latency HLS (LHLS) CMAF DASH and Apple’s LL-HLS. Do note that since the talk, Apple removed the requirement for HTTP/2 push.

The talk finishes off with Steve looking at the players. If you don’t get the players logic right, you can start off much farther behind than necessary. This is becoming less of a problem now as players are starting to ‘bend time’ by speeding up and slowing down to bring their latency within a certain target range. But this only underlines the importance of the quality of your player implementation.

Watch now!
Speaker

Steve Heffernan Steve Heffernan
Founder & Head of Product, MUX
Creator of video.js

Video: A State-of-the-Industry Webinar: Apple’s LL-HLS is finally here

Even after restrictions are lifted, it’s estimated that overall streaming subscriptions will remain 10% higher than before the pandemic. We’ve known for a long time that streaming is here to stay and viewers want their live streams to arrive quickly and on-par with broadcast TV. There have been a number of attempts at this, the streaming community extended HLS to create LHLS which brought down latency quite a lot without making major changes to the defacto standard.

MPEG’s DASH also has created a standard for low-latency streaming allowing CMAF to be used to get the latency down even further than LHLS. Then Apple, the inventors of the original HLS, announced low-latency HLS (LL-HLS). We’ve looked at all of these previously here on The Broadcast Knowledge. This Online Streaming Primer is a great place to start. If you already know the basics, then there’s no better than Will Law to explain the details.

The big change that’s happened since Will Law’s talk above, is that Apple have revised their original plan. This talk from CTO and Founder of THEOplayer, Pieter-Jan Speelmans, explains how Apple’s modified its approach to low-latency. Starting with a reminder of the latency problem with HLS, Pieter-Jan explains how Apple originally wanted to implement LL-HLS with HTTP/2 push and the problems that caused. This has changed now, and this talk gives us the first glimpse of how well this works.

Pieter-Jan talks about how LL-DASH streams can be repurposed to LL-HLS, explains the protocol overheads and talks about the optimal settings regarding segment and part length. He explains how the segment length plays into both overall latency but also start-up latency and the ability to navigate the ABR ladder without buffering.

There was a lot of frustration initially within the community at the way Apple introduced LL-HLS both because of the way it was approached but also the problems implementing it. Now that the technical issues have been, at least partly, addressed, this is the first of hopefully many talks looking at the reality of the latest version. With an expected ‘GA’ date of September, it’s not long before nearly all Apple devices will be able to receive LL-HLS and using the protocol will need to be part of the playbook of many streaming services.

Watch now to get the full detail

Speaker

Pieter-Jan Speelmans Pieter-Jan Speelmans
CTO & Founder
THEOplayer

Video: Tech Talks: Low-Latency Live Streaming

There are a number of techniques for achieving low-latency streaming. This talk is one of the few which introduces them in easy to understand ways and then puts them in context briefly showing the manifests or javascript examples of how these would be seen in the wild. Whilst there are plenty of companies who don’t need low-latency streaming, for many it’s a key part of their offering or it’s part of the business model itself. Knowing the techniques in play is to better understand internet streaming in general.

Jameson Steiner from Bitmovin starts by explaining why there is a motivation to cut the latency. One big motivation, aside from the standard live sports examples, is user-generated content like on Twitch where it’s very clear to the streamer, and quite off-putting, when there is large amounts of delay. Whilst delay can be adapted to, the more there is the less interaction is possible. In this situation, it’s the ‘handwaving’ latency that comes in to play. You want the hand on the screen to wave pretty much at the same time as your hand waves in front of the camera. Jameson places different types of distribution on a chart showing latency and we see that low-latency of 5 seconds or less will not only match traditional TV broadcasts, but also work well for live streamers.

Naturally, to fix a problem you need to understand the problem, so Jameson breaks down the legacy methods of delivery to show why the latency exists. The issue comes down to how video is split into sections, say 6 seconds, so that the player downloads a section at a time, reassembles and plays them. Looking from the player’s perspective, if the network suddenly broke or reduced its throughput, it makes sense to have several chunks in reserve. Having three 6-second chunks, a sensible precaution, makes you 18 seconds behind the curve from the off.

Clearly reducing the segement size is a winner in this scenario. Three 3 second segments will give you just 9 seconds latency; why not go to 1 second? Well encoding inefficiency is one reason. If you reduce the amount of time a temporal codec has of a video, its efficiency will drop and bitrate will increase to maintain quality. Jameson explains the other knock-on effects such as CDN inefficiencies and network requests. The standardised way to avoid these problems is to use CMAF (Common Media Application Format) which is based on MPEG DASH and ISO BMFF. CMAF, and DASH in general, has the benefit of coming from a standards body whose aim was to remove vendor lock-in that may be felt with HLS and was certainly felt with RTMP. Check out MPEG’s short white paper on the topic (zipped .docx file)

CMAF uses chunked transfer meaning that as the encoder writes the data to the disk, the web server sends it to the client. This is different to the default where a file is only sent after it’s been completely written. This has the effect of the not having to wait up to 6 seconds to a 6-second chunk to start being sent; the download time also needs to be counted. Rather, almost as soon as the chunk has been finished by the encoder, it’s arrived at the destination. This is a feature of HTTP 1.1 and after so is not new, but it still needs to be enabled and considered as part of the delivery.

CMAF goes beyond simple HTTP 1.1 chunked transfer which is a technique used in low-latency HLS, covered later, by creating extra structure within the 6-second segment (until now, called a chunk in this article). This extra structure allows the segment to be downloaded in smaller chunks decoupling the segment length from the player latency. Chunked transfer does cause a notable problem however which has not yet been conclusively solved. Jameson explains how traditionally each large segment typically arrives faster than realtime. By measuring how fast it arrives, given the player knows the duration, it can estimate the bandwidth available at that time on the network. With chunked transfer, as we saw, we are receiving data as it’s being created. By definition, we are now getting it in realtime so there is no opportunity to receive it any quicker. The bandwidth estimation element, as shown the presentation, is used to work out if the player needs to go down or could go up to another stream at a different bitrate – part of standard ABR. So the catastrophe here is the going down in latency has hampered our ability to switch bitrates and whilst the viewer can see the video close to real-time, who’s to say if they are seeing it at the best quality?

Low-Latency HLS/DASH is a way of extending DASH and HLS without using CMAF. Jameson explains some techniques such as advertising segments in advance to allow players to pre-request. It also relies on finding the compromise point of encoding inefficiency vs segment length, typically held to be around 2 seconds, to minimise the latency. At this point we start seeing examples of the techniques in manifests and javascript allowing us to understand how this is actually signalled and implemented.

Apple is on its second major revision of LL-HLS which has responded to many of the initial complaints from the community. Whilst it can use HTTP/2 to help push segments out, this caused problems in practice so it can now preload hints, as Jameson explains in order to remove round-trip times from requests. Jameson looks at the other of Apple’s techniques and shows how they look in manifest files.

The final section looks at problems in implementing these features such as chunks being fragmented across TCP packets, the bandwidth estimation question and dealing with playback speed in order to adjust the players position in time – speed-ups and slow-downs of 5 to 10% can be possible depending on content.

Watch now!
Download the presentation
Speaker

Jameson Steiner Jameson Steiner
Software Engineer,
Bitmovin