Video: CMAF with ByteRange – A Unified & Efficient Solution for Low Latency Streaming

Apple’s LL-HLS protocol is the most recent technology offering to deliver low-latency streams of just 2 or 3 seconds to the viewer. Before that, CMAF which is built on MPEG DASH also enabled low latency streaming. This panel with Ateme, Akamai and THEOplayer asks how they both work, their differences and also maps out a way to deliver both at once covering the topic from the perspective of the encoder manufacturer, the CDN and the player client.

We start with ATEME’s Mickaël Raulet who outline’s CMAF starting with its inception in 2016 with Microsoft and Apple. CMAF was published in 2018 and most recently received detailed guidelines for low latency best practice in 2020 from the DASH Industry Forum. He outlines that the idea of CMAF is to build on DASH to find a single way of delivering both DASH and HLS using once set of media. THe idea here is to minimise hits on the cache as well as storage. Harnessing the ISO BMFF CMAF adds on the ability to break chunks in to fragments opening up the promise of low latency delivery.



Mickaël discusses the methods of getting hold of these short fragments. If you store the fragments separately, then you double your storage as 4 fragments make up a whole segment. So it’s better to have all the fragments written as a segment. We see that Byterange requests are the way forward whereby the client asks the server to start delivering a file from a certain number of bytes into the file. We can even request this ahead of time, using a preload hint, so that the server can push this data when it’s ready.

Next we hear from Akamai’s Will Law who examines how Apples LL-HLS protocol can work within the CDN to provide either CMAF for LL-HLS from the same media files. He uses the example of a 4-second segments with four second-long parts. A standard latency player would want to download the whole 4-second segment where as a LL-HLS player would want the parts. DASH, has similar requirements and so Will focusses on how to bring all of these requirements down into the mimum set of files needed which he calls a ‘common cache footprint’ using CMAF.

He shows how byterange requests work, how to structure them and explains that, to help with bandwidth estimation, the server will wait until the whole of the byterange is delivered before it sends any data thus allowing the client to download a wire speed. Moreover a single request can deliver the rest of the segments meaning 7 requests get collapsed into 1 or 2 requests which is an important saving for CDNs working at scale. It is possible to use longer GOPs for a 4-second video clip than for 1-second parts, but for this technique to work, it’s important to maintain the same structure within the large 4-second clip as in the 1-second parts.

THEOplayer’s Pieter-Jan Speelmans takes the floor next explaining his view from the player end of the chain. He discusses support for LL-HLS across different platforms such as Android, Android TV, Roku etc. and concludes that there is, perhaps surprisingly, fairly wide support for Apple’s LL-HLS protocol. Pieter-Jan spends some time building on Will’s discussion about reducing request numbers for browsers, CORS checking can increase cause extra requests to be needed when using Byterange requests. For implementing ABR, it’s important to understand how close you are to the available bandwidth. Pieter-Jan says that you shouldn’t only use the download time to determine throughput, but also metadata from the player to get as an exact estimate as possible. We also hear about dealing with subtitles which can need to be on screen longer than the duration of any of the parts or even of the segment length. These need to be adapted so that they are shown repeatedly and each chunk contains the correct information. This can lead to flashing on re-display so, as with many things in modern players, needs to be carefully and intentionally dealt with to ensure the correct user experience.

The last part of the video is a Q&A which covers:

  • Use of HTTP2 and QUIC/HTTP3
  • Dynamic Ad Insertion for low latency
  • The importance of playlist blocking
  • Player synchronisation with playback rate adjustment
  • Player analytics
  • DRM insertion problems at low-latency

    Watch now!

    Will Law Will Law
    Chief Architect, Edge Technology Group,
    Mickaël Raulet Mickaël Raulet
    Pieter-Jan Speelmans Pieter-Jan Speelmans
    CTO & Founder,
  • Video: What is HESP Ultra-Low-Latency Streaming?

    Is it possible to improve on CMAF’s offer of an ultra-low-latency, scalable protocol with good viewer experience? This is what HESP, the High-Efficiency Streaming Protocol, promises. With almost instant channel change times and sub-second latency, it’s worth taking a look at those protocol created by THEOPlayer to understand where it might work in your workflows.

    Presented by Pieter-Jan Speelmans and Johan Vounckx from THEO, we hear some more detail surrounding HESP’s inception. Quality, latency and bitrate are often referred to as a triangle where if you improve one or even two, the remaining factor will get worse to compensate. HESP plays in the triangle connecting ‘viewer experience’, ‘low latency’ and ‘scalability’. If you compare WebRTC with CMAF, you see that WebRTC prioritises low-latency streaming but suffers in terms of scalability. CMAF, being 2-5 seconds higher latency, has much better scalability but the channel zapping times are high which affects viewer experience as well as overall latency. HESP, contests Pieter-Jan, actually improves all three. It’s able to do this because it’s not extending existing protocols which weren’t designed to meet all these requirements, rather it’s bringing in new techniques which shift the whole equation.

    THEOPlayer has created the HESP Alliance which is devoted to standardising the HESP technology through the IETF or other avenue, promoting adoption through marketing and the creation of tools, certification and management of intellectual property. The talk outlines the decoder royalties which can be payable by subscriber, per subscriber per hour, or per device.

    Source: THEOPlayer

    Looking at the technical details, we find out that you can actually start playing an HESP stream without downloading the manifest. While HESP does have manifest files, they change very infrequently. If a new one is changed at short notice, the server can ask players to download one by embedding a message in the stream. The channel zapping speed is achieved using two streams, an initialisation stream and a continuation stream. The initialisation stream just I and P frames allowing you to start playing immediately. The continuation stream is intended to be the low-bitrate stream used after the establishment of the stream.

    HESP uses two modes: Maximal Gain and Maximal Compatability. Maximal gain aims to have the lowest latency, lowest bandwidth and lowest zapping times. It has long segments with 1 frame chunks containing one I or P frame. The Maximal Compatability mode, however, allows you to reuse Low-Latency DASH and LLHLS streams and uses 6-second segments with 200msec chunks including B frames.

    THEOPlayer claim 7x less delivery delay, 20x lower zapping times and a 20% bandwidth saving over CMAF with broad compatibility with many TVs, android, iOS, Web, streaming devices.

    Watch now!

    Pieter-Jan Speelmans Pieter-Jan Speelmans
    CTO & Founder,
    Johan Vounckx Johan Vounckx
    Vice President, Innovation,

    Video: LL-HLS Discussion with THEO, Wowza & Fastly

    Roundtable discussion with Fastly, Theo and Wowza

    iOS 14 has finally started to hit devices and with it, LL-HLS is now available in millions of devices. Low-Latency HLS is Apple’s latest evolution of HLS, a streaming protocol which has been widely used for over a decade. Its typical latency has gradually come down from 60 seconds to, between 6 and 15 seconds now. There are still a lot of companies that want to bring that down further and LL-HLS is Apple’s answer to people who want to operate at around 2-4 seconds total latency, which matches or beats traditional broadcast.

    LL-HLS was introduced last year and had a rocky reception. It came after a community-driven low-latency scheme called LHLS and after MPEG DASH announced CMAF’s ability to hit the same 2-4 second window. Famously, this original context, as well as the technical questions over the new proposal, were summed up well in Phil Cluff’s blog post which was soon followed by a series of talks trying to make sense of LL-HLS ahead of implementation. This is the Apple video introducing LL-HLS in its first form. And the reactions from AL Shenker from CBS Interactive, Marina Kalkanis from M2A Media and Akamai’s Will Law which also nicely sums up the other two contenders. Apple have now changed some of the spec in response to their own further reasearch and external feedback which was received positively and summed up in, THEO CTO, Pieter-Jan Speelmans’ recent webinar bringing us the updates.

    In this panel, Pieter is joined by Chris Buckley from Fastly Inc. and Wowza’s Jamie Sherry discussing pressing LL-HLS into action. Moderator Alison Kolodny hosts the talk which covers a wide variety of points.

    “Wide adoption” is seen as the day-1 benefit. If you support LL-HLS then you’ll know you’re able to hit a large number of iPads, iPhones and Macs. Typically Apple sees a high percentage of the userbase upgrade fairly swiftly and easily see more than 75% of devices updated within four months of release. The panel then discusses how implementation has become easier given the change in protocol where the use of HTTP/2’s push technology was dropped which would have made typical CDN techniques like hosting the playlists separately to the media impossible. Overall, CDN implementation has become more practical since with pre-load hints, a CDN can host many, many connections into to it, all waiting for a certain chunk and collapse that down to a single link to the origin.

    One aspect of implementation which has improved, we hear from Pieter-Jan, is building effective Adaptive Bit Rate (ABR) switching. With low-latency protocols, you are so close to live that it becomes very hard to download a chunk of video ahead of time and measure the download speed to see if it arrived quicker than realtime. If it did, you’d infer there was spare bit rate. LL-HLS’s use of rendition reports, however, make that a lot easier. Pieter-Jan also points out SSAI is easier with rendition reports.

    The rest of the discussion covers device support for LL-HLS, subtitles workflows, the benefits of TLS 1.3 being recommended, and low-latency business cases.

    Watch now!
    The webinar is free to watch, on demand, in exchange for your email details. The link is emailed to you immediately.

    Chris Buckley
    Senior Sales Engineer,
    Fastly Inc.
    Pieter-Jan Speelmans Pieter-Jan Speelmans
    THEO Technologies
    Jamie Sherry Jamie Sherry
    Senior Product Manager,
    Alison Kolodny Moderator: Alison Kolodny
    Senior Product Manager of Media Services,

    Video: A State-of-the-Industry Webinar: Apple’s LL-HLS is finally here

    Even after restrictions are lifted, it’s estimated that overall streaming subscriptions will remain 10% higher than before the pandemic. We’ve known for a long time that streaming is here to stay and viewers want their live streams to arrive quickly and on-par with broadcast TV. There have been a number of attempts at this, the streaming community extended HLS to create LHLS which brought down latency quite a lot without making major changes to the defacto standard.

    MPEG’s DASH also has created a standard for low-latency streaming allowing CMAF to be used to get the latency down even further than LHLS. Then Apple, the inventors of the original HLS, announced low-latency HLS (LL-HLS). We’ve looked at all of these previously here on The Broadcast Knowledge. This Online Streaming Primer is a great place to start. If you already know the basics, then there’s no better than Will Law to explain the details.

    The big change that’s happened since Will Law’s talk above, is that Apple have revised their original plan. This talk from CTO and Founder of THEOplayer, Pieter-Jan Speelmans, explains how Apple’s modified its approach to low-latency. Starting with a reminder of the latency problem with HLS, Pieter-Jan explains how Apple originally wanted to implement LL-HLS with HTTP/2 push and the problems that caused. This has changed now, and this talk gives us the first glimpse of how well this works.

    Pieter-Jan talks about how LL-DASH streams can be repurposed to LL-HLS, explains the protocol overheads and talks about the optimal settings regarding segment and part length. He explains how the segment length plays into both overall latency but also start-up latency and the ability to navigate the ABR ladder without buffering.

    There was a lot of frustration initially within the community at the way Apple introduced LL-HLS both because of the way it was approached but also the problems implementing it. Now that the technical issues have been, at least partly, addressed, this is the first of hopefully many talks looking at the reality of the latest version. With an expected ‘GA’ date of September, it’s not long before nearly all Apple devices will be able to receive LL-HLS and using the protocol will need to be part of the playbook of many streaming services.

    Watch now to get the full detail


    Pieter-Jan Speelmans Pieter-Jan Speelmans
    CTO & Founder