Video: CMAF with ByteRange – A Unified & Efficient Solution for Low Latency Streaming

Apple’s LL-HLS protocol is the most recent technology offering to deliver low-latency streams of just 2 or 3 seconds to the viewer. Before that, CMAF which is built on MPEG DASH also enabled low latency streaming. This panel with Ateme, Akamai and THEOplayer asks how they both work, their differences and also maps out a way to deliver both at once covering the topic from the perspective of the encoder manufacturer, the CDN and the player client.

We start with ATEME’s Mickaël Raulet who outline’s CMAF starting with its inception in 2016 with Microsoft and Apple. CMAF was published in 2018 and most recently received detailed guidelines for low latency best practice in 2020 from the DASH Industry Forum. He outlines that the idea of CMAF is to build on DASH to find a single way of delivering both DASH and HLS using once set of media. THe idea here is to minimise hits on the cache as well as storage. Harnessing the ISO BMFF CMAF adds on the ability to break chunks in to fragments opening up the promise of low latency delivery.



Mickaël discusses the methods of getting hold of these short fragments. If you store the fragments separately, then you double your storage as 4 fragments make up a whole segment. So it’s better to have all the fragments written as a segment. We see that Byterange requests are the way forward whereby the client asks the server to start delivering a file from a certain number of bytes into the file. We can even request this ahead of time, using a preload hint, so that the server can push this data when it’s ready.

Next we hear from Akamai’s Will Law who examines how Apples LL-HLS protocol can work within the CDN to provide either CMAF for LL-HLS from the same media files. He uses the example of a 4-second segments with four second-long parts. A standard latency player would want to download the whole 4-second segment where as a LL-HLS player would want the parts. DASH, has similar requirements and so Will focusses on how to bring all of these requirements down into the mimum set of files needed which he calls a ‘common cache footprint’ using CMAF.

He shows how byterange requests work, how to structure them and explains that, to help with bandwidth estimation, the server will wait until the whole of the byterange is delivered before it sends any data thus allowing the client to download a wire speed. Moreover a single request can deliver the rest of the segments meaning 7 requests get collapsed into 1 or 2 requests which is an important saving for CDNs working at scale. It is possible to use longer GOPs for a 4-second video clip than for 1-second parts, but for this technique to work, it’s important to maintain the same structure within the large 4-second clip as in the 1-second parts.

THEOplayer’s Pieter-Jan Speelmans takes the floor next explaining his view from the player end of the chain. He discusses support for LL-HLS across different platforms such as Android, Android TV, Roku etc. and concludes that there is, perhaps surprisingly, fairly wide support for Apple’s LL-HLS protocol. Pieter-Jan spends some time building on Will’s discussion about reducing request numbers for browsers, CORS checking can increase cause extra requests to be needed when using Byterange requests. For implementing ABR, it’s important to understand how close you are to the available bandwidth. Pieter-Jan says that you shouldn’t only use the download time to determine throughput, but also metadata from the player to get as an exact estimate as possible. We also hear about dealing with subtitles which can need to be on screen longer than the duration of any of the parts or even of the segment length. These need to be adapted so that they are shown repeatedly and each chunk contains the correct information. This can lead to flashing on re-display so, as with many things in modern players, needs to be carefully and intentionally dealt with to ensure the correct user experience.

The last part of the video is a Q&A which covers:

  • Use of HTTP2 and QUIC/HTTP3
  • Dynamic Ad Insertion for low latency
  • The importance of playlist blocking
  • Player synchronisation with playback rate adjustment
  • Player analytics
  • DRM insertion problems at low-latency

    Watch now!

    Will Law Will Law
    Chief Architect, Edge Technology Group,
    Mickaël Raulet Mickaël Raulet
    Pieter-Jan Speelmans Pieter-Jan Speelmans
    CTO & Founder,
  • Video: DVB-I. Linear Television with Internet Technologies

    Outside of computers, life is rarely binary. There’s no reason for all TV to be received online, like Netflix or iPlayer, or all over-the-air by satellite or DVB-T. In fact, by using a hybrid approach, broadcasters can reach more people and deliver more services than before including securing an easier path to higher definition or next-gen pop-up TV channels.

    Paul Higgs explains the work DVB have been doing to standardise a way of delivering this promise: linear TV with internet technologies. DVB-I is split into three parts:

    1. Service discovery

    DVB-I lays out ways to find TV services including auto-discovery and recommendations. The A177 Bluebook provides a mechanism to find IP-based TV services. Service lists bring together channels and geographic information whereas service lists registries are specified to provide a place to go to in order to discover service lists.

    2. Delivery
    Internet delivery isn’t a reason for low-quality video. It should be as good or better than traditional methods because, at the end of the day, viewers don’t actually care which medium was used to receive the programmes. Streaming with DVB-I is based on MPEG DASH and defined by DVB-DASH (Bluebook A168). Moreover, DVB-I services can be simulcast so they are co-timed with broadcast channels. Viewers can, therefore, switch between broadcast and internet services.



    Naturally, a plethora of metadata can be delivered alongside the media for use in EPGs and on-screen displays thus including logos, banners, programme guide data and content protection information.

    Ian explains that this is brought together with three tools: the DVB-I reference client player which works on Android and HbbTV, DVB-DASH reference streams and a DVB-DASH validator.

    Finishing up, Ian adds that network operators can take advantage of the complementary DVB Multicast ABR specification to reduce bitrate into the home. DVB-I will be expanded in 2021 and beyond to include targetted advertising, home re-distribution and delivering video in IP but over traditional over-the-air broadcast networks.

    Watch now!

    Paul Higgs Paul Higgs
    Chairman – TM-I Working Group, DVB Project
    Vice President, Video Industry Development, Huawei

    Video: Broadcasting WebRTC over Low Latency Dash

    Using sub-second WebRTC with the scalability of CMAF: Allowing panelists and presenters to chat in real-time is really important to foster fluid conversations, but broadcasting that out to thousands of people scales more easily with CMAF based on MPEG DASH. In this talk, Mux’s Dylan Jhaveri (formerly CTO, explains how they’ve combined WebRTC and CMAF to keep latencies low for everyone.

    Speaking at the San Francisco VidDev meetup, Dylan explains that the Crowdspace webpage allows you to watch a number of participants talk in real-time as a live stream with live chat down the side of the screen. The live chat, naturally, feeds into the live conversation so latency needs to be low for the viewers as much as the on-camera participants. For them, WebRTC is used as this is one of the very few options that provides reliable sub-second streaming. To keep the interactivity between the chat and the participants, Crowdcast decided to look at ultra-low-latency CMAF which can deliver between 1 and 5 second latency depending on your risk threshold for rebuffering. So the task became to convert a WebRTC call into a low-latency stream that could easily be received by thousands of viewers.


    Dylan points out that they were already taking WebRTC into the browser as that’s how people were using the platform. Therefore, using headless Chrome should allow you to pipe the video from the browser into ffmpeg and create an encode without having to composite individual streams whilst giving Crowdcast full layout control.

    After a few months of tweaking, Dylan and his colleagues had Chrome going into ffmpeg then into a nodejs server which delivers CMAF chunks and manifests (click to learn more about how CMAF works). In order to scale this, Dylan explains the logic implemented in a CDN to use the nodejs server running in a docker container as an origin server. Using HLS they have a 95% cache hit rate and achieve 15 seconds latency. The tests at the time of the talks, Dylan explains, show that the CMAF implementation hits 3 seconds of latency and was working as expected.

    The talk ends with a Q&A covering how they get the video out of the headerless Chrome, whether CMAF latency could be improved and why there are so many docker containers.

    Watch now!

    Dylan Jhaveri Dylan Jhaveri
    Senior Software Engineer, Mux
    Formerly CTO & Co-founder,

    Video: Player Optimisations

    If you’ve ever tried to implement your own player, you’ll know there’s a big gap between understanding the HLS/DASH spec and getting an all-round great player. Finding the best, most elegant, ways of dealing with problems like buffer exhaustion takes thought and experience. The same is true for low-latency playback.

    Fortunately, Akamai’s Will Law is here to give us the benefit of his experience implementing his own and helping customers monitor the performance of their players. At the end of the day, the player is the ‘kingpin’ of streaming, comments Will. Without it, you have no streaming experience. All other aspects of the stream can be worked around or mitigated, but if the player’s not working, no one watches anything.

    Will’s first tip is to implement ‘segment abandonment’. This is when a video player foresees that downloading the current segment is taking too long; if it continues, it will run out of video to play before the segment has arrived. A well-programmed player will sport this and try to continue the download of this segment from another server or CDN. However, Will says that many will simply continue to wait for the download and, in the meantime, the download will fail.

    Tip two is about ABR switching in low-latency, chunked transfer streams. The playback buffer needs to be longer than the chunk duration. Without this precaution, there will not be enough time for the player to make the decision to switch down layers. Will shows a diagram of how a 3-second playback buffer can recover as long as it uses 2-second segments.

    Will’s next two suggestions are to put your initial chunk in the manifest by base64-encoding it. This makes the manifest larger but removes the round-trip which would otherwise be used to request the chunk. This can significantly improve the startup performance as the RTT could be a quarter of a second which is a big deal for low-latency streams and anyone who wants a short time-to-play. Similarly, advises Will, make those initial requests in parallel. Don’t wait for the init file to be downloaded before requesting the media segment.

    Whilst many of points in this talk focus on the player itself, Will says it’s wise for the player to provide metrics back to the CDN, hidden in the request headers or query args. This data can help the CDN serve media smarter. For instance, the player could send over the segment duration to the CDN. Knowing how long the segment is, the CDN can compare this to the download time to understand if it’s serving the data too slow. Perhaps the simplest idea is for the player to pass back a GUID which the CDN can put in the logs. This helps identify which of the millions of lines of logs are relevant to your player so you can run your own analysis on a player-by-player level.

    Will’s other points include advice on how to avoid starting playing at the lowest bandwidth and working up. This doesn’t look great and is often unnecessary. The player could run its own speed test or the CDN could advise based on the initial requests. He advises never trusting the system clock; use an external clock instead.

    Regarding playback latency, it pays to be wise when starting out. If you blindly start an HLS stream, then your latency will be variable within the duration of a segment. Will advocates HEAD requests to try to see when the next chunk is available and only then starting playback. Another technique is to vary your playback rate o you can ‘catch up’. The benefit of using rate adjustment is that you can ask all your players to be at a certain latency behind realtime so they can be close to synchronous.

    Two great tips which are often overlooked: Request multiple GOPs at once. This helps open up the TCP windows giving you a more efficient download. For mobile, it can also help the battery allowing you to more efficiently cycle the radio on and off. Will mentions that when it comes to GOPs, for some applications its important to look at exactly how long your GOP should be. Usually aligning it with an integer number of audio frames is the way to choose your segment duration.

    The talk finishes with an appeal to move to using CMAF containers for streaming ask they allow you to deliver HLS and DASH streams from the same media segments and move to a common DRM. Will says that CBCS encrypted content is now becoming nearly all-pervasive. Finally, Will gives some tips on how players are best to analyse which CDN to use in multi-CDN environments.

    Watch now!

    Will Law Will Law
    Chief Architect,