Video: CMAF with ByteRange – A Unified & Efficient Solution for Low Latency Streaming

Apple’s LL-HLS protocol is the most recent technology offering to deliver low-latency streams of just 2 or 3 seconds to the viewer. Before that, CMAF which is built on MPEG DASH also enabled low latency streaming. This panel with Ateme, Akamai and THEOplayer asks how they both work, their differences and also maps out a way to deliver both at once covering the topic from the perspective of the encoder manufacturer, the CDN and the player client.

We start with ATEME’s Mickaël Raulet who outline’s CMAF starting with its inception in 2016 with Microsoft and Apple. CMAF was published in 2018 and most recently received detailed guidelines for low latency best practice in 2020 from the DASH Industry Forum. He outlines that the idea of CMAF is to build on DASH to find a single way of delivering both DASH and HLS using once set of media. THe idea here is to minimise hits on the cache as well as storage. Harnessing the ISO BMFF CMAF adds on the ability to break chunks in to fragments opening up the promise of low latency delivery.

 

 

Mickaël discusses the methods of getting hold of these short fragments. If you store the fragments separately, then you double your storage as 4 fragments make up a whole segment. So it’s better to have all the fragments written as a segment. We see that Byterange requests are the way forward whereby the client asks the server to start delivering a file from a certain number of bytes into the file. We can even request this ahead of time, using a preload hint, so that the server can push this data when it’s ready.

Next we hear from Akamai’s Will Law who examines how Apples LL-HLS protocol can work within the CDN to provide either CMAF for LL-HLS from the same media files. He uses the example of a 4-second segments with four second-long parts. A standard latency player would want to download the whole 4-second segment where as a LL-HLS player would want the parts. DASH, has similar requirements and so Will focusses on how to bring all of these requirements down into the mimum set of files needed which he calls a ‘common cache footprint’ using CMAF.

He shows how byterange requests work, how to structure them and explains that, to help with bandwidth estimation, the server will wait until the whole of the byterange is delivered before it sends any data thus allowing the client to download a wire speed. Moreover a single request can deliver the rest of the segments meaning 7 requests get collapsed into 1 or 2 requests which is an important saving for CDNs working at scale. It is possible to use longer GOPs for a 4-second video clip than for 1-second parts, but for this technique to work, it’s important to maintain the same structure within the large 4-second clip as in the 1-second parts.

THEOplayer’s Pieter-Jan Speelmans takes the floor next explaining his view from the player end of the chain. He discusses support for LL-HLS across different platforms such as Android, Android TV, Roku etc. and concludes that there is, perhaps surprisingly, fairly wide support for Apple’s LL-HLS protocol. Pieter-Jan spends some time building on Will’s discussion about reducing request numbers for browsers, CORS checking can increase cause extra requests to be needed when using Byterange requests. For implementing ABR, it’s important to understand how close you are to the available bandwidth. Pieter-Jan says that you shouldn’t only use the download time to determine throughput, but also metadata from the player to get as an exact estimate as possible. We also hear about dealing with subtitles which can need to be on screen longer than the duration of any of the parts or even of the segment length. These need to be adapted so that they are shown repeatedly and each chunk contains the correct information. This can lead to flashing on re-display so, as with many things in modern players, needs to be carefully and intentionally dealt with to ensure the correct user experience.

The last part of the video is a Q&A which covers:

  • Use of HTTP2 and QUIC/HTTP3
  • Dynamic Ad Insertion for low latency
  • The importance of playlist blocking
  • Player synchronisation with playback rate adjustment
  • Player analytics
  • DRM insertion problems at low-latency

    Watch now!
    Speakers

    Will Law Will Law
    Chief Architect, Edge Technology Group,
    Akamai
    Mickaël Raulet Mickaël Raulet
    CTO,
    ATEME
    Pieter-Jan Speelmans Pieter-Jan Speelmans
    CTO & Founder,
    THEOPlayer
  • Video: The OTT Quality Challenge

    Quality of Experience (QoE) has a wider meaning than Quality of Service (QoS) even though viewers have a worse time if either are impacted. What’s the difference and how are companies trying to deal with maximising enjoyment of their services? This panel from Streaming Media brings together Akamai’s Will Law, Robert Colantuoni from Disney Streaming Services, CJ Harvey from HBO Max. and Ian Greenblatt from JD Power detail the nuances of Quality of Experience.

    The panel starts by outlining some of the differences between QoS and QoE. Ian explains that QoE is about the whole experience of the UI, recommendations, search, rebuffering and much more. QoS can impact QoE but is restricted to the success of the delivery the stream itself. QoS measures impairments such as rebuffering, macroblocking, video quality, time to play etc. Whilst poor QoS will usually reduce QoE, there’s a lot that a well-written player can do to mitigate the effects of QoS. Having good QoE is ensuring the viewer can put trust in each of their ‘clicks’, that they will know what will happen and won’t have to wait.

     

     

    Measuring QoE is not without its challenges, afterall what should you measure? Rebuffering measured second-to-second gives you different results than measuring over 10-second windows. Will Law highlighted CTA 2066 which is a free specification. There is also a QoE best practices white paper from Akamai.

    “Multi-CDN is the new norm” declares Will Law, as the conversation turns to how players should deal with CDN selection. The challenge is to be picking for the CDN which works best for the user. Robert points out that a great CDN in one geography may not perform so well in another. A player making a ping-based choice at the beginning of playback is going to make a much worse choice overall than a player which samples each CDN in turn and continues to pick the best. This needs to be done carefully though, giving each CDN time to warm up and usefully affect its pre-fetch capabilities.

    Where QoE raises itself over QoS is in questions of perception. A good player will not simply target high bitrate, but take in to account colour volume depth, resolution and device to name but three.

    There are plenty of questions from the audience covering load balancers, jarring changes between sharp, high budget productions and old episodes of 4:3 TV dramas plus a look-ahead to the next two years of streaming.

    Watch now!
    Speakers

    Will Law Will Law
    Chief Architect, Edge Technology Group,
    Akamai
    CJ Harvey CJ Harvey
    VP Product Management,
    HBO Max
    Robert Colantuoni Robert Colantuoni
    Content Distribution Performance Architect,
    Disney Streaming Services
    Ian Greenblatt Ian Greenblatt
    Managing Director,
    J.D. Power
    Tim Siglin Moderator: Tim Siglin
    Contributing Editor,
    Streaming Media

    Video: Player Optimisations

    If you’ve ever tried to implement your own player, you’ll know there’s a big gap between understanding the HLS/DASH spec and getting an all-round great player. Finding the best, most elegant, ways of dealing with problems like buffer exhaustion takes thought and experience. The same is true for low-latency playback.

    Fortunately, Akamai’s Will Law is here to give us the benefit of his experience implementing his own and helping customers monitor the performance of their players. At the end of the day, the player is the ‘kingpin’ of streaming, comments Will. Without it, you have no streaming experience. All other aspects of the stream can be worked around or mitigated, but if the player’s not working, no one watches anything.

    Will’s first tip is to implement ‘segment abandonment’. This is when a video player foresees that downloading the current segment is taking too long; if it continues, it will run out of video to play before the segment has arrived. A well-programmed player will sport this and try to continue the download of this segment from another server or CDN. However, Will says that many will simply continue to wait for the download and, in the meantime, the download will fail.

    Tip two is about ABR switching in low-latency, chunked transfer streams. The playback buffer needs to be longer than the chunk duration. Without this precaution, there will not be enough time for the player to make the decision to switch down layers. Will shows a diagram of how a 3-second playback buffer can recover as long as it uses 2-second segments.

    Will’s next two suggestions are to put your initial chunk in the manifest by base64-encoding it. This makes the manifest larger but removes the round-trip which would otherwise be used to request the chunk. This can significantly improve the startup performance as the RTT could be a quarter of a second which is a big deal for low-latency streams and anyone who wants a short time-to-play. Similarly, advises Will, make those initial requests in parallel. Don’t wait for the init file to be downloaded before requesting the media segment.

    Whilst many of points in this talk focus on the player itself, Will says it’s wise for the player to provide metrics back to the CDN, hidden in the request headers or query args. This data can help the CDN serve media smarter. For instance, the player could send over the segment duration to the CDN. Knowing how long the segment is, the CDN can compare this to the download time to understand if it’s serving the data too slow. Perhaps the simplest idea is for the player to pass back a GUID which the CDN can put in the logs. This helps identify which of the millions of lines of logs are relevant to your player so you can run your own analysis on a player-by-player level.

    Will’s other points include advice on how to avoid starting playing at the lowest bandwidth and working up. This doesn’t look great and is often unnecessary. The player could run its own speed test or the CDN could advise based on the initial requests. He advises never trusting the system clock; use an external clock instead.

    Regarding playback latency, it pays to be wise when starting out. If you blindly start an HLS stream, then your latency will be variable within the duration of a segment. Will advocates HEAD requests to try to see when the next chunk is available and only then starting playback. Another technique is to vary your playback rate o you can ‘catch up’. The benefit of using rate adjustment is that you can ask all your players to be at a certain latency behind realtime so they can be close to synchronous.

    Two great tips which are often overlooked: Request multiple GOPs at once. This helps open up the TCP windows giving you a more efficient download. For mobile, it can also help the battery allowing you to more efficiently cycle the radio on and off. Will mentions that when it comes to GOPs, for some applications its important to look at exactly how long your GOP should be. Usually aligning it with an integer number of audio frames is the way to choose your segment duration.

    The talk finishes with an appeal to move to using CMAF containers for streaming ask they allow you to deliver HLS and DASH streams from the same media segments and move to a common DRM. Will says that CBCS encrypted content is now becoming nearly all-pervasive. Finally, Will gives some tips on how players are best to analyse which CDN to use in multi-CDN environments.

    Watch now!
    Speaker

    Will Law Will Law
    Chief Architect,
    Akamai

    Video: Reducing peak bandwidth for OTT

    ‘Flattening the curve’ isn’t just about dealing with viruses, we learn from Will Law. Rather, this is one way to deal with network congestion brought on by the rise in broadband use during the global lockdown. This and other key ways such as per-title encoding and removing the top tier are just two other which are explored in this video from Akamai and Bitmovin.

    Will Law starts the talk explaining why congestion happens in a world where ABR (adaptive bitrate streaming) is supposed to deal with this. With Akamai’s traffic up by around 300%, it’s perhaps not a surprise there’s a contest for bandwidth. As not all traffic is a video stream, congestion will still happen when fighting with other, static, data transfers. However deeper than that, even with two ABR streams, the congestion protocol in use has a big impact as will shows with a graph showing Akamai’s FastTCP and BBR where BBR steals all the bandwidth rather than ‘playing fair’.

    Using a webpage constructed for the video, Will shows us a baseline video playback and the metrics associated with it such as data transferred and bitrate which he uses to demonstrate the different benefits of bitrate production techniques. The first is covered by Bitmovin’s Sean McCarthy who explains Bitmovin’s per-title encoding technology. This approach ensures that each asset has encoder settings tuned to get the best out of the content whilst reducing bandwidth as opposed to simply setting your encoder to a fairly-high, safe, static bitrate for all content no matter how complex it is. Will shows on the demo that the bitrate reduces by over 50%.

    Swapping codecs is an obvious way to reduce bandwidth. Unlike per-title encoding which is transparent to the end-user, using AV1, VP9 or HEVC requires support by the final device. Whilst you could offer multiple versions of your assets to make sure you still cover all your players despite fragmentation, this has the downside of extra encoding costs and time.

    Will then looks at three ways to reduce bandwidth by stopping the highest-bitrate rendition from being used. Method one is to manually modify the manifest file. Method two demonstrates how to do so using the Bitmovin player API, and method three uses the CDN itself to manipulate the manifests. The advantage of doing this in the CDN is because this allows much more flexibility as you can use geolocation rules, for example, to deliver different manifests to different locations.

    The final method to reduce peak bandwidth is to use the CDN to throttle download speed of the stream chunks. This means that while you may – if you are lucky – have the ability to download at 100Mbps, the CDN only delivers 3- or 5-times the real-time bitrate. This goes a long way to smoothing out the peaks which is better for the end user’s equipment and for the CDN. Seen in isolation, this does very little, as the video bitrate and the data transferred remain the same. However, delivering the video in this much more co-operative way is much more likely to cause knock-on problems for other traffic. It can, of course, be used in conjunction with the other techniques. The video concludes with a Q&A.

    Watch now!
    Speakers

    Will Law Will Law
    Chief Architect,
    Akamai
    Sean McCarthy Sean McCarthy
    Technical Product Marketing Manager,
    Bitmovin