Video: CMAF Live Media Ingest Protocol Masterclass

We’ve heard before on The Broadcast Knowledge about CMAF’s success at bringing down the latency for live dreaming to around 3 seconds. CMAF is standards based and works with Apple devices, Android, Windows and much more. And while that’s gaining traction for delivery to the home, many are asking whether it could be a replacement technology for contribution into the cloud.

Rufael Mekuria from Unified Streaming has been working on bringing CMAF to encoders and packagers. All the work in the DASH Industry forum has centred around to key points in the streamin architecture. The first is on the output of the encoder to the input of the packager, the second between the packager and the origin. This is work that’s been ongoing for over a year and a half, so let’s pause to ask why we need a new protocol for ingest.

 

 

RTMP and Smooth streaming have not been deprecated but they have not been specified to carry the latest codecs and while people have been trying to find alternatives, they have started to use fragmented MP4 and CMAF-style technologies for contribution in their own, non-interoperable ways. Push-based DASH and HLS are common but in need of standardisation and in the same work, support for timed metadata such as splice information for ads could be addressed.

The result of the work is a method of using a separate TCP connection for each essence track; there is a POST command for each subtitles stream, metadata, video etc. This can be done with fixed length POST, but is better achieved with chunked tranfer encoding.

Rufael next shows us an exmaple of a CMAF track. Based on the ISO BMFF standard, CMAF specifies which ‘boxes’ can be used. The CMAF specification provides for optional boxes which would be used in the CMAF fragements. Time is important so is carried in ‘Live basemedia decodetime’ which is a unix-style time stamp that can be inserted into both the fragment and the CMAF header.

With all media being sent separately, the standard provides a way to define groups of essences both implicitly and explicity. Redundancy and hot failover have been provided for with multiple sources ingesting to multiple origins using the timestamp synchronisation, identical fragments can be detected.

The additional timed metadata track is based on the ISO BMFFF standard and can be fragmented just like other media. This work has extended the standard to allow the carrying of the DASH EventMessageBox in the time metadata track in order to reuse existing specifications like id3 and SCTE 214 for carrying SCTE 35 messages.

Rufael finishes by explaining how SCTE messages are inserted with reference to IDR frames and outlines how the DASH/HLS ingest interface between the packager and origin server works as well as showing a demo.

Watch now!
Speaker

Rufael Mekuria Rufael Mekuria
Head of Research & Standardisation,
Unifed Streaming

Video: Broadcasting WebRTC over Low Latency Dash


Using sub-second WebRTC with the scalability of CMAF: Allowing panelists and presenters to chat in real-time is really important to foster fluid conversations, but broadcasting that out to thousands of people scales more easily with CMAF based on MPEG DASH. In this talk, Mux’s Dylan Jhaveri (formerly CTO, Crowdcast.io) explains how they’ve combined WebRTC and CMAF to keep latencies low for everyone.

Speaking at the San Francisco VidDev meetup, Dylan explains that the Crowdspace webpage allows you to watch a number of participants talk in real-time as a live stream with live chat down the side of the screen. The live chat, naturally, feeds into the live conversation so latency needs to be low for the viewers as much as the on-camera participants. For them, WebRTC is used as this is one of the very few options that provides reliable sub-second streaming. To keep the interactivity between the chat and the participants, Crowdcast decided to look at ultra-low-latency CMAF which can deliver between 1 and 5 second latency depending on your risk threshold for rebuffering. So the task became to convert a WebRTC call into a low-latency stream that could easily be received by thousands of viewers.

 

 
Dylan points out that they were already taking WebRTC into the browser as that’s how people were using the platform. Therefore, using headless Chrome should allow you to pipe the video from the browser into ffmpeg and create an encode without having to composite individual streams whilst giving Crowdcast full layout control.

After a few months of tweaking, Dylan and his colleagues had Chrome going into ffmpeg then into a nodejs server which delivers CMAF chunks and manifests (click to learn more about how CMAF works). In order to scale this, Dylan explains the logic implemented in a CDN to use the nodejs server running in a docker container as an origin server. Using HLS they have a 95% cache hit rate and achieve 15 seconds latency. The tests at the time of the talks, Dylan explains, show that the CMAF implementation hits 3 seconds of latency and was working as expected.

The talk ends with a Q&A covering how they get the video out of the headerless Chrome, whether CMAF latency could be improved and why there are so many docker containers.

Watch now!
Speaker

Dylan Jhaveri Dylan Jhaveri
Senior Software Engineer, Mux
Formerly CTO & Co-founder, Crowdcast.io

Video: Player Optimisations

If you’ve ever tried to implement your own player, you’ll know there’s a big gap between understanding the HLS/DASH spec and getting an all-round great player. Finding the best, most elegant, ways of dealing with problems like buffer exhaustion takes thought and experience. The same is true for low-latency playback.

Fortunately, Akamai’s Will Law is here to give us the benefit of his experience implementing his own and helping customers monitor the performance of their players. At the end of the day, the player is the ‘kingpin’ of streaming, comments Will. Without it, you have no streaming experience. All other aspects of the stream can be worked around or mitigated, but if the player’s not working, no one watches anything.

Will’s first tip is to implement ‘segment abandonment’. This is when a video player foresees that downloading the current segment is taking too long; if it continues, it will run out of video to play before the segment has arrived. A well-programmed player will sport this and try to continue the download of this segment from another server or CDN. However, Will says that many will simply continue to wait for the download and, in the meantime, the download will fail.

Tip two is about ABR switching in low-latency, chunked transfer streams. The playback buffer needs to be longer than the chunk duration. Without this precaution, there will not be enough time for the player to make the decision to switch down layers. Will shows a diagram of how a 3-second playback buffer can recover as long as it uses 2-second segments.

Will’s next two suggestions are to put your initial chunk in the manifest by base64-encoding it. This makes the manifest larger but removes the round-trip which would otherwise be used to request the chunk. This can significantly improve the startup performance as the RTT could be a quarter of a second which is a big deal for low-latency streams and anyone who wants a short time-to-play. Similarly, advises Will, make those initial requests in parallel. Don’t wait for the init file to be downloaded before requesting the media segment.

Whilst many of points in this talk focus on the player itself, Will says it’s wise for the player to provide metrics back to the CDN, hidden in the request headers or query args. This data can help the CDN serve media smarter. For instance, the player could send over the segment duration to the CDN. Knowing how long the segment is, the CDN can compare this to the download time to understand if it’s serving the data too slow. Perhaps the simplest idea is for the player to pass back a GUID which the CDN can put in the logs. This helps identify which of the millions of lines of logs are relevant to your player so you can run your own analysis on a player-by-player level.

Will’s other points include advice on how to avoid starting playing at the lowest bandwidth and working up. This doesn’t look great and is often unnecessary. The player could run its own speed test or the CDN could advise based on the initial requests. He advises never trusting the system clock; use an external clock instead.

Regarding playback latency, it pays to be wise when starting out. If you blindly start an HLS stream, then your latency will be variable within the duration of a segment. Will advocates HEAD requests to try to see when the next chunk is available and only then starting playback. Another technique is to vary your playback rate o you can ‘catch up’. The benefit of using rate adjustment is that you can ask all your players to be at a certain latency behind realtime so they can be close to synchronous.

Two great tips which are often overlooked: Request multiple GOPs at once. This helps open up the TCP windows giving you a more efficient download. For mobile, it can also help the battery allowing you to more efficiently cycle the radio on and off. Will mentions that when it comes to GOPs, for some applications its important to look at exactly how long your GOP should be. Usually aligning it with an integer number of audio frames is the way to choose your segment duration.

The talk finishes with an appeal to move to using CMAF containers for streaming ask they allow you to deliver HLS and DASH streams from the same media segments and move to a common DRM. Will says that CBCS encrypted content is now becoming nearly all-pervasive. Finally, Will gives some tips on how players are best to analyse which CDN to use in multi-CDN environments.

Watch now!
Speaker

Will Law Will Law
Chief Architect,
Akamai

Video: Low-latency DASH Streaming Using Open Source Tools

Low Latency Dash also known as LL-DASH is a modification of MPEG DASH to allow it to operate with close to two seconds’ latency bringing it down to meet, or beat, standard broadcast signals.

Brightcove’s Bo Zhang starts by outlining the aims and methods of getting there. For instance, he explains, the HTTP 1.1 Chunked Transfer element is key to low-latency streaming as it allows the server to start sending a video segment as its being written, not waiting until the file is complete. LL-DASH also has the ability to state an availability window (‘availabilityTimeOffset’).

As LL-MPEG DASH is a living standard, there are updates on the way: Resync points will allow a player to receive a list of places where it can join a stream using SAP types in the ISO-BMFF spec, the server can send a ‘service description’ to the player which can use the information to adjust its latency. Event messages can now be inserted in the middle of segments.

Bo then moves on to explain that he and the team have set up and experiment to gain experience with LL-DASH and test overall latency. He shows that they decided to stream RTMP out of OBS, into a github project called ‘node-gpac-dash’ then to the dash.js player all. between Boston and Seattle. This test runs at 800×600, 30fps with a bitrate of 2.5Mbps and shows results of between 2.5 and 5 seconds depending on the network conditions.

As Bo moves towards the Q&A, he says that low-latency streaming is less scalable because a TCP connection needs to be kept open between the player and the CDN which is a burden.
Another compromise is that smaller chunk sizes in LL-DASH give reduced latency but IO increases meaning sometimes you may have to increase the chunk sizes (and hence latency of the stream) to allow for better performance. He also adds that adverts are more difficult with low-latency streams due to the short amount of time to request and receive the advertising.

Watch now!</a
More detail about the experiments in this talk can be found in Bo’s
blog post.
Speakers

Bo Zhang Bo Zhang
Staff Video System Engineer, Research
Brightcove