Video: Tech Talks: Low-Latency Live Streaming

There are a number of techniques for achieving low-latency streaming. This talk is one of the few which introduces them in easy to understand ways and then puts them in context briefly showing the manifests or javascript examples of how these would be seen in the wild. Whilst there are plenty of companies who don’t need low-latency streaming, for many it’s a key part of their offering or it’s part of the business model itself. Knowing the techniques in play is to better understand internet streaming in general.

Jameson Steiner from Bitmovin starts by explaining why there is a motivation to cut the latency. One big motivation, aside from the standard live sports examples, is user-generated content like on Twitch where it’s very clear to the streamer, and quite off-putting, when there is large amounts of delay. Whilst delay can be adapted to, the more there is the less interaction is possible. In this situation, it’s the ‘handwaving’ latency that comes in to play. You want the hand on the screen to wave pretty much at the same time as your hand waves in front of the camera. Jameson places different types of distribution on a chart showing latency and we see that low-latency of 5 seconds or less will not only match traditional TV broadcasts, but also work well for live streamers.

Naturally, to fix a problem you need to understand the problem, so Jameson breaks down the legacy methods of delivery to show why the latency exists. The issue comes down to how video is split into sections, say 6 seconds, so that the player downloads a section at a time, reassembles and plays them. Looking from the player’s perspective, if the network suddenly broke or reduced its throughput, it makes sense to have several chunks in reserve. Having three 6-second chunks, a sensible precaution, makes you 18 seconds behind the curve from the off.

Clearly reducing the segement size is a winner in this scenario. Three 3 second segments will give you just 9 seconds latency; why not go to 1 second? Well encoding inefficiency is one reason. If you reduce the amount of time a temporal codec has of a video, its efficiency will drop and bitrate will increase to maintain quality. Jameson explains the other knock-on effects such as CDN inefficiencies and network requests. The standardised way to avoid these problems is to use CMAF (Common Media Application Format) which is based on MPEG DASH and ISO BMFF. CMAF, and DASH in general, has the benefit of coming from a standards body whose aim was to remove vendor lock-in that may be felt with HLS and was certainly felt with RTMP. Check out MPEG’s short white paper on the topic (zipped .docx file)

CMAF uses chunked transfer meaning that as the encoder writes the data to the disk, the web server sends it to the client. This is different to the default where a file is only sent after it’s been completely written. This has the effect of the not having to wait up to 6 seconds to a 6-second chunk to start being sent; the download time also needs to be counted. Rather, almost as soon as the chunk has been finished by the encoder, it’s arrived at the destination. This is a feature of HTTP 1.1 and after so is not new, but it still needs to be enabled and considered as part of the delivery.

CMAF goes beyond simple HTTP 1.1 chunked transfer which is a technique used in low-latency HLS, covered later, by creating extra structure within the 6-second segment (until now, called a chunk in this article). This extra structure allows the segment to be downloaded in smaller chunks decoupling the segment length from the player latency. Chunked transfer does cause a notable problem however which has not yet been conclusively solved. Jameson explains how traditionally each large segment typically arrives faster than realtime. By measuring how fast it arrives, given the player knows the duration, it can estimate the bandwidth available at that time on the network. With chunked transfer, as we saw, we are receiving data as it’s being created. By definition, we are now getting it in realtime so there is no opportunity to receive it any quicker. The bandwidth estimation element, as shown the presentation, is used to work out if the player needs to go down or could go up to another stream at a different bitrate – part of standard ABR. So the catastrophe here is the going down in latency has hampered our ability to switch bitrates and whilst the viewer can see the video close to real-time, who’s to say if they are seeing it at the best quality?

Low-Latency HLS/DASH is a way of extending DASH and HLS without using CMAF. Jameson explains some techniques such as advertising segments in advance to allow players to pre-request. It also relies on finding the compromise point of encoding inefficiency vs segment length, typically held to be around 2 seconds, to minimise the latency. At this point we start seeing examples of the techniques in manifests and javascript allowing us to understand how this is actually signalled and implemented.

Apple is on its second major revision of LL-HLS which has responded to many of the initial complaints from the community. Whilst it can use HTTP/2 to help push segments out, this caused problems in practice so it can now preload hints, as Jameson explains in order to remove round-trip times from requests. Jameson looks at the other of Apple’s techniques and shows how they look in manifest files.

The final section looks at problems in implementing these features such as chunks being fragmented across TCP packets, the bandwidth estimation question and dealing with playback speed in order to adjust the players position in time – speed-ups and slow-downs of 5 to 10% can be possible depending on content.

Watch now!
Download the presentation
Speaker

Jameson Steiner Jameson Steiner
Software Engineer,
Bitmovin

Video: Encoding and packaging for DVB-I services

There are many ways of achieving a hybrid of OTT-delivered and broadcast-delivered content, but they are not necessarily interoperable. DVB aims to solve the interoperability issue, along with the problem of service discovery with DVB-I. This specification was developed to bring linear TV over the internet up to the standard of traditional broadcast in terms of both video quality and user experience.

DVB-I supports any device with a suitable internet connection and media player, including TV sets, smartphones, tablets and media streaming devices. The medium itself can still be satellite, cable or DTT, but services are encapsulated in IP. Where both broadband and broadcast connections are available, devices can present an integrated list of services and content, combining both streamed and broadcast services.

DVB-I standard relies on three components developed separately within DVB: the low latency operation, multicast streaming and advanced service discovery. In this webinar, Rufael Mekuria from Unified Streaming focuses on low latency distributed workflow for encoding and packaging.

 

The process starts with an ABR (adaptive bit rate) encoder responsible for producing streams with multiple bit rates and clear segmentation – this allows clients to automatically choose the best video quality depending on available bandwidth. Next step is packaging where streaming manifests are added and content encryption is applied, then data is distributed through origin servers and CDNs.

Rufael explains that low latency mode is based on an enhancement to the DVB-DASH streaming specification known as DVB Bluebook A168. This incorporates the chunked transfer encoding of the MPEG CMAF (Common Media Application Format), developed to enable co-existence between the two principle flavors of adaptive bit rate streaming: HLS and DASH. Chunked transfer encoding is a compromise between segment size and encoding efficiency (shorter segments make it harder for encoders to work efficiently). The encoder splits the segments into groups of frames none of which requires a frame from a later group to enable decoding. The DASH packager then puts each group of frames into a CMAF chunk and pushes it to the CDN. DVB claims this approach can cut end-to-end stream latency from a typical 20-30 seconds down to 3-4 seconds.

The other topics covered are: encryption (exhanging key parameters using CPIX), content insertion, metadata, supplemental descriptors, TTML subitles and MPD proxy.

Watch now!

Download the slides.

Speaker

Rufael Mekuria Rufael Mekuria
Head of Research & Standardization
Unified Streaming

Video: CMAF and DASH-IF Live ingest protocol

Of course without live ingest of content into the cloud, there is no live streaming so why would we leave such an important piece of the puzzle to an unsupported protocol like RTMP which has no official support for newer codecs. Whilst there are plenty of legacy workflows that still successfully use RTMP, there are clear benefits to be had from a modern ingest format.

Rufael Mekuria from Unified Streaming, introduces us to DASH-IF’s CMAF-based live ingest protocol which promises to solve many of these issues. Based on the ISO BMFF container format which underpins MPEG DASH. Whilst CMAF isn’t intrinsically low-latency, it’s able to got to much lower latencies than standard HLS, for instance.

This work to create a standard live ingest protocol was born out of an analysis, Rufael explains, of which part of the content delivery chain were most ripe for standardisation. It was felt that live ingest was an obvious choice partly because of the decaying RTMP protocol which was being sloppy replaced by individual companies doing their own thing, but also because there everyone contributing in the same way is of a general benefit to the industry. It’s not typically, at the protocol level, an area where individual vendors differentiate to the detriment of interoperability and we’ve already seen the, then, success of RMTP being used inter-operably between vendor equipment.

MPEG DAHS and HLS can be delivered in a pull method as well as pushed, but not the latter is not specified. There are other aspects of how people have ‘rolled their own’ which benefit from standardisation too such as timed metadata like ad triggers. Rufael, explaining that the proposed ingest protocol is a version of CMAF plus HTTP POST where no manifest is defined, shows us the way push and pull streaming would work. As this is a standardisation project, Rufael takes us through the timeline of development and publication of the standard which is now available.

As we live in the modern world, ingest security has been considered and it comes with TLS and authentication with more details covered in the talk. Ad insertion such as SCTE 35 is defined using binary mode and Rufael shows slides to demonstrate. Similarly in terms of ABR, we look at how switching sets work. Switching sets are sets of tracks that contain different representations of the same content that a player can seamlessly switch between.

Watch now!
Speaker

Rufael Mekuria Rufael Mekuria
Head of Research & Standardisation,
Unified Streaming

Video: Integrating CMAF Into A VOD Workflow

CMAF is often seen as the best hope for streaming to match the latency of broadcast. Fully standards based, many see this as the best route over Apple’s LL-HLS. Another benefit of it over LL-HLS is that it’s already a completed standard with growing support.

This talk from Tomas Bacik starts by explaining CMAF to us. Standing for the Common Media Application Format, it’s based on the standardised ISOBMFF container format and whilst CMAF isn’t by default low-latency, it is flexible enough to deliver just that. However, as Tomas from CDN77 points out, there are other major benefits in terms of its use of the Common Encryption format, reduces storage fees and more.

MPEG DASH is a commonly found streaming format based on ISO BMFF. It has always had the benefit of supporting other codecs such as HEVC and AV1 over HLS which is an AVC-only specification. CMAF is an extension of MPEG DASH which goes one step further in that it can deal with both HLS-style manifest files (.hls) as well as MPEG DASH format (.mpd) inheriting, of course, the multi-codec ability of DASH itself.

Next is central theme of the talk, looking at VoD workflows showing how CMAF fits in and, indeed, changes workflows for the better. CMAF directly impacts packaging, storage and CDN which is where we focus now. Given that some devices can play HLS and some can play DASH, if you try to serve both, you will double your requirements of packaging, storage etc. Dynamic packaging allows for immediately repackaging your chunks into either HLS or DASH as needed. Whilst this reduces the storage requirements, it increases processing and also increases the time to first byte. As you might expect, using CMAF throughout, Tomas explains in this talk, allows you to package once and store once which solves these problems.

Tomas continues by explaining the DRM abilities of CMAF including AES-CBC and finishes by taking questions from the audience.

Watch now!
See Streamflow’s blog post supporting the talk
Speakers

Tomas Bacik Tomas Bacik
VP of Product Development, Streamflow by CDN77
CDN77