Video: What is HESP Ultra-Low-Latency Streaming?

Is it possible to improve on CMAF’s offer of an ultra-low-latency, scalable protocol with good viewer experience? This is what HESP, the High-Efficiency Streaming Protocol, promises. With almost instant channel change times and sub-second latency, it’s worth taking a look at those protocol created by THEOPlayer to understand where it might work in your workflows.

Presented by Pieter-Jan Speelmans and Johan Vounckx from THEO, we hear some more detail surrounding HESP’s inception. Quality, latency and bitrate are often referred to as a triangle where if you improve one or even two, the remaining factor will get worse to compensate. HESP plays in the triangle connecting ‘viewer experience’, ‘low latency’ and ‘scalability’. If you compare WebRTC with CMAF, you see that WebRTC prioritises low-latency streaming but suffers in terms of scalability. CMAF, being 2-5 seconds higher latency, has much better scalability but the channel zapping times are high which affects viewer experience as well as overall latency. HESP, contests Pieter-Jan, actually improves all three. It’s able to do this because it’s not extending existing protocols which weren’t designed to meet all these requirements, rather it’s bringing in new techniques which shift the whole equation.

THEOPlayer has created the HESP Alliance which is devoted to standardising the HESP technology through the IETF or other avenue, promoting adoption through marketing and the creation of tools, certification and management of intellectual property. The talk outlines the decoder royalties which can be payable by subscriber, per subscriber per hour, or per device.

Source: THEOPlayer

Looking at the technical details, we find out that you can actually start playing an HESP stream without downloading the manifest. While HESP does have manifest files, they change very infrequently. If a new one is changed at short notice, the server can ask players to download one by embedding a message in the stream. The channel zapping speed is achieved using two streams, an initialisation stream and a continuation stream. The initialisation stream just I and P frames allowing you to start playing immediately. The continuation stream is intended to be the low-bitrate stream used after the establishment of the stream.

HESP uses two modes: Maximal Gain and Maximal Compatability. Maximal gain aims to have the lowest latency, lowest bandwidth and lowest zapping times. It has long segments with 1 frame chunks containing one I or P frame. The Maximal Compatability mode, however, allows you to reuse Low-Latency DASH and LLHLS streams and uses 6-second segments with 200msec chunks including B frames.

THEOPlayer claim 7x less delivery delay, 20x lower zapping times and a 20% bandwidth saving over CMAF with broad compatibility with many TVs, android, iOS, Web, streaming devices.

Pieter-Jan Speelmans Pieter-Jan Speelmans
CTO & Founder,
Johan Vounckx Johan Vounckx
Vice President, Innovation,

Video: AV1 Commercial Readiness Panel

With two years of development and deployments under its belt, AV1 is still emerging on to the codec scene. That’s not to say that it’s no in use billions of times a year, but compared to the incumbents, there’s still some distance to go. Known as very slow to encode and computationally impractical, today’s panel is here to say that’s old news and AV1 is now a real-time codec.

Brought together by Jill Boyce with Intel, we hear from Amazon, Facebook, Googles, Amazon, Twitch, Netflix and Tencent in this panel. Intel and Netflix have been collaborating on the SVT-AV1 encoder and decoder framework for two years. The SVT-AV1 encoder’s goal was to be a high-performance and scalable encoder and decoder, using parallelisation to achieve this aim.

Yueshi Shen from Amazon and Twitch is first to present, explaining that for them, AV1 is a key technology in the 5G area. They have put together a 1440p, 120fps games demo which has been enabled by AV1. They feel that this resolution and framerate will be a critical feature for Twitch in the next two years as computer games increasingly extend beyond typical broadcast boundaries. Another key feature is achieving an end-to-end latency of 1.5 seconds which, he says, will partly be achieved using AV1. His company has been working with SOC vendors to accelerate the adoption of AV1 decoders as their proliferation is key to a successful transition to AV1 across the board. Simultaneously, AWS has been adding AV1 capability to MediaConvert and is planning to continue AV1 integration in other turnkey content solutions.

David Ronca from Facebook says that AV1 gives them the opportunity to reduce video egress bandwidth whilst also helping increase quality. For them, SVT-AV1 has brought using AV1 into the practical domain and they are able to run AV1 payloads in production as well as launch a large-scale decoder test across a large set of mobile devices.

Matt Frost represent’s Google Chrome and Android’s point of view on AV1. Early adopters, having been streaming partly using AV1 since 2018 in resolution small and large, they have recently added support in Duo, their Android video-conferencing application. As with all such services, the pandemic has shown how important they can be and how important it is that they can scale. Their move to AV1 streaming has had favourable results which is the start of the return on their investment in the technology.

Google’s involvement with the Alliance for Open Media (AOM), along with the other founding companies, was born out of a belief that in order to achieve the scales needed for video applications, the only sensible future was with cheap-to-deploy codecs, so it made a lot of sense to invest time in the royalty-free AV1.

Andrey Norkin from Netflix explains that they believe AV1 will bring a better experience to their members. Netflix has been using AV1 in streaming since February 2020 on android devices using a software decoder. This has allowed them to get better quality at lower bitrates than VP9 Testing AV1 on other platforms. Intent on only using 10-bit encodes across all devices, Andrey explains that this mode gives the best efficiency. As well as being founding members of AoM, Netflix has also developed AVIF which is an image format based on AV1. According to Andrey, they see better performance than most other formats out there. As AVIF works better with text on pictures than other formats, Netflix are intending to use it in their UI.

Tencent’s Shan Liu explains that they are part of the AoM because video compression is key for most Tencent businesses in their vast empire. Tencent cloud has already launched an AV1 transcoding service and support AV1 in VoD.

The panel discusses low-latency use of AV1, with Dave Ronca explaining that, with the performance improvements of the encoder and decoders along-side the ability to tune the decode speed of AV1 by turning on and off certain tools, real-time AV1 are now possible. Amazon is paying attention to low-end, sub $300 handsets, according to Yueshi, as they believe this will be where the most 5G growth will occur so site recent tests showing decoding AV1 in only 3.5 cores on a mobile SOC as encouraging as it’s standard to have 8 or more. They have now moved to researching battery life.

The panel finishes with a Q&A touching on encoding speed, the VVC and LCEVC codecs, the Sisvel AV1 patent pool, the next ramp-up in deployments and the roadmap for SVT-AV1.

Yueshi Shen Yueshi Shen
Principle Engineer
AWS & Twitch
David Ronca David Ronca
Video Infrastructure Team,
Matt Frost Matt Frost
Product Manager, Chome Media Technologies,
Andrey Norkin Andrey Norkin
Emerging Technologies Team
Shan Liu Dr Shan Liu
Chief Scientist & General Manager,
Tencent Media Lab
Jill Boyce Jill Boyce

Video: Real-time AV1 in WebRTC

AV1 seems to be shaking off its reputation for slow encoding, now only 2x slower than HEVC. How practical, then is it to put AV1 into a real-time codec aiming for sub-second latency? This is exactly what the Alliance for Open Media are working on as parts of AV1 are perfectly suited for the use case.

Dr Alex from CoSMo Software took the podium at the Alliance for Open Media Research Symposium to lay out the whys and wherefores of updating WebRTC to deliver AV1. He started by outlining the different requirements of real-time vs VoD. With non-live content, encoding time is often unrestricted allowing for complex encoding methods to achieve lower bitrates. Even live CMAF streams aiming to achieve a relatively low 3-second latency have time enough for much more complex encoding than real-time. Encoding, ingest, storage and delivery can all be separated into different parts of the workflow for VoD, whereas real-time is forced to collapse logical blocks down as much as possible. Unsurprisingly, Dr Alex outlines latency as the most important driver in the WebRTC use case.

When streaming, ABR isn’t quite as simple as with chunked formats. The different bit rate streams need to be generated at the encoder to save any transcoding delays. There are two ways of delivering these streams. One is to deliver them as separate streams, the other is to deliver only one, layered stream. The latter method is known as Scalable Video Coding (SVC) which sends a base layer of a low-resolution version of the video which can be decoded on its own. Within that stream, is also the information which builds on top of that video to create a higher-resolution version of the same stream. You can have multiple layers and hence provide information for 3, 4 or more streams.

Managing which streams get to the decoder is done through an SFU (Selective Forwarding Unit) which is a server to which WebRTC clients connect to receive just the stream, or parts of a stream, they need for their current bandwidth capability. It’s important to remember that compared to video conferencing solutions based on WebRTC, that streaming using WebRTC scales linearly. Whilst it’s difficult to hold a meeting with 50 people in a room, it’s possible to optimise what video is sent to everyone by only showing the last 5 speakers in full resolution, the others as thumbnails. Such optimisations are not available for video distribution, rather SFUs and media servers need to be scaled and cascaded. This should be simple, but testing can be difficult but it’s necessary to ensure quality and network resilience at scale.

Cisco have already demonstrated the first real-time AV1-based WebRTC system, though without SVC support. Work is ongoing to deliver improvements to RTP encapsulation of AV1 in WebRTC. For instance, providing Decoding Target Information which embeds information about frames without needing to decode the video itself. This information explains how important each frame is and how it relates to the other video. Such metadata can be used by the SFU or the decoder to understand which frames to drop and send/decode.

Alex Gouaillard Dr Alex Gouaillard
Video Codec Working Group – Real-time subgroup, Allience for Open Media
Founder, Directory & CEO, CoSMo Software Consulting Pte. Ltd.
Co-founder & CTO, Millicast

Video: Layer 4 in the CDN

Caching is a critical element of the streaming video delivery infrastructure, but with the proliferation of streaming services, managing caching is complex and problematic. Open Caching is an initiative by the Streaming Video Alliance to bring this under control allowing ISPs and service providers a standard way to operate.

By caching objects as close to the viewer as possible, you can reduce round-trip times which helps reduce latency and can improve playback but, more importantly, moving the point at which content is distributed closer to the customer allows you to reduce your bandwidth costs, and create a more efficient delivery chain.

This video sees Disney Streaming Services, ViaSat and Stackpath discussing Open Caching with Jason Thibeault, Executive Director of the Streaming Video Alliance. Eric Klein from Disney explains that one driver for Open Caching is from content producers which find it hard to scale, to deliver content in a consistent manner across many different networks. Standardising the interfaces will help remove this barrier of scale. Alongside a drive from content producers, are the needs of the network operators who are interested in moving caching on to their network which reduces the back and forth traffic and can help cope with peaks.

Dan Newman from Viasat builds on these points looking at the edge storage project. This is a project to move caching to the edge of the networks which is an extension of the original open caching concept. The idea stretches to putting caching directly into the home. One use of this, he explains, can be used to cache UHD content which otherwise would be too big to be downloaded down lower bandwidth links.

Josh Chesarek from StackPath says that their interest in being involved in the Open Caching initiative is to get consistency and interoperability between CDNs. The Open Caching group is looking at creating these standard APIs for capacity, configuration etc. Also, Eric underlines the interest in interoperability by the close work they are doing with the IETF to find better standards on which to base their work.

Looking at the test results, the average bitrate increases by 10% when using open caching, but also a 20-40% improvement in connection use rebuffer ratio which shows viewers are seeing an improved experience. Viasat have used multicast ABR plus open caching. This shows there’s certainly promise behind the work that’s ongoing. The panel finishes by looking towards what’s next in terms of the project and CDN optimisation.

Eric Klein Eric Klein
Director, CDN Technology,
Dan Newman Dan Newman
Product Manager,
Josh Chesarek Josh Chesarek
VP, Sales Engineering & Support
Jason Thibeault Jason Thibeault
Executive Director, Streaming Video Alliance