Video: What is HESP Ultra-Low-Latency Streaming?

Is it possible to improve on CMAF’s offer of an ultra-low-latency, scalable protocol with good viewer experience? This is what HESP, the High-Efficiency Streaming Protocol, promises. With almost instant channel change times and sub-second latency, it’s worth taking a look at those protocol created by THEOPlayer to understand where it might work in your workflows.

Presented by Pieter-Jan Speelmans and Johan Vounckx from THEO, we hear some more detail surrounding HESP’s inception. Quality, latency and bitrate are often referred to as a triangle where if you improve one or even two, the remaining factor will get worse to compensate. HESP plays in the triangle connecting ‘viewer experience’, ‘low latency’ and ‘scalability’. If you compare WebRTC with CMAF, you see that WebRTC prioritises low-latency streaming but suffers in terms of scalability. CMAF, being 2-5 seconds higher latency, has much better scalability but the channel zapping times are high which affects viewer experience as well as overall latency. HESP, contests Pieter-Jan, actually improves all three. It’s able to do this because it’s not extending existing protocols which weren’t designed to meet all these requirements, rather it’s bringing in new techniques which shift the whole equation.

THEOPlayer has created the HESP Alliance which is devoted to standardising the HESP technology through the IETF or other avenue, promoting adoption through marketing and the creation of tools, certification and management of intellectual property. The talk outlines the decoder royalties which can be payable by subscriber, per subscriber per hour, or per device.

Source: THEOPlayer

Looking at the technical details, we find out that you can actually start playing an HESP stream without downloading the manifest. While HESP does have manifest files, they change very infrequently. If a new one is changed at short notice, the server can ask players to download one by embedding a message in the stream. The channel zapping speed is achieved using two streams, an initialisation stream and a continuation stream. The initialisation stream just I and P frames allowing you to start playing immediately. The continuation stream is intended to be the low-bitrate stream used after the establishment of the stream.

HESP uses two modes: Maximal Gain and Maximal Compatability. Maximal gain aims to have the lowest latency, lowest bandwidth and lowest zapping times. It has long segments with 1 frame chunks containing one I or P frame. The Maximal Compatability mode, however, allows you to reuse Low-Latency DASH and LLHLS streams and uses 6-second segments with 200msec chunks including B frames.

THEOPlayer claim 7x less delivery delay, 20x lower zapping times and a 20% bandwidth saving over CMAF with broad compatibility with many TVs, android, iOS, Web, streaming devices.

Watch now!
Speakers

Pieter-Jan Speelmans Pieter-Jan Speelmans
CTO & Founder,
THEOPlayer
Johan Vounckx Johan Vounckx
Vice President, Innovation,
THEOPlayer

Video:Measuring Video Quality with VMAF – Why You Should Care

VMAF, from Netflix, has become a popular tool for evaluating video quality since its launch as an Open Source project in 2017. Coming out of research from the University of Southern California and The University of Texas at Austin, it’s seen as one of the leading ways to automate video assessment.

Netflix’s Christos Bampis gives us a brief overview of VMAF’s origins and its aims. VMAF came about because other metrics such as MS-SSIM and, in particular, PSNR aren’t close enough indicators of quality. Indeed, Christos shows that when it comes to animated content (i.e. anime and cartoons) subjective scores can be very high, but if we look at the PSNR score it can be the same as the PSNR of score another live-action video clip which humans rate a lot lower, subjectively. Moreover, in less extreme examples, Christos explains. PSNR is often 5% or so away from the actual subjective score in either direction.

To a simple approximation, VMAF is a method of bringing out the spatial and temporal information from a video frame in a way which emphasises the types of things humans are attuned to such as contrast masking. Christos shows an example of a picture where artefacts in the trees are much harder to see than similar artefacts on a colour gradient such as a sky or still water. These extraction methods take account of situations like this and are then fed into a trained model which matches the results of the model with the numbers that humans would have given it. The idea being that when trained on many examples, it can correctly predict a human’s score given a set of data extracted from a picture. Christos shows examples of how well VMAF out-performs PSNR in gauging video quality.

 

Challenges are in focus in the second half of the talk. What are the things which still need working on to improve VMAF? Christos zooms in on two: design dimensionality and noise. By design dimensionality, he means how can VMAF be extended to be more general, delivering a number which has a consistent meaning in different scenarios? As the VMAF model has been trained on AVC, how can we deal with different artefacts which are seen with different codecs? Do we need a new model for HDR content instead of SDR and how should viewing conditions, whether ambient light or resolution and size of the display device, be brought into the metric? The second challenge Christos highlights is noise as he reveals VMAF tends to give lower scores than it should to noisy sources. Codecs like AV1 have film-grain synthesis tools and these need to be evaluated, so behaving correctly in the presence of video noise is important.

The talk finishes with Christos outlining that VMAF’s applicability to the industry is only increasing with new codecs coming out such as LCEVC, VCC, AV1 and more – such diversity in the codec ecosystem wasn’t an obvious prediction in 2014 when the initial research work was started. Christos underlines the fact that VMAF is a continually evolving metric which is Open Source and open to contributions. The Q&A covers failure cases, super-resolution and how to interpret close-call results which are only 1% different.

Watch now!
Download the presentation
Speaker

Christos Bampis Christos Bampis
Senior Software Engineer,
Netflix

Video: Case Study: Dropbox HQ ST 2110

Dropbox is embedded in many production workflows – official and otherwise – so it’s a beautiful symmetry that they’re using Broadcast’s latest technology, SMPTE ST 2110, within their own headquarters. Dropbox have AV throughout their building and a desire to create professional video from anywhere. This desire was a driving factor in an IP-based production facility as, to allow mobile production platforms to move from room to room with only a single cable needed to connect to the wall and into the production infrastructure.

David Carroll’s integration company delivered this project and joins Wes Simpson to discuss this case-study with colleague Kevin Gross. David explains that they delivered fibre to seventy locations throughout the building making most places into potential production locations.

Being an IT company at heart, the ST 2110 network was built to perform in the traditional way, but with connections into the corporate network which many broadcasters wouldn’t allow. ST 2110 works best with two separate networks, often called Red and Blue, both delivering the same video. This uses ST 2022-7 to seamlessly failover if one network loses a packet or even if it stops working all together. This is the technique used with dropbox, although there these networks are connected together so are not one hundred per cent isolated. This link, however, has the benefit of allowing PTP traffic between the two networks.

PTP topology typically sees two grandmasters in the facility. It makes sense to connect one to the red network, the other to the blue. In order to have proper redundancy, though, there should really be a path from both grandmasters to both networks. This is usually done with a specially-configured ‘PTP only’ link between the two. In this case, there are other reasons for a wider link between networks which also serves as the PTP link. Another element of PTP topology is acknowledging the need for two PTP domains. A PTP domain allows two PTP systems to operate on the same network but without interfering with one another. Dante requires PTP version 1 whereas 2110, and most other things, require v2. Although this is in the process of improving, the typical way to solve this now is to run the two separately and block v1 from areas of the network in which it’s not needed.

PTP disruptions can also happen with multicast packet loss. If packets are lost at the wrong time, a grandmaster election can happen. Finally, on PTP, they also saw the benefits of using boundary clock switches to isolate the grandmasters. These grandmasters have to send out the time eight times a second. Each end-device then replies to ascertain the propagation delay. Dealing with every single device can overwhelm grandmasters, so boundary clock switches can be very helpful. On a four-core Arista, David and Kevin found that one core would be used dealing with the PTP requests.

A more extensive write-up of the project can be found here from David Carroll

Watch now!

Speakers

Kevin Gross Kevin Gross
Media Network Consultant
AVA Networks
David Carroll David Carroll
President,
David Carroll Associates, Inc.
Wes Simpson Wes Simpson
Owner, LearnIPVideo.com

Video: AES67 over WAN

Deeply embedded in the audio industry and adopted into SMPTE ST 2110, AES67 workflows surround us. Increasingly our workflows are in multiple locations so moving AES67 on the WAN and the internet is essential. If networks were always perfect, this would be easy but as that’s not the case, this RAVENNA talk examines what the problems are and how to solve them.

Andreas Hildebrand introduces the video with an examination of how the WAN, whether that’s a company’s managed wide area network or the internet at large, is different from a LAN. Typical issues are packet loss, varying latency meaning the packets arrive with jitter, lack of PTP and multicast. With this in mind, Nicolas Sturmel from Merging Technologies takes the reins to examine the solutions.

Nicolas explains the typically EBU Tech 3326 (also known as ACIP) is used for WAN contribution which specifies how a sender and receiver communicate and the codecs to be used. Although PCM is available, many codecs such as AptX are also prescribed for use. Nicolas says that ACIP is great for most applications but if you need low-latency, precise timing and PCM-quality staying AES67 may be the best policy, even over the WAN.

Having identified your AES6-over-WAN workflow, the question is how to pull it off. Nicolas looks at three methods, one is FEC whereby you are constantly sending redundant data. FEC can send up to around 25% extra data so that if any is lost, the extra information sent can be leveraged to determine the lost values and reconstruct the stream. This is can work well but requires sending this extra data constantly therefore putting up your bandwidth. It can also only deal with certain losses requiring them to be of a short duration.

Instead of FEC, you can use RIST, SRT or a similar re-transmission technology. These will actively recover any lost packets and have the benefit that you only transmit more data when you have lost data. Lastly, he mentions SMPTE ST 2022-7 which uses two paths of identical data to cover losses in any one of them. Although this is 100% extra data, the benefit is that it can deal with any type of loss including a complete path failure which neither of the others can do. It is, however possible to combine FEC or RIST with a 2022-7 workflow so you can have two levels of protection.

Timing over the WAN is not ideal as PTP loses accuracy over long-latency links and it assumes symmetry. On the internet, it’s possible to get links where the latency is longer in one direction than the other. An easy, though potentially costly, workaround for distributing PTP over the WAN is to use GPS, GLONASS or similar to synchronise grandmaster clocks at each location.

Watch now!
Speakers

Nicolas Sturmel Nicolas Sturmel
Product Manager & Senior Technologist
Merging Technologies
Andreas Hildebrand Andreas Hildebrand
RAVENNA Evangelist,
ALCNetworx