Video: Deep Neural Networks for Video Coding

We know AI is going to stick around. Whether it’s AI, Machine Learning, Deep Learning or by another name, it all stacks up to the same thing: we’re breaking away from fixed algorithms where one equation ‘does it all’ to a much more nuanced approached with a better result. This is true across all industries. Within the Broadcast industry, one way it can be used is in video and audio compression. Want to make an image smaller? Downsample it with a Convolutional Neural Network and it will look better than Lanczos. No surprise, then, that this is coming in full force to a compression technology near you.

In this talk from Comcast’s Dan Grois, we hear the ongoing work to super-charge the recently released VVC by replacing functional blocks with neural-networks-based technologies. VVC has already achieved 40-50% improvements over HEVC. From the work Dan’s involved with, we hear that more gains are looking promising by using neural networks.

Dan explains that deep neural networks recognise images in layers. The brain does the same thing having one area sensitive to lines and edges, another to objects, another part of the brain to faces etc. A Deep Neural Network works in a similar way.
 

 

During the development of VVC, Dan explains, neural network techniques were considered but deemed too memory- or computationally-intensive. Now, 6 years on from the inception of VVC, these techniques are now practical and are likely to result in a VVC version 2 with further compression improvements.

Dan enumerates the tests so far swapping out each of the functional blocks in turn: intra- and inter-frame prediction, up- and down-scaling, in-loop filtering etc. He even shows what it would look like in the encoder. Some blocks show improvements of less than 5%, but added together, there are significant gains to be had and whilst this update to VVC is still in the early stages, it seems clear that it will provide real benefits for those that can implement these improvements which, Dan highlights at the end, are likely to require more memory and computation than the current version VVC. For some, this will be well worth the savings.

Watch now!
Speaker

Dan Grois Dan Grois
Principal Researcher,
Comcast

Video: SCTE 224? ESNI? What is Everyone Talking About?

Ad insertion with SCTE 35 is common enough in streaming today, but it can be a blunt tool when it comes to running a complex service which requires complex scheduling and switching plus more detailed control of advert playback and geographical deployment. SCTE 224 is here to meet the challenge by increasing the range of metadata that can be signalled.

Stuart Kurkowski from Comcast explains this need for SCTE 224 and what it delivers. For instance, a lot of SCTE 224 is devoted to controlling the US-style blackouts where viewers close to a sports game can’t watch the game live on TV. Whilst this is relatively easy to deal within the US for local terrestrial transmitters, in OTT, this is a new ability. But SCTE 224, however, isn’t just able blackouts. It also transmits accurate, multi-level, schedule information which helps to schedule complex ad breaks providing detailed, frame-accurate, local ad insertion.

It shouldn’t be thought that SCTE 35 and SCTE 224 are mutually exclusive. SCTE 35 can provide very accurate updates to unscheduled programmes and delays, where the 224 information still carries the rich metadata.

Find out more in this short primer!/a>
Speakers

Stuart Kurkowski Stuart Kurkowski
Distinguished Engineer and Principal Architect,
Comcast Technology Solutions

Video: Demystifying Video Delivery Protocols

Let’s face it, there are a lot of streaming protocols out there both for contribution and distribution. Internet ingest in RTMP is being displaced by RIST and SRT, whilst low-latency players such as CMAF and LL-HLS are vying for position as they try to oust HLS and DASH in existing services streaming to the viewer.

This panel, hosted by Jason Thibeault from the Streaming Video Alliance, talks about all these protocols and attempts to put each in context, both in the broadcast chain and in terms of its features. Two of the main contribution technologies are RIST and SRT which are both UDP-based protocols which implement a method of recovering lost packets whereby packets which are lost are re-requested from the sender. This results in a very high resilience to packet loss – ideal for internet deployments.

First, we hear about SRT from Maxim Sharabayko. He lists some of the 350 members of the SRT Alliance, a group of companies who are delivering SRT in their products and collaborating to ensure interoperability. Maxim explains that, based on the UDT protocol, it’s able to do live streaming for contribution as well as optimised file transfer. He also explains that it’s free for commercial use and can be found on github. SRT has been featured a number of times on The Broadcast Knowledge. For a deeper dive into SRT, have a look at videos such as this one, or the ones under the SRT tag.

Next Kieran Kunhya explains that RIST was a response to an industry request to have a vendor-neutral protocol for reliable delivery over the internet or other dedicated links. Not only does vendor-neutrality help remove reticence for users or vendors to adopt the technology, but interoperability is also a key benefit. Kieran calls out hitless switching across multiple ISPs and cellular. bonding as important features of RIST. For a summary of all of RIST’s features, read this article. For videos with a deeper dive, have a look at the RIST tag here on The Broadcast Knowledge.

Demystifying Video Delivery Protocols from Streaming Video Alliance on Vimeo.

Barry Owen represents WebRTC in this webinar, though Wowza deal with many protocols in their products. WebRTC’s big advantage is sub-second delivery which is not possible with either CMAF or LL-HLS. Whilst it’s heavily used for video conferencing, for which it was invented, there are a number of companies in the streaming space using this for delivery to the user because of it’s almost instantaneous delivery speed. Whilst a perfect rendition of the video isn’t guaranteed, unlike CMAF and LL-HLS, for auctions, gambling and interactive services, latency is always king. For contribution, Barry explains, the flexibility of being able to contribute from a browser can be enough to make this a compelling technology although it does bring with it quality/profile/codec restrictions.

Josh Pressnell and Ali C Begen talk about the protocols which are for delivery to the user. Josh explains how smoothstreaming has excited to leave the ground to DASH, CMAF and HLS. They discuss the lack of a true CENC – Common Encryption – mechanism leading to duplication of assets. Similarly, the discussion moves to the fact that many streaming services have to have duplicate assets due to target device support.

Looking ahead, the panel is buoyed by the promise of QUIC. There is concern that QUIC, the Google-invented protocol for HTTP delivery over UDP, is both under standardisation proceedings in the IETF and is also being modified by Google separately and at the same time. But the prospect of a UDP-style mode and the higher efficiency seems to instil hope across all the participants of the panel.

Watch now to hear all the details!
Speakers

Ali C. Begen Ali C. Begen
Technical Consultant, Comcast
Kieran Kunhya Kieran Kunhya
Founder & CEO, Open Broadcast Systems
Director, RIST Forum
Barry Owen Barry Owen
VP, Solutions Engineering
Wowza Media Systems
Joshua Pressnell Josh Pressnell
CTO,
Penthera Technologies
Maxim Sharabayko Maxim Sharabayko
Senior Software Developer,
Haivision
Jason Thibeault Moderator: Jason Thibeault
Executive Director,
Streaming Video Alliance

Video: Bandwidth Prediction for Multi-Bitrate Streaming at Low Latency

Low latency protocols like CMAF are wreaking havoc with traditional ABR algorithms. We’re having to come up with new ways of assessing if we’re running out of bandwidth. Traditionally, this is done by looking at how long a video chunk takes to download and comparing that with its playback duration. If you’re downloading at the same speed it’s playing, it’s time consider changing stream to a lower-bandwidth one.

As latencies have come down, servers will now start sending data from the beginning of a chunk as it’s being written which means it’s can’t be downloaded any quicker. To learn more about this, look at our article on ISO BMFF and this streaming primer. Since the file can’t be downloaded any quicker, we can’t ascertain if we should move up in bitrate to a better quality stream, so while we can switch down if we start running out of bandwidth, we can’t find a time to go up.

Ali C. Begen and team have been working on a way around this. The problem is that with the newer protocols, you pre-request files which start getting sent when they are ready. As such you don’t actually know the time the chunk starts downloading to you. Whilst you know when it’s finished, you don’t have access, via javascript, to when the file started being sent to you robbing you of a way of determining the download time.

Ali’s algorithm uses the time the last chunk finished downloading in place of the missing timestamp figuring that the new chunk is going to load pretty soon after the old. Now, looking at the data, we see that the gap between one chunk finishing and the next one starting does vary. This lead Ali’s team to move to a sliding window moving average taking the last 3 download durations into consideration. This is assumed to be enough to smooth out some of those variances and provides the data to allow them to predict future bandwidth and make a decision to change bitrate or not. There have been a number of alternative suggestions over the last year or so, all of which perform worse than this technique called ACTE.

In the last section of this talk, Ali explores the entry he was part of into a Twitch-sponsored competition to keep playback latency close to a second in test conditions with varying bitrate. Playback speed is key to much work in low-latency streaming as it’s the best way to trim off a little bit of latency when things are going well and allows you to buy time if you’re waiting for data; the big challenge is doing it without the viewer noticing. The entry used a heuristics and a machine learning approach which worked so well, they were runners up in the contest.

Watch now!
Speaker

Ali C. Begen
Ali C. Begen,
Technical Consultant, Comcast
Professor, Computer Science, Özyeğin University