Video: Futuristic Codecs and a Healthy Obsession with Video Startup Time

These next 12 months are going to see 3 new MPEG standards being released. What does this mean for the industry? How useful will they be and when can we start using them? MPEG’s coming to the market with a range of commercial models to show it’s learning from the mistakes of the past so it should be interesting to see the adoption levels in the year after their release. This is part of the second session of the Vienna Video Tech Meetup and delves into startup time for streaming services.

In the first talk, Dr. Christian Feldmann explains the current codec landscape highlighting the ubiquitous AVC (H.264), UHD’s friend, HEVC (H.265), and the newer VP9 & AV1. The latter two differentiate themselves by being free to used and are open, particularly AV1. Whilst slow, both the latter are seeing increasing adoption in streaming, but no one’s suggesting that AVC isn’t still the go-to codec for most online streaming.

Christian then introduces the three new codecs, EVC (Essential Video Coding), LCEVC (Low-Complexity Enhancement Video Coding) and VVC (Versatile Video Coding) all of which have different aims. We start by looking at EVC whose aim is too replicate the encoding efficiency of HEVC, but importantly to produce a royalty-free baseline profile as well as a main profile which improves efficiency further but with royalties. This will be the first time that you’ve been able to use an MPEG codec in this way to eliminate your liability for royalty payments. There is further protection in that if any of the tools is found to have patent problems, it can be individually turned off, the idea being that companies can have more confidence in deploying the new technology.

The next codec in the spotlight is LCEVC which uses an enhancement technique to encode video. The aim of this codec is to enable lower-end hardware to access high resolutions and/or lower bitrates. This can be useful in set-top boxes and for online streaming, but also for non-broadcast applications like small embedded recorders. It can achieve a light improvement in compression over HEVC, but it’s well known that HEVC is very computationally heavy.

LCEVC reduces computational needs by only encoding a lower resolution version (say, SD) of the video in a codec of your choice, whether that be AVC, HEVC or otherwise. The decoder will then decode this and upscale the video back to the original resolution, HD in this example. This would look soft, normally, but LCEVC also sends enhancement data to add back in the edges and detail that would have otherwise been lost. This can be done in CPU whilst the other decoding could be done by the dedicated AVC/HEVC hardware and naturally encoding/decoding a quarter-resolution image is much easier than the full resolution.

Lastly, VVC goes under the spotlight. This is the direct successor to HEVC and is also known as H.266. VVC naturally has the aim of improving compression over HEVC by the traditional 50% target but also has important optimisations for more types of content such as 360 degree video and screen content such as video games.

To finish this first Vienna Video Tech Meetup, Christoph Prager lays out the reasons he thinks that everyone involved in online streaming should obsess about Video Startup Time. After defining that he means the time between pressing play and seeing the first frame of video. The longer that delay, the assumption is that the longer the wait, the more users won’t bother watching. To understand what video streaming should be like, he examines Spotify’s example who have always had the goal of bringing the audio start time down to 200ms. Christophe points to this podcast for more details on what Spotify has done to optimise this metric which includes activating GUI elements before, strictly speaking, they can do anything because the audio still hasn’t loaded. This, however, has an impact of immediacy with perception being half the battle.

“for every additional second of startup delay, an additional 5.8% of your viewership leaves”

Christophe also draws on Akamai’s 2012 white paper which, among other things, investigated how startup time puts viewers off. Christophe also cites research from Snap who found that within 2 seconds, the entirety of the audience for that video would have gone. Snap, of course, to specialise in very short videos, but taken with the right caveats, this could indicate that Akamai’s numbers, if the research was repeated today, may be higher for 2020. Christophe finishes up by looking at the individual components which go towards adding latency to the user experience: Player startup time, DRM load time, Ad load time, Ad tag load time.

Dr. Christian Feldmann
Team Lead Encoding,
Christoph Prager
Product Manager, Analytics
Markus Hafellner
Product Manager, Encoding

Video: Versatile Video Coding (VVC)

MPEG’s VVC is the next iteration along from HEVC (H.265). Whilst there are other codecs being finalised such as EVC and LCEVC, this talk looks at how VVC builds on HEVC, but also lends its hand to screen content and VR becoming a more versatile codec than HEVC, meeting the world’s changing needs. For an overview of these emerging codecs, this interview covers them all.

VVC is a joint project between ITU-T and MPEG (AKA ISO/IEC). Its aim is to create a 50% encoding efficiency in bitrate for the same quality picture, with the emphasis on higher resolutions, HDR and 10-bit video. At the same time, acknowledging that optimising codecs on natural video is no longer the core requirement for a lot of people. Its versatility comes from being able to encode screen content, independent sub-picture encoding, scalable encoding among others.

Gary Sullivan from Microsoft Technology & Research talks us through what all this means. He starts by outlining the case for a new codec, particularly the reach for another 50% bitrate saving which may come at further computational cost. Gary points out that video use continues to increase anything that can be done to significantly reduce bitrates, will either drive down costs or allow people to use video in better ways.

Any codec is a set of tools all working together to create the final product. Some tools are not always needed, say if you are running on a lower-power system, allowing the codec to be tuned for the situation. Gary puts up a list of some of the tools in VVC, many of which are an evolution of the same tool in HEVC, and highlights a few to give an insight into the improvements under the hood.

Gary’s pick of the big hitters in the tool-set are the Adaptive Loop Filter which reduces artefacts and prediction errors, affine motion compensation which provides better motion compensation, triangle partitioning mode which is a high-computation improvement in intra prediction, bi-directional optical flow (BIO) for motion prediction, intra-block copy which is useful for screen content where an identical block is found elsewhere in the same frame.

Gary highlights SCC, Screen Content Coding, which was in HEVC but not in the base profile, this has changed for VVC so all VVC implementations will have SCC whereas very few HEVC implementations do. Reference Picture Resampling (RPR) allows changing resolution from picture to picture where pictures can be stored at a different resolution from the current picture. And independent sub-pictures which allow parts of the video frame to be re-arranged or only for only one region to be decoded. This works well for VR, video conferencing and allows creation of composite videos without intermediate decoding.

As usual, doing more thinking about how to compress a picture brings further computational demands. MPEG’s LCEVC is the standards body’s way of fighting against this, as notable bitrate improvements are possible even for low-power devices. With VVC, versatility is the aim, however. Decoders see a 60% increase in decode complexity. Whilst MPEG specifications are all about the decoder – hence allowing a lot of ongoing innovation in encoding techniques – current examples are about 8 or 9 times slower. Performance is better for screen content and on higher resolutions. Whilst the coding part of VVC is mature, versatility is still being worked on but the aim is to publishing within about 2 months.

The video finishes with a Q&A that covers implementing in DASH into a low-latency video workflow. How CMAF will be specified to use VVC. Live workflows which Gary explains always come after the initial file-based work and is best understood after the first attempts at encoder implementations, noting that hardware lags by 2 years. He goes on to explain that chipmakers need to see the demand. At the moment, there is a lot of focus from implementors on AV1 by implementors, not to mention EVC, so the question is how much demand can be generated.

This talk is based on talk from Benjamin Bross originally given to an ITU workshop (PDF), then presented at Mile High Video by Benjamin and was updated by Gary for this conversation with the Seattle Video Tech community.

Bitmovin has an article highlighting many of the improvements in VVC written by Christian Feldmann who has given many talks on both AV1 and VVC.

Gary Sullivan
Microsoft Technology & Research

Video: Codecs, standards and UHD formats – where is the industry headed?

UHD transmissions have been available for many years now and form a growing, albeit slow-growing, percentage of channels available. The fact that major players such as Sky and BT Sports in the UK, NBCUniversal and the ailing DirecTV in the US, see fit to broadcast sports in UHD shows that the technology is trusted and mature. But given the prevalence of 4K in films from Netflix, Apple TV+ streaming is actually the largest delivery mechanism for 4K/UHD video into the home.

Following on from last week’s DVB webinar, now available on demand, this webinar from the DVB Project replaces what would have been part of the DVB World 2020 conference and looks at the work that’s gone into getting UHD to were it is now in terms of developing HEVC (also known as H.265), integrating it into broadcast standards plus getting manufacturer support. It then finishes by looking at the successor to HEVC – VVC (Versatile Video Codec)

The host, Ben Swchwarz from the Ultra HD Forum, first introduces Ralf Schaefer who explores the work that was done in order to make UHD for distribution a reality. He’ll do this by looking at the specifications and standards that were created in order to get us where we are today before looking ahead to see what may come next.

Yvonne Thomas from the UK’s Digital TV Group is next and will follow on from Ben by looking at codecs for video and audio. HEVC is seen as the go-to codec for UHD distribution. As the uncompressed bitrate for UHD is often 12Gbps, HEVC’s higher compression ratio compared to AVC and relatively wide adoption makes it a good choice for wide dissemination of a signal. But UHD is more than just video. With UHD and 4K services usually carrying sports or films, ‘next generation audio‘ is really important. Yvonne looks at the video and audio aspects of delivering HEVC and the devices that need to receive it.

Finally we look at VVC, also known as H.266, the successor to HEVC, also known as H.265. ATEME’s Sassan Pejhan gives us a look into why VVC was created, where it currently is within MPEG standardisation and what it aims to achieve in terms of compression. VVC has been covered previously on The Broadcast Knowledge in dedicated talks such as ‘VVC, EVC, LCEVC, WTF?’, ‘VVC Standard on the Final Stretch’, and AV1/VVC Update.

Ben Schwarz
Communication Working Group Chair,
Ultra HD Forum
Ralf Schaefer
VP Standards R&I
InterDigital Inc.
Yvonne Thomas
Strategic Technologist
DTG (Digital TV Group)
Sassan Pejhan
VP Technology,

Video: SMPTE Technical Primers

The Broadcast Knowledge exists to help individuals up-skill whatever your starting point. Videos like this are far too rare giving an introduction to a large number of topics. For those starting out or who need to revise a topic, this really hits the mark particularly as there are many new topics.

John Mailhot takes the lead on SMPTE 2110 explaining that it’s built on separate media (essence) flows. He covers how synchronisation is maintained and also gives an overview of the many parts of the SMPTE ST 2110 suite. He talks in more detail about the audio and metadata parts of the standard suite.

Eric Gsell discusses digital archiving and the considerations which come with deciding what formats to use. He explains colour space, the CIE model and the colour spaces we use such as 709, 2100 and P3 before turning to file formats. With the advent of HDR video and displays which can show bright video, Eric takes some time to explain why this could represent a problem for visual health as we don’t fully understand how the displays and the eye interact with this type of material. He finishes off by explaining the different ways of measuring the light output of displays and their standardisation.

Yvonne Thomas talks about the cloud starting by explaining the different between platform as a service (PaaS), infrastructure as a service (IaaS) and similar cloud terms. As cloud migrations are forecast to grow significantly, Yvonne looks at the drivers behind this and the benefits that it can bring when used in the right way. Using the cloud, Yvonne shows, can be an opportunity for improving workflows and adding more feedback and iterative refinement into your products and infrastructure.

Looking at video deployments in the cloud, Yvonne introduces video codecs AV1 and VVC both, in their own way, successors to HEVC/h.265 as well as the two transport protocols SRT and RIST which exist to reliably send video with low latency over lossy networks such as the internet. To learn more about these protocols, check out this popular talk on RIST by Merrick Ackermans and this SRT Overview.

Rounding off the primer is Linda Gedemer from Source Sound VR who introduces immersive audio, measuring sound output (SPL) from speakers and looking at the interesting problem of forward speakers in cinemas. The have long been behind the screen which has meant the screens have to be perforated to let the sound through which interferes with the sound itself. Now that cinema screens are changing to be solid screens, not completely dissimilar to large outdoor video displays, the speakers are having to move but now with them out of the line of sight, how can we keep the sound in the right place for the audience?

This video is a great summary of many of the key challenges in the industry and works well for beginners and those who just need to keep up.

John Mailhot
Systems Architect for IP Convergence,
Imagine Communications
Eric Gsell
Staff Engineer,
Dolby Laboratories
Linda Gedemer, PhD
Technical Director, VR Audio Evangelist
Source Sound VR
Yvonne Thomas
Strategic Technologist
Digital TV Group