Video: Overview of MPEG’s Network-Based Media Processing

Building complex services from microservices not simple. While making a static workflow can be practical, though time-consuming, making one that is able to be easily changed to match a business’s changing needs is another matter. If an abstraction layer could be placed over the top of the microservices themselves, that would allow people to concentrate on making the workflow correct and leave the abstraction layer to orchestrate the microservices below. This is what MPEG’s Network-Based Media Processing (NBMP) standard achieves.

Developed to counteract the fragmentation in cloud and single-vendor deployments, NBMP delivers a unified way to describe a workflow with the platform controlled below. Iraj Sodagar spoke at Mile High Video 2020 to introduce NBMP, now published as ISO/IEC 23090-8. NBMP provides a framework that allows you to deploy and control media processing using existing building blocks called functions fed by sources and sinks, also known as inputs and outputs. A Workflow Manager process is used to actually start and control the media processing, fed with a workflow description that describes the processing wanted as well as the I/O formats to use. This is complemented by a Function Discovery API and a Function Repository to discover and get hold of the functions needed. The Workflow Manager gets the function and uses the Task API to initiate the processing of media. The Workflow Manager also deals with finding storage and understanding networking.

Next, Iraj takes us through the framework APIs which allow the abstraction layer to operate, in principle, across multiple cloud providers. The standard contains 3 APIs: Workflow, Task & Function. The APIs use a CRUD architecture each having ‘update’ ‘Discover’ ‘Delete’ and similar actions which apply to Tasks, Functions and the workflows i.e. CreateWorkflow. The APIs can operate synchronously or asynchronously.

Split rendering is possible by splitting up the workflow into sub workflows which allows you to run certain tasks nearer to certain resources, say storage, or in certain locations like in the case of edge computing where you want to maintain low-latency by processing close to the user. In fact, NBMP has been created with a view to being able to be used by 5G operators and is the subject of two study items in 3GPP.

Watch now!

Iraj Sodagar Iraj Sodagar
Principal Researcher
Tencent America

Iraj Sodagar,
Tencent America

Video: MPEG-5 Essential Video Coding (EVC) Standard

Learning from the patent miss-steps of HEVC, MPEG have released MPEG-5 EVC which brings bitrate savings, faster encoding and clearer licencing terms including a royalty-free implementation. The hope being that with more control over exposure to patent risk, companies large and small will adopt EVC as they improve and launch streaming services now and in the future.

At Mile High Video 2020, Kiho Choi introduced the MPEG 5 Essential Video Coding. Naturally, the motivation to produce a new codec was partly based on the continued need to reduce video bitrates. With estimates of the video traffic share on the internet, both now and in the future all hovering between 75% and 90% any reduction in bitrate will have a wide benefit, best exemplified by Netflix and Facebook’s decision to reduce the bitrate at the top of their ABR ladder during the pandemic which impacted the quality available to viewers. The unspoken point of this talk is that if the top rung used EVC, viewers wouldn’t notice a drop in quality.

The most important point about EVC, which is in contrast to the MPEG/ISO co-defined standard form last year, VVC, is that it provides businesses a lot of control over their exposure to patent royalties. It’s no secret that much HEVC adoption has been hampered by the risk that large users could be approached for licencing fees. Whilst it has made its way into Apple devices, which is no minimal success, big players like ESPN won’t have anything to do with it. EVC tackles this problem in two ways. One is to have a baseline profile which provides bitrate savings over its predecessors but uses a combination of technologies which are either old enough to not be eligible for royalty payments or that have been validated as free to use. Companies should, therefore, be able to use this level of codec without any reasonable concern over legal exposure. Moreover, the main profile which does use patentable technologies allows for each individual part of the profile to be switched off meaning anyone encoding EVC has control, assuming the vendor makes this possible, over which technologies they are using and hence their exposure to risk. Kiho points out that this business-requirements-first approach is new and in contrast to many codecs.

Kiho highlights a number of the individual tools within both the baseline and main codecs which provide the bitrate savings before showing us the results of the objective and subjective testing. Within the EVC docs, the testing methodology is spelt out to allow EVC to be compared against predecessors AVC and HEVC. The baseline codec shows an improvement of 38% against 1080p60 material and 35% for UHD material compared to AVC doing the same tasks yet it achieves a quicker encoder (less compute needed) and the decode is approximately the same. The main profile, being more efficient is compared against HEVC which is, itself, around 50% more efficient than AVC. Against HEVC, Kiho says, EVC main profile produces an improvement of around 30% encoding gain for UHD footage and 25% for 1080p60 footage. Encoding is close to 5x longer and decoder is around 1.5x longer than HEVC.

Kiho finishes by summarising subjective testing of SDR and HDR videos which show that, in contrast to the objective savings which are calculated by computers, in practice perceived quality is higher and enables a higher bitrate reduction, a phenomenon which has been seen in other codec comparisons such LCEVC. SDR results show a 50% encoding gain for 4K and 30% for 1080p60 against AVC. Against HEVC, the main profile is able to deliver 50% coding gains for 4K content and 40% for 1080p60. For HDR, the main profile provides an approximately 35% encoding gain for both 1080p60 and 4k.

Watch now!

Kiho Choi Kiho Choi
Senior Engineer & Technical Lead for Multimedia Standards at Samsung Electronics
Lead Editor of MPEG5 Part 1 Essential Video Coding

Video: Don’t let latency ruin your longtail: an introduction to “dref MP4” caching

So it turns out that simply having an .mp4 file isn’t enough for low-latency streaming. In fact, for low latency streaming, MP4s work well, but for very fast start times, there’s optimisation work to be done.

Unified Streaming’s Boy van Dijk refers to how mp4s are put together (AKA ISO BMFF) to explain how just restructuring the data can speed up your time-to-play.

Part of the motivation to optimise is the financial motivation to store media on Amazon’s S3 which is relatively cheap and can deal with a decent amount of throughput. This costs latency, however. The way to work around this, explains Boy, is to bring the metadata out of the media so you can cache it separately and, if possible, elsewhere. Within the spec is the ability to bring the index information out of the original media and into a separate file called the dref, the Data Reference box.

Boy explains that by working statelessly, we can see why latency is reduced. Typically three requests would be needed, but we can save those if we just make one, moreover, stateless architectures scale better.

The longtail of your video library is affected most by this technique as it is, by proportion, the largest part, but gets the least requests. Storing the metadata closer, or in faster storage ca vastly reduce startup times. DREF files point to media data allowing a system to bring that closer. For a just-in-time packaging system, drefs works as a middle-man. The beauty is that a DREF for a film is only a few 10s of MB for a film of many gigabytes.

Unified Origin, for different tests, saw reductions of 1160ms->15, 185ms->13 and 240ms->160ms. Depending on what exactly was being tested which Boy explains in the talk in more detail. Overall they have shown that there’s a non-trivial improvement in startup delay.

Watch now!
Download a detailed presentation

Boy van Dijk Boy van Dijk
Streaming Solutions Engineer,
Unified Streaming

Video: LCEVC, The Compression Enhancement Standard

MPEG released 3 codecs last year, VVC, LCEVC and EVC. Which one was unlike the others? LCEVC is the only one that is an enhancement codec, working in tandem with a second codec running underneath. Each MPEG codec from last year addressed specific needs with VVC aiming at comprehensive bitrate savings while EVC aims to push encoding further whilst having a patent-free base layer.

In this talk, we hear from Guido Meardi from V-Nova who explains why LVECV is needed and how it works. LCEVC was made, Guido explains, to cater to an increasingly crowded network environment with more and more devices sending and receiving video both in residential and enterprise. LCEVC helps by reducing the bitrate needed for a certain quality level but, crucially, reduces the computation needed to achieve good quality video which not only benefits IoT and embedded devices but also general computing.

LCEVC uses a ‘base codec’ which is any other codec, often AVC or HEVC, which runs at a lower resolution than the source video. By using this hybrid technique, LCEVC aims to get the best video compression out of the codec yet by running the encode at a quarter resolution, allowing this to be done on low-power hardware. LCEVC then deals with reconstructing two enhancement layers and a, relatively simple, super-resolution upsample. This is all achieved with a simple toolset and all of the LCEVC computation can be done in CPU, GPU or other types of computation; it’s not bound to hardware acceleration.

Guido presents a number of results from tests against a whole range of codecs from VVC to AV1 to plain old AVC. These tests have been done by a number of people including Jan Ozer who undertook a whole range of tests. All of these tests point to the ability of LCEVC to extend bandwidth savings of existing codecs, new and old.

Guido shows an example of a video only comprising edges (apart from mid-grey) and says that LCEVC encodes this not only better than HEVC but also with an algorithm two orders of magnitude less. We then see an example of a pure upsample and an LCEVC encode. Upsampling alone can look good, but it can’t restore information and when there are small textual elements, the benefit of having an enhancement layer bringing those back into the upsampled video is clear.

On the decode side, Guido presents tests showing that decode is also quicker by at least two times if nor more, and because most of the decoding work is involved in decoding the base layer, this is still done using hardware acceleration (for AVC, HEVC and other codecs depending on platform). Because we can still rely on hardware decoding, battery life isn’t impacted.

Watch now!

Guido Meardi Guide Meardi
CEO & Co-Founder,