Video: Cloud Encoding – Overview & Best Practices

There are so many ways to work in the cloud. You can use a monolithic solution which does everything for you which is almost guaranteed by its nature to under-deliver on features in one way or another for any non-trivial workflow. Or you could pick best-of-breed functional elements and plumb them together yourself. With the former, you have a fast time to market and in-built simplicity along with some known limitations. With the latter, you may have exactly what you need, to the standard you wanted but there’s a lot of work to implement and test the system.

Tom Kuppinen from Bitmovin joins Christopher Olekas from SSIMWAVE and host of this Kirchner Waterloo Video Tech talk on cloud encoding. After the initial introduction to ‘middle-aged’ startup, Bitmovin, Tom talks about what ‘agility in the cloud’ means being cloud-agnostic. This is the, yet unmentioned, elephant in the room for broadcasters who are so used to having extreme redundancy. Whether it’s the BBC’s “no closer than 70m” requirement for separation of circuits or the standard deployment methodology for systems using SMPTE’s ST 2110 which will have two totally independent networks, putting everything into one cloud provider really isn’t in the same ballpark. AWS has availability zones, of course, which is one of a number of great ways of reducing the blast radius of problems. But surely there’s no better way of reducing the impact of an AWS problem than having part of your infrastructure in another cloud provider.

Bitmovin have implementations in Azure, Google Cloud and AWS along with other cloud providers. In this author’s opinion, it’s a sign of the maturity of the market that this is being thought about, but few companies are truly using multiple cloud providers in an agnostic way; this will surely change over the next 5 years. For reliable and repeatable deployments, API control is your best bet. For detailed monitoring, you will need to use APIs. For connecting together solutions from different vendors, you’ll need APIs. It’s no surprise that Bitmovin say they program ‘API First’; it’s a really important element to any medium to large deployment.

 

 

When it comes to the encoding itself, per-title encoding helps reduce bitrates and storage. Tom explains how it analyses each video and chooses the best combination parameters for the title. In the Q&A, Tom confirms they are working on implementing per-scene encoding which promises more savings still.

To add to the complexity of a best-of-breed encoding solution, using best-of-breed codecs is part and parcel of the value. Bitmovin were early with AV1 and they support VP9 and HEVC. They can also distribute the encoding so that it’s encoded in parallel by as many cores as needed. This was their initial offering for AV1 encoding which was spread over more than 200 cores.

Tom talks about how the cloud-based codecs can integrate into workflows and reveals that HDR conversion, instance pre-warming, advanced subtitling support and AV1 improvements are on the roadmap while leads on to the Q&A. Questions include whether it’s difficult to deploy on multiple clouds, which HDR standards are likely to become the favourites, what the pain points are about live streaming and how to handle metadata.

Watch now!
Speakers

Tom Kuppinen Tom Kuppinen
Senior Sales Engineer,
Bitmovin
Moderator: Christopher Olekas
Senior Software Engineer,
SSIMWAVE Inc.

Video: Super Resolution: What’s the buzz and why does it matter?

“Enhance!” the captain shouts as the blurry image on the main screen becomes sharp and crisp again. This was sci-fi – and this still is sci-fi – but super-resolution techniques are showing that it’s really not that far-fetched. Able to increase the sharpness of video, machine learning can enable upscaling from HD to UHD as well as increasing the frame-rate.

Bitmovin’s Adithyan Ilangovan is here to explain the success they’ve seen with super-resolution and though he concentrates on upscaling, this is just as relevant to improving downscaling. Here are our previous articles covering super resolution.

Adithyan outlines two main enablers of super resolution, allowing it to displace the traditional methods such as bicubic and Lanczos. Enabler one is the advent of machine learning which now has a good foundation of libraries and documentation for coders allowing it to be fairly accessible to a wide audience. Furthermore, the proliferation of GPUs and, particularly for mobile devices, neural engines is a big help. Using the GPUs inside CPUs or in desktop PCI slots allows the analysis to be done locally without transferring great amounts of video to the cloud solely for the purpose of processing or identification. Furthermore, if your workflow is in the cloud, it’s now easy to rent GPUS and FPGAs to handle such workloads.

Using machine learning doesn’t only allow for better upscaling on a frame-by-frame basis, but we are also able to allow it to form a view of the whole file, or at least whole scene. With abetter understanding of the type of video it’s analysing (cartoon, sports, computer screen etc.) it can tune the upscaling algorithm to deal with this optimally.

Anime has seen a lot of tuning for super resolution. Due to Anime’s long history, there are a lot of old cartoons which are both noisy and low resolution which are still enjoyed now but would benefit from more resolution to match the screens we now routinely used.

Adithyan finishes by asking how we should best take advantage of super resolution. Codecs such as LCEVC use it directly within the codec itself, but for systems which have pre and post-processing before the encoder, Adithyan suggests it’s viable to consider reducing the bitrate to reduce the CDN costs knowing the using super-resolution on the decoder, the video quality can actually be maintained.

The video ends with a Q&A.

Watch now!
Download the slides
Speaker

Adithyan Ilangovan Adithyan Ilangovan
Encoding Engineer,
Bitmovin

Video: State of Compression: Versatile Video Coding – H.266/VVC

An evolution from HEVC, VVC is a codec that not only delivers the traditional 50% bit rate reduction over its predecessor but also has specific optimisations for screen content (e.g. computer gaming) and 360-degree video.

Christian Feldmann from Bitmovin explains how VCC manages to deliver this bitrate reduction. Whilst VVC makes no claims to be a totally new codec, Christian explains that the fundamental way the codec works, at a basic level, is the same as all block-based codecs including MPEG 2 and AV1. The bitrate savings come from incremental improvements in technique or embracing a higher computation load to perform one function more thoroughly.

Block partitioning is one good example. Whilst AVC macroblocks are all 16×16 pixels in size, VVC allows 128×128 blocks. For larger areas of ‘solid’ colour, this allows for more efficiency. But the main advance comes in the fact you can sub-divide each of these blocks into different sized rectangles. Whilst sub-dividing has always been possible back to AVC, we have more possible shapes available now allowing the divisions to be created in closer alignment with the video.

Tiles and slices are a way of organising the macroblocks, allowing them to be treated together as a group. This is grouping isn’t taken lightly; each group can be decoded separately. This allows the video to be split into sub-videos. This can be used for multiviewer-style applications or, for instance, to allow multiple 4k decoders to decode a 16k. This could be one of those features which sees lots of innovative use…or, if it’s too complicated/restricted, will see no mainstream take-up.

Christian outlines other techniques such as intra-prediction where macroblocks are predicted from already-decoded macroblocks. Any time a codec can predict a value, this tends to reduce bitrate. Not because it necessarily gets it right, but because it then only needs an error-correction, typically a smaller number, to give it the correct value. Similarly, prediction is also possible now between the Y, U and V channels.

Finishing off, Christian hits geometric partitioning, similar to AV1, which allows diagonal splitting of macroblocks with each section having separate motion vectors. He also explains affine motion prediction, allowing blocks to scale, rotate, change aspect ration and shear. Finally Christian discusses the performance possible from the codec.

To find out more about VVC, including the content-based tuning such as for screen graphics, which is partly where the ‘versatile’ in VVC’s name comes from, listen to this talk, from 19 minutes in, given by Benjamin Bross from Fraunhofer. For Christian’s summary of all this year’s new MPEG codecs, see his previous video in the series.

Watch now!
Free to watch
Speaker

Christian Feldmann Christian Feldmann
Team Lead, Encoding
Bitmovin

Video: The Video Codec Landscape 2020

2020 has brought a bevy of new codecs from MPEG. These codecs represent a new recognition that the right codec is the one which fits your hardware and your business case. We have the natural evolution of HEVC, namely VVC which trades on complexity to achieve impressive bit rate savings. There’s a recognition that sometimes a better codec is one that has lower computation, namely LCEVC which enables a step-change in quality for lower-power equipment. And there’s also EVC which has a license-free mode to reduce the risk for companies which prefer low-risk deployments.

Christian Feldmann from Bitmovin takes the stage in this video to introduce these three new contenders in an increasingly busy codec landscape. Christian starts by talking about the incumbents namely AVC, HEVC, VP9 and AV1. He puts their propositions up against the promises of these new codecs which are all at the point of finalisation/publication. With the current codecs, Christian looks at what the hardware and software support is like as well as the licencing.

EVC (Essential Video Codec) is the first focus of the presentation whose headline feature is more reliably licence landscape. The first offer is the baseline profile which has no licencing as it uses technologies which are old enough to be outside of patents. The main profile does require licencing and does allow much better performance. Furthermore, the advanced tools in the main profile can each be turned off individually hence avoiding patents that you don’t want to licence. The hope is that this will encourage the patent holders to licence the technology in a timely manner else the customer can, relatively easily, walk away. Using the baseline only should provide 32% better than AVC and the main profile can give up to a 25% benefit over HEVC.

LCEVC (Low Complexity Enhancement Video Coding) is next which is a new technique for encoding which is actually two codecs working together. It uses a ‘base’ codec at low resolution like AVC, HEVC, AV1 etc. This low fidelity version is then accompanied by enhancement information so that the low-resolution base can be upscaled to the desired resolution can be corrected with relevant edges etc. added. The overall effect is that complexity is kept low. It’s designed as a software codec which can fit into almost any hardware by using the hardware decoders in SoCs/CPUs (i.e. Intel QuickSync) plus the CPU itself which deals with the enhancement application. This ability to fit around hardware makes the codec ideal for improving the decoding capability to existing hardware. It stands up well against AVC providing at least 36% improvement and at worst improves slightly upon HEVC bitrates but with much-reduced encoder computation.

VVC (Versatile Video Coding) is discussed by Christian but not in great detail as Bitmovin will be covering that separately. As an evolution of HEVC, it’s no surprise that bitrate is reduced by at least 40%, though encoding complexity has gone up 10-fold. This is similar to HEVC compared to its predecessor AVC. VVC has some built-in features not delivered as standard before such as special modes for screen content (such as computer games) and 360-degree video.

Free to watch now!

Speaker

Christian Feldmann Christian Feldmann
Lead encoding engineer,
Bitmovin