For many transcoding workflows, efficiency or quality are the primary factors defining how they are created. But when ingesting user-generated videos like those uploaded to the online video platform, Vimeo, life gets difficult. Dealing with the wide variety of formats uploaded and the many edge cases in the way that otherwise normal AVC videos are delivered means throwing out any assumptions you ever had and analysing every aspect of the file.
Senior video encoding engineer, Derek Buitenhuis takes us through the many lessons he and his colleagues have learnt over the years. Don’t, he says, assume that properties don’t change between frames – sometimes they change in every single frame. Assuming that you have a single frame rate throughout the video is another ‘no no’ as there are many variable-frame rate videos.
Derek also looks at dealing with samples stamped with negative timestamps, the need for sample durations, the myriad of issues seeking through a file, the fun of having some frames that aren’t displayed and multiple-track videos.
Colour spaces, no surprise to anyone, cause handling difficulties for example if the bitstream colour properties are different to those in the container. As the talk finishes, we’re left considering old MPEG2 files that can have unavoidable banding, replicating looping MOV files, and dealing with QuickTime special effects channels that animate a fire on the screen.
The ever popular, always analytical Jan Ozers spends time here evaluating the quality of these codecs against the ever-present h.264. As the team here at The Broadcast Knowledge takes a short break, we’re recapping the most popular posts of the year. Interestingly, this post is from over a year ago but is still seeing top-10 traffic. This is no surprise since, as I said in my interview with SMPTE on the subject of codecs, everyone touches codecs in some way even if only at home. So it’s no surprise there is such an interest.
Jan takes a careful approach to explaining the penetration adn abilities of h.264 in order to see at what point we can break even and start to ebenefit from using alternative codecs. He then takes each codec in turn looking at it its pros and cons to paint a picture of the options available for those willing and able to go beyond h.264.
FPGAs are flexible, reprogrammable chips which can do certain tasks faster than CPUs, for example, video encoding and other data-intensive tasks. Once the domain of expensive hardware broadcast appliances, FPGAs are now available in the cloud allowing for cheaper, more flexible encoding.
In fact, according to NGCodec founder Oliver Gunasekara, video transcoding makes up a large percentage of cloud work loads and this increasing year on year. The demand for more video and the demand for more efficiently-compressed video both push up the encoding requirements. HEVC and AV1 both need much more encoding power than AVC, but the reduced bitrate can be worth it as long as the transcoding is quick enough and the right cost.
Oliver looks at the likely future adoption of new codecs is likely to playout which will directly feed into the quality of experience: start-up time, visual quality, buffering are all helped by reduced bitrate requirements.
It’s worth looking at the differences and benefits of CPUs, FPGAs and ASICs. The talk examines the CPU-time needed to encode HEVC showing the difficulty in getting real-time frame rates and the downsides of software encoding. It may not be a surprise that NGCodec was acquired by FPGA manufacturer Xilinx earlier in 2019. Oliver shows us the roadmap, as of June 2019, of the codecs, VQ iterations and encoding densities planned.
The talk finishes with a variety of questions like the applicability of Machine Learning on encoding such as scene detection and upscaling algorithms, the applicability of C++ to Verilog conversion, the need for a CPU for supporting tasks.
Twitch is constantly searching for better and lower cost ways of streaming and its move to include VP9 was one of the most high profile ways of doing this. In this talk, a team of Twitch engineers examine the reasons for this and other moves.
Tarek Amara first takes to the stage to introduce Twitch and its scale before looking at the codecs available, the fragmentation of support but also the drivers to improve the video delivered to viewers both in terms of frame rate and resolution in addition to quality. The discussion turns to the reasons to implement of VP9 and we see that if HEVC were chosen instead, less than 3% of people would be able to receive it.
Nagendra Babu explains the basic architecture employed at Twitch before going on to explain the challenges they met in testing and developing the backend and app. He also talks about the difficulty of running multiple transcodes in the cloud. FPGAs are in important tool for Twitch, and Nagendra discusses how they deal with their programming.
The last speaker is Nikhil who talks about the format of VP9 being FMP4 delivered by transport stream and then outlines the pros and cons of Fragmented FMP4 before handing the floor to the audience.