AV1’s no longer the slow codec it was when it was released. Real-time encodes and decodes are now practical with open-source software implementations called rav1e for encoding and dav1d for decoding. We’ve also seen in previous talks the SVT-AV1 provides real-time encoding and WebRTC now has a real-time version with the AV1 codec.
In this talk, rav1e contributor Vibhoothi explains more about these projects and how the ARM chipset helps speed up encoding. The Dav1d started project started in 2018 with the intention of being a fast, cross-platform AV1 encoder with a small binary which Vibhoothi says is exactly what we have in 2021. Dav1d is the complementary decoder project. AV1 decoding is found in many places now including in Android Q, in Microsoft’s media extension for it, VLC supports AV1 on linux and macOS thanks to dav1d, AV1 is supported in all major browsers, on NVIDIA and AMD GPUs plus Intel Tiger Lake CPUs. Netflix even use dav1d to stream AV1 onto some mobile devices. Overall, then, we see that AV1 has ‘arrived’ in the sense that it’s in common and increasing use.
The ARM CPU architecture underpins nearly all smartphones and most tablets so ARM is found in a vast number of devices. It’s only relatively recently that ARM has made it into mainstream servers. One big milestone has been the release of Neoverse which is an ARM chip for infrastructure. AWS now offer ARM instances that have a 40% higher performance but a 20% reduced cost. These have been snapped up by Netflix but also by a plethora of non-media companies. Recently Apple has made waves with their introduction of the M1 ARM-based chip for desktops which has benchmarks far in excess of the previous x86 offering which shows that the future for ARM-based implementations of the rav1e encoder and dav1d decoder are bright.
Vibhoothi outlines how dav1d works better on ARM then x86 with improved threading support including hand-written asm optimisations and support for 10-bit assembly. rav1e has wide support in VLC, GStreamer, FFmpeg, libavif and others.
The talk finishes with a range of benchmarks showing how better-than-real-time encoding and decoding is possible and how the number of threads relates to the throughput. Vibhoothi’s final thoughts focus on what’s still missing in the ARM implementations.
Research Assistant, Trinity College Dublin,
Codec Development, rav1e, Mozilla