Streamline is a reference system design for premium quality, end to end live streaming all the way from SDI to a player fed from a CDN that works on the web, iOS, and Android devices. It uses commodity computer hardware, free software, and AWS to create an affordable way to learn how to build a high-quality live streaming system.
Already capable of 4K, this project is ideal for people to use as a learning tool to get first-hand experience of how live video works end to end. Now, the project is being extended to be able to four 4K 60fps feeds, or a single 8K stream. Ths update is called Streamline 2.
Colleen Henry from Facebook introduces the hardware behind the feat as comprising two NVIDIA QUADRO GPUs and one large CPU – a Ryzen 3990x. The equipment is perfectly capable of 8K, but the goal actually is to have enough power to deal with 10bit, 4K, HDR, high frame-rate feeds. The kit’s also intended to be able to encode AV1, LCEVC and VP9. Colleen suggests considering using the Lenovo ThinkStation P620 as a pre-built Threadripper desktop rather than building yourself.
Code for the project can be found at https://streamline.wtf. After encoding, the rest of the work is done in AWS. Caitlin O’Callaghan talks us through how to set up AWS by setting up an m4.xlarge server with the correct firewall and building the code from the Streamline 2 repository and then shows us how to install the encoder.
AV1’s been famous for very low encoding speed, but as we’ve seen from panel like this, AV1 encoding times have dropped into a practical range and it’s starting to gain traction. Zoe Liu, CEO of Visionular, is here to talk at Mile High Video 2020 about how careful use of encoding parameters can deliver faster encodes, smooth decodes, and yet balance that balance with codec efficiency.
Zoe starts by outlining the good work that’s been done with the SVT-AV1 encoder which leaves it ready for deployment, as we heard previously from David Ronca of Facebook. Similarly the Dav1d decoder has recently made many speed improvements, now being able to easily decode 24fps on mobiles using between 1.5 and 3 Snapdragon cores depending on resolution. Power consumption has been measured as higher than AVC decoding but less than HEVC. Further to that, hardware support is arriving in many devices like TVs.
Zoe then continues to show ways in which encoding can be sped up by reducing the calculations done which, in turn, increased decoder speed. Zoe’s work has exposed settings that significantly speed up decoding but have very little effect on the compression efficiency of the codec which opens up use cases where decoding was the blocker and a 5% reduction in the ability to compress is a price worth paying. One example cited is ignoring partition sizes of less than 8×8. These small partitions can be numerous and bog down calculations but their overall contribution to bitrate reduction is very low.
All of these techniques are brought together under the heading of Decoder Complexity Aware AV1 Encoding Optimization which, Zoe explains, can result in an encoding speed-up of over two times the original framerate i.e. twice real-time on an Intel i5. Zoe concludes that this creates a great opportunity to apply AV1 to VOD use cases.
Yesterday we learnt about machine learning improving VVC. But VVC has a fundamental property which limits its ability to compress: it’s raster-based. Vector graphics are infinitely scalable with no loss of quality and are very efficient. Instead of describing 100 individual pixels, you can just define a line 100 pixels long. This video introduces a vector-based video codec which dramatically reduces bitrate.
Sam Bhattacharyya from Vectorly introduces this technique which uses SVG graphics, a well-established graphics standard available in all major web browsers. It describes shapes with XML and is similar to WebGL. The once universal Adobe Flash was able to animate SVG shapes as part of its distinctive ‘flash animations’. The new aspect here is not to start with SVG shapes and animate them, but to create those shapes from video footage and recreate that same video but with vectors.
Sam isn’t shy to acknowledge that video vectorisation is a technique which works well on animation with solid colours; Peppa Pig being the example shown. But on more complex imagery without solid colours and sharp lines, this technique doesn’t result in useful compression. To deal with shaded animation, he explains a technique of using mesh gradients and diffusion curves to represent gradually changing colours and shades. Sam is interested in exploring a hybrid mode whereby traditional video had graphics overlayed using this low-bandwidth vector-based codec.
The technique uses machine learning/AI techniques to identify the shapes, track them and to put them in to keyframes. The codec plays this back by interpolating the motion. This can produce files playable at HD of only 100kbps. For the right content, this can be a great option given it’s based on established standards, is low bitrate and can be hardware accelerated.
Sam’s looking for interest from the community at large to help move this work forward.
Streaming is such a success because it manages to deliver video even as your network capacity varies while you are watching. Called ABR (Adaptive Bitrate), this short talk asks how we can allow low-latency streams to nimbly adapt to network conditions whilst keeping the bitrate low in the new AV1 codec.
Tarek Amara from Twitch explains the idea in AV1 of introducing S-Frames, sometimes called ‘switch frames’, which take the role of the more traditional I or IDR frames. If a frame is marked as an IDR frame, this means the decoder knows it can start decoding from this frame without worrying that it’s referencing some data that came before this frame. By doing this, you can allow frequent points at which a decoder can enter a stream. IDR frames are typically I frames which are the highest bandwidth frames, by a large proportion. This is because they are a complete rendition of a frame without any of the predictions you find in P and B frames.
Because IDR frames are so large, if you want to keep overall bandwidth down, you should reduce the number of them. However, reducing the number of frames reduces the number if ‘in points’ for for the stream meaning a decoder then has to wait longer before it can start displaying the stream to the viewer. An S-Frame brings the benefits of an IDR in that it still marks a place in the stream where the decoder can join, free of dependencies on data previously sent. But the S-Frame is takes up much less space.
Tarek looks at how an S-Frame is created, the parameters it needs to obey and explains how the frames are signalled. To finish off he presents tests run showing the bitrate improvements that were demonstrated. Watch now! Speaker
Engineering Manager, Video Encoding,
Subscribe to get daily updates
Views and opinions expressed on this website are those of the author(s) and do not necessarily reflect those of SMPTE or SMPTE Members.
This website is presented for informational purposes only. Any reference to specific companies, products or services does not represent promotion, recommendation, or endorsement by SMPTE