RTMP hasn’t left us yet, though, between HLS, DASH, SRT and RIST, the industry is doing its best to get rid of it. At the time RTMP’s latency was seen as low and it became a defacto standard. But as it hasn’t gone away, it pays to take a little time to understand how it works
Nick Chadwick from Mux is our guide in this ‘quick deep-dive’ into the protocol itself. To start off he explains the history of the Adobe-created protocol to help put into context why it was useful and how the specification that Adobe published wasn’t quite as helpful as it could have been.
Nick then gives us an overview of the protocol explaining that it’s TCP-based and allows for multiple, bi-directional streams. He explains that RTMP multiplexes larger, say video, messages along with very short data requests, such as RPC, but breaking down the messages into chunks which can be multiplexed over just the one TCP connection. Multiplexing at the packet level allows RTMP to be asking the other end a question at the same time as delivering a long message.
Nick has a great ability to make describing the protocol and showing ASCII tables accessible and interesting. We quickly start looking at the header for chunks explaining what the different chunks are and how you can compress the headers to save bit rate. He also describes how the RTMP timestamp works and the control message and command message mechanism. Before answering Q&A questions, Nick outlines the difficulty in extending RTMP to new codecs due to the hard-coded list of codecs that can be used as well as recommending improvements to the protocol. It’s worth noting that this talk is from 2017. Whilst everything about RTMP itself will still be correct, it’s worth remembering that SRT, RIST and Zixi have taken the place of a lot of RTMP workflows.
Low latency can be a differentiator for a live streaming service, or just a way to ensure you’re not beaten to the punch by social media or broadcast TV. Either way, it’s seen as increasingly important for live streaming to be punctual breaking from the past where latencies of thirty to sixty seconds were not uncommon. As the industry has matured and connectivity has enough capacity for video, simply getting motion on the screen isn’t enough anymore.
Steve Heffernan from MUX takes us through the thinking about how we can deliver low latency video both into the cloud and out to the viewers. He starts by talking about the use cases for sub-second latency – anything with interaction/conversations – and how that’s different from low-latency streaming which is one to many, potentially very large scale distribution. If you’re on a video call with ten people, then you need sub-second latency else the conversation will suffer. But distributing to thousands or millions of people, the sacrifice in potential rebuffering of operating sub-second, isn’t worth it, and usually 3 seconds is perfectly fine.
Steve talks through the low-latency delivery chain starting with the camera and encoder then looking at the contribution protocol. RTMP is still often the only option, but increasingly it’s possible to use WebRTC or SRT, the latter usually being the best for streaming contribution. Once the video has hit the streaming infrastructure, be that in the cloud or otherwise, it’s time to look at how to build the manifest and send the video out. Steve talks us through the options of Low-Latency HLS (LHLS) CMAF DASH and Apple’s LL-HLS. Do note that since the talk, Apple removed the requirement for HTTP/2 push.
The talk finishes off with Steve looking at the players. If you don’t get the players logic right, you can start off much farther behind than necessary. This is becoming less of a problem now as players are starting to ‘bend time’ by speeding up and slowing down to bring their latency within a certain target range. But this only underlines the importance of the quality of your player implementation.
Getting colours right is tricky. Many of us get away without considering colour spaces both in our professional and personal life. But if you’ve ever wanted to print a logo which is exactly the right colour, you may have found out the hard way that the colour in your JPEG doesn’t always match the CMYK of the printer. Here, we’re talking, of course about colour in video. With SD’s 601 and HD’s 709 colour space, how do we keep colours correct?
In this talk starting 28 minutes into the Twitch feed, Matt Szatmary exposes a number of problems. The first is the inconsistent, and sometimes wrong, way that browsers interpret colours in videos. Second is that FFmpeg only maintains colour space information in certain circumstances and, lastly, he exposes the colour changes that can occur when you’re not careful about maintaining the ‘chain of custody’ of colour space information.
Matt starts by explaining that the ‘VUI’ information, the Video Usability Information, found in AVC and HEVC conveys colour space information among other things such as aspect ratio. This was new to AVC and are not used by the encoder but indicate to decoders things to consider during the decoder process. We then see a live demonstration of Matt using FFmpeg to move videos through different colour spaces and the immediate results in different browsers.
This is an illuminating talk for anyone who cares about actually displaying the correct colours and brightnesses, particularly given there are many processes based on FFmpeg. Matt demonstrates how to ensure FFmpeg is maintaining the correct information.
If we ever had a time when most displays were the same resolution, those days are long gone with smartphone and tablets with extremely high pixel density nestled in with laptop screens of various resolutions and 1080-line TVs which are gradually being replaced with UHD variants. This means that HD videos are nearly always being upscaled which makes ‘getting upscaling right’ a really worthwhile topic. The well-known basic up/downscaling algorithms have been around for a while, and even the best-performing Lanczos is well over 20 years old. The ‘new kid on the block’ isn’t another algorithm, it’s a whole technique of inferring better upscaling using machine learning called ‘super resolution’.
Nick Chadwick from Mux has been running the code and the numbers to see how well super resolution works. Taking to the stage at Demuxed SF, he starts by looking at where scaling is used and what type it is. The most common algorithms are nearest neighbour, bi-cubic, bi-linear and lanczos with nearest neighbour being the most basic and least-well performing. Nick shows, using VMAF that using these for up and downscaling, that the traditional opinions of how well these algorithms perform are valid. He then introduces some test videos which are designed to let you see whether your video path is using bi-linear or bi-cubic upscaling, presenting his results of when bi-cubic can be seen (Safari on a MacBook Pro) as opposed to bi-linear (Chrome on a MacBook Pro). The test videos are available here.
In the next part of the talk, Nick digs a little deeper into how super resolution works and how he tested ffmpeg’s implementation of super resolution. Though he hit some difficulties in using this young filter, he is able to present some videos and shows that they are, indeed, “better to view” meaning that the text looks sharper and is easier to see with details being more easy pick out. It’s certainly possible to see some extra speckling introduced by the process, but VMAF score is around 10 points higher matching with the subjective experience.
The downsides are a very significant increase in computational power needed which limits its use in live applications plus there is a need for good, if not very good, understanding of ML concepts and coding. And, of course, it wouldn’t be the online streaming community if clients weren’t already being developed to do super-resolution on the decode despite most devices not being practically capable of it. So Nick finishes off his talk discussing what’s in progress and papers relating to the implementation of super resolution and what it can borrow from other developing technologies.
Views and opinions expressed on this website are those of the author(s) and do not necessarily reflect those of SMPTE or SMPTE Members.
This website is presented for informational purposes only. Any reference to specific companies, products or services does not represent promotion, recommendation, or endorsement by SMPTE