Video: RTMP: A Quick Deep-Dive

RTMP hasn’t left us yet, though, between HLS, DASH, SRT and RIST, the industry is doing its best to get rid of it. At the time RTMP’s latency was seen as low and it became a defacto standard. But as it hasn’t gone away, it pays to take a little time to understand how it works

Nick Chadwick from Mux is our guide in this ‘quick deep-dive’ into the protocol itself. To start off he explains the history of the Adobe-created protocol to help put into context why it was useful and how the specification that Adobe published wasn’t quite as helpful as it could have been.

Nick then gives us an overview of the protocol explaining that it’s TCP-based and allows for multiple, bi-directional streams. He explains that RTMP multiplexes larger, say video, messages along with very short data requests, such as RPC, but breaking down the messages into chunks which can be multiplexed over just the one TCP connection. Multiplexing at the packet level allows RTMP to be asking the other end a question at the same time as delivering a long message.

Nick has a great ability to make describing the protocol and showing ASCII tables accessible and interesting. We quickly start looking at the header for chunks explaining what the different chunks are and how you can compress the headers to save bit rate. He also describes how the RTMP timestamp works and the control message and command message mechanism. Before answering Q&A questions, Nick outlines the difficulty in extending RTMP to new codecs due to the hard-coded list of codecs that can be used as well as recommending improvements to the protocol. It’s worth noting that this talk is from 2017. Whilst everything about RTMP itself will still be correct, it’s worth remembering that SRT, RIST and Zixi have taken the place of a lot of RTMP workflows.

Watch now!
Speaker

Nick Chadwick Nick Chadwick
Software Engineer,
Mux

Video: Scalable Per-User Ad Insertion in Live OTT

Targetted ads are the most valuable ads, but making sure the right person gets the right ad is tricky, not only in deciding who to show which ad to, but in scaling – and keeping track of – the ad infrastructure to thousands or millions of viewers. This video explains how this complexity arises and the techniques that Hulu have implemented to improve the situation.

Zachary Cava from Hulu lays out the way that standard advertising works for live streams. Whilst he uses MPEG DASH as an example, much the same is true of HLS. This starts with cutting up the video into sections which all start with an IDR frame for seeking. SCTE 35 is used to indicate times when ads can be inserted. These are called SCTE Markers. As DASH has the principle of defining a period (exactly as it sounds, just a way of marking a section of time), we can define periods of ‘programme’ and periods for ‘ads’. This allows the possibility of swapping out a whole period for a section of several ads.

If it were as simple as just swapping out whole periods, that would be Server-Side Ad Insertion. For per-user targetted ads, the streaming service has to keep track of every ad which was given to a user so that when they rewind, they have a consistent experience. This can mean remembering millions of ads for services which have a large rewind buffer. Moreover, traffic can become overwhelming as, since the requests are unique, a CDN can’t help in the caching. Whilst you can scale your system, the cost can spiral up beyond the revenue practical.

Enter MPD Patch Requests. This addition to MPEG Dash requires the client to remember the whole of the manifest. Where the client has a gap in its knowledge, it can simply request that section from the server which generates a ‘diff’, returning only the changes, which the client then assimilates into memory. The benefit here is that all the clients end up converging on only requesting what’s happening ‘now’ and so CDNs come back in to play. Zachary explains how this works in more detail and shows examples before explaining how URLQueryInfo helps reduce the complexity of URL parameters, again in order to interoperate better with CDNs and allows the ad system to be scaled separately to the main video assets.

Finally, Zachary takes a look at coming back from an ad break where you may find that your ads were longer then the ad period allotted or that the programme hasn’t returned before the ads finished. During the ad break, the client is still polling for updates so it’s possible to quickly update the manifest and swap back to programme video early. Similarly at the end of a break, if there is still no content, the server can start issuing its own ad or content, effectively moving back to server-side ad insertion. However, this is not necessarily just plain ad insertion, explains Zachary, rather Hulu cal it ‘Server-Guided’ ad insertion. There is no stitching on the server, but the server is informing you where to get the next video from. It also allows for some levels of user separation where some larger geographies can see different ads to those from other areas.

Zachary finishes by outlining the work Hulu is doing to feedback this learning into the DASH spec, via the DASH Industry Forum and their work with the industry at large to bring more consistency to SCTE 35 markers.

Watch now!
Speaker

Zachary Cava Zachary Cava
Software Architect,
Hulu

Video: CDN Trends in FPGAs & GPUs

As technology continues to improve, immersive experiences are all the more feasible. This video looks at how the CDNs can play their part in enabling technologies which seem to rely on fast, local, compute. However, as with many internet services, low latency is very important.

Greg Jones from Nvidia and Nehal Mehta form Intel give us the lowdown in this video on what’s happening today to enable low-latency CDNs and what the future might look like. Intel, owners of FPGA makers Altera, and Nvidia are both interested in how their products can be of as much service at the edge as in the core datacentres.

Greg is involved in XR development at Nvidia. ‘XR’ is a term which refers to an outcome rather than any specific technology. Ostensibly ‘eXtended’ reality, it includes some VR, some augmented reality and anything else which helps improve the immersive experience. Greg explains that the importance of getting the ‘motion to photon’ delay to within 20ms. CDNs can play a role in this by moving compute to the edge. This tracks with current trends on wanting to reduce backhaul, edge computation is already on the rise.

Greg also touches on recent power improvements on newer GPUs. Similar to what we heard the other day from Gerard Phillips from Arista who said that switch manufacturers were still using technology that CPU’s were on several years ago meaning there’s plenty in the bank for speed increases over the coming years. According to Greg, the same is true for GPUs. Moreover, it’s important to compare compute per watt rather than doing it in absolute terms.

Nehal Mehta explains that, in the same way that GPUs can offload certain tasks from the CPU, so do FPGAs. At scale, this can be critical for tasks like deep packet inspection, encryption or even dynamic ad insertion at the edge,

The second half of video looks at what’s happening during the pandemic. Nehal explains that need for encryption has increased and Greg sees that large engineering functions are now, or many are soon likely to be, done in the cloud. Greg sees XR as going a long way to helping people collaborate around a large digital model and may help to reduce travel.

The last point made is regarding video conferencing all day long leaving people wanting “more meaningful interactions”. We are seeing attempts at richer and richer meeting experiences, both with and without XR.
Watch now!
Speakers

Greg Jones Greg Jones
Global Business Development, XR
NVIDIA
Nehal Mehta Nehal Mehta
Direcotr Visiual Cloud, CDN Segment,
Intel
Tim Siglin Moderator: Tim Siglin
Founding Executive Director,
Help Me Stream

Video: Doing Better Congestion Control with BBR & Copa

In networking there are many possible bottlenecks, but the most pervasive is congestion caused by links operating at capacity and saturating the buffers. Full buffers are unable to fully adapt to the incoming traffic, increasing the chances of dropped packets, but the extra latency added by full buffer after full buffer quickly adds up and this extra latency further degrades the quality of the connection for the data that does make it through.

It’s no surprise then, that a lot of work goes into finding the best ‘congestion’ algorithms to allow data senders to back off when a link stops responding well. This talk, from Facebook engineer Nitin Garg, examines old and new approaches to keeping streams fast and responsive by running a 4-million-data-point test of three contenders, Cubic, BBR and Copa.


Nitin starts by introducing what we mean by ‘congestion’, how and why it occurs. The simple example is that your computer can send data, typically, at up to 1Gbps, yet your uplink to the internet is likely below this number. So congestion control is a feedback mechanism which lets your computer realise that sending at 1Gbps isn’t working and allows it to throttle back to a speed which fits within your upload bandwidth. The same is true further down the pipe. If you have 50Mbps uplink to the internet, but you are sending to a server which only has 10Mbps left, not only does your computer need to throttle below 50, but also 10Mbps.

We then walk through how Cubic, BBR and Copa work with Nitin explaining the differences. <a href=”https://web.mit.edu/copa/” rel=”noopener” target=”_blank>Copa is the newest of the protocols comes from MIT and comes with the unique ability to tune it to your need; throughput or low latency. As discussed above, to keep latency down, buffer size needs to be minimised which stops you being aggressive in loading up links which leads to latency and throughput being at opposite ends of a see-saw.

Nitin’s test was on mobile phones using Facebook’s Live streaming app on Android and iOS for live streaming with ABR where the app itself adapts to ensure that it is streaming with as high a quality as possible, but willing to reduce the bitrate when needed. Testing from global markets, they measured round trip times and the amount of delivered data. Nitin walks through the results both for latency and throughput and shows that when Copa is optimised for latency, in the worst conditions it leads the other two protocols in latency reduction.

Watch now!
Speakers

Nitin Garg Nitin Garg
Software Engineer, Videos Infra,
Facebook