Moving high bitrate flows such as uncompressed media through cloud infrastructure = which is designed for scale rather than real-time throughput requires more thought than simply using UDP and multicast. That traditional approach can certainly work, but is liable to drop the occasional packet compromising the media.
In this video, Thomas Edwards and Evan Statton outline the work underway at Amazon Web Services (AWS) for reliable real-time delivery. On-prem 2110 network architectures usually have two separate networks. Media essences are sent as single, high bandwidth flows over both networks allowing the endpoint to use SMPTE ST 2022-7 seamless switching to deal with any lost packets. Network architectures in the cloud differ compared to on-prem networks. They are usually much wider and taller providing thousands of possible paths to get to any one destination.
AWS have been working to find ways of harnessing the cloud network architectures and have come up with two protocols. The first to discuss is Scalable Reliable Delivery, SRD, a protocol created by Amazon which guarantees delivery of packets. Delivery is likely to be out of order, so packet order needs to be restored by a layer above SRD. Amazon have custom network cards called ‘Nitro’ and it’s these cards which run the SRD protocol to keep the functionality as close to the physical layer as possible.
SRD capitalises on hyperscale networks by splitting each media flow up into many smaller flows. A high bandwidth uncompressed video flow could be over 1 Gbps. SRD would deliver this over one or more hundred ‘flowlets’ each leaving on a different path. Paths are partially controlled using ECMP, Equal Cost Multipath, routing whereby the egress port used on a switch is chosen by hashing together a number of parameters such as the source IP and destination port. The sender controls the ECMP path selection by manipulating packet encapsulation. SRD employs a specialized congestion control algorithm that helps further decrease the chance of packet drops and minimize retransmit times, by keeping queuing to a minimum. SRD keeps an eye on the RTT (round trip time) of each of the flowlets and adjusts the bandwidth appropriately. This is particularly useful as a way to deal with the problem where upstream many flowlets may end up going through the same interface which is close to being overloaded, known as ‘incast congestion’. In this way, SRD actively works to reduce latency and congestion. SRD is able to monitor round trip time since it also has a very small retransmit buffer so that any packets which get lost can be resent. Similar to SRT and RIST, SRD does expect to receive acknowledgement packets and by looking at when these arrive and the timing between packets, RTT and bandwidth estimations can be made.
CDI, the Cloud Digital Interface, is a layer on top of SRD which acts as an interface for programmers. Available on Github under a BSD licence, it gives access to the incoming essence streams in a way similar to SMPTE’s ST 2110 making it easy to deal with pixel data, get access to RGB graphics including an alpha layer as well as providing metadata information for subtitles or SCTE 104 signalling.
Thomas Edwards Principal Solutions Architect & Evangelist, Amazon Web Services |
|
Evan Statton Principal Architect, Amazon Web Services (AWS) |