“The AV over IP market has really matured [giving us] great quality, low latency and the kind of stability and features that customers are looking for,” says Andrew Starks from Macnica Technology. If that’s the case, why do we need another standard by the name of IPMX? Intended to open up the AV-over-IP market and provide customers with a better deal, Andrew takes us through the motivations of AIMS, AMWA, VSF, SMPTE and the other organisations involved.
IPMX is a set of open standards and specifications which seek to bring a technology platform to the Pro AV industry on which all vendors can interoperate and innovate. Built on SMPTE’s ST 2110 suite of standards and the accompanying NMOS APIs from AMWA, IPMX adds essential capabilities such as HDMI, HDCP and USB support to create a complete and reliable foundation for AV events and installations.
Whilst there are a number of successful AV initiatives such as SDVoE, these are typically alliances built around a single-vendor hardware solution which is available to vendors in the alliance. This provides interoperability within that ecosystem but, explains Andrew, it prevents wider interoperability between vendors of different alliances. It also makes it hard to any vendor to innovate in the core feature set since that’s delivered from the single source relegating innovation to ‘plumbing’. For the vendors, at best, this means they have to contend with multiple, incompatible product lines and complicated support. Overall this results in a bad end user experience as they operate multiple islands which can have conflicting network requirements, i.e. 10GbE vs 1GbE.
IPMX can be implemented in software as well as hardware using compressed or uncompressed video with a focus on fully featured discovery as this has been identified as being as important as the ability to carry video. Timing has been made flexible such that it can operate with or without PTP which is one of a number of ways that it’s anticipated IPMX will be able to merge in with ST 2110 infrastructures.
Andrew finishes off his talk with a look at the tech stack of IPMX with layer 2 options from 1 to 100GbE connections supported on which RTP and PTP run. SMPTE’s ST 2110 standards feature heavily alongside a new standard for HDCP in 2110, a VSF spec for FEC and new specifications from AMWA for asynchronous control traffic like EDID, Serial, CEC, USB etc. Finally, there are the main APIs such as IS-04, -05 etc. as well as the application layer which uses OAuth2 for authenticating and has an RDS server for discovery. Lastly, there is a look at the JT-NM roadmap to see how the IPMX work will continue to advance throughout this year.
WebRTC continues to live two lives; one of massive daily use in video conferencing in apps from Google, Facebook as well as many others, and one as a side-lined streaming protocol in hte broadcast and streaming industry. WebRTC is now an IETF/W3C standard, is a decade old and is seeing continued work and innovation from Google, other large companies and smaller specialists pushing it forward.
In this extended Streaming Media Connect video with Millicast’s Ryan Jespersen, we explore where WebRTC is up to now, how it can replace RTMP, how real-time AV1 not only shows the innovation within the technology but also enables several use cases and upcoming technologies such as end-to-end encryption for streaming workflows. The video is in sections: product demos, technology discussion and overviews of use cases.
A clear first question is why bother with WebRTC at all. Ryan’s quick to point out that WebRTC is in daily use not only in many of the big video call apps but also in Clubhouse, the high-scale WebRTC-based interactive audio platform. He also establishes that it’s commonly in use on CDNs such as Limelight and Millicast to deliver ultra-low-latency streams to end-users for auctions, gambling and interactive streams, but also as part of broadcast workflows. NFL, for instance, used WebRTC for low-latency monitoring of 122 cameras for the Super Bowl. As far as end-users are concerned, Ryan sees the ‘interactivity’ market as a way, as yet untapped, to release value in many verticals and will be the fastest-growing sector of the streaming industry over the next few years.
Looking back at Flash, Ryan explains that we came from a point where we had a low-latency protocol in the name of RTMP. Its latency was in the realms of 1 to 3 seconds, it had end-to-end security, encoder control and interactivity. RTMP was displaced due to three main factors, security concerns, rejection of the proprietary nature of the protocol and the move to HLS which provided improved scalability and was enthusiastically adopted by CDNs.
WebRTC, Ryan contends, learns from the mistakes of RTMP. WebRTC has ways to recover lost packets, is content agnostic, has a solution for NAT traversal, is non-proprietary and has no plugins. These latter two points address many of the security concerns of RTMP. Now a standard, the W3C has documented many upcoming use cases for this free, Open Source, technology.
Why, then, do we not see WebRTC much more prevalent in video streaming such as Netflix or Peacock? This is a question that Russell Trafford-Jones discussed in this IBC panel with nanocosmos, M2A and VisualOn. One view from that panel is that sub-second is lower than needed for some services. For instance, a public broadcaster may not wish to deliver online faster than it does over the air. Also, there’s a quality issue to contend with. One strength of WebRTC is that it prioritises latency over quality, always. This is great for face-to-face communication, but tier-1 broadcasters want people to see video in the same quality that left their encoders and if that means waiting for packets to be recovered instead of showing an impaired signal, that’s what they will do. As ever, therefore, this is a business decision that has to pay careful attention to the needs of the viewers, the quality aspirations of the viewers and broadcaster/provider as well as the technical pros and cons of each approach.
Ryan tlks about Real-time AV1 in WebRTC covered also in this talk
Moving on to AV1, Ryan explains that this royalty-free codec has been sped up significantly since the early days when it required thousands of CPUs for real-time encoding. Using AV1 is a boon for WebRTC for two reasons: screen content and scalable video coding. Screen Content Coding is a set of techniques to adapt encoding specifically for screen content meaning computer graphics whether that be in games or just sending a computer desktop. With straighter lines and the possibility for many parts of the screen to be identical or close to identical to other parts, it’s possible to get much better encoding for screen content if you can detect it and optimise for it.
Ryan moves on to AV1’s use in shoring up security. Although a codec and not a security measure in and of itself, AV1’s ability to send multiple resolutions in one stream is a big deal for securing communications. Scalable video coding, SVC, is not a new technology, but AV1 is one of the first mainstream, modern codecs which has it by default. This enables an encoder to encode to, say, sub-SD, SD and HD resolution and send these all at once in one stream. These are not simply 3 encodes squeezed down the same pipe, but they encode that build on top of each other. The sub-HD provides a foundation on which the SD feed provides enhancement information. You need both the sub-SD and SD layer to get SD. Adding on the HD layer to those two gives you that full-resolution HD. By only delivering the extra information needed for HD rather than all the underlying data again, a lot of bitrate can be saved. Importantly, by generating all the encoding at the source, you can encrypt at the source for an end-to-end encrypted workflow and also deliver multiple bitrates. Ryan explains that the move to ABR streaming, whether HLS, DASH or otherwise breaks the end-to-end security model as the need to transcode the media necessitates being able to view it. Using AV1’s SVC is one way around the need for mid-workflow transcoding.
One aspect is missing, though, for modern streaming workflows. If you don’t want to do peer-to-peer networking, some form of traffic manipulation will be needed in your CDN and/or delivery infrastructure. This is why Ryan says that Millicast has proposed that ‘secure frames’ are added to the WebRTC spec. Whilst this talk doesn’t detail their functionality they add a way of encrypting data twice such that the media can be encrypted for end-to-end workflows, but also each hop can be separately encrypted. This provides just enough access to the metadata of the stream for traffic manipulation, but without allowing access to the underlying media.
As the video comes to end, Ryan gives us a glimpse into one other upcoming technology that may be added to WebRTC called WHIP. The RFC explains the intention of WHIP:
The WebRTC-HTTP ingest protocol (WHIP) uses an HTTP POST request to
perform a single shot SDP offer/answer so an ICE/DTLS session can be
established between the encoder/media producer and the broadcasting
Once the ICE/DTLS session is set up, the media will flow
unidirectionally from the encoder/media producer broadcasting
ingestion endpoint. In order to reduce complexity, no SDP
renegotiation is supported, so no tracks or streams can be added or
removed once the initial SDP O/A over HTTP is completed.
Ryan closes his video with a demonstration of the Millicast platform and looks at how other use cases might be architected such as watch parties.
Synchronised origins in streaming means that a player can switch from one origin to another without any errors or having to restart decoding allowing a much more seamless viewing experience. Adam Ross, speaking from his experience on the Comcast linear video packing team, takes us through the pros and cons of two approaches to synchronisation. This discussion centres around video going into an encoder, transcoder and then packager. This video is either split from a single source which helps keep the video and audio clocks aligned or the clocks are aligned in the encoder or transcoder through communication site A and B.
Keeping segments aligned isn’t too difficult as we just need to keep naming the same and keep them timed together. Whilst not trivial, manifests have many more layers of metadata to synchronised in the form of short-term metadata like content currently present in the manifest and long-term metadata like the dash period. For DASH streams, the [email protected] and [email protected] need to be the same. SegmentTimelines need to have the same start number mapping to the same content. For HLS, variant playlists need to be the same as well as the sequence numbering.
Adam proposes two methods of doing this. the first is Co-operative Packaging where each site sends metadata between the packagers so that they each make the same, more informed decisions. However, this is complicated to implement and produces a lot of cross-site traffic which can live-point introduce latency. The alternative is a Minimal Synchronisation strategy which relies much more on determinism. Given the same output from the transcoder, the packagers should make the same decisions. Each packager does still need to look at the other’s manifest to ensure it stays in sync and it can resync if not deemed impactful. Overall this second method is much simpler.
PTP is foundational for SMPTE ST 2110 systems. It provides the accurate timing needed to make the most out of almost zero-latency professional video systems. In the strictest sense, some ST 2110 workflows can work without PTP where they’re not combining signals, but for live production, this is almost never the case. This is why a lot of time and effort goes into getting PTP right from the outset because making it work perfectly from the outset gives you the bedrock on which to build your most valuable infrastructure upon.
In this video, Gerard Phillips from Arista, Leigh Whitcomb from Imagine Communications and Telestream’s Mike Waidson join forces to run down their top 15 best practices of building a PTP infrastructure you can rely on.
Gerard kicks off underlining the importance of PTP but with the reassuring message that if you ‘bake it in’ to your underlying network, with PTP-aware equipment that can support the scale you need, you’ll have the timing system you need. Thinking of scale is important as PTP is a bi-directional protocol. That is, it’s not like the black and burst and TLS that it replaces which are simply waterfall signals. Each endpoint needs to speak to a clock so understanding how many devices you’ll be having and where is important to consider. For a look a look at PTP itself, rather than best practices, have a look at this talk free registration required or this video with Meinberg.
Gerard’s best practices advice continues as he recommends using a routed network meaning having multiple layer 2 networks with layer 3 routing between This reduces the broadcast domain size which, in turn, increases stability and resilience. JT-NM TR-1001 can help to assist in deployments using this network architecture. Gerard next cautions about layer 2 IGMP snoopers and queriers which should exist on every VLAN. As the multicast traffic is flooded to the snooping querier in layer 2, it’s important to consider traffic flows.
When Gerard says PTP should be ‘baked in’, it’s partly boundary clocks he’s referring to. Use them ‘everywhere you can’ is the advice as they bring simplicity to your design and allow for easier debugging. Part of the simplicity they bring is in helping the scalability as they shed load from your GM, taking the brunt of the bi-directional traffic and can reduce load on the endpoints.
It’s long been known that audio devices, for instance, older versions of Dante before v4.2, use version one of PTP which isn’t compatible with SPMTE ST 2059’s requirement to use PTP v2. Gerard says that, if necessary, you should buy a version 1 to version 2 converter from your audio vendor to join the v1 island to your v2 infrastructure. This is linked to best practice point 6; All GMs must have the same time. Mike makes the point that all GMs should be locked to GPS and that if you have multiple sites, they should all have an active, GPS-locked GM even if they do send PTP to each other over a WAN as that is likely to deliver less accurate timing even if it is useful as a backup.
Even if you are using physically separate networks for your PTP and ST 2110 main and backup networks, it’s important to have a link between the two GMs for ST 2022-7 traffic so a link between the two networks just for PTP traffic should be established.
The next 3 points of advice are about the ongoing stability of the network. Firstly, ST 2059-2 specifies the use of TLV messages as part of a mechanism for media notes to generate drop-frame timecode. Whilst this may not be needed day 1, if you have it running and show your PTP system works well with it on, there shouldn’t be any surprises in a couple of years when you need to introduce an end-point that will use it. Similarly, the advice is to give your PTP domain a number which isn’t a SMPTE or AES default for the sole reason that if you ever have a device join your network which hasn’t been fully configured, if it’s still on defaults it will join your PTP domain and could disrupt it. If, part of the configuration of a new endpoint is changing the domain number, the chances of this are notably reduced. One example of a configuration item which could affect the network is ‘ptp role master’ which will stop a boundary clock from taking part in BCMA and prevents unauthorised end-points taking over.
Gerard lays out the ways in which to do ‘proper commissioning’ which is the way you can verify, at the beginning, that your PTP network is working well-meaning you have designed and built your system correctly. Unfortunately, PTP can appear to be working properly when in reality it is not for reasons of design, the way your devices are acting, configuration or simply due to bugs. To account for this, Gerard advocates separate checklists for GM switches and media nodes with a list of items to check…and this will be a long list. Commissioning should include monitoring the PTP traffic, and taking a packet capture, for a couple of days for analysis with test and measurement gear or simply Wireshark.
Leigh finishes up the video talking about verifying functionality during redundancy switches and on power-up. Commissioning is your chance to characterise the behaviour of the system in these transitory states and to observe how equipment attached is affected. His last point before summarising is to implement a PTP monitoring solution to capture the critical parameters and to detect changes in the system. SMPTE RP 2059-15 will define parameters to monitor, with the aim that monitoring across vendors will provide some sort of consistent metrics. Also, a new version of IEEE-1588, version 2.1, will add monitoring features that should aid in actively monitoring the timing in your ST 2110 system.
Views and opinions expressed on this website are those of the author(s) and do not necessarily reflect those of SMPTE or SMPTE Members.
This website is presented for informational purposes only. Any reference to specific companies, products or services does not represent promotion, recommendation, or endorsement by SMPTE