In this talk from Seattle Video Tech, HLS.js maintaner Rob Walch talks us through HLS.js from 2015 to today and touches on future capabilities. Rob takes us through some analytics comparing HLS.js performance against playback with the native player in a range of browsers and shows better bitrate stability. The player today is built on the work under Guillaume but also John Bartos, the previous maintainer who also worked on the LHLS protocol which, while useful, was ultimately sidelined by Apple when they launched ‘Low Latency HLS‘. Needless to say that HLS.js is the result of many contributors.
Rob takes us through his demo page which has a lot of analysis tools showing the current and next chunks, playback timelines, real-time metrics, details on audio tracks, live stats for Apple LL-HLS and much more. The session concludes with questions about how browser agnostic HLS.js is, its support for DRM including on streams with ads in the clear and encyrpted content, support for SSAI discontinuities and testing.
Low Latency Dash also known as LL-DASH is a modification of MPEG DASH to allow it to operate with close to two seconds’ latency bringing it down to meet, or beat, standard broadcast signals.
Brightcove’s Bo Zhang starts by outlining the aims and methods of getting there. For instance, he explains, the HTTP 1.1 Chunked Transfer element is key to low-latency streaming as it allows the server to start sending a video segment as its being written, not waiting until the file is complete. LL-DASH also has the ability to state an availability window (‘availabilityTimeOffset’).
As LL-MPEG DASH is a living standard, there are updates on the way: Resync points will allow a player to receive a list of places where it can join a stream using SAP types in the ISO-BMFF spec, the server can send a ‘service description’ to the player which can use the information to adjust its latency. Event messages can now be inserted in the middle of segments.
Bo then moves on to explain that he and the team have set up and experiment to gain experience with LL-DASH and test overall latency. He shows that they decided to stream RTMP out of OBS, into a github project called ‘node-gpac-dash’ then to the dash.js player all. between Boston and Seattle. This test runs at 800×600, 30fps with a bitrate of 2.5Mbps and shows results of between 2.5 and 5 seconds depending on the network conditions.
As Bo moves towards the Q&A, he says that low-latency streaming is less scalable because a TCP connection needs to be kept open between the player and the CDN which is a burden.
Another compromise is that smaller chunk sizes in LL-DASH give reduced latency but IO increases meaning sometimes you may have to increase the chunk sizes (and hence latency of the stream) to allow for better performance. He also adds that adverts are more difficult with low-latency streams due to the short amount of time to request and receive the advertising.
With us since 1998, ABR (Adaptive Bitrate) has been allowing streaming players to select a stream appropriate for their computer and bandwidth. But in this video, we hear that over 20 years on, we’re still developing ways to understand and optimise the performance of ABRs for delivery, finding the best balance of size and quality.
Brightcove’s Yuriy Reznik takes us deep into the theory, but start at the basics of what ABR is and why we. use it. He covers how it delivers a whole series os separate streams at different resolutions and bitrates. Whilst that works well, he quickly starts to show the downsides of ‘static’ ABR profiles. These are where a provider decides that all assets will be encoded at the same set bitrate of 6 or 7 bitrates even though some titles such as cartoons will require less bandwidth than sports programmes. This is where per-title and other encoding techniques come in.
Netflix coined the term ‘per-title encoding’ which has since been called content-aware encoding. This takes in to consideration the content itself when determining the bitrate to encode at. Using automatic processes to determine objective quality of a sample encode, it is able to determine the optimum bitrate.
Content & network-aware encoding takes into account the network delivery as part of the optimisation as well as the quality of the final video itself. It’s able to estimate the likelihood of a stream being selected for playback based upon its bitrate. The trick is combining these two factors simultaneously to find the optimum bitrate vs quality.
The last element to add in order to make this ABR optimisation as realistic as practical is to take into account the way people actually view the content. Looking at a real example from the US open, we see how on PCs, the viewing window can be many different sizes and you can calculate the probability of the different sizes being used. Furthermore we know there is some intelligence in the players where they won’t take in a stream with a resolution which is much bigger than the browser viewport.
Yuriy brings starts the final section of his talk by explaining that he brought in another quality metric from Westerink & Roufs which allows him to estimate how people see video which has been encoded at a certain resolution which is then scaled to a fixed interim resolution for decoding and then to the correct size for the browser windows.
The result of adding in this further check shows that fewer points on the ladder tend to be better, giving an overall higher quality value. Going much beyond 3 is typically not useful for the website. Shows only a few resolutions needed to get good average quality. Adding more isn’t so useful.
Yuriy finishes by introducing SSIM modeling of the noise of an encoder at different bitrates. Bringing together all of these factors, modelled as equations, allows him to suggest optimal ABR ladders.
The internet has been a continuing story of proprietary technologies being overtaken by open technologies, from the precursors to TCP/IP, to Flash/RTMP video delivery, to HLS. Understanding the history of why these technologies appear, why they are subsumed by open standards and how boost in popularity that happens at that transition is important to help us make decisions now and foresee how the technology landscape may look in five or ten years’ time.
This talk, by Jonn Simmons, is a talk of two halves. Looking first at the history of how our standards coalesced into what we have today will fill in many blanks and make the purpose of current technologies like MPEG DASH & CMAF clearer. He then looks at how we can understand what we have today in light of similar situations in the past answering the question of whether we are at an inflexion point in technology.
John first looks at the importance of making DRM-protected content portable in the same way as non-protected content was easy to move between computers and systems. This was in response to a WIPO analysis which, as many would agree, concluded that this was essential to enable legal video use on the internet. In 2008, Mircosoft analysed all the elements needed, beyond the simple encryption, to allow such media to be portable. It would require HTML extensions for delivery, DRM signalling, authentication, a standard protocol for Adaptive Delivery (also known as ABR) and an adaptive container format. We then take a walk through the timeline starting in 2009 through to 2018 seeing the beginnings and published availability of such technologies Common Encryption, MPEG DASH and CMAF.
John then walks through these key technologies starting with the importance of Common Encryption (also known as CENC). Previously all the DRM methods had their own container formats. Harmonisation of DRM is, likely, never going to happen so we’ll always have Apple’s own, Google’s own, Microsoft’s and plenty of others. For streaming providers, it’s a major problem to deliver all the different formats and makes for messy, duplicative workflows. Common Encryption allows for one container format which can contain any DRM information allowing for a single workflow with different inputs. On the player side, the player can, now, simply accept a single stream of DRM information, authenticate with the appropriate service and decode the video.
In the second part of the presentation, John asked ‘where will we end up?’ John draws upon two examples. One is the number of TCP/IP hosts between 1980 and 1992. He shows it was clear that when TCP/IP was publicly available there was an exponential increase in adoption of TCP/IP, moving on from proprietary network interfaces available in the years before. Similarly with websites between 1990 and 1997. Exponential growth happened after 1993 when the standard was set for Web Clients. This did take a few years to have a marked effect, but the number of websites moved from a flat ‘less than 100’ number to 600, then 10,000 in 1994 increasing to a quarter of a million by 1995 and then over one million in 1996. This shows the difference between the power ‘walled garden’ environments and the open internet.
John sees media technology today as still having a number of ‘traditional’ walled gardens such as DISH and Sky TV. He sees people self-serving multiple walled gardens to create their own larger pool of media options, typically known as ‘cord cutters’. He, therefore, sees two options for the future. One is ever larger walled gardens where large companies aggregate the content of smaller content owners/providers. The other option is having cloud services that act as a one-stop-shop for your media, but dynamically authenticate against whichever service is needed. This is a much more open environment without the need to be separately subscribing to each and every outlet in the traditional sense.
W3C Evangelist, Media & Entertainment
Subscribe to get daily updates
Views and opinions expressed on this website are those of the author(s) and do not necessarily reflect those of SMPTE or SMPTE Members.
This website is presented for informational purposes only. Any reference to specific companies, products or services does not represent promotion, recommendation, or endorsement by SMPTE