As we saw yesterday, there’s an increasingly buoyant market for video codecs and whilst this is a breath of fresh air after AVC’s multi-decade dominance, we will likely never again see a market which isn’t fragmented with several dominant players, say AV1, AVC, VVC and VP9, each sharing 85% market share relatively equally and then ‘the rest’ bringing up the rear. So multi-codec distribution to home viewers is going to have to deal with delivering different codecs to different people.
fuboTV do this today and Nick Krzemienski is here to tell us how. Starting with an overview of fuboTV primarily streams both live and on VOD. Nick shows us the workflow they use and then explains how their AVC & HEVC combined workflow is set up. Starting with the ideal case where a single fmp4 is encoded into both AVD and HEVC, he proposes you would simply package both into an HLS and DASH manifest and let players work out the rest. Depending on your players, you may have to split out your manifests into single-codec files.
DRM’s very important for a sports broadcaster so Nick looks at how this might be achieved. CMAF allows you to deliver m3u8 and mpd files using CENC (Common ENCryption). This promises a single DRM process ahead of packaging, but the reality, we hear from Nick, is that you’ll need two sets of media for HLS and DASH if you’re going to use CENC.
When you’re delivering multiple manifest and, hence, multiple sources, how do you manage this? Nick outlines, and shows the code, of how he achieves this at the edge. Using Lamda, he’s able to look at the incoming requests and existing files at the CDN to deliver the right asset with the logic done close to the viewer. Nick closes by with his thoughts on the future for streaming and answering questions from the audience.
These next 12 months are going to see 3 new MPEG standards being released. What does this mean for the industry? How useful will they be and when can we start using them? MPEG’s coming to the market with a range of commercial models to show it’s learning from the mistakes of the past so it should be interesting to see the adoption levels in the year after their release. This is part of the second session of the Vienna Video Tech Meetup and delves into startup time for streaming services.
In the first talk, Dr. Christian Feldmann explains the current codec landscape highlighting the ubiquitous AVC (H.264), UHD’s friend, HEVC (H.265), and the newer VP9 & AV1. The latter two differentiate themselves by being free to used and are open, particularly AV1. Whilst slow, both the latter are seeing increasing adoption in streaming, but no one’s suggesting that AVC isn’t still the go-to codec for most online streaming.
Christian then introduces the three new codecs, EVC (Essential Video Coding), LCEVC (Low-Complexity Enhancement Video Coding) and VVC (Versatile Video Coding) all of which have different aims. We start by looking at EVC whose aim is too replicate the encoding efficiency of HEVC, but importantly to produce a royalty-free baseline profile as well as a main profile which improves efficiency further but with royalties. This will be the first time that you’ve been able to use an MPEG codec in this way to eliminate your liability for royalty payments. There is further protection in that if any of the tools is found to have patent problems, it can be individually turned off, the idea being that companies can have more confidence in deploying the new technology.
The next codec in the spotlight is LCEVC which uses an enhancement technique to encode video. The aim of this codec is to enable lower-end hardware to access high resolutions and/or lower bitrates. This can be useful in set-top boxes and for online streaming, but also for non-broadcast applications like small embedded recorders. It can achieve a light improvement in compression over HEVC, but it’s well known that HEVC is very computationally heavy.
LCEVC reduces computational needs by only encoding a lower resolution version (say, SD) of the video in a codec of your choice, whether that be AVC, HEVC or otherwise. The decoder will then decode this and upscale the video back to the original resolution, HD in this example. This would look soft, normally, but LCEVC also sends enhancement data to add back in the edges and detail that would have otherwise been lost. This can be done in CPU whilst the other decoding could be done by the dedicated AVC/HEVC hardware and naturally encoding/decoding a quarter-resolution image is much easier than the full resolution.
Lastly, VVC goes under the spotlight. This is the direct successor to HEVC and is also known as H.266. VVC naturally has the aim of improving compression over HEVC by the traditional 50% target but also has important optimisations for more types of content such as 360 degree video and screen content such as video games.
To finish this first Vienna Video Tech Meetup, Christoph Prager lays out the reasons he thinks that everyone involved in online streaming should obsess about Video Startup Time. After defining that he means the time between pressing play and seeing the first frame of video. The longer that delay, the assumption is that the longer the wait, the more users won’t bother watching. To understand what video streaming should be like, he examines Spotify’s example who have always had the goal of bringing the audio start time down to 200ms. Christophe points to this podcast for more details on what Spotify has done to optimise this metric which includes activating GUI elements before, strictly speaking, they can do anything because the audio still hasn’t loaded. This, however, has an impact of immediacy with perception being half the battle.
“for every additional second of startup delay, an additional 5.8% of your viewership leaves”
Christophe also draws on Akamai’s 2012 white paper which, among other things, investigated how startup time puts viewers off. Christophe also cites research from Snap who found that within 2 seconds, the entirety of the audience for that video would have gone. Snap, of course, to specialise in very short videos, but taken with the right caveats, this could indicate that Akamai’s numbers, if the research was repeated today, may be higher for 2020. Christophe finishes up by looking at the individual components which go towards adding latency to the user experience: Player startup time, DRM load time, Ad load time, Ad tag load time.
Running the live streaming for an event can be fraught, so preparation needs to be the number one priority. In this talk, Robert Reinhardt, a highly experienced streaming consultant takes us through choosing encoders, finding out what the client wanted, helping the client understand what needs to be done, choosing software and ensuring the event stays on air.
This is a wide-ranging and very valuable talk for anyone who’s going to be involved with a live streaming event. In this article, I’ll highlight 3 of the big topics nestled in with the continuous stream of tips and nuances that Rob unearths.
System Architecture. Reliability is usually a big deal for live streaming and this needs to be a consideration not only in the streaming infrastructure in the cloud, but in contribution and the video equipment itself. No one wants to have a failed stream due to a failed camera, so have two. Can you afford a hardware switcher/vision mixer? Rob prefers hardware units in terms of reliability (no random OS reboots), but he acknowledges this is not always practical or possible. Audio, too needs to be remembered and catered for. It’s always better to have black vision and hear the programme than to have silent video. Getting your streams from the event into the cloud can also be done resiliently either by having dual streams into a Wowza server or similar or having some other switching in the cloud. Rob spends some time discussing
whether to use AVC or HEVC, plus the encoder manufacturers that can help.
Discovery and Budget Setting. This is the most important part of Rob’s talk. Finding out what your customer wants to achieve in a structured, well recorded way is vital in order to ensure you meet their expectations and that their expectations are realistic. This discovery process can also be used as a way to take the customer through the options available and decisions that need to be made. For many clients, this discovery process then starts to happen on both sides. Once the client is fully aware of what they need, this can directly feed into the budget setting.
Discovery is more than just helping get the budget right and ensure the client has thought of all aspects of the event, it’s also vital in drawing a boundary around your work and allows you to document your touchpoints who will be providing you things like video, slides and connectivity. Rob suggests using a survey to get this information and offers, as an example, the survey he uses with clients. This part of the talk finishes with Rob highlighting costs that you may incur that you need to ensure are included. Rob has also written up his advice.
Setup and Testing. Much of the final part of the presentation is well understood by people who have done events before and is summarised as ‘test and test again’. But it’s always helpful to have this reiterated and, in this case, from the streaming angle. Rob goes through a long list of what to determine ahead of the event, what to test on-site ahead of the event and again what to test just before the event.
KPIs are under the microscope as Milan’s Video Tech meet up fights against the pandemic by having its second event online and focused on measuring, and therefore improving, streaming services.
Looking at ‘Data-Driven Business Decision Making‘, Federico Preli, kicks off the event looking at how to harness user data to improve the user experience. He explains this using Netflix’s House of Cards as an example. Netflix commissioned 2 seasons of House of Cards based not on a pilot, but on data they already have. They knew the British version had been a hit on the platform, they could see that the people who enjoyed that, also watched other films from Kevin Spacey or David Fincher (the director of House of Cards). As such, this large body of data showed that, though success was not guaranteed, there was good cause to expect people to be receptive to this new programme.
Federico goes on to explain how to balance recommendations based upon user data. A balance is necessary, he explains, to avoid a bubble around a viewer where the same things keep on getting recommended and not to exaggerate someone’s interests at the detriment of nuance and not representing the less prominent predilections. He outlines the 5 parts of a balanced recommendations experience: Serendipity, diversity, coverage, fairness & trust. Balancing these equally will provide a rounded experience. Finally, Federico discusses how some platforms may choose to under invest in some of these due to the nature of their platforms. Relevance, for instance, may be less important for an ultra-niche platform where everything has relevance.
‘Performance Video KPIs at the Edge‘ is the topic of Luca Moglia‘s talk. A media solutions engineer from Akamai, he looks at how to derive more KPI information from logs at the edge. Whilst much data comes from a client-side KPI, data directly reported by the video player itself to the service. Client-side information is vital as only the client knows on which button you clicked, for instance and how long you spent in certain parts of the GUI. But in terms of video playback, there is a lot to be understood by looking at the edge, the part of the CDN which is closest to the client.
One aspect that client-side reporting doesn’t cover is use of the platform by clients which aren’t fully supported meaning they report back less information. Alternatively, for some services, it may be possible to access them with clients which don’t report at all. Depending on how reporting is done, this could be blocked by ad blockers or DNS rules. As such, this is an important gap which can be largely filled by analysis of CDN logs. This allows you to enhance the data analysis done elsewhere and validate it.
Luca gives examples of KPIs that can be measured or inferred from the edge, such as ‘hand-waving latency’ which can be understood from the edge-to-origin latency and time to manifest. He also shows an example graph analysing the number of segments served at the edge within the segment duration time. This helps indicate how many streams weren’t rebuffering. Overall, Luca concludes, analysing data from the edge helps track improvements, gives you better visibility on consumer/global events and allows you to enhance the performance of the platform.
Bitmovin’s Andrea Fassina covers ‘Client KPIs – Five Analytics Metrics That Matter‘ which he summarises at the beginning of his talk ahead of explaining each individually. ‘Impressions & Total Hours Watched’ is first. This metric has really shown its importance as the SARS-CoV-2 pandemic has rolled around the globe. Understanding how much more people are watching is important in understanding how your platform is reacting. After all, if a platform is struggling this could be for many reasons that are correlated with, but not because of, more hours streamed. For instance, in boxing matches, it’s often the payment system which struggles before the streaming does.
Video startup time is next. Andrea explains the statistics of lost viewers as your time-to-play increases. You can look at startup time across each device and see where the low-hanging fruit for improvements and prioritise your work. This metric can be extended to ad playing and DRM load time which need to be brought into the overall equation.
Third is Video Bitrate Heatmap which allows you to see which type of chunks are most used and, similarly, which rungs on your ABR ladder aren’t needed (or could be improved.) The fourth KPI discussed is Error Types and Codes. Analysing codes generated can give you early warning to issues and allow you to understand whether you suffer more problems than the industry average (6.6%) but also proactively talk to connectivity providers to reduce problems. Lastly, Andrea explains how Rebuffering percentage helps understand where there are gaps in your service in terms of devices/apps which are particularly struggling.
Source: Andrea Fassina, Bitmovin
‘Video Quality Metrics‘ rounds off the session as Fabio Sonnati tackles the tricky problem of how to know what quality of video each viewer is seeing. Given that the publisher has each and every chunk and can view them, many would think this would mean you could see exactly what each stream would look like. But a streaming service can only see what each chunk looks like on their device in their environment. When you view a chunk encoded at 1080i on an underpowered SD device, what does the user actually see and would they have been better receiving a lower resolution, lower bitrate chunk instead?
In order to understand video quality, Fabio briefly explains some objective metrics such as VMAD, SSIM and PSNR. He then discusses the way that Sky Italia have chosen to create their own metric by combining metrics, subjective feedback and model training. The motivation to do this, to tailor your metric to the unique issues that your platform has to contend with. This metric, called SynthEYE, has been expanded to be able to run without a reference – i.e. it doesn’t require the source as well as the encoded version. Fabio shows results of how well SynthEYE Absolute predicts VMAF and MOS scores. He concludes by saying that using an absolute metric is useful because it gives you the ability to analyse chunk-by-chunk and then match that up with resolution and other analytics data to better understand the performance of the platform.
Views and opinions expressed on this website are those of the author(s) and do not necessarily reflect those of SMPTE or SMPTE Members.
This website is presented for informational purposes only. Any reference to specific companies, products or services does not represent promotion, recommendation, or endorsement by SMPTE