Video: Machine Learning for Per-title Encoding

AI’s continues its march into streaming with this new approach to optimising encoder settings to keep down the bitrate and improve quality for viewers. By its more appropriate name, ‘machine learning’, computers learn how to characterise video to avoid hundreds of encodes whilst determining the best way to encode video assets.

Daniel Silhavy from Fraunhofer FOKUS takes the stand at Mile High Video 2020 to detail the latest technique in per-title and per-scene encoding. Daniel starts by outlining the problem with fixed ABR which is that efficiencies are gained by being flexible both with resolution and with bitrate.

Netflix were the best-known pioneers of the per-title encoding idea where, for each different video asset, many, many encodes are done to determine the best overall bitrate to choose. This is great because it will provide for animation-based files to be treated differently than action films or sports. Efficiency is gained.

However, per-title delivers an average benefit. There are still parts of the video which are simple and could see reduced bitrate and arts where complexity isn’t accounted for. When bitrate is higher than necessary to achieve a certain VMAF score, Danel calls this ‘wasted quality’. This means bitrate was used making the quality better than we needed it to be. Whilst better quality sounds like a boon, it’s not always possible for it to be seen, hence having a target VMAF at a lower level.

Naturally, rather than varying the resolution mix and bitrate for each file, it would be better to do it for each scene. Working this way, variations in complexity can be quickly accounted for. This can also be done without machine learning, but more encodes are needed. The rest of the talk looks at using machine learning to take a short-cut through some of that complexity.

The standard workflow is to perform a complexity analysis on the video, working out a VMAF score at various bitrate and resolution combinations. This produces a ‘Convex hull estimation’ allowing determination of the best parameters which then feed in to the production encoding stage.

Machine learning can replace the section which predicts the best bitrate-resolution pairs. Fed with some details on complexity, it can avoid multiple encodes and deliver a list of parameters to the encoding stage. Moreover, it can also receive feedback from the player allowing further optimisation of this prediction module.

Daniel shows a demo of this working where we see that the end result has fewer rungs on the ABS ladder, a lower-resolution top rung and fewer resolutions in general, some repeated at different bitrates. This is in common with the findings of Facebook which we covered last week who found that if they removed their ‘one bitrate per resolution rule’ they could improve viewers’ experience. In total, for an example Fraunhofer received from a customer, they saw a 53% reduction in storage needed.

Watch now!
Download the slides
Speakers

Daniel Silhavy
Scientist & Project Manager,
Fraunhofer FOKUS

Video: Providing better video experiences for the next billion users

What’s the best way for a billion people all on mobile networks to have a universally great streaming experience? It’s not trivial, and no service is perfect, but Facebook set out to find out what problems existed and find ways to fix them. This video explains their approach and solutions.

Denise Noyes from Facebook spoke at Demuxed 2020 about their work in India over the year. For Facebook, India is unique for this research as it represents such a large number of people almost universally using Android phones and mobile data. Not only does this allow them to understand the low-bitrate performance of video, but the Android penetration level simplifies comparisons.

The problems that Denise and her colleagues identified were gaps in the bitrate ladders where the ABR ladder either wasn’t well optimised or didn’t go low enough. There were also some ABR logic/decisions that were seen to be causing problems along with server delays from the CDN and internal congestion within the app. The research looked at ‘average bad sessions per user’ rather than the overall number of bad sessions which would be skewed by how many videos people generally watched.

Covid had a bearing on the research as this was being conducted by in-person interviews within India. These teams had to come home but the relevance of the research was acutely highlighted by the networks in other countries which worsened in response to the rising amount of traffic making them closer to the Indian example.

Denise’s team worked with colleagues throughout the company to create improvements across the whole network and delivery stack. On the encoding front, they decreased the lowest encoding level to 100kbps. This doesn’t look amazing, as seen by the metric score, but it’s better than buffering and can be watchable dependent on content. The GOP size was also increased from 2 seconds to 5. Longer GOP sizes are known to deliver improved bitrate, in this case up to 8%, but there is a tradeoff to pay in latency and how frequently you can move up/down the ABR ladder. Facebook found that the tradeoffs were worth the improvement for the viewers.

Denise introduces FB-MOS, Facebook’s objective model of the MOS objective metric. The lower the number, the worse the video looks. Facebook have used the fact that encoding resolution ‘A’ at, say, 400kbps and 200kbps can look better than encoding resolution ‘A’ at 400kbps and using a lower resolution ‘B’ for the 200kbps encode. This has lead to the ABR having 360p at two bitrates and 480p at two bitrates.

That FB-MOS score comes in handy for avoiding the lowest rungs of the ABR ladder. As their MOS score is quite low, the player will only choose it if it really has no choice otherwise, it will prefer to settle on a higher quality version if it isn’t able to go up the ladder. Ironically, they have also implemented logic to limit who gets the highest bandwidth streams since most users would prefer to spend less on data than get that disproportionately low improvement in quality.

In playback, Denise explains that they have reduced the impact of occasional anomalies on the bandwidth estimation and adjusted prefetching to prefetch the first chunk of all videos it would like to prefetch before getting the next chunk. This has reduced the chance that someone is able to choose a video which hasn’t yet been buffered and hence have to wait for it to start.

Lastly Denise covers the work done at the network layer seeing a move from HTTP/2 to QUIC. We see how the removal of head-of-line blocking has helped and that, not only has this the move to QUIC seen an overall improvement in performance but as congestion increased, QUIC traffic has shown a disproportionate improvement.

Denise concludes highlighting that this work across the network stack with wide collaboration has not only delivered the desired results but is a vital approach for any company looking to make marked improvements in customer experience.

Watch now!
Speaker

Denise Noyes Denise Noyes
Software Developer,
Facebook

Video: Making a case for DVB-MABR

Multicast ABR (mABR) is a way of delivering traditional HTTP-based streams like HLS and DASH over multicast. On a managed telco network, the services are multicast to thousands of homes and only within the home itself does the stream gets converted back unicast HTTP. Devices in the home then access streaming services in exactly the same way as they would Netflix or iPlayer over the internet, but the content is served locally. Streaming is a point-to-point service so each device takes its own stream. If you have 3 devices in the home watching a service, you’ll be sending 3 streams out to them. With mABR, the core network only ever sees one stream to the home and the linear scaling is done internally. Not only does this help remove peaks in traffic, but it significantly reduces the load on the upstream networks, the origin servers and smooths out the bandwidth use.

This video from DVB lays out the business cases which are enabled by mABR. mABR has approved the specification which is now going for standardisation within ETSI. It’s already gained some traction with deployments in the field, so this talk looks at what the projects that drive the continued growth in mABR may look like.

Williams Tovar starts first by making the case for OTT over satellite. With OTT services continuing to take viewing time away from traditional broadcast services, satellite providers are working to ensure they retain relevance and offer value. Delivering these OTT services is, thus, clearly beneficial, but why would you want to? On top of the mABR benefits briefly outlined above, this business case recognises that not everyone is served by a good internet connection. Distributing OTT by satellite can provide high bitrate, OTT experiences to areas with bad broadband and could also be an efficient way to deliver to large public places such as hotels and ships.

Julian Lemotheux from Orange presents a business case for next-generation IPTV. The idea here is to bring down the cost of STBs by replacing CA security with DRM and replacing the chipset with a cheaper one which is less specialised. As DASH and HLS streaming are cpu-based tasks and well understood, general, mass-produced chipsets can be used which are cheaper and removing CA removes some hardware from the box. Also to be considered is that the OTT ecosystem is continually seeing innovation so delivering services in the same format allows providers to keep their offerings up to date without custom development in the IPTV software stack.

Xavier Leclercq from Broadpeak looks, next, at Scaling ABR Delivery. This business case is a consideration of what the ultimate situation will be regarding MPEG2 TSes and ABR. Why don’t we provide all services as Netflix-style ABR streams? One reason is that the scale is enormous with one connection per device, CDNs and national networks would still not be able to cope. Another is that the QoS for MPEG2 transport streams is very good and, whilst it is possible to have bad reception, there is little else that causes interruption to the stream.

mABR can address both of these challenges. By delivering one stream to each home and having the local gateway do the scaling, mass delivery of streamed content becomes both predictable and practical. Whilst there is still a lot of bandwidth involved, the predictable load on the CDNs is much more controlled and with lower peaks, the CDN cost is reduced as this is normally based on the maximum throughput. mABR can also be delivered with a higher QoS than public internet traffic which allows it to benefit from better reliability which could move it in the realm of the traditional transport-stream based serviced. Xavier explains that if you put the gateway within a TV, you are able to deliver a set-top-box-less service whilst if you want to address all devices in you home, you can provide a separate gateway.

Before the video finishes with a Q&A session, Williams delivers the business case for Backhauling over Satellite for CDNs and IP backhaul for 5G Networks. The use case for both has similarities. The CDN backhauling example looks at using satellite to efficiently deliver directly to CDN PoPs in hard to reach areas which may have limited internet links. The Satellite could deliver a high bandwidth set of streams to many PoPs. A similar issue presents itself as there is so much bandwidth available, there is a concern about getting enough into the transmitter. Whether by satellite or IP Multicast, mABR could be used for CDN backhauling to 5G networks delivering into a Mobile Edge Computing (MEC) cache. A further benefit in doing this is avoiding issues with CDN and core network scalability where, again, keeping the individual requests and streams away from the CDN and the network is a big benefit.

Watch now!
Download the slides from this video
Speakers

Williams Tovar Williams Tovar
Soultion Pre-sales manager,
ENENSYS Technologies
Julien Lemotheux Julien Lemotheux
Standardisation Expert,
Orange Labs
Xavier Leclercq Xavier Leclercq
VP Business Development,
Broadpeak
Christophe Berdinat Moderator: Christophe Berdinat
Chairman CM-I MABR, DVB
Innovation and Standardisation Manager, ENENSYS

Video: Optimal Design of Encoding Profiles for Web Streaming

With us since 1998, ABR (Adaptive Bitrate) has been allowing streaming players to select a stream appropriate for their computer and bandwidth. But in this video, we hear that over 20 years on, we’re still developing ways to understand and optimise the performance of ABRs for delivery, finding the best balance of size and quality.

Brightcove’s Yuriy Reznik takes us deep into the theory, but start at the basics of what ABR is and why we. use it. He covers how it delivers a whole series os separate streams at different resolutions and bitrates. Whilst that works well, he quickly starts to show the downsides of ‘static’ ABR profiles. These are where a provider decides that all assets will be encoded at the same set bitrate of 6 or 7 bitrates even though some titles such as cartoons will require less bandwidth than sports programmes. This is where per-title and other encoding techniques come in.

Netflix coined the term ‘per-title encoding’ which has since been called content-aware encoding. This takes in to consideration the content itself when determining the bitrate to encode at. Using automatic processes to determine objective quality of a sample encode, it is able to determine the optimum bitrate.

Content & network-aware encoding takes into account the network delivery as part of the optimisation as well as the quality of the final video itself. It’s able to estimate the likelihood of a stream being selected for playback based upon its bitrate. The trick is combining these two factors simultaneously to find the optimum bitrate vs quality.

The last element to add in order to make this ABR optimisation as realistic as practical is to take into account the way people actually view the content. Looking at a real example from the US open, we see how on PCs, the viewing window can be many different sizes and you can calculate the probability of the different sizes being used. Furthermore we know there is some intelligence in the players where they won’t take in a stream with a resolution which is much bigger than the browser viewport.

Yuriy brings starts the final section of his talk by explaining that he brought in another quality metric from Westerink & Roufs which allows him to estimate how people see video which has been encoded at a certain resolution which is then scaled to a fixed interim resolution for decoding and then to the correct size for the browser windows.

The result of adding in this further check shows that fewer points on the ladder tend to be better, giving an overall higher quality value. Going much beyond 3 is typically not useful for the website. Shows only a few resolutions needed to get good average quality. Adding more isn’t so useful.

Yuriy finishes by introducing SSIM modeling of the noise of an encoder at different bitrates. Bringing together all of these factors, modelled as equations, allows him to suggest optimal ABR ladders.

Watch now!
Speaker

Yuriy Reznik Yuriy Reznik
Technology Fellow and Head of Research,
Brightcove