Video: Things Developers Believe About Video Files (Proven Wrong by User Uploads)


For many transcoding workflows, efficiency or quality are the primary factors defining how they are created. But when ingesting user-generated videos like those uploaded to the online video platform, Vimeo, life gets difficult. Dealing with the wide variety of formats uploaded and the many edge cases in the way that otherwise normal AVC videos are delivered means throwing out any assumptions you ever had and analysing every aspect of the file.

Senior video encoding engineer, Derek Buitenhuis takes us through the many lessons he and his colleagues have learnt over the years. Don’t, he says, assume that properties don’t change between frames – sometimes they change in every single frame. Assuming that you have a single frame rate throughout the video is another ‘no no’ as there are many variable-frame rate videos.

Derek also looks at dealing with samples stamped with negative timestamps, the need for sample durations, the myriad of issues seeking through a file, the fun of having some frames that aren’t displayed and multiple-track videos.

Colour spaces, no surprise to anyone, cause handling difficulties for example if the bitstream colour properties are different to those in the container. As the talk finishes, we’re left considering old MPEG2 files that can have unavoidable banding, replicating looping MOV files, and dealing with QuickTime special effects channels that animate a fire on the screen.

Watch now!
Speakers

Derek Buitenhuis Derek Buitenhuis
Senior Video Encoding Engineer,
Vimeo

Video: Three Roads to Jerusalem

With his usual entertaining vigour, Will Law explains the differences to the three approaches to low-latency streaming: DASH, LHLS and LL-HLS from Apple. Likening them partly to religions that all get you to the same end, we see how they differ and some of the reasons for that.

Please note: Since this video was recorded, Apple has released a new draft of LL-HLS. As described in this great article from Mux, the update’s changes are

  • “Delivering shorter sub-segments of the video stream (Apple call these parts) more frequently (every 0.3 – 0.5s)
  • Using HTTP/2 PUSH to deliver these smaller parts, pushed in response to a blocking playlist request
  • Blocking playlist requests, eliminating the current speculative manifest request polling behaviour in HLS
  • Smaller, delta rendition playlists, which reduces playlist size, which is important since playlists are requested more frequently
  • Faster rendition switching, enabled by rendition reports, which allows clients to see what is happening in another playlist without requesting it in its entirety”[0]

Read the full article for the details and implications, some of which address some points made in the talk.

Furthermore, THEOplayer have released this talk explaining the changes and discussing implementation.

Anyone who saw last year’s Chunky Monkey video, will recognise Will’s near-Oscar-winning animation style as he sets the scene explaining the contenders to the low-latency streaming crown.

We then look at a bullet list of features across each of the three low latency technologies (note Apple’s recent update) which leads on to a discussion on chunked transfer delivery and the challenges of line-rate delivery. A simple view of the universe would say that the ideal way to have a live stream, encoded at a constant bitrate, would be to stream it constantly at that bitrate to the receiver. Whilst this is, indeed, the best way to go, when we stream we’re also keeping one eye on whether we need to change the bitrate. If we get more bandwidth available it might be best to upgrade to a better quality and if we suddenly have contested, slow wifi, it might be time for an emergency drop down to the lowest bitrate stream.

When you are delivered a stream as individual files, you can measure how long they take to download to estimate your available bandwidth. If a file can be downloaded at 1Gbps, then it should always arrive at 1Gbps. Therefore if it arrives at less than 1Gbps we know that there is a bandwidth restriction and can make adjustments. Will explains that for streams delivered with chunked transfer or in real time such as in LL-HLS, this estimation no longer works as the files simply are never available at 1Gbps. He then explains some of the work that has been undertaken to develop more nuanced ways of estimating available bandwidth. It’s well worth noting that the smaller the files you transfer, the less accurate the bandwidth estimation as TCP takes time to speed up to line rate so small 320ms-length video segments are not ideal for maximising throughput.

Continuing to look at the differences, we next look at request rates with DASH at 20 requests per second compared to LL-HLS at 720. This leads naturally to an analysis of the benefits of HTTP/2 PUSH technology used in LL-HLS and the savings that can offer. Will explores the implications, and some of the problems, with last year’s version of the LL-HLS spec, some of which have been mitigated since.

The talk concludes with some work Akamai has done to try and establish a single, common workflow with examples and a GitHub repository. Will shows how this works and the limitations of the approach and finishes with a look at the commonalities in approaches.

[0] From “Low Latency HLS 2: Judgment Day” https://mux.com/blog/low-latency-hls-part-2/

Watch now!
Speakers

Will Law Will Law
Chief Architect,
Akamai

Webinar: HDR Dynamic Mapping

HDR broadcast is on the rise, as we saw from the increased number of ways to watch this week’s Super Bowl in HDR, but SDR will be with us for a long time. Not only will services have to move seamlessly between SDR and HDR services, but there is a technique that allows HDR itself to be dynamically adjusted to better match the display its on.

Introduced in July 2019, content can now be more accurately represented on any specific display, particularly lower end TVs. Dynamic Mapping (DM), is applies to PQ-10 which is the 10-bit version of Dolby’s Perceptual Quantizer HDR format standardised under SMPTE ST-2084. Because HLG (ARIB STV-B67) works differently, it doesn’t need dynamic mapping. Dynamic Metadata to support this function is defined as SMPTE ST 2094-10, -40 and also as part of ETSI TS 103 433-2.

Stitching all of this together and helping us navigate delivering the best HDR is Dolby’s Jason Power and Virginie Drugeon from Panasonic in this webinar organised by DVB.

Register now!
Speakers

Virginie Drugeon Virginie Drugeon
Senior Engineer for Digital TV Standardisation, Panasonic
Chair, DVB TM-AVC Group
Jason Power Jason Power
Senior Director, Commercial Partnerships and Standards, Dolby Laboratories
Chair, DVB CM-AVC Group

Webinar: An Overview of the ATSC 3.0 Interactive Environment

Allowing viewers to interact with television services is an obvious next step for the IP-delivered ATSC service. Taking cues from the European HbbTV standard, the aim here is to make available as many ways as practical for viewers to direct their viewing in order to open up new avenues for television channels and programme creators.

Mark Corl is chair of the TG3/S38: Specialist Group on Interactive Environment. Its aim is to support interactive applications and their companion devices. It has produced the A/344 standard which is based on W3C technologies with APIs which support the needs of broadcast television. It describes the Interactive Environment Content Display model allowing video to be mixed with app graphics as a composite display. Mark is also part of the ATSC group TG3-9 which looks at how the different layers of ATSC 3.0 can communicate with each other where necessary.

From the TG3 group, too, is the Companion Device Concepts A/338 standards document which details discovery of second devices such as smartphones and enabling them to communicate with the ATSC 3.0 receiver.

In this webinar from the IEEE BTS, Mark marries an understanding of these documents with the practical aspects of deploying interactive broadcaster applications to receivers including some of the motivations to do this, such as improving revenue through the introduction of Dynamic Ad Insertion and personalisation.

Register now!
Speakers

Mark Corl Mark Corl
Chair, TG3/S38 Specialist Group on Interactive Environment
Co-chair, TG3-9 AHG on Interlayer Communications in the ATSC 3.0 Ecosystem
Senior Vice President, Emergent Technology Development, Triveni Digital