With the Tokyo Olympics less than 2 weeks away, 8K is back in focus. NHK have famously been key innovators and promoters of 8K for many years, have launched an 8K channel on satellite and will be broadcasting the games in 8K. That’s all very well, but is 8K a viable broadcast format for other public and commercial broadcasters? One problem for 8K is how to get it to people. Whilst there are plenty of bandwidth problems to contend with during production, all of that will be for nought if we can’t get it to the customer.
This panel, run by the 8K Association in conjunction with SMPTE, looks to new codecs to help reduce the burden on connectivity whether RF or networks. The feeling is that HEVC just can’t deliver practical bandwidths, so what are the options? The video starts with Bill Mandel from Samsung introducing the topics of HDR display using HDR10+, streaming with CMAF and bandwidth. Bill discusses future connectivity improvements which should come into play and then looks at codec options.
Bill and Stephan Wenger give their view on the codecs which were explained in detail in this SMPTE deep dive video so do take a look at the article for more context. AV1 is the first candidate for 8K distribution that many think of since it is known to have better compression than HEVC and is even seeing some hardware support in TVs and is being trialled by YouTube. However, the trailer is 50Mbps and therefore not suitable for many connections. Looking at better performance, MPEG’s EVC is a potential candidate which offers continued improvement over AV1 and a better licensing model than HEVC. Stephan’s view on codecs is that users really don’t care what the codec is, they just need the service to work. He points towards VVC, the direct successor to HEVC, as a way forward for 8K since it delivers 40 to 50% bandwidth reduction opening up the possibility of a 25Mbps video channel. Noa published MPEG standard, the market awaits patent information and vendor implementations.
Stephan talks about MPEG’s LCEVC standard which has similarities to Samsung’s Scalenet which Bill introduced. The idea is to encode at a lower resolution and use upscaling to get the desired resolution using AI/machine learning to make the scaling look good and, certainly in the case of LCEVC, a low-bandwidth stream of enhancement data which adds in key parts of video, such as edges, which would otherwise be lost. Stephan says that he awaits implementations in the market to see how well this works. Certainly, taking into account LCEVC’s ability to produce compression using less computation, it may be an important way to bring 8K to some devices and STBs.
The discussion is rounded off by Mickael Raulet, CTO of ATEME who talks us through an end-to-end test broadcast done using VVC. This was delivered by satellite to set top boxes and over streaming with a UHD channel at 15Mbps. His take-away from the experience is that VVC is a viable option for broadcasters and 8K and may be possible with using EVC’s main profile. The video finishes with a Q&A covering:
This video looks at the whole streaming stack asking what’s now, what trends are coming to the fore and how are things going to be done better in the future? Whatever part of the stack you’re optimising, it’s vital to have a way to measure the QoE (Quality of Experience) of the viewer. In most workflows, there is a lot of work done to implement redundancy so that the viewer sees no impact despite problems happening upstream.
The Streaming Video Alliance’s Jason Thibeault diggs deeper with Harmonic’s Thierry Fautier, Brenton Ough from Touchstream, SSIMWAVE’s Hojatollah Yeganeh and Damien Lucas from Ateme.
Talking about Codecs, Thierry makes the point that only 7% of devices can currently support AV1 and with 10 billion devices in the world supporting AVC, he sees a lot of benefit in continuing to optimise this rather than waiting for VVC support to be commonplace. When asked to identify trends in the marketplace, moving to the cloud was identified as a big influencer that is driving the ability to scale but also the functions themselves. Gone are the days, Brenton says, that vendors ‘lift and shift’ into the cloud. Rather, the products are becoming cloud-native which is a vital step to enable functions and products which take full advantage of the cloud such as being able to swap the order of functions in a workflow. Just-in-time packaging is cited as one example.
Other changes are that server-side ad insertion (SSAI) is a lot better in the cloud and sub partitioning of viewers, where you do deliver different ads to different people, is more practical. Real-time access to CDN data allowing you near-immediate feedback into your streaming process is also a game-changer that is increasingly available.
Open Caching is discussed on the panel as a vital step forward and one of many areas where standardisation is desperately needed. ISPs are fed up, we hear, of each service bringing their own caching box and it’s time that ISPs took a cloud-based approach to their infrastructure and enabled multiple use servers, potentially containerised, to ease this ‘bring your own box’ mentality and to take back control of their internal infrastructure.
HDR gets a brief mention in light of the Euro soccer championships currently on air and the Japan Olympics soon to be. Thierry says 38% of Euro viewership is over OTT and HDR is increasingly common, though SDR is still in the majority. HDR is more complex than just upping the resolution and requires much more care over which screen it’s watched. This makes adopting HDR more difficult which may be one reason that adoption is not yet higher.
The discussion ends with a Q&A after talking about uses for ‘edge’ processing which the panel agrees is a really important part of cloud delivery. Processing API requests at the edge, doing SSAI or content blackouts are other examples of where the lower-latency response of edge compute works really well in the workflow.
The Codec landscape is a more nuanced place than 5 years ago, but there will always be a place for a traditional Codec that cuts file sizes in half while harnessing recent increases in computation. Enter VVC (Versatile Video Codec) the successor to HEVC, created by MPEG and the ITU by JVET (Joint Video Experts Team), which delivers up to 50% compression improvement by evolving the HEVC toolset and adding new features.
In this video Virginie Drugeon from Panasonic takes us through VVC’s advances, its applications and performance in this IEEE BTS webinar. VVC aims not only to deliver better compression but has an emphasis on delivering at higher resolutions with HDR and as 10-bit video. It also acknowledges that natural video isn’t the only video used nowadays with much more content now including computer games and other computer-generated imagery. To achieve this, VVC has had to up its toolset.
Any codec is comprised of a whole set of tools that carry out different tasks. The amount that each of these tools is used to encode the video is controllable, to some extent, and is what gives rise to the different ‘profiles’, ‘levels’ and ‘tiers’ that are mentioned when dealing with MPEG codecs. These are necessary to allow for lower-powered decoding to be possible. Artificially constraining the capabilities of the encoder gives maximum performance guarantees for both the encoder and decoder which gives manufacturers control over the cost of their software and hardware products. Virginie walks us through many of these tools explaining what’s been improved.
Most codecs split the image up into blocks, not only MPEG codecs but the Chinese AVS codecs and AV1 also do. The more ways you have to do this, the better compression you can achieve but this adds more complexity to the encoding so each generation adds more options to balance compression against the extra computing power now available since the last codec. VVC allows rectangles rather than just squares to be used and the size of sections can now be 128×128 pixels, also covered in this Bitmovin video. This can be done separately for the chroma and luma channels.
Virginie explains that the encoding is done through predicting the next frame and sending the corrections on top of that. This means that the encoder needs to have a decoder within it so it can see what is decoded and understand the differences. Virginie explains there are three types of prediction. Intra prediction uses the current frame to predict the content of a block, inter prediction which uses other frames to predict video data and also a hybrid mode which uses both, new to VVC. There are now 93 directional intra prediction angles and the introduction of matrix-based intra prediction. This is an example of the beginning of the move to AI for codecs, a move which is seen as inevitable by The Broadcast Knowledge as we see more examples of how traditional mathematical algorithms are improved upon by AI, Machine Learning and/or Deep Learning. A good example of this is super-resolution. In this case, Virginie says that they used machine learning to generate some matrices which are used for the transform meaning that there’s no neural network within the codec, but that the matrices were created based on real-world data. It seems clear that as processing power increases, a neural network will be implemented in future codecs (whether MPEG or otherwise).
For screen encoding, we see that intra block copying (IBC) is still present from HEVC, explained here from 17:30 IBC allows part of a frame to be copied to another which is a great technique for computer-generated content. Whilst this was in HEVC it was not in the basic package of tools in HEVC meaning it was much less accessible as support in the decoders was often lacking. Two new tools are block differential pulse code modulation & transform skip with adapted residual coding each discussed, along with IBC in this free paper.
Virginie moves on to Coding performance explaining that the JVET reference software called VTM has been used to compare against HEVC’s HM reference and has shown, using PSNR, an average 41% improvement on luminance with screen content at 48%. Fraunhofer HHI’s VVenc software has been shown to be 49%.
Along with the ability to be applied to screen content and 360-degree video, the versatility in the title of the codec also refers to the different layers and tiers it has which stretch from 4:2:0 10 bit video all the way up to 4:4:4 video including spatial scalability. The main tier is intended for delivery applications and the high for contribution applications with framerates up to 960 fps, up from 300 in HEVC. There are levels defined all the way up to 8K. Virginie spends some time explaining NAL units which are in common with HEVC and AVC, explained here from slide 22 along with the VCL (Video Coding Layer) which Virginie also covers.
Random access has long been essential for linear broadcast video but now also streaming video. This is done with IDR (Instantaneous Decoding Refresh), CRA (Clean Random Access) and GDR (Gradual Decoding Refresh). IDR is well known already, but GDR is a new addition which seeks to smooth out the bitrate. With a traditional IBBPBBPBBI GOP structure, there will be a periodic peak in bitrate because the I frames are much larger than the B and, indeed, P frames. The idea with GDR is to have the I frame gradually transmitted over a number of frames spreading out the peak. This disadvantage is you need to wait longer until you have your full I frame available.
Virginie introduces subpictures which are a major development in VVC allowing separately encoded pictures within the same stream. Effectively creating a multiplexed stream, sections of the picture can be swapped out for other videos. For instance, if you wanted a picture in picture, you could swap the thumbnail video stream before the decoder meaning you only need one decoder for the whole picture. To do the same without VVC, you would need two decoders. Subpictures have found use in 360 video allowing reduced bitrate where only the part which is being watched is shown in high quality. By manipulating the bitstream at the sender end.
Before finishing by explaining that VVC can be carried by both MPEG’s ISO BMFF and MPEG2 Transport Streams, Virginie covers Reference Picture Resampling, also covered in this video from Seattle Video Tech allows reference frames of one resolution to be an I frame for another resolution stream. This has applications in adaptive streaming and spatial scalability. Virginie also covers the enhanced timing available with HRD
Many of the bottlenecks in processing video today are related to bandwidth but most codecs that solve this problem require a lot of compute power and/or add a lot of latency. For those that wish to work with high-quality video such as within cameras and in TV studios, what’s really needed is a ‘zero’ latency codec that maintains lossless video but drops the data rate from gigabits to megabits. This is what JPEG XS does and Jean-Baptiste Lorent joined the NVIDIA GTC21 conference to explain why this is so powerful.
Created by intoPIX who are not only active in compression intellectual property but also within standards bodies such as JPEG, MPEG, ISO, SMPTE and others, JPEG XS is one of the latest technologies to come to market from the company. Lorent explains that it’s designed both to live inside equipment compressing video as it moves between parts of a device such as a phone where it would enable higher resolutions to be used and minimise energy use, and to drive down bandwidths between equipment in media workflows. We’ve featured case studies of JPEG XS in broadcast workflows previously.
JPEG XS prioritisation of quality & latency over compression. Source: intoPIX
The XS in JPEG XS stands for Xtra Small, Xtra Speed. And this underlines the important part of the technology which looks at compression in a different way to MPEG, AV1 and similar codecs. As discussed in
this interview the codec market is maturing and exploiting other benefits rather than pure bitrate. Nowadays, we need codecs that make life easy for AI/ML algorithms to quickly access video, we need low-complexity codecs for embedded devices like old set-top boxes and new embedded devices like body cams. We also need ulta-low delay codecs, with an encode delay in the microseconds, not milliseconds so that even multiple encodes seem instantaneous. JPEG XS is unique in delivering the latter.
With visually lossless results at compression levels down to 20:1, JPEG XS is expected to be used by most at 10:1 at which point it can render uncompressed HD 1080i at around 200Mbps, down from 1.5Gbps or can bring 76Gbps down to 5Gbps or less. Lorent explains that the maths in the algorithm has low complexity and is highly paralellisable which is a key benefit in modern CPUs which have many cores. Moreover, important for implementation in GPUs and FPGAs, it doesn’t need external memory and is low on logic.
The talk finishes with Lorent highlighting that JPEG XS has been created flexibly to be agnostic to colour space, chroma subsampling, bit depths, resolution and more. It’s also been standardised to be carried in SMPTE ST 2110-22, under ISO IEC 21122, carriage over RTP, in an MPEG TS and in the file domain as MXF, HEIF, JXS and MP4 (in ISO BMFF).
Free Registration required. Easiest way to watch is to click above, register, come back here and click again.
If you have trouble, use the chat on the bottom right of this website and we can send you a link
Director Marketing & Sales
Subscribe to get daily updates
Views and opinions expressed on this website are those of the author(s) and do not necessarily reflect those of SMPTE or SMPTE Members.
This website is presented for informational purposes only. Any reference to specific companies, products or services does not represent promotion, recommendation, or endorsement by SMPTE