Video compression is used everywhere we look. So often is it not practical to use uncompressed video, that everything in the consumer space video is delivered compressed so it pays to understand how this works, particularly if part of your job involves using video formats such as AVC, also known as H.264 or HEVC, AKA H.265.
Gisle Sælensminde from Vizrt takes us on this journey of creating compressed video. He starts by explaining why we need uncompressed video and then talks about containers such as MPEG-2 Transport Streams, mp4, MOV and others. He explains that the container’s job is partly to hold metadata such as the framerate, resolution and timestamps among a long list of other things.
Gisle takes some time to look at the past timeline of codecs in order to understand where we’re going from what went before. As many use the same principles, Gisle looks at the different type of frames inside most compressed formats – I, P and B frames which are used in set patterns known as GOPs – Group(s) of Pictures. A GOP defines how long is between I frames. In the talk we learn that I frames are required for a decoder to be able to tune in part way through a feed and still start seeing some pictures. This is because it’s the I frame which holds a whole picture rather than the other types o frame which don’t.
Colours are important, so Gisle looks at the way that colours are represented. Many people know about defining colours by looking at the values of Red, Green and Blue, but fewer about YUV. This is all covered in the talk so we know about conversion between the two types.
Almost synonymous with codecs such as HEVC and AVC are Macroblocks. This is the name given to the parts of the raster which have been spit up into squares, each of which will be analysed independently. We’ll look at who these macro blocks are used, but Gisle also spends some time looking to the future as both HEVC, VP9 and now AV1 use variable-size macro block analysis.
A process which happens throughout broadcast is chroma subsampling. This topic, whereby we keep more of the luminance channel than colours, is explored ahead of looking at DCTs – Discrete Cosine Transforms – which are foundational to most video codecs. We see that by analysing these macro blocks with DCTs. we can express the image in a different way and even cut down on some of the detail we get from DCTs in order to reduce the bitrate.
Before some very useful demos looking at the result of varying quantisation across a picture, the difference signal between the source and encoded picture plus deblocking technology to hide some of the artefacts which can arise from DCT-based codecs when they are pushed for bandwidth.
Gisle finishes this talk at Media City Bergen by taking a number of questions from the floor.
Watch now!
Speaker
Gisle Sælensminde Senior Software Engineer, Vizrt |