Artificial Intelligence, Machine Learning and related technologies aren’t going to go away…the real question is where they are best put to use. Here, Dan Grois from Comcast shows their transformative effect on video.
Some of us can have a passable attempt at explaining what neural networks, but to start to understand how this technology works understanding how our neural networks work is important and this is where Dan starts his talk. By walking us through the workings of our own bodies, he explains how we can get computers to mimic parts of this process. This all starts by creating a single neuron but Dan explains multi-layer perception by networking many together.
As we see examples of what these networks are able to do, piece by piece, we start to see how these can be applied to video. These techniques can be applied to many parts of the HEVC encoding process. For instance, extrapolating multiple reference frames, generating interpolation filters, predicting variations etc. Doing this we can achieve a 10% encoding improvements. Indeed, a Deep Neural Network (DNN) can totally replace the DCT (Discrete Cosine Transform) widely used in MPEG and beyond. Upsampling and downsampling can also be significantly improved – something that has already been successfully demonstrated in the market.
Dan isn’t shy of tackling the reason we haven’t seen the above gains widely in use; those of memory requirements and high computational costs. But this work is foundational in ensuring that these issues are overcome at the earliest opportunity and in optimising the approach to implementing them to the best extent possible to day.
The last part of the talk is an interesting look at the logical conclusion of this technology.