JPEG artifact removal with Convolutional Neural Network (AI)

Uncategorized

JPEG images, at high compression ratios, often have blocky artefacts which look particularly unpleasant. JPEG images look particularly bad around the sharp edges of objects.

There are already ways to reduce the artefacts, however, these methods often don’t use very sophisticated techniques, only using the information from adjacent pixels to smooth out the actual transition between the blocks. An example of this can be found in Kitbi and Dawood’s work https://ieeexplore.ieee.org/document/4743789/ which gave me the original inspiration for this.

An alternative way is to use a convolutional neural network (CNN) to intelligently estimate an optimal block given the original block and the eight surrounding blocks. And then tile over the image to estimate the optimal mapping for each block.

The network design was a 5 layer fully convolutional one, using only small filters. Several different architectures were used, which all gave largely similar results. The compromise between effectiveness and speed (i.e. 1/size) was found with a small network with only 57,987 parameters. Training the network was surprisingly fast, taking only a few hours without a GPU.

The network takes in full colour information and outputs the full colour too. The reason for this is to use all of the possible information. The colour channels are highly correlated with one another. It would be possible to train the network on monochrome images, but that would lose the relation which naturally exists.

So, does it work?

In my opinion, yes it does work. I think my method performs best when there are complicated edges, such as around glasses, or on hairs which are resolved as partial blocks. It works least well, in comparison to the method which photoshop currently uses, when there are large smooth areas.

Text was not present in the training data set. So the poor effectiveness of the network on text is not a significant point of comparison. The above images are quite small. A larger example is below, along with a zoomed in video comparing the original, JPEG, photoshop artefact removal, and my method.

I did calculate root mean squared error values one point comparing the network performance with the JPEG image to the original. In some cases the network was reliably out performing the JPEG – which is impressive, but not too surprising, as this was how it was trained. I don’t think that those sorts of values are too important in this case. The aim isn’t to restore the original image, but to instead reduce the visual annoyance of the artefacts.

If you really wanted to reduce the annoyance of the artefacts you should use JPEG2000 or else have a read of this paper https://arxiv.org/pdf/1611.01704.pdf