Reading: CAR-DRN — Compression Artifacts Reduction based on Dual-Residual Network (JPEG Filtering)

Outperforms ARCNN and VRCNN.

Left: JPEG, Right: CAR-DRN

In this story, Compression Artifacts Reduction based on Dual-Residual Network (CAR-DRN), by Nanjing University of Aeronautics and Astronautics, is shortly presented.

  • The artifacts such as blocking and ringing are especially sharp at low bitrates.
  • In this paper, a novel dual-residual network is proposed to reduce compression artifacts caused by lossy compression codecs.

This is a paper in 2020 Springer Journal of Signal, Image and Video Processing. (Sik-Ho Tsang @ Medium)

Outline

  1. CAR-DRN: Network Architecture
  2. Experimental Results

1. CAR-DRN: Network Architecture

CAR-DRN: Network Architecture
  • There are four parts as well as the residual learning to form CAR-DRN.
  • In summary, there are 8 convolutional layers in the whole network.

1.1. Feature Extraction

  • Feature extraction For the first layer, we use 64 filters of size 9×9×c where c stands for channels of input images (3 for color images and 1 for gray ones) to generate feature maps, followed by the activation function of leaky Relu.
  • with r sets to 5.

1.2. Feature Enhancement

Residual Block
  • Firstly, a convolutional layer with 64 filters is used to map the extracted features, followed by a residual block, to increase the nonlinearity.
  • The residual block consists of three convolutional layers with batch normalization and leaky Relu activation.
  • Then, we use a convolutional layer with 16 filters to generate the enhanced features.
  • Dilated convolution, originated in DeepLab and DilatedNet, with rate 2 is applied to boost the receptive field of network. It is of great importance because the network will be capable of taking larger regions into consideration at a moment with relatively bigger receptive field. Dilated convolution makes it possible for the network to expand it without bringing in excessive parameters.

1.3. Nonlinear mapping

  • A convolutional layer with 16 filters is taken to transform representations.

1.4. Reconstruction

  • As the last layer of the network, c filters of size 3×3×16 are used to reconstruct output image patches (color or gray) out of high-dimensional representations.

1.5. Residual Learning

  • By adding the original inputs to the final outputs, the network is able to learn residuals of inputs and labels rather than directly learn the entire image.

1.6. Loss Function

  • Mean squared error (MSE) is used as the loss function.

2. Experimental Results

2.1. Complexity

Complexity study of the networks
  • CAR-DRN has a bit larger model size than ARCNN and VRCNN but much smaller than OTORRR.

2.2. PSNR, SSIM, PSNR-B

Comparison results with QF = 5
Comparison results with QF = 10
Comparison results with QF = 15
  • The BSDS500 database is used for training.
  • As for testing, 31 images are selected from commonly used dataset set5, set12 and set14.
  • PSNR, SSIM and PSNR including Blocking artifacts (PSNR-B) are evaluated.
  • PSNR-B is a new block-sensitive image quality index designed to measure the blocking artifacts, which takes the gray level discontinuities around block boundaries into consideration.
  • CAR-DRN ranks only second to OTORRR in PSNR and SSIM while it achieves the highest scores in PSNR-B compared with all the other networks.
  • Besides, it can be inferred that our model works better at extremely low bitrates and the advantage over other methods decreases with larger QF.
Visual comparisons of the proposed and compared methods
  • JPEG decoded image is occupied with blocking artifacts and ABF and ARCNN improves PSNR, SSIM and PSNR-B slightly.
  • VRCNN removes the blocking artifacts partially while the edges are still blurred to some extent.
  • The recently proposed OTORRR achieves best scores in PSNR and SSIM.
  • But, CAR-DRN achieves best PSNR-B among all methods.

PhD, Researcher. I share what I've learnt and done. :) My LinkedIn: https://www.linkedin.com/in/sh-tsang/, My Paper Reading List: https://bit.ly/33TDhxG

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

BERT Model Restores Protein Structure

Mask RCNN

Machine Learning for Beginners

Deploy image segmentation (Mask R-CNN) model service using TensorFlow Serving & Flask

Reading: H-FCN — Hierarchical Fully Convolutional Network (VP9 & HEVC Fast Intra)

Text Classification using LDA

A set of folders, labelled with “Art”, “Fiction”, “Science” and “Medicine” each.

Introducing Objectron: The Next Phase in 3D Object Understanding

Fixing Small Photos with Deep Learning

Get the Medium app

Sik-Ho Tsang

Sik-Ho Tsang

PhD, Researcher. I share what I've learnt and done. :) My LinkedIn: https://www.linkedin.com/in/sh-tsang/, My Paper Reading List: https://bit.ly/33TDhxG

More from Medium

Review — Big Transfer (BiT): General Visual Representation Learning

Paper Review: Parameter Prediction for Unseen Deep Architectures

SimSiam in PyTorch, Part 3: Solving Model Collapse

The AutoTest Framework Makes the Operator Alignment Task for Deep Learning Frameworks Easy