Reading: CAR-DRN — Compression Artifacts Reduction based on Dual-Residual Network (JPEG Filtering)

Outperforms ARCNN and VRCNN.

Left: JPEG, Right: CAR-DRN

In this story, Compression Artifacts Reduction based on Dual-Residual Network (CAR-DRN), by Nanjing University of Aeronautics and Astronautics, is shortly presented.

  • The artifacts such as blocking and ringing are especially sharp at low bitrates.
  • In this paper, a novel dual-residual network is proposed to reduce compression artifacts caused by lossy compression codecs.

This is a paper in 2020 Springer Journal of Signal, Image and Video Processing. (Sik-Ho Tsang @ Medium)


  1. CAR-DRN: Network Architecture
  2. Experimental Results

1. CAR-DRN: Network Architecture

CAR-DRN: Network Architecture
  • There are four parts as well as the residual learning to form CAR-DRN.
  • In summary, there are 8 convolutional layers in the whole network.

1.1. Feature Extraction

  • Feature extraction For the first layer, we use 64 filters of size 9×9×c where c stands for channels of input images (3 for color images and 1 for gray ones) to generate feature maps, followed by the activation function of leaky Relu.
  • with r sets to 5.

1.2. Feature Enhancement

Residual Block
  • Firstly, a convolutional layer with 64 filters is used to map the extracted features, followed by a residual block, to increase the nonlinearity.
  • The residual block consists of three convolutional layers with batch normalization and leaky Relu activation.
  • Then, we use a convolutional layer with 16 filters to generate the enhanced features.
  • Dilated convolution, originated in DeepLab and DilatedNet, with rate 2 is applied to boost the receptive field of network. It is of great importance because the network will be capable of taking larger regions into consideration at a moment with relatively bigger receptive field. Dilated convolution makes it possible for the network to expand it without bringing in excessive parameters.

1.3. Nonlinear mapping

  • A convolutional layer with 16 filters is taken to transform representations.

1.4. Reconstruction

  • As the last layer of the network, c filters of size 3×3×16 are used to reconstruct output image patches (color or gray) out of high-dimensional representations.

1.5. Residual Learning

  • By adding the original inputs to the final outputs, the network is able to learn residuals of inputs and labels rather than directly learn the entire image.

1.6. Loss Function

  • Mean squared error (MSE) is used as the loss function.

2. Experimental Results

2.1. Complexity

Complexity study of the networks
  • CAR-DRN has a bit larger model size than ARCNN and VRCNN but much smaller than OTORRR.


Comparison results with QF = 5
Comparison results with QF = 10
Comparison results with QF = 15
  • The BSDS500 database is used for training.
  • As for testing, 31 images are selected from commonly used dataset set5, set12 and set14.
  • PSNR, SSIM and PSNR including Blocking artifacts (PSNR-B) are evaluated.
  • PSNR-B is a new block-sensitive image quality index designed to measure the blocking artifacts, which takes the gray level discontinuities around block boundaries into consideration.
  • CAR-DRN ranks only second to OTORRR in PSNR and SSIM while it achieves the highest scores in PSNR-B compared with all the other networks.
  • Besides, it can be inferred that our model works better at extremely low bitrates and the advantage over other methods decreases with larger QF.
Visual comparisons of the proposed and compared methods
  • JPEG decoded image is occupied with blocking artifacts and ABF and ARCNN improves PSNR, SSIM and PSNR-B slightly.
  • VRCNN removes the blocking artifacts partially while the edges are still blurred to some extent.
  • The recently proposed OTORRR achieves best scores in PSNR and SSIM.
  • But, CAR-DRN achieves best PSNR-B among all methods.

PhD, Researcher. I share what I've learnt and done. :) My LinkedIn:, My Paper Reading List:

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

BERT Model Restores Protein Structure


Machine Learning for Beginners

Deploy image segmentation (Mask R-CNN) model service using TensorFlow Serving & Flask

Reading: H-FCN — Hierarchical Fully Convolutional Network (VP9 & HEVC Fast Intra)

Text Classification using LDA

A set of folders, labelled with “Art”, “Fiction”, “Science” and “Medicine” each.

Introducing Objectron: The Next Phase in 3D Object Understanding

Fixing Small Photos with Deep Learning

Get the Medium app

Sik-Ho Tsang

Sik-Ho Tsang

PhD, Researcher. I share what I've learnt and done. :) My LinkedIn:, My Paper Reading List:

More from Medium

Review — Big Transfer (BiT): General Visual Representation Learning

Paper Review: Parameter Prediction for Unseen Deep Architectures

SimSiam in PyTorch, Part 3: Solving Model Collapse

The AutoTest Framework Makes the Operator Alignment Task for Deep Learning Frameworks Easy