[Paper] DeepCNN: Deep CNN for IQA (Image Quality Assessment)

Outperforms SOTA Approaches Such As IQA-CNN

Sik-Ho Tsang
3 min readOct 31, 2020

In this story, No-reference Image Quality Assessment with Deep Convolutional Neural Networks (DeepCNN), by City University of Hong Kong, is briefly presented.

In IQA-CNN (2014 CVPR):

  • The CNN used only contains one convolution layer which is too shallow.
  • 32×32 patch is too small for training as image quality is not homogenous within the image.

In this paper:

  • A deeper CNN is proposed to predict the image quality score which outperforms SOTA approaches.

This is a paper in 2016 DSP with about 40 citations. (Sik-Ho Tsang @ Medium)

Outline

  1. DeepCNN: Network Architecture
  2. Experimental Results

1. DeepCNN: Network Architecture

DeepCNN: Network Architecture

1.1. Input

  • The proposed network consists of 31 layers.
  • Given a color image, we first sample 224×224 image patches from the original image, and then perform a global contract normalization in each channel by subtracting the mean image of ImageNet.

1.2. Pretrained NIN from 1st to 26th layers

  • The pre-trained NIN from the 1st layer to the 26th layer are used.
  • MLP convolution layers are used as in red boxes which consists of one traditional convolution layer followed by several convolution layers with 1×1 convolution kernel and ReLU activation function.
  • (If interested, please feel free to read NIN.)

1.3. New layers from 27th to 31th layers

  • Five new layers are concatenated following the 26th layer which is shown in the blue box.
  • Only layer 27 and layer 29 are randomly initialized and their parameters can be easily tuned by fine-tuning process.
  • Global average pooling (GAP) is used and then Sigmoid is used.

1.4. Larger Patch Size

  • The original color image is resized into 448×448 and fine-tuning the network on 224×224 patches with stride of 112.
  • Thus, there are about 5400 image patches for each training process from 600 training images.

2. Experimental Results

Median LCC and SRPCC on LIVE
  • DeepCNN has a competitive results as IQA-CNN and outperforms numerous hand-crafted feature based IQA approaches.

And DeepCNN needs only 50 ms to process one image which makes the real-time application possible. (It does not mention it uses GPU or CPU.)

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.