[Paper] DeepCNN: Deep CNN for IQA (Image Quality Assessment)

Outperforms SOTA Approaches Such As IQA-CNN

3 min readOct 31, 2020

In this story, No-reference Image Quality Assessment with Deep Convolutional Neural Networks (DeepCNN), by City University of Hong Kong, is briefly presented.

In IQA-CNN (2014 CVPR):

The CNN used only contains one convolution layer which is too shallow.
32×32 patch is too small for training as image quality is not homogenous within the image.

In this paper:

A deeper CNN is proposed to predict the image quality score which outperforms SOTA approaches.

This is a paper in 2016 DSP with about 40 citations. (Sik-Ho Tsang @ Medium)

Outline

DeepCNN: Network Architecture
Experimental Results

1. DeepCNN: Network Architecture

1.1. Input

The proposed network consists of 31 layers.
Given a color image, we first sample 224×224 image patches from the original image, and then perform a global contract normalization in each channel by subtracting the mean image of ImageNet.

1.2. Pretrained NIN from 1st to 26th layers

The pre-trained NIN from the 1st layer to the 26th layer are used.
MLP convolution layers are used as in red boxes which consists of one traditional convolution layer followed by several convolution layers with 1×1 convolution kernel and ReLU activation function.
(If interested, please feel free to read NIN.)

1.3. New layers from 27th to 31th layers

Five new layers are concatenated following the 26th layer which is shown in the blue box.
Only layer 27 and layer 29 are randomly initialized and their parameters can be easily tuned by fine-tuning process.
Global average pooling (GAP) is used and then Sigmoid is used.

1.4. Larger Patch Size

The original color image is resized into 448×448 and fine-tuning the network on 224×224 patches with stride of 112.
Thus, there are about 5400 image patches for each training process from 600 training images.