Review — SFA & SFGN: Simplified-Fast-GoogleNet (Blur Classification)

Blur Classification Using Ensemble of Simplified-Fast-GoogleNet (SFGA) and Simplified-Fast-AlexNet (SFA)

Sik-Ho Tsang
5 min readMay 23, 2021
Sample images in blur datasets

In this story, Blur image identification with ensemble convolution neural networks, (SFA & SFGN), by Beihang University and University of Connecticut, is reviewed.

Blur image type classification is essential to blur image recovery.

In this paper:

  • Simplified-Fast-AlexNet (SFA) and Simplified-Fast-GoogleNet (SFGN), are designed.
  • Ensemble of SFA and SFGN is used for blur classification (Gaussian blur, motion blur, defocus blur and haze blur).

This is a paper in 2019 JSP with high impact factor of 4.662. This paper is an extension of SFA in 2017 IST. (

@ Medium)


  1. Simplified-Fast-AlexNet (SFA): Network Architecture
  2. Simplified-Fast-GoogleNet (SFGN): Network Architecture
  3. Ensemble of SFA and SFGN
  4. Overall Framework
  5. Datasets
  6. Experimental Results

1. Simplified-Fast-AlexNet (SFA): Network Architecture

Simplfied-Fast-AlexNet (SFA): Network Architecture
  • The architecture is similar to the SFA in 2017 IST.
  • Except that, the ReLU is changed to Leaky ReLU (LReLU).
  • (If interested, please feel free to read SFA in 2017 IST.)

2. Simplified-Fast-GoogleNet (SFGN): Network Architecture

Simplified-Fast-GoogleNet (SFGN): Network Architecture
  • In the original GoogLeNet, there are 3 losses.
  • SFGN is obtained by pruning GoogLeNet, with only the layers till the first loss.
  • The number of neurons are compressed by a ratio of 50%, just like SFA.
  • Batch normalization and LReLU are used.

3. Ensemble of SFA and SFGN

Ensemble of SFA and SFGN
  • The classification accuracies of SFA and SFGN are denoted as C1 and C2, respectively.
  • The corresponding weights of SFA and SFGN are defined as Weight1 = C1/(C1 + C2) and Weight2 = C2/(C1 + C2), respectively.

4. Overall Framework

Overall Framework
  • For an image that is locally blurred, a number of patches, each being globally blurred, are extracted from the original image and are classified by weighted SFA and SFGN. The overall blur type of the original image is then determined based on the output of the ensemble classifier.
  • The improved SLIC super-pixel segmentation method is used to extract blurred area from the blurred images to form a real blurred image dataset containing only global blurred images.
Improved SLIC
  • In brief, the original SLIC considers color and spatial distance to obtain superpixels.
  • The modified SLIC method also considers the blur feature distance.
  • The information entropy and SVD ratio are also considered to select the purely blur image patches.
  • (If interested, please feel free to read the paper directly.)

5. Datasets

5.1. Training Dataset

  • Similar to SFA, Gaussian blur, motion blur and defocus blur are synthesized. But in this paper, haze blur is also synthesized.
  • 200,000 128×128×3 simulated global blur patches are used for training.
  • 62,000 real/natural blur patches are obtained from online website.
  • All four blur types are uniformly distributed.

5.2. Testing Dataset 1

  • Berkeley dataset images and Pascal VOC 2007 dataset are selected to be the testing dataset.
  • In total 21,000 global blur test sample patches are obtained in which 5,560 haze blur image patches possess the same sources with training samples.

5.3. Testing Dataset 2

  • A dataset consisting of 13,810 natural global blur image patches is constructed. The samples are all collected from the same websites as the haze blur samples in Training dataset.

6. Experimental Results

6.1. The Integrated CNN Performance

Comparison of different models under several criteria.
  • P_N is the number of model parameter, L_N is the model depth, F_T is the forward propagation time, B_T is the error backward propagation time, CLF_T is the average time required to identify a single image, Tr_T is the model training time, Error denotes the classification error rate over the testing dataset1.

P_N of AlexNet is over 1000 times of SFA and GoogLeNet is almost 7 times of SFGN.

F_T of different models are of the same order of magnitude.

B_T is dramatically different.

CLF_T of SFA is only about 13.5% of AlexNet’s CLF_T, and SFGN is about 6 times faster than GoogLeNet.

Moreover, the total training times of SFA and SFGN are both less than one day, while the AlexNet and GoogLeNet require about two days each.

Finally, the classification error rate of SFA suffers a 1.05% drop compared to the original AlexNet, while the drop is 0.11% from GoogLeNet to SFGN.

The ensemble classifier easily outperforms both AlexNet and GoogLeNet in terms of classification accuracy.

6.2. SOTA Comparison

Comparison of the ensemble classifier and the state-of-the-art.
  • The classification accuracies of two-step way [4] , single-layered NN [9] and DNN[16] included in the table are the ones reported in their respective references. (The datasets are different. But it is understandable that the re-implementation is difficult.)
  • The prediction accuracy ( > 90%) of learned feature-based methods is generally superior to the ones ( < 90%) whose use handcrafted features.

The classification accuracy of SFA on the simulated testing dataset is 96.99%, which is slightly lower than AlexNet’s 97.74%.

Nevertheless, it is still better than the DNN model of 95.2%.

The classification accuracy of SFGN is 98.12%, which outperforms the SFA model but less than the classification performance of the ensemble classifier of 98.89%.

In addition, the classification performance of SFA, SFGN and the ensemble classifier on the real/natural blur datasets are 93.75%, 95.81% and 96.72%, respectively.



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.