Review — BDNet: Blur Detection Convolutional Neural Network (Blur Detection)
Fusing Blur Detection Results From Multiscale AlexNet-Like Networks, Obtain Better Results
8 min readDec 26, 2020
In this story, Multiscale blur detection by learning discriminative deep features, BDNet, by Tianjin University, and Civil Aviation University of China, is reviewed. In this paper:
- A simple yet effective 6-layer CNN model, with 5 layers for feature extraction and 1 for binary classification is proposed, which can faithfully produce patch-level blur likelihood.
- The network is applied at three coarse-to-fine scales. The multiscale blur likelihood maps optimally fused to generate better blur detection.
This is a paper in 2018 JNEUCOM with over 20 citations where JNEUCOM is a journal in Elsevier with a high impact factor of 4.438. (Sik-Ho Tsang @ Medium)
Outline
- BDNet: Single Scale Deep Blur Detection
- BDNet: Multiscale Deep Blur Detection
- CBDNet: Compressed BDNet
- Experimental Results
1. BDNet: Single Scale Deep Blur Detection
1.1. BDNet: Network Architecture
- A six layers CNN model for single scale blur detection similar to AlexNet is designed.
- The first convolutional layer has 96 filters of size 5×5 to extract low level features.
- The second convolutional layer has 256 filters of size 5×5 to extract middle level features.
- The third convolutional layer has 384 filters of size 3×3 that responds for high-level features extraction.
- Each convolutional layer is followed by a 2 ×2 max pooling layer.
- The fourth and fifth layers are fully connected layers, which have 2048 neurons for each layer.
- Dropout with probability of 0.5 is used to avoid overfitting in layers 4 and 5.
- The last layer is a 2-way softmax layer for binary classification.
- The details are as follows:
1.2. Dataset Preparation
- The Shi’s Dataset is used which has 296 motion blur images and 704 out-of-focus blur images. 80% of each type images are randomly selected as training set. 20% are used as test set.
- For each image in the training image set, training patches are collected by sampling the image patches in multiple patch scale (i.e. 21×21, 35×35 and 49×49) with a stride of 5 pixels by means of sliding window.
- The patch is labeled as positive (blurred), if the number of the blurred pixel is more than 80%. Otherwise, it is negative.
- To increase the diversity of the training patches, the training patches on the resized training images at resize ratios of 0.5 and 0.25, are also sampled.
- The ratio of positive to negative training patches is restricted to 1. Finally, about 10, 5 and 4 millions training patches for patch scales of 21×21, 35×35 and 49×49, respectively, are collected.
- 80% samples in each scale are randomly selected to train the model. The left 20% samples are used for validation. The ratio of the positive and negative is also fixed at 1.
- Since the input size of our model is different in each scale, the networks are trained separately using stochastic gradient descent with a batch size of 128.
2. BDNet: Multiscale Deep Blur Detection
- For a given image, we obtain the blur detection map Ds in the s-th scale.
- An optimal model is built to estimate the blur probability Bs in each scale s. By vectorizing Bs and Ds to bs and ds, respectively, the energy function is:
- where p is pixel index. There are three terms.
- The first term is data term which assigns probability to pixel p.
- The second term keeps the consistence of blur degree of the neighborhood in same scale.
- The third term makes the consistency of blur degree for different scales.
- And wspq is the appearance similarity of two pixels and defined as:
- Where fp is appearance of a pixel at position p.
- Parameters α and β are set to 0.5.
- The final blur map can be obtained by reshaping the optimal ˆb1 at the finest scale.
3. CBDNet: Compressed BDNet
- The feature maps in each layer can be classified into four classes, i.e., positive, negative, image-like and null.
- The classification is by observation/manual operation.
- The positive feature maps look like the final blur detection result that blur regions have large values and sharp regions have small values.
- The negative feature maps are opposite to the positive ones.
- The image-like feature maps have large values in both blur and sharp regions.
- The null feature maps are different from the first three types that have very small values even all zeros in whole map extent.
- To be brief, the image-like features are more effective in layers 1–3 and less useful in layers 4 and 5.
- The null features have less effects in almost all layers.
- Thus the null responded filters are removed and the remaining filters in their original BDNet-s are sampled according to their ratios.
- The filter numbers from the first to the fifth layer are set to 64, 64, 64, 512 and 256, respectively, which formed the compressed network called CBDNet-s. This new network is then fine-tuned until converge.
- (If interested in this part, please feel free to read the paper. There is a large coverage about the feature analysis based on the feature map types.)
4. Experimental Results
4.1. BDNet
- From the the figures, we clearly have three observations:
- The scale ambiguity does exist in blur detection.
- With the fine scale, the proposed model detects the blur within a small extent. Thus, the detection misclassifies a smooth region to be blurred when it appears in non-blurred context. On the contrary, a region with texture but within a blurred context may be misclassified into non-blurred.
- When using large scale, the blur detection results are dilated. However, due to large region contains more context information, the blur detector is more robust in conquering the above problem.
- The proposed fused blur detection results are better than the results of our single scale BDNet-s.
- From the table, even the proposed single scale blur detection results are better than those of the state-of-the-arts.
4.2. CBDNet
- The compressed networks with deliberate sampled filters are obviously superior to the compressed networks with randomly sampled.
- In addition the sizes of BDNet-1,2,3 are 26M, 35M and 73.2M, respectively. While the size of CBDNet-1,2,3 are 1.2M, 1.6M and 3.2M, respectively.
- The total size of CBDNet-1,2,3 is 6M, which is 4% of the total size of BDNet1,2,3. It makes our blur detection system easier to deploy on mobile phone or FPGA device.
4.3. Further Studies
- 10 images are randomly selected and separately applied 7 blur kernel sizes {3, 7, 11, 15, 21, 25, 31} and 7 Gaussian noise levels ranging from [0, 0.003] for testing.
- As shown in the above figure, two main observations are obtained.
- All compared blur detectors are quite stable on increasing blur kernel sizes and only exhibit ac- curacy dropping on smallest blur kernel 3.
- Low-level blur features are quite robust to noise, while the performance of the proposed method and Shi et al. [10] drops quickly as noise level increases. One possible way to boost noise robustness of the proposed method is to properly include noises in the training stage.
4.4. Running Time
- BDNet-1,2,3 run on TitanX GPU, but the fusion of BDNet-F runs on a laptop with i7 CPU and 16 GB RAM.
- The fusion of three scales of blur detection maps for a 640 ×480 image takes 31.28s.
4.5. Limitation Discussion
- As mentioned above, the fusion of three scales of blur detection maps for a 640 ×480 image takes 31.28 s, which occupies 69.25% of total running time.
- The slow fusion limits our blur detection on fast image or video processing applications like real-time video blur detection.
- As shown above, three blur detection network separately run to generate multiscale blur detections which is not efficient. Based on the above limitations, A larger network is required.
4.6. Blur-Aware Saliency Detection
- Even the state-of-the-art saliency object detectors [34–37] do not well consider the blur cues.
- A dataset BAS500 is built including 500 images of out-of-focus blur and motion blur images captured by photographers that randomly selected from Flickr and Fengniao.
- The salient objects are manually labeled in each image to facilitate quantitative comparison.
- For each detector, the proposed blur detection map is embedded as an extra background prior. The improved saliency detectors are denoted by suffix “_BA”.
- As shown above, the blur-aware saliency detection can improve the detection accuracy while the backgrounds are complex and blurred.
- The quantitative comparison on BAS500 is shown above, which verifies that the saliency detection can benefit from reliable blur detection.
- As shown in the Table, a comparable performance is obtained on MSRA1000.
4.7. Other Potential Applications
- An interesting application is unintentional blur removal that removes the useless blur of an image.
- From the blur detection in a video, we can easily get the object that the camera focuses on, which can facilitate other applications.
Reference
[2018 JNEUCOM] [BDNet]
Multiscale blur detection by learning discriminative deep features
Blur Detection / Defocus Map Estimation
2017 [Park CVPR’17 / DHCF / DHDE] 2018 [Purohit ICIP’18] [BDNet] [BTBNet] 2020 [BTBCRL (BTBNet + CRLNet)]