Review — BR²Net: Defocus Blur Detection via a Bidirectional Channel Attention Residual Refining Network (Blur Detection)

Fusing Low and High Level Features Using CAM, Outperforms DHDE and BTBNet

Some challenging cases for defocus blur detection
  • A bidirectional residual feature refining network with two branches by embedding multiple RLRMs into it for recurrently combining and refining the residual features.
  • The outputs of the two branches are fused to obtain the final results.

Outline

  1. BR²Net: Network Architecture
  2. Residual Learning and Refining Module (RLRM)
  3. Channel Attention Module (CAM)
  4. Defocus Map Fusion
  5. Experimental Results
  6. Ablation Study

1. BR²Net: Network Architecture

BR²Net: Network Architecture
  • On the other hand, the high-level semantic features work well in locating the blurry regions and suppressing background clutter.
  • ResNet structure is used as the backbone feature extraction network and the pretrained ResNeXt model is deployed for network initialization, which produces five basic feature extraction layers: conv1, conv2_x, conv3_x, conv4_x, and conv5_x.
  • A bidirectional feature refining network to capture the different levels of information from different layers in two different directional pathways: one pathway goes from the shallow layers to the deep layers (denoted by L2H), while the other pathway goes in the opposite direction (denoted by H2L).
  • A residual learning and refining module (RLRM) is designed and multiple RLRMs are embedded into the two directional feature-refining network.
  • Suppose that the feature maps extracted from the pretrained ResNeXt model are denoted by F1, F2, F3, F4, F5 from the shallow layers to the deep layers.
  • For the L2H pathway, let OLH1 represent the output from F1, the output of the t-th recurrent step is obtained:
  • Similarly, the output of the t-th recurrent step in the H2L pathway:

2. Residual Learning and Refining Module (RLRM)

The detailed structure of the proposed RLRM
  • In addition, the supervision signal is imposed on each RLRM to improve residual learning at each recurrent step during the training process.
  • There are at least three advantages of proposing and embedding multiple RLRMs into BR²Net.
  1. Second, the RLRM can easily integrate deep features extracted from different layers to refine the residual learning process step by step.
  2. Third, faster convergence at the early stages can be obtained. Both the time cost and training error can be effectively reduced.

3. Channel Attention Module (CAM)

Channel Attention Module (CAM)
  • The channel-wise global spatial information is first converted into channel descriptors by leveraging global average pooling.
  • Then, the final weighted channel-wise feature maps

4. Defocus Map Fusion

4.1. Final Output

  • The final defocus blur map is generated by fusing the predictions from the outputs of the two pathways (denoted by OLH and OHL).
  • Specifically, the outputs OLH and OHL are first concatenated, and then a convolution layer with a ReLU activation function is imposed on the concatenated maps to obtain the final output defocus blur map B:

4.2. Training

  • For each intermediate output, the cross-entropy loss is used for training.
  • Specifically, for the L2H pathway at the t-th recurrent step, the pixelwise cross-entropy loss between OLHt and the ground-truth blur mask G:
  • The network is initialized by the well-trained ResNeXt network ImageNet, then fine-tuned on part of Shi’s dataset (604 images).

5. Experimental Results

5.1. Quantitative Comparison

The comparison of different methods in terms of the MAE, F-measure and AUC scores
Comparison of the precision-recall curves, F-measure curves and ROC curves of the different methods on the Shi’s
Comparison of the precision-recall curves, F-measure curves and ROC curves of the different methods on the DUT
Comparison of the precision-recall curves, F-measure curves and ROC curves of the different methods on the CTCUG
  • The results again demonstrate that BR²Net consistently outperforms the other methods.

5.2. Qualitative Comparison

Visual comparison of the detected defocus blur maps generated from the different methods
  • In addition, BR²Net can preserve the boundary information of the in-focus objects well.
  • When the background is in focus and the foreground regions are blurred, BR²Net also works well.

5.3. Running Efficiency Comparison

Running Platform and Average Running Time (seconds)
  • When BR²Net is well trained, it is faster than all of the other methods.
  • For BTBNet, according to their paper, they used 5 days for training, and approximately 25 seconds is required to generate the defocus blur map for an input image with 320×320 pixels.

6. Ablation Study

Ablation Study

6.1. Effectiveness of the RLRM

  • BR²Net_no_RLRM: All of the RLRMs are removed and the intermediate side output directly refined without residual learning.
  • BR²Net with the RLRM performs significantly better than BR²Net_no_RLRM.
  • BR²Net with residual learning is superior to the case without residual learning.
The training loss of BR²Net with and without the RLRM

6.2. Effectiveness of the Final Defocus Blur Map Fusion Step

  • The final outputs of the two pathways are represented by OLH and OHL.
  • As shown in the above table, it can be observed that the fusing mechanism effectively improves the final results.
The intermediate outputs of the two feature-refining pathways

6.3. Effectiveness of the Different Backbone Network Architectures

  • VGG16 is used as the backbone.
  • As shown in the table above, BR²Net_VGG16 also achieves an impressive performance.

6.4. Failure Cases

Failure cases generated by using the proposed method. Left: Input, Middle: GT, Right: Predicted Results
  • For the H2L pathway, some semantic information is also erased, as shown in red boxes above.
  • In the future work, authors mentioned they may add edge and segmentation loss functions for supervising the network training.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store