[Paper] DropBlock: A Regularization Method for Convolutional Networks (Image Classification)

Outperforms Dropout, DropPath from FractalNet, SpatialDropout, Cutout, AutoAugment, and Label Smoothing from Inception-v3

Published in

Artificial Intelligence in Plain English

4 min readNov 21, 2020

In this story, DropBlock: A regularization method for convolutional networks (DropBlock), by Google Brain, is shortly presented. In this paper:

DropBlock, a form of structured Dropout, is proposed where units in a contiguous region of a feature map are dropped together.
Applying DropBlock in skip connections in addition to the convolution layers increases the accuracy.

This is a paper in 2018 NeurIPS with over 200 citations. (Sik-Ho Tsang @ Medium)

Outline

DropBlock
Experimental Results

1. DropBlock

**(a) Input Image, (b)** **Dropout** **Randomly at Feature Maps, (c) DropBlock at Feature Maps**

DropBlock has two main parameters which are block_size and γ.

1.1. block_size

block_size is the size of the block to be dropped.
DropBlock resembles Dropout when block_size = 1 and resembles SpatialDropout when block_size covers the full feature map.

1.2. γ

γ controls how many activation units to drop.
In the paper, γ is computed as

where keep_prob can be interpreted as the probability of keeping a unit in traditional Dropout.
The size of valid seed region is (feat_size — block_size + 1)² where feat_size is the size of feature map.
There will be some overlapped in the dropped blocks, so the above equation is only an approximation.
In the experiments, authors first estimate the keep_prob to use (between 0.75 and 0.95), and then compute according to the above equation.

1.3. Scheduled DropBlock

It is found that DropBlock with a fixed keep_prob during training does not work well.
Applying small value of keep_prob hurts learning at the beginning. Instead, gradually decreasing keep_prob over time from 1 to the target value is more robust.

2. Experimental Results

2.1. Image Classification on ImageNet

**ResNet-50 trained on ImageNet when DropBlock is applied to group 4 or groups 3 and 4.**

A residual network can be represented by building groups based on the spatial resolution of feature activation. A building group consists of multiple building blocks. group 4 represents the last group in residual network (i.e., all layers in conv5_x) and so on.
Up & Bottom: Applying DropBlock to Group 4 has inferior performance than applying DropBlock to both Groups 3 and 4.
Left & Middle: Applying DropBlock after both convolution layers and skip connection obtains higher accuracy.
Bottom Right: The best DropBlock configuration is to apply block_size = 7 to both groups 3 and 4.

Left: Dropblock outperforms SpatialDropout and Dropout.
Right: The scheduled keep_prob makes DropBlock more robust to the change of keep_prob and adds improvement for the most values of keep_prob.

**Validation Accuracy on ImageNet (kp is keep_prob)**

DropBlock outperforms Dropout, DropPath from FractalNet, SpatialDropout, Cutout, AutoAugment, and label smoothing from Inception-v3.

**Top-1 and top-5 validation accuracy on ImageNet**

AmoebaNet-B has Dropout with keep probability of 0.5 but only on the the final softmax layer.
DropBlock is applied after all batch normalization layers and also in the skip connections of the last 50% of the cells.
keep_prob of 0.9 and block_size = 11 are used, which is the width of the last feature map.
DropBlock improves top-1 accuracy of AmoebaNet-B from 82.25% to 82.52%.

2.2. Object Detection on COCO

**AP on COCO using** **RetinaNet** **and** **ResNet-50** **FPN** **backbone model**

DropBlock with keep_prob = 0.9 is used with different block_size.
Adding DropBlock gives additional 1.6% AP.
DropBlock is an effective regularization approach for object detection.

2.3. Semantic Segmentation on PASCAL VOC

DropBlock is applied to ResNet-FPN backbone model and fully convolution networks.
Again, applying DropBlock greatly improves mIOU for training model from scratch and shrinks performance gap between the pre-trained model.

Reference

[2018 NeurIPS] [DropBlock]
DropBlock: A regularization method for convolutional networks

Image Classification

1989–1998: [LeNet]
2012–2014: [AlexNet & CaffeNet] [Maxout] [NIN] [ZFNet] [SPPNet]
2015: [VGGNet] [Highway] [PReLU-Net] [STN] [DeepImage] [GoogLeNet / Inception-v1] [BN-Inception / Inception-v2]
2016: [SqueezeNet] [Inception-v3] [ResNet] [Pre-Activation ResNet] [RiR] [Stochastic Depth] [WRN] [Trimps-Soushen]
2017: [Inception-v4] [Xception] [MobileNetV1] [Shake-Shake] [Cutout] [FractalNet] [PolyNet] [ResNeXt] [DenseNet] [PyramidNet] [DRN] [DPN] [Residual Attention Network] [IGCNet / IGCV1] [Deep Roots]
2018: [RoR] [DMRNet / DFN-MR] [MSDNet] [ShuffleNet V1] [SENet] [NASNet] [MobileNetV2] [CondenseNet] [IGCV2] [IGCV3] [FishNet] [SqueezeNext] [ENAS] [PNASNet] [ShuffleNet V2] [BAM] [CBAM] [MorphNet] [NetAdapt] [mixup] [DropBlock]
2019: [ResNet-38] [AmoebaNet] [ESPNetv2] [MnasNet] [Single-Path NAS] [DARTS] [ProxylessNAS] [MobileNetV3] [FBNet]

Artificial Intelligence in Plain English

[Paper] DropBlock: A Regularization Method for Convolutional Networks (Image Classification)

Outperforms Dropout, DropPath from FractalNet, SpatialDropout, Cutout, AutoAugment, and Label Smoothing from Inception-v3

Outline

1. DropBlock

1.1. block_size

1.2. γ

1.3. Scheduled DropBlock

2. Experimental Results

2.1. Image Classification on ImageNet

2.2. Object Detection on COCO

2.3. Semantic Segmentation on PASCAL VOC

Reference

Image Classification

My Other Previous Paper Readings

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Artificial Intelligence in Plain English

Written by Sik-Ho Tsang

No responses yet