[Paper] DropBlock: A Regularization Method for Convolutional Networks (Image Classification)
Outperforms Dropout, DropPath from FractalNet, SpatialDropout, Cutout, AutoAugment, and Label Smoothing from Inception-v3

In this story, DropBlock: A regularization method for convolutional networks (DropBlock), by Google Brain, is shortly presented. In this paper:
- DropBlock, a form of structured Dropout, is proposed where units in a contiguous region of a feature map are dropped together.
- Applying DropBlock in skip connections in addition to the convolution layers increases the accuracy.
This is a paper in 2018 NeurIPS with over 200 citations. (Sik-Ho Tsang @ Medium)
Outline
- DropBlock
- Experimental Results
1. DropBlock


- DropBlock has two main parameters which are block_size and γ.
1.1. block_size
- block_size is the size of the block to be dropped.
- DropBlock resembles Dropout when block_size = 1 and resembles SpatialDropout when block_size covers the full feature map.
1.2. γ
- γ controls how many activation units to drop.
- In the paper, γ is computed as

- where keep_prob can be interpreted as the probability of keeping a unit in traditional Dropout.
- The size of valid seed region is (feat_size — block_size + 1)² where feat_size is the size of feature map.
- There will be some overlapped in the dropped blocks, so the above equation is only an approximation.
- In the experiments, authors first estimate the keep_prob to use (between 0.75 and 0.95), and then compute according to the above equation.
1.3. Scheduled DropBlock
- It is found that DropBlock with a fixed keep_prob during training does not work well.
- Applying small value of keep_prob hurts learning at the beginning. Instead, gradually decreasing keep_prob over time from 1 to the target value is more robust.
2. Experimental Results
2.1. Image Classification on ImageNet

- A residual network can be represented by building groups based on the spatial resolution of feature activation. A building group consists of multiple building blocks. group 4 represents the last group in residual network (i.e., all layers in conv5_x) and so on.
- Up & Bottom: Applying DropBlock to Group 4 has inferior performance than applying DropBlock to both Groups 3 and 4.
- Left & Middle: Applying DropBlock after both convolution layers and skip connection obtains higher accuracy.
- Bottom Right: The best DropBlock configuration is to apply block_size = 7 to both groups 3 and 4.

- Left: Dropblock outperforms SpatialDropout and Dropout.
- Right: The scheduled keep_prob makes DropBlock more robust to the change of keep_prob and adds improvement for the most values of keep_prob.

DropBlock outperforms Dropout, DropPath from FractalNet, SpatialDropout, Cutout, AutoAugment, and label smoothing from Inception-v3.

- AmoebaNet-B has Dropout with keep probability of 0.5 but only on the the final softmax layer.
- DropBlock is applied after all batch normalization layers and also in the skip connections of the last 50% of the cells.
- keep_prob of 0.9 and block_size = 11 are used, which is the width of the last feature map.
- DropBlock improves top-1 accuracy of AmoebaNet-B from 82.25% to 82.52%.
2.2. Object Detection on COCO

- DropBlock with keep_prob = 0.9 is used with different block_size.
- Adding DropBlock gives additional 1.6% AP.
- DropBlock is an effective regularization approach for object detection.
2.3. Semantic Segmentation on PASCAL VOC

Reference
[2018 NeurIPS] [DropBlock]
DropBlock: A regularization method for convolutional networks
Image Classification
1989–1998: [LeNet]
2012–2014: [AlexNet & CaffeNet] [Maxout] [NIN] [ZFNet] [SPPNet]
2015: [VGGNet] [Highway] [PReLU-Net] [STN] [DeepImage] [GoogLeNet / Inception-v1] [BN-Inception / Inception-v2]
2016: [SqueezeNet] [Inception-v3] [ResNet] [Pre-Activation ResNet] [RiR] [Stochastic Depth] [WRN] [Trimps-Soushen]
2017: [Inception-v4] [Xception] [MobileNetV1] [Shake-Shake] [Cutout] [FractalNet] [PolyNet] [ResNeXt] [DenseNet] [PyramidNet] [DRN] [DPN] [Residual Attention Network] [IGCNet / IGCV1] [Deep Roots]
2018: [RoR] [DMRNet / DFN-MR] [MSDNet] [ShuffleNet V1] [SENet] [NASNet] [MobileNetV2] [CondenseNet] [IGCV2] [IGCV3] [FishNet] [SqueezeNext] [ENAS] [PNASNet] [ShuffleNet V2] [BAM] [CBAM] [MorphNet] [NetAdapt] [mixup] [DropBlock]
2019: [ResNet-38] [AmoebaNet] [ESPNetv2] [MnasNet] [Single-Path NAS] [DARTS] [ProxylessNAS] [MobileNetV3] [FBNet]