Review: IGCNet / IGCV1 — Interleaved Group Convolutions (Image Classification)
Outperforms NIN, Highway, FractalNet, ResNet, Pre-Activation ResNet, Stochastic Depth, WRN, RiR, Xception, DenseNet, ResNeXt
In this story, Interleaved Group Convolutional Neural Networks (IGCNet / IGCV1), by Microsoft Research and University of Central Florida, is reviewed. With the novel Interleaved Group Convolutions, IGCV1 outperforms state-of-the-art approaches such as ResNet with fewer number of parameters and fewer number of FLOPs. This is a paper in 2017 ICCV with more than 50 citations. And later on, authors also published IGCV2 and IGCV3. (Sik-Ho Tsang @ Medium)
Outline
- Interleaved Group Convolution (IGC) Block
- Connections to Other Convolutions
- Evaluations
1. Interleaved Group Convolution (IGC) Block
- As shown above, it is split into primary group convolutions and secondary group convolutions.
- And there are permutation before and after secondary group convolutions.
1.1. Primary Group Convolutions
- Let L be the number of partitions. The input feature maps are divided into L groups as shown in the figure above.
- Standard spatial convolutions, such as 3×3, are applied for each group independently.
- Therefore, a group convolution can be viewed as a regular convolution with a sparse block-diagonal convolution kernel, where each block corresponds to a partition of channels and there are no connections across the partitions.
1.2. Secondary Group Convolutions
- Then, permutation is performed on the output of primary group convolutions, as the equation above where P is the permutation matrix, in the way that the mth secondary partition is composed of the mth output channel from each primary partition.
- Next, the secondary group convolution is performed over the M secondary partitions. Inhere, point-wise 1×1 convolution is applied on the mth secondary partition.
- After convolution, it is permuted back to x.
- In summary, an interleaved group convolution block is formulated as:
- It can be treated as:
- i.e. an IGC block is actually equivalent to a regular convolution with the convolution kernel being the product of two sparse kernels.
2. Connections to Other Convolutions
2.1. Connection to Regular Convolution
- Authors suggested, the above right 4-branch IGC can be equivalent to regular convolution with:
2.2. Connection to Summation Fusion like ResNeXt
- For ResNeXt-like network, in the form of IGC blocks, actually it is:
2.3. Connection to Xception-like Network
- It is actually L=1 and M=1.
3. Evaluations
3.1. Comparison with SumFusion and Regular Convolution
- SumFusion: ResNeXt-like network.
- RegConv: Network using standard convolution.
- IGC-L?M?: Proposed IGCNet / IGCV1 with different L and M.
- It is found that IGCNet / IGCV1 has smaller number of parameters as well as smaller of FLOPs.
- IGC-L24M2 containing much fewer parameters, performs better than both RegConv-W16 and RegConv-W18.
- The IGC blocks increase the width and the parameters are exploited more efficiently.
3.2. Effect of Partition Number
- The performance with M = 2 secondary partitions is better than Xception-like network (M = 1).
- IGC with L = 40 and M = 2 gets 63.89% accuracy, about 0.8% better than IGC with L = 64 and M = 1, which gets 63.07% accuracy.
3.3. Combination with Identity Mapping Structure
- Identity Mapping Structure is used in Pre-Activation ResNet.
- IGC-L24M2 can be cooperated with Identity Mapping Structure, and obtain the highest accuracy.
3.4. ImageNet Classification Compared with ResNet
- IGC-L4M32+Ident., performs better than ResNet (C = 64) that contains slightly fewer parameters.
- IGC-L16M16+Ident. performs better than ResNet (C = 69) that has approximately the same number of parameters and computation complexity.
- The gains are not from regularization but from richer representation.
3.5. Comparison with State-of-the-art Approaches
- IGCNet / IGCV1 obtains the best accuracy on CIFAR-10, and the third best accuracy on SVHN ,outperforms NIN, Highway, FractalNet, ResNet, Pre-Activation ResNet, Stochastic Depth, WRN, RiR, DenseNet, ResNeXt.
- DenseNet obtains the best accuracy on CIFAR-100. However, authors believe that the performance would be better if their network also adopts the bottleneck design as in DenseNet and ResNeXt.
Hope I can cover IGCV2 and IGCV3 in the future.
Reference
[2107 ICCV] [IGCNet / IGCV1]
Interleaved Group Convolutions for Deep Neural Networks
My Previous Reviews
Image Classification
[LeNet] [AlexNet] [Maxout] [NIN] [ZFNet] [VGGNet] [Highway] [SPPNet] [PReLU-Net] [STN] [DeepImage] [SqueezeNet] [GoogLeNet / Inception-v1] [BN-Inception / Inception-v2] [Inception-v3] [Inception-v4] [Xception] [MobileNetV1] [ResNet] [Pre-Activation ResNet] [RiR] [RoR] [Stochastic Depth] [WRN] [Shake-Shake] [FractalNet] [Trimps-Soushen] [PolyNet] [ResNeXt] [DenseNet] [PyramidNet] [DRN] [DPN] [Residual Attention Network] [MSDNet] [ShuffleNet V1] [SENet]
Object Detection
[OverFeat] [R-CNN] [Fast R-CNN] [Faster R-CNN] [MR-CNN & S-CNN] [DeepID-Net] [CRAFT] [R-FCN] [ION] [MultiPathNet] [NoC] [Hikvision] [GBD-Net / GBD-v1 & GBD-v2] [G-RMI] [TDM] [SSD] [DSSD] [YOLOv1] [YOLOv2 / YOLO9000] [YOLOv3] [FPN] [RetinaNet] [DCN]
Semantic Segmentation
[FCN] [DeconvNet] [DeepLabv1 & DeepLabv2] [CRF-RNN] [SegNet] [ParseNet] [DilatedNet] [DRN] [RefineNet] [GCN] [PSPNet] [DeepLabv3]
Biomedical Image Segmentation
[CUMedVision1] [CUMedVision2 / DCAN] [U-Net] [CFS-FCN] [U-Net+ResNet] [MultiChannel] [V-Net] [3D U-Net] [M²FCN] [SA] [QSA+QNT] [3D U-Net+ResNet]
Instance Segmentation
[SDS] [Hypercolumn] [DeepMask] [SharpMask] [MultiPathNet] [MNC] [InstanceFCN] [FCIS]
Super Resolution
[SRCNN] [FSRCNN] [VDSR] [ESPCN] [RED-Net] [DRCN] [DRRN] [LapSRN & MS-LapSRN] [SRDenseNet]
Human Pose Estimation
[DeepPose] [Tompson NIPS’14] [Tompson CVPR’15] [CPM]