Review: IGCNet / IGCV1 — Interleaved Group Convolutions (Image Classification)

IGCNet / IGCV1

In this story, Interleaved Group Convolutional Neural Networks (IGCNet / IGCV1), by Microsoft Research and University of Central Florida, is reviewed. With the novel Interleaved Group Convolutions, IGCV1 outperforms state-of-the-art approaches such as ResNet with fewer number of parameters and fewer number of FLOPs. This is a paper in 2017 ICCV with more than 50 citations. And later on, authors also published IGCV2 and IGCV3. (Sik-Ho Tsang @ Medium)

Outline

  1. Interleaved Group Convolution (IGC) Block
  2. Connections to Other Convolutions
  3. Evaluations

1. Interleaved Group Convolution (IGC) Block

Interleaved Group Convolutions
  • As shown above, it is split into primary group convolutions and secondary group convolutions.
  • And there are permutation before and after secondary group convolutions.

1.1. Primary Group Convolutions

  • Let L be the number of partitions. The input feature maps are divided into L groups as shown in the figure above.
  • Standard spatial convolutions, such as 3×3, are applied for each group independently.
  • Therefore, a group convolution can be viewed as a regular convolution with a sparse block-diagonal convolution kernel, where each block corresponds to a partition of channels and there are no connections across the partitions.

1.2. Secondary Group Convolutions

  • Then, permutation is performed on the output of primary group convolutions, as the equation above where P is the permutation matrix, in the way that the mth secondary partition is composed of the mth output channel from each primary partition.
  • Next, the secondary group convolution is performed over the M secondary partitions. Inhere, point-wise 1×1 convolution is applied on the mth secondary partition.
  • After convolution, it is permuted back to x.
  • In summary, an interleaved group convolution block is formulated as:
  • It can be treated as:
  • i.e. an IGC block is actually equivalent to a regular convolution with the convolution kernel being the product of two sparse kernels.

2. Connections to Other Convolutions

2.1. Connection to Regular Convolution

(a) Regular Convolution, (b) Four-branch Representation of the Regular Convolution
  • Authors suggested, the above right 4-branch IGC can be equivalent to regular convolution with:
(More Details in the paper)

2.2. Connection to Summation Fusion like ResNeXt

  • For ResNeXt-like network, in the form of IGC blocks, actually it is:

2.3. Connection to Xception-like Network

  • It is actually L=1 and M=1.

3. Evaluations

3.1. Comparison with SumFusion and Regular Convolution

Different Architectures
  • SumFusion: ResNeXt-like network.
  • RegConv: Network using standard convolution.
  • IGC-L?M?: Proposed IGCNet / IGCV1 with different L and M.
Number of Parameters (Left) and Number of FLOPs (Right)
  • It is found that IGCNet / IGCV1 has smaller number of parameters as well as smaller of FLOPs.
Classification Accuracy on CIFAR-10 and CIFAR-100
  • IGC-L24M2 containing much fewer parameters, performs better than both RegConv-W16 and RegConv-W18.
  • The IGC blocks increase the width and the parameters are exploited more efficiently.

3.2. Effect of Partition Number

Accuracy Using Different Sets of (L, M) on CIFAR-100
  • The performance with M = 2 secondary partitions is better than Xception-like network (M = 1).
  • IGC with L = 40 and M = 2 gets 63.89% accuracy, about 0.8% better than IGC with L = 64 and M = 1, which gets 63.07% accuracy.

3.3. Combination with Identity Mapping Structure

Classification Accuracy on CIFAR-10 and CIFAR-100
  • Identity Mapping Structure is used in Pre-Activation ResNet.
  • IGC-L24M2 can be cooperated with Identity Mapping Structure, and obtain the highest accuracy.

3.4. ImageNet Classification Compared with ResNet

  • IGC-L4M32+Ident., performs better than ResNet (C = 64) that contains slightly fewer parameters.
  • IGC-L16M16+Ident. performs better than ResNet (C = 69) that has approximately the same number of parameters and computation complexity.
  • The gains are not from regularization but from richer representation.

3.5. Comparison with State-of-the-art Approaches

Classification Error on ImageNet

Hope I can cover IGCV2 and IGCV3 in the future.

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

No responses yet