Review — RegNet: Designing Network Design Spaces

RegNet, Simple & Regular Networks are Designed, By Analyzing the Network Design Space

Designing Network Design Spaces
RegNet
, by Facebook AI Research (FAIR)
2020 CVPR, Over 500 Citations (Sik-Ho Tsang @ Medium)
Image Classification, Convolutional Neural Network, CNN, Neural Architecture Search, NAS

  • Authors design network design spaces that parametrize populations of networks. By exploring the structure aspect of network design at design space level, authors arrive at a low-dimensional design space, consisting of simple, regular networks that called RegNet.

Outline

  1. RegNet Concept
  2. The AnyNet Design Space
  3. The RegNet Design Space
  4. RegNetX and RegNetY
  5. Experimental Results

1. RegNet Concept

Design Space Design

1.1. Motivations

  • Manual design of convolutional blocks may obtain sub-optimal performance.
  • NAS require a lot of computations to search for a optimal block.

In this work, a new network design paradigm is presented that combines the advantages of manual design and NAS.

1.2. Concept

  • Authors propose to design network design spaces, where a design space is a parametrized set of possible model architectures, elevated to the population level.

The quality of a design space is characterized by sampling models and inspecting their error distribution.

  • For example, in the figure above we start with an initial design space A and apply two refinement steps to yield design spaces B then C. In this case (left):
  • The error distributions are strictly improving from A to B to C (right).

The hope is that design principles that apply to model populations are more likely to be robust and generalize.

1.3. Conceptual Procedures

  • Started with a relatively unconstrained design space called AnyNet (e.g., widths and depths vary freely across stages), human-in-the-loop methodology is applied to arrive at a low-dimensional design space consisting of simple “regular” networks, called RegNet.
  • The core of the RegNet design space is simple: stage widths and depths are determined by a quantized linear function.

Compared to AnyNet, the RegNet design space has simpler models, is easier to interpret, and has a higher concentration of good models.

1.4. Tools for Design Space Design

To obtain a distribution of models, n models are sampled from a design space, and trained.

  • For efficiency, a low-compute, low-epoch training regime is used. In particular, in this section 400 million flop (400MF) regime is used and each sampled model is trained for 10 epochs on the ImageNet.
  • Each training run is fast: training 100 models at 400MF for 10 epochs is roughly equivalent in flops to training a single ResNet-50 model at 4GF for 100 epochs.
  • The design space quality is analyzed by the error empirical distribution function (EDF). The error EDF of n models with errors ei is given by:
  • where F(e) gives the fraction of models with error less than e.
Statistics of the AnyNetX design space computed with n=500 sampled models
  • Left: shows the error EDF for n=500 sampled models from the AnyNetX design space.
  • Middle & Right: Various network properties versus network error for two examples taken from the AnyNetX design space.

Insights are obtained, the design space is then refined.

2. The AnyNet Design Space

  • Given an input image, a network consists of a simple stem, followed by the network body that performs the bulk of the computation, and a final network head that predicts the output classes.
  • The network body consists of 4 stages. Each stage consists of a sequence of identical blocks, with the number of blocks di, block width wi, and any other block parameters.
The X block is based on the standard residual bottleneck block with group convolution
  • x block: is the standard residual bottlenecks block with group convolution. AnyNet design space built on it as AnyNetX.
AnyNetXA vs AnyNetXB (left) and AnyNetXB vs AnyNetXC (middle), error of shared bottleneck ratio bi=b
  • AnyNetXA: Intitial unconstrained AnyNetX design space.
  • AnyNetXB: We first test a shared bottleneck ratio bi=b for all stages i for the AnyNetXA design space.
  • AnyNetXC: Starting with AnyNetXB, a shared group width gi=g is used for all stages to obtain AnyNetXC.

The EDFs are nearly unchanged. Overall, AnyNetXC has 6 fewer degrees of freedom than AnyNetXA, and reduces the design space size nearly four orders of magnitude.

AnyNetXD (left) and AnyNetXE (right)
  • AnyNetXD: A pattern emerges: good network have increasing widths.
  • AnyNetXE: The stage depths di likewise tend to increase for the best models.

The constraints on wi and di each reduce the design space by 4!, with a cumulative reduction of O(10⁷) from AnyNetXA.

3. The RegNet Design Space

  • A linear parameterization is introduced for block widths, so that different block width uj is generated for each block:
  • To quantize uj, an additional parameter wm > 0 is introduced to controls quantization:
  • Further, rounding sj is used for wj, to quantized per-block widths wj:
RegNetX design space
  • Left: Models in RegNetX have better average error than AnyNetX while maintaining the best models.
  • Middle: Using wm=2 (doubling width between stages) slightly improves the EDF. Setting w0=wa, this performs even better.
Design space summary

Random search efficiency is much higher for RegNetX; searching over just 32 random models is likely to yield good models.

4. RegNetX and RegNetY

Top REGNETX models
  • Finally, RegNetX variants are formed as above.
Top REGNETY models (Y=X+SE)
  • With Squeeze-and-Excitation (SE), as originated in SENet, RegNetY variants are formed as above.
  • Authors also tried many other settings such as inverted bottleneck but not good. Please read paper directly for more details.

5. Experimental Results

Mobile regime
  • Much of the recent work on network design has focused on the mobile regime (600MF).

RegNet outperforms the SOTA mobile networks such as MobileNetV1, MobileNetV2, ShuffleNet V1, ShuffleNet V2, PNASNet, AmoebaNet.

ResNet & ResNeXt comparisons

RegNetX models outperform ResNet & ResNeXt models under fixed flops.

EfficientNet comparisons using our standard training schedule

At low flops, EfficientNet outperforms the RegNetY.
At intermediate flops, RegNetY outperforms EfficientNet.
At higher flops, both RegNetX and RegNetY perform better.

RegNet has been compared by many later papers. I’ve already shortened a lot in this story. For more details, please read the paper directly.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store