Review — RegNet: Designing Network Design Spaces

RegNet, Simple & Regular Networks are Designed, By Analyzing the Network Design Space

  • Authors design network design spaces that parametrize populations of networks. By exploring the structure aspect of network design at design space level, authors arrive at a low-dimensional design space, consisting of simple, regular networks that called RegNet.

Outline

  1. RegNet Concept
  2. The AnyNet Design Space
  3. The RegNet Design Space
  4. RegNetX and RegNetY
  5. Experimental Results

1. RegNet Concept

Design Space Design

1.1. Motivations

  • Manual design of convolutional blocks may obtain sub-optimal performance.
  • NAS require a lot of computations to search for a optimal block.

1.2. Concept

  • Authors propose to design network design spaces, where a design space is a parametrized set of possible model architectures, elevated to the population level.
  • For example, in the figure above we start with an initial design space A and apply two refinement steps to yield design spaces B then C. In this case (left):
  • The error distributions are strictly improving from A to B to C (right).

1.3. Conceptual Procedures

  • Started with a relatively unconstrained design space called AnyNet (e.g., widths and depths vary freely across stages), human-in-the-loop methodology is applied to arrive at a low-dimensional design space consisting of simple “regular” networks, called RegNet.
  • The core of the RegNet design space is simple: stage widths and depths are determined by a quantized linear function.

1.4. Tools for Design Space Design

  • For efficiency, a low-compute, low-epoch training regime is used. In particular, in this section 400 million flop (400MF) regime is used and each sampled model is trained for 10 epochs on the ImageNet.
  • Each training run is fast: training 100 models at 400MF for 10 epochs is roughly equivalent in flops to training a single ResNet-50 model at 4GF for 100 epochs.
  • The design space quality is analyzed by the error empirical distribution function (EDF). The error EDF of n models with errors ei is given by:
  • where F(e) gives the fraction of models with error less than e.
Statistics of the AnyNetX design space computed with n=500 sampled models
  • Left: shows the error EDF for n=500 sampled models from the AnyNetX design space.
  • Middle & Right: Various network properties versus network error for two examples taken from the AnyNetX design space.

2. The AnyNet Design Space

  • Given an input image, a network consists of a simple stem, followed by the network body that performs the bulk of the computation, and a final network head that predicts the output classes.
  • The network body consists of 4 stages. Each stage consists of a sequence of identical blocks, with the number of blocks di, block width wi, and any other block parameters.
The X block is based on the standard residual bottleneck block with group convolution
  • x block: is the standard residual bottlenecks block with group convolution. AnyNet design space built on it as AnyNetX.
AnyNetXA vs AnyNetXB (left) and AnyNetXB vs AnyNetXC (middle), error of shared bottleneck ratio bi=b
  • AnyNetXA: Intitial unconstrained AnyNetX design space.
  • AnyNetXB: We first test a shared bottleneck ratio bi=b for all stages i for the AnyNetXA design space.
  • AnyNetXC: Starting with AnyNetXB, a shared group width gi=g is used for all stages to obtain AnyNetXC.
AnyNetXD (left) and AnyNetXE (right)
  • AnyNetXD: A pattern emerges: good network have increasing widths.
  • AnyNetXE: The stage depths di likewise tend to increase for the best models.

3. The RegNet Design Space

  • A linear parameterization is introduced for block widths, so that different block width uj is generated for each block:
  • To quantize uj, an additional parameter wm > 0 is introduced to controls quantization:
  • Further, rounding sj is used for wj, to quantized per-block widths wj:
RegNetX design space
  • Left: Models in RegNetX have better average error than AnyNetX while maintaining the best models.
  • Middle: Using wm=2 (doubling width between stages) slightly improves the EDF. Setting w0=wa, this performs even better.
Design space summary

4. RegNetX and RegNetY

Top REGNETX models
  • Finally, RegNetX variants are formed as above.
Top REGNETY models (Y=X+SE)
  • With Squeeze-and-Excitation (SE), as originated in SENet, RegNetY variants are formed as above.
  • Authors also tried many other settings such as inverted bottleneck but not good. Please read paper directly for more details.

5. Experimental Results

Mobile regime
  • Much of the recent work on network design has focused on the mobile regime (600MF).
ResNet & ResNeXt comparisons
EfficientNet comparisons using our standard training schedule

--

--

PhD, Researcher. I share what I learn. :) Reads: https://bit.ly/33TDhxG, LinkedIn: https://www.linkedin.com/in/sh-tsang/, Twitter: https://twitter.com/SHTsang3

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store