Reading: OISR — ODE-Inspired Schemes to Super-Resolution Network Designs (Super Resolution)

Outperforms MSRN, RDN, EDSR & MDSR, CARN, SelNet, MemNet, LapSRN, DRRN, DRCN, VDSR, FSRCNN

Sik-Ho Tsang
6 min readJul 23, 2020

In this story, ODE-inspired Network Design for Single Image Super-Resolution (OISR), by Chinese Academy of Sciences, University of Chinese Academy of Sciences, CAS, Alibaba Group, is presented. In this paper:

  • An ordinary differential equation (ODE)-inspired design scheme is adopted for single image super-resolution, which have brought a new understanding of ResNet in classification problems.
  • Two types of network structures are derived: LF-block and RK-block, which correspond to the Leapfrog method and Runge-Kutta method in numerical ordinary differential equations.

This is a paper in 2019 CVPR with over 20 citations. (Sik-Ho Tsang @ Medium)

Outline

  1. ODE-Inspired Network Design
  2. Derivation of OISR-Blocks
  3. Overall Network Architecture and Block Design
  4. Experimental Results

1. ODE-Inspired Network Design

  • From a dynamical system perspective, it defines a map that takes input status forward x units of time in the phase space.
  • In CNN semantics, time horizon x corresponds to layers that can be adaptively chosen, while the final status is restricted by labels.
  • Considering the dynamical systems which can be described as an ODE:
  • This system gives a map:
  • with initial status y0 ∈ Rd. Suppose p(y0) is the distribution of input feature y0 on a domain, if we regard CNN-based SISR as such a dynamical system, then we are supposed to minimize:
  • where Φ is a map should be learned in SISR.
  • As the system is non-linear, there is no simple formula describing the map, numerical methods are used — forward Euler method:
  • which provides the approximation. It can be seen as a numerical ODE using the approximation to the integral of y′ over an interval of width.
  • And residual block takes a similar form:
  • The above suggests the relationship and establish the bridge by defining:
  • Thus, mapping forward Euler to a residual block.

2. Derivation of OISR-Blocks

OISR-Block Variants
  • To learn such a map Φ, it may take many steps to reach the final status, each step corresponds to a CNN block.
  • (The CNN blocks as shown above will be derived below. Also the G block is composed of convolutions and activation functions which will be defined in the next section after deriving the above CNN block variants.)
  • Either increasing the number of steps or refining motion of each step helps to achieve the goal, corresponding to increasing block numbers and designing finer blocks.
  • Higher-order methods are supposed to bring about some merits.

2.1. LF-Block

  • LeapFrog method is a second-order linear 2-step method, as a refinement of forward Euler scheme.
  • By doubling the time interval h, the approximation of y′ can be rewritten in the form of y′ ≈ (yn+1−yn−1)/2h. Thus, yn+1 is:
  • In order to retain flexibility and obtain a block architecture, every three formulas above are grouped into a block as:
  • As mentioned:
  • where G is some kinds of convolutional blocks.

2.2. RK2-block

  • Consider the Runge-Kutta family, which is widely used in numerical analysis. Making use of trapezoidal formula:
  • We obtain a block structure:
  • In mathematics, these formulas are referred as Heun’s method, which is also a two-stage second-order Runge- Kutta method.

2.3. RK3-Block

  • Higher-order methods should obtain a smaller local truncation error.
  • Explicit iterative Runge-Kutta methods can be extended to arbitrary n stages:
  • In particular, 3-stage Lunge-Kutta with third order is:
  • Generally, higher-order methods tend to generate more complicated blocks.
  • (Please note that I am not expert in ODE. Please feel free to read the paper if interested.)

3. Overall Network Architecture and Block Design

3.1. Network Architecture

Network Architecture
  • The dimension of input and output feature maps of G is kept unchanged.
  • The OISR Blocks are at the middle of the network.
  • Residual learning is used.
  • Pixel-Shuffle in ESPCN is used at the end to get the SR.

3.2. Block Design (G)

Block Design (G)
  • There are different types of G tried. There is a large searching space to search G.
  • Here, only choose three different forms are chosen to illustrate the general effectiveness of ODE-inspired schemes.
  • Each of these designs keeps at least one activation function and one convolutional layer, thus promising the nonlinearity.

4. Experimental Results

4.1. Ablation Study

  • First 800 images in DIV2K are used for training. L1 loss is used.
  • Ablation study is performed on DIV2K validation set.
  • ”PReLU+Conv”, namely G-v2, is suitable for LF-blocks.
  • RK2-blocks should be equipped with G-v3.

4.2. SOTA Comparison

Quantitative Comparisons on Benchmark Datasets
  • The small-scale network designs are suffixed by ”-s”.
  • The small-scale models, OISR-RK2-s and OISR-LF-s outperform other methods such as FSRCNN, DRRN, MemNet, SelNet and CARN, on different upscaling factors and datasets, except a slightly behind on Urban100 with upscaling factor ×2.
  • In addition, the middle-scale models, OISR-RK2 and OISR-LF, surpass MSRN with only two exceptions on B100 and Urban100 SSIM when the upscaling factor is 2.
Quantitative Comparisons on Benchmark Datasets
  • For current state-of-the-art deep residual methods, OISR-RK3 achieves the best performances in most cases, outperforms LapSRN, VDSR, DRCN, MDSR, RDN, and EDSR.
Qualitative comparsions (×2 super resolution (top) and ×4 super resolution (bottom))
  • OISRs can reconstruct more detailed images with less blurring.

This is the 20th story in this month.

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

No responses yet