Reading: OISR — ODE-Inspired Schemes to Super-Resolution Network Designs (Super Resolution)

Outperforms MSRN, RDN, EDSR & MDSR, CARN, SelNet, MemNet, LapSRN, DRRN, DRCN, VDSR, FSRCNN

6 min readJul 23, 2020

In this story, ODE-inspired Network Design for Single Image Super-Resolution (OISR), by Chinese Academy of Sciences, University of Chinese Academy of Sciences, CAS, Alibaba Group, is presented. In this paper:

An ordinary differential equation (ODE)-inspired design scheme is adopted for single image super-resolution, which have brought a new understanding of ResNet in classification problems.
Two types of network structures are derived: LF-block and RK-block, which correspond to the Leapfrog method and Runge-Kutta method in numerical ordinary differential equations.

This is a paper in 2019 CVPR with over 20 citations. (Sik-Ho Tsang @ Medium)

Outline

ODE-Inspired Network Design
Derivation of OISR-Blocks
Overall Network Architecture and Block Design
Experimental Results

1. ODE-Inspired Network Design

From a dynamical system perspective, it defines a map that takes input status forward x units of time in the phase space.
In CNN semantics, time horizon x corresponds to layers that can be adaptively chosen, while the final status is restricted by labels.
Considering the dynamical systems which can be described as an ODE:

This system gives a map:

with initial status y0 ∈ Rd. Suppose p(y0) is the distribution of input feature y0 on a domain, if we regard CNN-based SISR as such a dynamical system, then we are supposed to minimize:

where Φ is a map should be learned in SISR.
As the system is non-linear, there is no simple formula describing the map, numerical methods are used — forward Euler method:

which provides the approximation. It can be seen as a numerical ODE using the approximation to the integral of y′ over an interval of width.
And residual block takes a similar form:

The above suggests the relationship and establish the bridge by defining:

Thus, mapping forward Euler to a residual block.

2. Derivation of OISR-Blocks

To learn such a map Φ, it may take many steps to reach the final status, each step corresponds to a CNN block.
(The CNN blocks as shown above will be derived below. Also the G block is composed of convolutions and activation functions which will be defined in the next section after deriving the above CNN block variants.)
Either increasing the number of steps or refining motion of each step helps to achieve the goal, corresponding to increasing block numbers and designing finer blocks.
Higher-order methods are supposed to bring about some merits.

2.1. LF-Block

LeapFrog method is a second-order linear 2-step method, as a refinement of forward Euler scheme.
By doubling the time interval h, the approximation of y′ can be rewritten in the form of y′ ≈ (yn+1−yn−1)/2h. Thus, yn+1 is:

In order to retain flexibility and obtain a block architecture, every three formulas above are grouped into a block as:

As mentioned:

where G is some kinds of convolutional blocks.

2.2. RK2-block

Consider the Runge-Kutta family, which is widely used in numerical analysis. Making use of trapezoidal formula:

We obtain a block structure:

In mathematics, these formulas are referred as Heun’s method, which is also a two-stage second-order Runge- Kutta method.

2.3. RK3-Block

Higher-order methods should obtain a smaller local truncation error.
Explicit iterative Runge-Kutta methods can be extended to arbitrary n stages:

In particular, 3-stage Lunge-Kutta with third order is:

Generally, higher-order methods tend to generate more complicated blocks.
(Please note that I am not expert in ODE. Please feel free to read the paper if interested.)

3. Overall Network Architecture and Block Design

3.1. Network Architecture

The dimension of input and output feature maps of G is kept unchanged.
The OISR Blocks are at the middle of the network.
Residual learning is used.
Pixel-Shuffle in ESPCN is used at the end to get the SR.

3.2. Block Design (G)

There are different types of G tried. There is a large searching space to search G.
Here, only choose three different forms are chosen to illustrate the general effectiveness of ODE-inspired schemes.
Each of these designs keeps at least one activation function and one convolutional layer, thus promising the nonlinearity.

4. Experimental Results

4.1. Ablation Study

First 800 images in DIV2K are used for training. L1 loss is used.
Ablation study is performed on DIV2K validation set.

”PReLU+Conv”, namely G-v2, is suitable for LF-blocks.
RK2-blocks should be equipped with G-v3.

4.2. SOTA Comparison

**Quantitative Comparisons on Benchmark Datasets**

The small-scale network designs are suffixed by ”-s”.
The small-scale models, OISR-RK2-s and OISR-LF-s outperform other methods such as FSRCNN, DRRN, MemNet, SelNet and CARN, on different upscaling factors and datasets, except a slightly behind on Urban100 with upscaling factor ×2.
In addition, the middle-scale models, OISR-RK2 and OISR-LF, surpass MSRN with only two exceptions on B100 and Urban100 SSIM when the upscaling factor is 2.

For current state-of-the-art deep residual methods, OISR-RK3 achieves the best performances in most cases, outperforms LapSRN, VDSR, DRCN, MDSR, RDN, and EDSR.