Reading: OISR — ODE-Inspired Schemes to Super-Resolution Network Designs (Super Resolution)
In this story, ODE-inspired Network Design for Single Image Super-Resolution (OISR), by Chinese Academy of Sciences, University of Chinese Academy of Sciences, CAS, Alibaba Group, is presented. In this paper:
- An ordinary differential equation (ODE)-inspired design scheme is adopted for single image super-resolution, which have brought a new understanding of ResNet in classification problems.
- Two types of network structures are derived: LF-block and RK-block, which correspond to the Leapfrog method and Runge-Kutta method in numerical ordinary differential equations.
This is a paper in 2019 CVPR with over 20 citations. (Sik-Ho Tsang @ Medium)
- ODE-Inspired Network Design
- Derivation of OISR-Blocks
- Overall Network Architecture and Block Design
- Experimental Results
1. ODE-Inspired Network Design
- From a dynamical system perspective, it defines a map that takes input status forward x units of time in the phase space.
- In CNN semantics, time horizon x corresponds to layers that can be adaptively chosen, while the final status is restricted by labels.
- Considering the dynamical systems which can be described as an ODE:
- This system gives a map:
- with initial status y0 ∈ Rd. Suppose p(y0) is the distribution of input feature y0 on a domain, if we regard CNN-based SISR as such a dynamical system, then we are supposed to minimize:
- where Φ is a map should be learned in SISR.
- As the system is non-linear, there is no simple formula describing the map, numerical methods are used — forward Euler method:
- which provides the approximation. It can be seen as a numerical ODE using the approximation to the integral of y′ over an interval of width.
- And residual block takes a similar form:
- The above suggests the relationship and establish the bridge by defining:
- Thus, mapping forward Euler to a residual block.
2. Derivation of OISR-Blocks
- To learn such a map Φ, it may take many steps to reach the final status, each step corresponds to a CNN block.
- (The CNN blocks as shown above will be derived below. Also the G block is composed of convolutions and activation functions which will be defined in the next section after deriving the above CNN block variants.)
- Either increasing the number of steps or refining motion of each step helps to achieve the goal, corresponding to increasing block numbers and designing finer blocks.
- Higher-order methods are supposed to bring about some merits.
- LeapFrog method is a second-order linear 2-step method, as a refinement of forward Euler scheme.
- By doubling the time interval h, the approximation of y′ can be rewritten in the form of y′ ≈ (yn+1−yn−1)/2h. Thus, yn+1 is:
- In order to retain flexibility and obtain a block architecture, every three formulas above are grouped into a block as:
- As mentioned:
- where G is some kinds of convolutional blocks.
- Consider the Runge-Kutta family, which is widely used in numerical analysis. Making use of trapezoidal formula:
- We obtain a block structure:
- In mathematics, these formulas are referred as Heun’s method, which is also a two-stage second-order Runge- Kutta method.
- Higher-order methods should obtain a smaller local truncation error.
- Explicit iterative Runge-Kutta methods can be extended to arbitrary n stages:
- In particular, 3-stage Lunge-Kutta with third order is:
- Generally, higher-order methods tend to generate more complicated blocks.
- (Please note that I am not expert in ODE. Please feel free to read the paper if interested.)
3. Overall Network Architecture and Block Design
3.1. Network Architecture
- The dimension of input and output feature maps of G is kept unchanged.
- The OISR Blocks are at the middle of the network.
- Residual learning is used.
- Pixel-Shuffle in ESPCN is used at the end to get the SR.
3.2. Block Design (G)
- There are different types of G tried. There is a large searching space to search G.
- Here, only choose three different forms are chosen to illustrate the general effectiveness of ODE-inspired schemes.
- Each of these designs keeps at least one activation function and one convolutional layer, thus promising the nonlinearity.
4. Experimental Results
4.1. Ablation Study
- First 800 images in DIV2K are used for training. L1 loss is used.
- Ablation study is performed on DIV2K validation set.
- ”PReLU+Conv”, namely G-v2, is suitable for LF-blocks.
- RK2-blocks should be equipped with G-v3.
4.2. SOTA Comparison
- The small-scale network designs are suffixed by ”-s”.
- The small-scale models, OISR-RK2-s and OISR-LF-s outperform other methods such as FSRCNN, DRRN, MemNet, SelNet and CARN, on different upscaling factors and datasets, except a slightly behind on Urban100 with upscaling factor ×2.
- In addition, the middle-scale models, OISR-RK2 and OISR-LF, surpass MSRN with only two exceptions on B100 and Urban100 SSIM when the upscaling factor is 2.
- For current state-of-the-art deep residual methods, OISR-RK3 achieves the best performances in most cases, outperforms LapSRN, VDSR, DRCN, MDSR, RDN, and EDSR.
- OISRs can reconstruct more detailed images with less blurring.
This is the 20th story in this month.
[2019 CVPR] [OISR]
ODE-inspired Network Design for Single Image Super-Resolution
[SRCNN] [FSRCNN] [VDSR] [ESPCN] [RED-Net] [DnCNN] [DRCN] [DRRN] [LapSRN & MS-LapSRN] [MemNet] [IRCNN] [WDRN / WavResNet] [MWCNN] [SRDenseNet] [SRGAN & SRResNet] [SelNet] [CNF] [BT-SRN][EDSR & MDSR] [MDesNet] [RDN] [SRMD & SRMDNF] [DBPN & D-DBPN] [RCAN] [ESRGAN] [CARN] [IDN] [ZSSR] [MSRN] [SR+STN] [SRFBN] [OISR]