Review: WDRN / WavResNet — Wavelet-based Deep Residual Learning Network (Image Denoising & Super Resolution)
Wavelet Transformed Image As Input, Outperforms VDSR, DnCNN, Ranked Third in NTIRE Competition
In this story, “Beyond Deep Residual Learning for Image Restoration: Persistent Homology-Guided Manifold Simplification”, by Korea Ad. Inst. of Science & Technology (KAIST), is briefly reviewed. The short form, WDRN, is given because it is cited using this name by another 2018 JEI paper. Another short form, WavResNet, is cited in another 2018 CVPRW paper. It is also used in another paper called “Wavelet Domain Residual Network (WavResNet) for Low-Dose X-ray CT Reconstruction” (The last author is the same), in which the network is very similar.
Instead of using the original image as input, the image is wavelet-transformed before inputting into CNN. Because of this, the complexity is low and the inference time is fast. And it is ranked 3rd in NTIRE competition which makes it become a paper in 2017 CVPRW with over 70 citations. (Sik-Ho Tsang @ Medium)
Outline
- Network Architecture
- Experimental Results
1. Network Architecture
- There are 2 architectures: Denoising architecture, and NTIRE SISR competition architecture.
1.1. Denoising Architecture
- The input and the clean label images are first decomposed into four subbands (i.e. LL, LH, HL, and HH) using the wavelet transform.
- The wavelet residual images, which are now used as the new labels, are obtained by the difference between the input and the clean label images in the wavelet domain.
- Then, the network is trained to learn multi-input and multi-output functional relationship between these newly processed input and label.
- Four patches at the same locations in each wavelet subband are extracted and used for training.
- The network consists of five modules between the first and the last stages.
- Each module has one bypass connection, three convolution layers, three batch normalizations, and three ReLU layers.
- The first stage contains two layers: one with a convolution layer with ReLU which is followed by the other convolution layer with batch normalization and ReLU.
- The last stage is composed of three layers: two layers with a convolution, batch normalization, and ReLU and the last layer with a convolution layer.
- The total number of convolution layers is 20. The convolution filter size is 3×3×320×320.
- (Please read batch normalization from Inception-v2 and bypass connection from ResNet.)
- Three advantages using wavelet transform:
1. The input feature space is mapped to another feature space which can be trained easier, which can help to reduce the network depth, i.e. reduce the computational complexity. Also, it is easier to be trained.
2. The patch size can be reduced by half. It can reduce the runtime of the network due to the size of the output images of layers being halved.
3. The minimum required size of receptive field can be reduced.
1.2. NTIRE SISR (Single Image Super Resolution) Competition Architecture
- These architectures are extended from the primary denoising architecture. Depending on the decimation schemes (bicubic ×2, ×3, ×4, and unknown ×2, ×3, and ×4) for low resolution dataset, three different architectures are implemented.
- All three SISR architectures have 41 convolution layers.
- To reconstruct the bicubic ×2 downsampled dataset, two long bypass connections are used between six basic modules in the network and the number of channels were 256.
- For the other datasets, we did not use the long bypass connection and the number of channels were 320.
- With long pass, the average PSNR/SSIM is improved.
2. Experimental Results
2.1. Denoising
- The proposed network outperforms the state-of-the-art denoising methods such as DnCNN in terms of PSNR and SSIM for all Set12 images and BSD68 dataset.
2.2. NTIRE SISR (Single Image Super Resolution) Competition
During the days of coronavirus, I hope to write 30 stories in this month to give myself a small challenge. And this is the 32nd story in this month. Thanks for visiting my story…
2 Days left for this month. How about 35 stories within this month…?
Reference
[2017 CVPRW] [WDRN / WavResNet]
Beyond Deep Residual Learning for Image Restoration: Persistent Homology-Guided Manifold Simplification
Super Resolution
[SRCNN] [FSRCNN] [VDSR] [ESPCN] [RED-Net] [DnCNN] [DRCN] [DRRN] [LapSRN & MS-LapSRN] [MemNet] [IRCNN] [WDRN / WavResNet] [SRDenseNet] [SRGAN & SRResNet] [EDSR & MDSR] [SR+STN]
Image Restoration
[RED-Net] [DnCNN] [MemNet] [IRCNN] [WDRN / WavResNet]