Reading: CSFN & CSFN-M — Channel Splitting and Fusion Network (Super Resolution)
CSFN Outperforms CARN, IDN, MemNet & VDSR; CSFN-M Outperforms CARN-M, SRFBN-S & DRRN
In this story, Low Complexity Single Image Super-Resolution with Channel Splitting and Fusion Network (CSFN), by Nanjing University, is presented. In this paper:
- A low complexity solution based on channel splitting and fusion network (CSFN) is proposed.
- Channel splitting and channel fusion to enhance feature maps and make full use of valuable information.
- Multiple residual channel splitting and fusion blocks (CSFB) are cascaded to continuously extract more important information for reconstruction.
- To further minimize redundant parameters and improve efficiency, group and recursive convolutional layer strategies are adopted in CSFB to form a lightweight block called CSFB-M. (M stands for Mobile)
This is a paper in 2020 ICASSP. (Sik-Ho Tsang @ Medium)
Outline
- CSFB: Network Architecture
- Residual Channel Splitting and Fusion Block (CSFB)
- Lightweight Version: CSFB-M
- Experimental Results
1. CSFB: Network Architecture
- CSFN is divided into three parts: shallow feature extraction block (FEBlock), residual channel splitting and fusion blocks (CSFBlocks) and upscale block (UPBlock).
- FEBlock: one convolutional layer is used to extract features from the LR image. F0 is the shallow features extracted by FEBlock.
- CSFBlocks: consist of multiple CSFBs and a bottom convolution layer, and the output of CSFBlocks can be expressed as:
- where ft is the bottom convolution function and HCSF,i represents the operation of i-th CSFB.
- UPBlock: does the LR-HR transforming and reconstruct the HR image.
- ESPCN is used in the upscale module. For scale 2 and scale 3, we use one Conv-PixelShuffle structure. Two Conv-PixelShuffle modules are used for scale 4.
2. Residual Channel Splitting and Fusion Block (CSFB)
- The proposed CSFB can be roughly divided into two pipelines. The left pipeline is a feature fusion module based on channel split (CSFF) and the right one is a global feature extraction (GFE) module.
- The other part of CSFB for residual learning.
2.1. Feature Fusion module based on Channel Split (CSFF)
- CSFF module mainly focuses on the imbalance of channel information.
- An asymmetric channel split is adopted to divide the feature into two parts.
- Fi-1 is split into two parts which contain s and c0-s channels respectively, where s is less than c0/2 in the network.
- After convolution, the dimensions of left and right branch in CSFF as ca and cb, subjected to ca+cb=c0. ca>s is the restriction on ca to guide the network to extract more information.
- Then the two feature maps are fused by concatenation and the 1×1 convolution.
2.2. Global Feature Extraction (GFE)
- The main purpose of GFE module (the right pipeline) is to compensate for CSFF module since CSFF module only uses partial channel information independently which leads to incomplete global information.
- A channel compression and expansion unit is used to extract features and promote channel information fusion as well as reducing the number of parameters.
- The feature maps dimension of i-th Conv-ReLU output are denoted as cn, cn, and c0 respectively (cn < c0).
3. Lightweight Version: CSFB-M
- To further reduce parameters and enhance feature maps, a more lightweight network (CSFN-M) is proposed which replaces CSFB with CSFB-M.
- Group convolutional layer, originated in AlexNet and ResNeXt, is adopted in CSFF and GFE.
- Recursive CSFB block is used such that feature maps would be enhanced for t times.
4. Experimental Results
4.1. Settings
- 3×3 Convolutional layers are used except the specified 1×1 convolutional layers. The number of channels in Fi is set to 64.
- 10 CSFBs are used in the proposed CSFN and 5 for CSFN-M.
- In each CSFB, c0, s, ca, cb and cn are set to 64, 16, 32, 32 and 16 respectively.
- For the proposed CSFB-M, the same setting is used with CSFB but the number of groups is set to 2, t is set to 3 for each CSFB-M.
- L1 loss is used.
- DIV2K 800 training images are used for training.
- (There are experiments for network analysis, but I think the results are quite close…)
4.2. SOTA Comparison
- For a fair comparison, the heavy network such as EDSR, DBPN, RDN, RCAN is excluded.
- CSFN has better performance with fewer parameters and FLOPs than CARN, note that CARN uses larger patch sizes in training time and multi-scale training strategy to improve the final result.
- Of course, CSFN also outperforms IDN, MemNet and VDSR.
- For a more lightweight model, CSFN-M can achieve higher performance compared with DRRN, CARN-M, and similar or better performance than SRFBN-S, but SRFBN-S has larger FLOPs.
- img076: The proposed CSFN and CSFN-M rebuild the glass edge more accurately while other models only smoothes the area.
- barbara: Other models generate the wrong texture when CSFN can predict the texture of this spotted cloth correctly, and CSFN-M can correctly predict partial structures with fewer parameters.
- img092: All other methods infer the wrong black line, but CSFN can make full use of the information in low-resolution images to accurately estimate the direction of the line.
This is the 24th story in this month.
Reference
[2020 ICASSP] [CSFN & CSFN-M]
Low Complexity Single Image Super-Resolution with Channel Splitting and Fusion Network
Super Resolution
[SRCNN] [FSRCNN] [VDSR] [ESPCN] [RED-Net] [DnCNN] [DRCN] [DRRN] [LapSRN & MS-LapSRN] [MemNet] [IRCNN] [WDRN / WavResNet] [MWCNN] [SRDenseNet] [SRGAN & SRResNet] [SelNet] [CNF] [BT-SRN][EDSR & MDSR] [EnhanceNet] [MDesNet] [RDN] [SRMD & SRMDNF] [DBPN & D-DBPN] [RCAN] [ESRGAN] [CARN] [IDN] [ZSSR] [MSRN] [SR+STN] [IDBP-CNN-IA] [SRFBN] [OISR] [PRLSR] [CSFN & CSFN-M]