Review: DS-CNN — Decoder-side Scalable CNN (Codec Filtering)

Scalable Quality Enhancement Approach, Outperform ARCNN and VRCNN

Sik-Ho Tsang
4 min readAug 5, 2019
An Example of Application Scenario of DS-CNN

In this story, Decoder-side Scalable CNN (DS-CNN), by Beihang University and Collaborative Innovation Center of Geospatial Technology, is briefly reviewed. Based on the decoder side computational ability, using different complexity scale within one CNN model as shown above to improve the image/video quality, so as to increase the coding efficiency. This is published in 2017 ICME with tens of citations. (Sik-Ho Tsang @ Medium)

Outline

  1. From AR-CNN to AR-CNN-1 and AR-CNN-2
  2. DS-CNN-I and DS-CNN-B Network Architecture
  3. Experimental Results

1. From ARCNN to AR-CNN-1 and AR-CNN-2

AR-CNN
  • The original ARCNN has 4 layers of convolution as shown above.
  • (Please feel free to read my ARCNN review if interested.)
  • AR-CNN-1: Authors improve the ARCNN by increasing the filter number,
  • AR-CNN-2: And adding one more layer. And PSNR is improved.

2. DS-CNN-I and DS-CNN-B Network Architecture

DS-CNN-I & DS-CNN-B Network Architecture
  • DS-CNN-I: It is a network consists of 5 convolutional layers (Green) as shown above. It is used for intra-frame.
  • DS-CNN-B: It is used for inter-frame. Recall that Conv 1 is to extract intra coding features. Then, the outputs of Conv 1 and Conv 6 are concatenated, and are both convolved by Conv 7.
  • Thus, Conv 7 denoises the features of both intra and inter coding.
  • Conv 8–10 in DS-CNN-B are designed in the similar way.
Scalable Structure of DS-CNN
  • Once the computational resources are not sufficient, the switches {S0 to S4} are turned off, and only DS-CNN-I is in use at the decoder.
  • When the computational resources are sufficient, {S0 to S4} are turned on, and DS-CNN-B starts to work based on the output from the layers Conv 1–4 of DS-CNN-I.
  • Because of the reduction of inter coding distortion, the quality of B/P frames can be further enhanced by DS-CNN-B, at the cost of higher computational complexity.

3. Experimental Results

  • DS-CNN-I outperforms ARCNN and VRCNN for I-frames.
  • DS-CNN-B outperforms DS-CNN-I for B-frames as it has more feature maps for convolutions.
Performance Evaluation of DS-CNN

Reference

[2017 ICME] [DS-CNN]
Decoder-side HEVC Quality Enhancement with Scalable Convolutional Neural Network

My Previous Reviews

Image Classification [LeNet] [AlexNet] [Maxout] [NIN] [ZFNet] [VGGNet] [Highway] [SPPNet] [PReLU-Net] [STN] [DeepImage] [SqueezeNet] [GoogLeNet / Inception-v1] [BN-Inception / Inception-v2] [Inception-v3] [Inception-v4] [Xception] [MobileNetV1] [ResNet] [Pre-Activation ResNet] [RiR] [RoR] [Stochastic Depth] [WRN] [Shake-Shake] [FractalNet] [Trimps-Soushen] [PolyNet] [ResNeXt] [DenseNet] [PyramidNet] [DRN] [DPN] [Residual Attention Network] [DMRNet / DFN-MR] [IGCNet / IGCV1] [MSDNet] [ShuffleNet V1] [SENet] [NASNet] [MobileNetV2]

Object Detection [OverFeat] [R-CNN] [Fast R-CNN] [Faster R-CNN] [MR-CNN & S-CNN] [DeepID-Net] [CRAFT] [R-FCN] [ION] [MultiPathNet] [NoC] [Hikvision] [GBD-Net / GBD-v1 & GBD-v2] [G-RMI] [TDM] [SSD] [DSSD] [YOLOv1] [YOLOv2 / YOLO9000] [YOLOv3] [FPN] [RetinaNet] [DCN]

Semantic Segmentation [FCN] [DeconvNet] [DeepLabv1 & DeepLabv2] [CRF-RNN] [SegNet] [ParseNet] [DilatedNet] [DRN] [RefineNet] [GCN] [PSPNet] [DeepLabv3] [LC] [FC-DenseNet] [IDW-CNN] [SDN]

Biomedical Image Segmentation [CUMedVision1] [CUMedVision2 / DCAN] [U-Net] [CFS-FCN] [U-Net+ResNet] [MultiChannel] [V-Net] [3D U-Net] [M²FCN] [SA] [QSA+QNT] [3D U-Net+ResNet]

Instance Segmentation [SDS] [Hypercolumn] [DeepMask] [SharpMask] [MultiPathNet] [MNC] [InstanceFCN] [FCIS]

Super Resolution [SRCNN] [FSRCNN] [VDSR] [ESPCN] [RED-Net] [DRCN] [DRRN] [LapSRN & MS-LapSRN] [SRDenseNet]

Human Pose Estimation [DeepPose] [Tompson NIPS’14] [Tompson CVPR’15] [CPM]

Codec Post-Processing [ARCNN] [Lin DCC’16] [IFCNN] [Li ICME’17] [VRCNN] [DCAD] [DS-CNN]

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.