Review: DS-CNN — Decoder-side Scalable CNN (Codec Filtering)
In this story, Decoder-side Scalable CNN (DS-CNN), by Beihang University and Collaborative Innovation Center of Geospatial Technology, is briefly reviewed. Based on the decoder side computational ability, using different complexity scale within one CNN model as shown above to improve the image/video quality, so as to increase the coding efficiency. This is published in 2017 ICME with tens of citations. (Sik-Ho Tsang @ Medium)
Outline
- From AR-CNN to AR-CNN-1 and AR-CNN-2
- DS-CNN-I and DS-CNN-B Network Architecture
- Experimental Results
2. DS-CNN-I and DS-CNN-B Network Architecture
- DS-CNN-I: It is a network consists of 5 convolutional layers (Green) as shown above. It is used for intra-frame.
- DS-CNN-B: It is used for inter-frame. Recall that Conv 1 is to extract intra coding features. Then, the outputs of Conv 1 and Conv 6 are concatenated, and are both convolved by Conv 7.
- Thus, Conv 7 denoises the features of both intra and inter coding.
- Conv 8–10 in DS-CNN-B are designed in the similar way.
- Once the computational resources are not sufficient, the switches {S0 to S4} are turned off, and only DS-CNN-I is in use at the decoder.
- When the computational resources are sufficient, {S0 to S4} are turned on, and DS-CNN-B starts to work based on the output from the layers Conv 1–4 of DS-CNN-I.
- Because of the reduction of inter coding distortion, the quality of B/P frames can be further enhanced by DS-CNN-B, at the cost of higher computational complexity.
Reference
[2017 ICME] [DS-CNN]
Decoder-side HEVC Quality Enhancement with Scalable Convolutional Neural Network
My Previous Reviews
Image Classification [LeNet] [AlexNet] [Maxout] [NIN] [ZFNet] [VGGNet] [Highway] [SPPNet] [PReLU-Net] [STN] [DeepImage] [SqueezeNet] [GoogLeNet / Inception-v1] [BN-Inception / Inception-v2] [Inception-v3] [Inception-v4] [Xception] [MobileNetV1] [ResNet] [Pre-Activation ResNet] [RiR] [RoR] [Stochastic Depth] [WRN] [Shake-Shake] [FractalNet] [Trimps-Soushen] [PolyNet] [ResNeXt] [DenseNet] [PyramidNet] [DRN] [DPN] [Residual Attention Network] [DMRNet / DFN-MR] [IGCNet / IGCV1] [MSDNet] [ShuffleNet V1] [SENet] [NASNet] [MobileNetV2]
Object Detection [OverFeat] [R-CNN] [Fast R-CNN] [Faster R-CNN] [MR-CNN & S-CNN] [DeepID-Net] [CRAFT] [R-FCN] [ION] [MultiPathNet] [NoC] [Hikvision] [GBD-Net / GBD-v1 & GBD-v2] [G-RMI] [TDM] [SSD] [DSSD] [YOLOv1] [YOLOv2 / YOLO9000] [YOLOv3] [FPN] [RetinaNet] [DCN]
Semantic Segmentation [FCN] [DeconvNet] [DeepLabv1 & DeepLabv2] [CRF-RNN] [SegNet] [ParseNet] [DilatedNet] [DRN] [RefineNet] [GCN] [PSPNet] [DeepLabv3] [LC] [FC-DenseNet] [IDW-CNN] [SDN]
Biomedical Image Segmentation [CUMedVision1] [CUMedVision2 / DCAN] [U-Net] [CFS-FCN] [U-Net+ResNet] [MultiChannel] [V-Net] [3D U-Net] [M²FCN] [SA] [QSA+QNT] [3D U-Net+ResNet]
Instance Segmentation [SDS] [Hypercolumn] [DeepMask] [SharpMask] [MultiPathNet] [MNC] [InstanceFCN] [FCIS]
Super Resolution [SRCNN] [FSRCNN] [VDSR] [ESPCN] [RED-Net] [DRCN] [DRRN] [LapSRN & MS-LapSRN] [SRDenseNet]
Human Pose Estimation [DeepPose] [Tompson NIPS’14] [Tompson CVPR’15] [CPM]
Codec Post-Processing [ARCNN] [Lin DCC’16] [IFCNN] [Li ICME’17] [VRCNN] [DCAD] [DS-CNN]