Reading: RRCNN — Recursive Residual Convolutional Neural Network (Coded Filtering)

Outperforms VRCNN, RHCNN & CNNF. 8.7% Average BD-Rate Reduction for Luma. More Than 20% Average BD-Rate Reductions for Chroma.

In this story, Recursive Residual Convolutional Neural Network (RRCNN), by Tianjin University, and Santa Clara University, is presented. I read this paper because I work on video coding research. In this paper:

  • Recursive: The same set of weights are used recursively thus fewer number of parameters.
  • A single model is trained for various bitrate settings.

Outline

  1. Residual Learning & Recursive Learning
  2. RRCNN: Multi-QP Network Architecture
  3. HEVC Implementation
  4. Experimental Results

1. Residual Learning & Recursive Learning

(a) VDSR (b) ResNet (c) RecNet (d) RRCNN
  • Also, an identity skip connection is attached in a few stacked layers, termed Internal Residual Learning (IRL).
  • Within the green dashed-line boxes, the convolutional weights in yellow & green are sharing the same weights respectively. This can help to keep the model parameters from increasing.
Pre-Activation ResNet is Used
Average BD-PSNR (dB) & Number of Parameters
  • Better performance can be achieved when the network goes deeper because a deep network guarantees strong learning ability.
  • Furthermore, RRCNN achieves the best performance at all depths and outperforms the second-best VDSR by 0.05dB (BD-PSNR) at a depth of 22 while utilizing 10 times fewer parameters, which verifies the effectiveness of the multi-path structure and recursive learning.

2. RRCNN: Multi-QP Network Architecture

RRCNN: Multi-QP Network Architecture for Luma
  • The luma patch and QP map are first normalized to [0, 1] by min-max normalization:
  • As depth increases, the performance increases, but gains improve very slowly and are small while the network goes further deeper.
  • The standard loss function is used:
RRCNN: Multi-QP Network Architecture for Chroma
  • Similar to the network for luma, but with luma is downsampled to have the same size as chroma before input.
  • The output is U and V.

3. HEVC Implementation

RRCNN Variants
  • (a) RRCNNF-I: For first position, the RRCNN is placed before DF and replaces DF and SAO.
  • (b) RRCNNF-II: The second position is after DF and replaces SAO.
  • (c) RRCNNF-III: The third position is after SAO and is employed as an additional filter.
  • (d): CTU-level control flag is added to let encoder to choose the best one either DF+SAO or RRCNN for filtering.

4. Experimental Results

4.1. Training

  • Uncompressed Colour Image Database (UCID), which consists of 1338 natural images, is used to generate the training data.
  • They are compressed by HM-16.16 using different QPs with DF and SAO off.

4.2. BD-Rate

BD-Rate (%) on HEVC Test Sequences Under AI Configuration
BD-Rate (%) on HEVC Test Sequences Under RA, LDP, LDB Configurations

4.3. Visual Quality

Left to Right: Ground-truth, No DF & SAO, DF & SAO, and RRCNN
  • However, RRCNN not only effectively removes blocking and ringing artifacts but also recovers the details, which leads to clear images

4.3. SOTA Comparison

BD-Rate (%) on HEVC Test Sequences Under AI Configuration

4.4. RD Curves

RD Curves

4.5. Different Positions for RRCNN

BD-Rate (%) on HEVC Test Sequences Under AI Configuration

4.6. QP Adaptivity

BD-Rate (%) on HEVC Test Sequences Under AI Configuration
BD-Rate (%) on HEVC Test Sequences Under AI Configuration
  • RRCNN-M: M means multi, training one RRCNN for multiple QPs.
  • Only little loss for RRCNN-M compared with RRCNN-S but with multiple models saved.

4.7. BD-Rate for Chroma

Average PSNR (dB) and BD-Rate (%) on HEVC Test Sequences Under AI Configuration
  • With chroma model trained and tested on chroma, 20.5% and 21.3% BD-rate reduction for Cb and Cr respectively.
BD-Rate (%) on HEVC Test Sequences Under AI Configuration

4.8. Computational Complexity

Computational Complexity
  • With only CPU, the encoding time and decoding time are significantly increased.

PhD, Researcher. I share what I've learnt and done. :) My LinkedIn: https://www.linkedin.com/in/sh-tsang/, My Paper Reading List: https://bit.ly/33TDhxG