Reading: MGNLF — Multi-Gradient Convolutional Neural Network Based In-Loop Filter (VVC Filtering)

3.29% BD-Rate Reduction Compared to Conventional VVC While VRCNN and CACNN-S Cannot Obtain BD-rate reduction

Sik-Ho Tsang
4 min readAug 2, 2020

In this story, Multi-Gradient Convolutional Neural Network Based In-Loop Filter For VVC (MGNLF), by Peking University, is presented. I read this because I work on video coding research. In this paper:

  • A multi-gradient convolutional neural network based in-loop filter (MGNLF) for VVC is proposed.
  • Divergence and second derivatives of the frame are utilized.

This is a paper in 2020 ICME. (Sik-Ho Tsang @ Medium)

Outline

  1. MGNLF: Network Architecture
  2. MGNLF: Loss Function
  3. Some Training Details
  4. Experimental Results

1. MGNLF: Network Architecture

MGNLF: Network Architecture
  • (a): The divergence reconstruction branch.
  • (b): The image reconstruction branch.
  • (c): The second derivative reconstruction branch.
  • First, the divergence DI and second derivative LI of the input frame are obtained by using the Sobel Operator and the Laplace Operator, which could be formulated as follows:
  • where I is the input frame, and * denotes the convolution operation.
  • Afterwards, DI and LI will be the inputs of two separated residual learning networks (a) and (c).
  • The structures of the three networks are the same.
  • Convolutional layers with 3×3 kernels and 64 feature maps are used. Each convolutional layer is followed by a LeakyReLU activation except the last layer.
  • Batch normalization is not used.
  • The outputs of (a) and (c) are denoted as D’I and L’I.
  • Then D’I and L’I will be transformed by a convolutional layer with 1×1 kernel and concatenated with the input image feature map to a feature maps with 64 channels.
  • By doing so, the feature map could preserve more detailed information present in the original image, which can promote the image reconstruction.
  • Finally, the reconstruction network will map the input to the residual between the frame I and the ground truth.

2. MGNLF: Loss Function

  • The loss function is:
  • where LR is the MSE loss of the image and LE is the enhancement loss for the divergence and the second derivative.
  • where λ is tuned based on experiments.
PSNR Against Training Steps for Different λ Values
  • It is found that λ=0.1 has the best result.

3. Some Training Details

  • DIV2K 800 images are used for training.
  • VTM 3.0 under AI configuration is used to compress the image to generate image pairs, with QPs (22, 27, 32, 37) used.
  • Filters DBF, SAO and ALF are disabled when compressing these sequences.
  • 64×64 small patches, with 120K blocks for each QP, with removing the blocks whose PSNR are larger than 50 and randomly selecting 50,000 blocks for training, 1000 blocks for validation.
  • First train the model for QP = 37, then use it to train the networks with smaller QPs.
  • The model is to replace the DBF and SAO.

4. Experimental Results

4.1. Prior Arts

BD-Rates (%)
  • It is found that VRCNN and CACNN-S cannot obtain BD-rate reduction.

4.2. Ablation Study

Model Variants
  • The multi-gradient network is used to compared with single-gradient using Sobel Operator, single-gradient using Laplace Operator and no-gradient.
  • Multi-gradient shows the best restoration ability, which also demonstrates that the multi-gradient could capture more slight details and lead to performance improvement.

4.3. SOTA Comparisons

BD-Rates (%)
  • MGNLF obtains the largest BD-rate reduction compared to the other 3 approaches submitted to the standard.

4.4. RD Curves

RD Curves
  • The proposed approach performs better at low bit rate conditions.

4.5. Subjective Quality

Subjective Quality (a) GT, (b) DRNLF, (c) Proposed MGNLF
  • From the enlarged region in sequence BasketballDrill, it can be observed that the texture of floor and straight-lines are still severely blurred when compressed by DRNLF. In contrast, the texture of floor and straight-lines becomes more clear after being enhanced by MGNLF.

This is the 5th story in this month.

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

No responses yet