Reading: CNNF — Convolutional Neural Network Filter (Codec Filtering)

3.14%, 5.21% and 6.28% BD-Rate Savings for Luma and 2 Chroma Respectively Under AI Configuration

In this paper, Convolutional Neural Network Filter (CNNF), by Hikvision Research Institute, is presented. I read this because I work on video coding research. In CNN prior arts for filtering, there are few problems:

  • Float points operation is used which leads to inconsistency between encoding and decoding across different platforms.
  • Redundancy within CNN model consumes precious computational resources.
  • The obtained model is compressed to reduce redundancy.
  • To ensure consistency, dynamic fixed points (DFP) are adopted in testing CNN.

Outline

  1. CNNF: Network Architecture
  2. Model Compression
  3. Dynamic Fixed Point (DFP) Inference
  4. Experimental Results

1. CNNF: Network Architecture

CNNF: Network Architecture
  • Both the two inputs are normalized to [0,1] for better convergence.
  • A simple CNN with 8 convolution layers with residual learning is used, where KL is set to 64.
  • The network is like a VDSR but with QP input, BN, and shallower network.

2. Model Compression

  • For efficient compression, loss function, Loss, with two additional regularizers are included:
  • λw, λs and λlda are set to 1e-5, 5e-8 and 3e-6, respectively.
  • S is the scale parameter in BN layer. With the first additional regularizer, the learned scale parameters in BN layer tends to be zero.
  • The second additional regularizer, i.e. the linear discriminant analysis (LDA) item, makes the learned parameters friendly to the following low rank approximation.
  • Then singular value decomposition (SVD) is established for low rank approximation. After that, filters are reconstructed using a much lower basis.
Compressed filter number for each convolution layer
  • Experimental results report performance only changes about -0.08%, -0.19%, 0.25% in average for Y, U and V components of class B, C, D and E on JEM 7.0.

3. Dynamic Fixed Point Inference

  • A value V in dynamic fixed point is described by:
  • FL the fractional length and xi the mantissa binary bits.
  • Each float point within model parameters and outputs is quantized and clipped to be converted to DFP.
  • First, Bit width for weights Bw and biases Bb are set to 8 and 32, respectively.
  • For layer outputs, the bit width is set to 16.
  • Each group in the same layer shares one common FL, which is estimated from available training data and layer parameters.
Estimated FL for each convolution layer
  • Since CPU and GPU do not support DFPs, they are simulated by float points similar to [10].
  • With shorter fractional length, computation can be saved.

4. Experimental Results

4.1. Training

  • Training data: Visual genome(VG) [17], DIV2K [18] and ILSVRC2012 [19].
  • Each image is intra encoded by the QP 22, 27, 32, 37 on JEM 7.0 with BF, DF, SAO and ALF off, patches with 35×35 size.
  • Batch size M is set to 64.
  • 3.6 million training data are generated which includes 600 thousands luma data and 300 thousands chroma data for each QP.
  • Training is stopped after 32 epochs.

4.2. QP-Independent vs QP-Dependent

BD-Rate (%) on Test Sequences
  • CNNF obtains 3.99% BD-rate reduction which is close to ‘Best’.

4.3. BD-Rate Under AI Configuration

BD-Rate (%) on Test Sequences
Visual Quality in this sub-section

4.4. BD-Rate When ALF On

BD-Rate (%) on Test Sequences Under AI Conf
BD-Rate (%) on Test Sequences Under RA Configuration
  • 3.57%, 6.17% and 7.06% average gains are observed with AI configuration.
  • Though only applied to intra frames, CNNF achieves 1.23%, 3.65% and 3.88% gains with RA configuration.

4.5. Complexity

  • With GPU, the EncT decreases and DecT increases a little.
  • Even when testing with CPU, the EncT only increases a little.
  • Though DecT is extremely high on CPU, we do believe that with the development of deep learning specific hardware it will not be a problem

PhD, Researcher. I share what I've learnt and done. :) My LinkedIn: https://www.linkedin.com/in/sh-tsang/, My Paper Reading List: https://bit.ly/33TDhxG