Review: CNNAC TCSVT’19 —Convolutional Neural Network-Based Arithmetic Coding (HEVC Intra)

Using DenseNet Blocks, 4.7% Average BD-Rate Reduction, Up to 6.7% BD-Rate Reduction

Sik-Ho Tsang
3 min readJun 21, 2020

In this story, Convolutional Neural Network-Based Arithmetic Coding (CNNAC), by University of Science and Technology of China, Microsoft Research Asia, and University of Missouri-Kansas City, is briefly presented. I read this because I work on video coding research. In

  • CNNAC is firstly published in 2018 ICIP to encode DC coefficients.
  • In this paper, CNNAC is extended to encode AC coefficients as well.

This is an early access article in 2019 TCSVT where TCSVT has a high impact factor of 4.046. (Sik-Ho Tsang @ Medium)

Outline

  1. CNNAC: Overall Scheme
  2. CNNAC: Network Architecture
  3. Experimental Results

1. CNNAC: Overall Scheme

CNNAC: Overall Scheme
  • Three neighboring reconstructed blocks (left, above, and left-above) that are the same size as the current TU are cropped and used as input.
  • For LastXY syntax element (SE), no other inputs since it is the first SE to be coded.
  • For DC and AC, the current reconstructed block generated with the encoded coefficients are also used as input.
  • Only SEs whose percentages of bits exceeds 3.0% are considered to use CNNAC. This includes LastXY, DC, AC1, AC2, AC3, AC4 and AC5.
  • Other SEs only use the conventional CABAC in HEVC.

2. CNNAC: Network Architecture

  • Standard DenseNet is used as shown in the above figure.
  • But different SE has different DenseNet structure.
  • For TU sizes 4×4, 8×8, 16×16, and 32×32, the layer number in DenseNet is 27, 40, 53, and 66, including 2, 3, 4, and 5 dense blocks, respectively.
  • In each dense block, 12 layers are included.

3. Experimental Results

3.1. BD-Rate

BD-Rate (%) on HEVC Test Sequences
  • 4.7% average BD-rate reduction is achieved by CNNAC.
BD-Rate (%) on HEVC Test Sequences
  • 1.7%, 1.0% and 0.9% average BD-rate reductions are achieved by CNNAC for DC, AC1 and LastXY respectively.

3.2. RD Curves

RD Curves
  • CNNAC works more efficiently at high bitrate than at low bitrate.

3.3. Computational Complexity

Computational Complexity
  • The complexity is extremely high even using GPU. Large amount of CNN inferences are needed for each block/CU.
Computational Complexity
  • Authors try to use FCN to reduce the complexity. But the performance is also degraded.

There are large amount of experiments to talk about the bit savings for coding syntax elements. If interested, please read the paper.

This is the 31st story in this month!

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

No responses yet