Review: CNNAC — CNN-based Arithmetic Coding for DC Coefficients (HEVC Intra Coding)

DenseNet-like Network, Average 22.47% Bits Saving for DC Coefficients, Average 1.6% BD-Rate Reduction

Outline

  1. DC Coefficients
  2. Network Architecture
  3. Experimental Results

1. DC Coefficients

1.1. DC & AC Coefficients

DC Coefficient is at top left corner after DCT
  • In HEVC, DC coefficients in intra-predicted residues are encoded as a part of the entire transform coefficient coding scheme.
  • DC Coefficient is at top left corner after DCT. DC Coefficient usually has a large value to be coded. And the remaining entries are AC coefficients usually has relatively small values, which representing high frequency components of the image block as they towards to the bottom right corner.
  • Specifically, for a transform unit (TU), a flag is first coded that indicates whether there is non-zero coefficient in the quantized TU.
  • If the flag is true, the last non-zero coefficient position, the locations of non-zeros, the coefficient levels and signs are successively encoded.
  • Otherwise, the encoding of TU is finished since there is no non-zero coefficients.

1.2. Encoding of DC Coefficients

  • The syntax elements for recording the DC coefficient are composed of:
  • significant_coeff_flag: whether the DC coefficient is zero.
  • coeff_abs_level_greater1_flag: whether the absolute DC coefficient value is larger than 1.
  • coeff_abs_level_greater2_flag: whether the absolute DC coefficient value is larger than 2.
  • coeff_abs_level_remaining: the absolute value minus 2, and
  • coeff_sign_flag: the sign of the DC coefficient.
  • significant_coeff_flag, coeff_abs_level_greater1_flag and coeff_abs_level_greater2_flag are encoded with regular mode.
  • coeff_abs_level_remaining and coeff_sign_flag are encoded with bypass mode.
  • (If interested, please read Section 1 in Song VCIP’17 for regular mode and bypass mode.)

2. Network Architecture

2.1. Flowchart

Flowchart of CNNAC
  • Instead of coding so many syntax elements as mentioned in Section 1, CNN is used to predict the probability distribution of DC coefficient.
  • Then, the DC coefficient value together with the estimated probability is fed into a multi-level arithmetic codec to fulfill entropy coding.
  • This approach is similar to Song VCIP’17. But Song VCIP’17 is to encode the intra prediction mode, whereas here, CNNAC is to encode the DC coefficient of each TU.

2.2. Network

CNN Structure
  • With the use of dense blocks, DenseNet-like network is used.
  • Between dense blocks, the transition layer is used to down-sample the feature maps.
  • At the end of the last dense block, a softmax layer is attached to predict the probability distribution of every candidate.

2.3. Synthesized Images

Two synthetic images used to calculate the minimal and maximal values of DC coefficients in HEVC intra coding
  • In video coding, there is a quantization parameter (QP) to control the bitrate. Higher QP, lower bitrate, or vice versa.
  • With different quantization parameters, there are different ranges of values for DC coefficients. Before the real coding of DC coefficient using CNNAC, two synthetic images at different QPs to calculate the minimal and maximal possible values of DC. Accordingly, the softmax layer in the CNN should be corresponding to the range of possible DC values.
  • The two synthetic images are composed of white color and black color values as shown above.

2.4. Training Data

  • Uncompressed Color Image Database (UCID) and DIV2K are used to prepare the training data.
  • Specifically, 40 DIV2K images and 120 UCID images are compressed to generate training data.
  • Then 1,000,000 8×8 blocks are used as training data, and 50,000 blocks as validation data, both are randomly selected for different QPs.
  • The above network is only used for 8×8 TUs.

3. Experimental Results

Bits saving for DC coefficients using CNNAC
  • 22.47% bits saving for DC coefficients are obtained by CNNAC compared to the conventional HEVC HM-12.0.
BD-rate (%) using CNNAC
  • Average 1.6% BD-rate reduction is achieved.
  • R-D performance is better at lower bit rates.
  • At lower bit rates, the percentage of bits cost on DC coefficients among all the syntax elements is more, and thus the R-D performance is better at lower bit rates.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store