Reading: Lin DCC’20 — CNN Based Fast Intra Mode Prediction for H.266/FVC Video Coding (Fast VVC)

To Reduce the Computational Complexity of Intra Coding

3 min readJul 31, 2020

In this story, Convolutional Neural Network based Fast Intra Mode Prediction for H.266/FVC Video Coding (Lin DCC’20), by National Taipei University of Technology, and National Central University, is briefly presented. I read this because I work on video coding research.

The next-generation video compression standard H.266/Future Video Coding (FVC) provides high compression efficiency in terms of the cost of computing the optimal intra mode from 67 modes with high complexity. Studies carefully considering the intra mode decision to reduce the computational complexity of intra coding were developed.

And VVC is developed based on FVC.

This is a paper in 2020 DCC. (Sik-Ho Tsang @ Medium)

(Since this is only a 1-page paper. Details are not too much. The network architecture & the detailed results are obtained from authors’ poster.)

Outline

Network Architecture
Overall Approach
Experimental Results

1. Network Architecture

The CNN architecture comprises two convolutional layers and a fully connected layer.
The input including the neighbor pixels are fed into the network.
The output is a 67-dimensional vector for the 67 intra prediction modes.

2. Overall Approach

**Overview flowchart of proposed method.**

Only 16 × 16 blocks with the deep learning methodology were tested by JEM 7.0.
According to the CNN output, the top 5 modes are chosen to have full rate-distortion optimization (RDO) process.

3. Experimental Results

**BD-PSNR, BDBR and Time Difference Compared to JEM-7.0 Without Fast Search**

doFastSearch: The default SATD-based fast search.
Compared with the default fast search method doFastSearch in JEM 7.0, the proposed method can achieve averagely a 0.033% decrease in Bjøntegaard delta bit rate (BDBR) with only a slight increase in time.
Furthermore, the proposed method gains much improvement achieving a 0.097% decrease in BDBR over the default method when the tested videos are with moderate frame size such as classes B, C, D, E.

This is the 28th story in this month.

References

[2020 DCC] [Lin DCC’20]
Convolutional Neural Network Based Fast Intra Mode Prediction for H.266/FVC Video Coding

And Corresponding Poster

Codec Fast Prediction

H.264 to HEVC [Wei VCIP’17] [H-LSTM]
HEVC [Yu ICIP’15 / Liu ISCAS’16 / Liu TIP’16] [Laude PCS’16] [Li ICME’17] [Katayama ICICT’18] [Chang DCC’18] [ETH-CNN & ETH-LSTM] [Zhang RCAR’19] [Kim TCVST’19] [LFHI & LFSD & LFMD Using AK-CNN] [Yang AICAS’20] [H-FCN]
3D-HEVC [AQ-CNN] [CNN-SENet]
VP9 [H-FCN]
VVC [Jin VCIP’17] [Jin PCM’17] [Jin ACCESS’18] [Wang ICIP’18] [Galpin DCC’19] [Pooling-Variable CNN] [Lin DCC’20] [Amna JRTIP’20] [DeepQTMT]