Reading: Lin DCC’20 — CNN Based Fast Intra Mode Prediction for H.266/FVC Video Coding (Fast VVC)

To Reduce the Computational Complexity of Intra Coding

Sik-Ho Tsang
3 min readJul 31, 2020

In this story, Convolutional Neural Network based Fast Intra Mode Prediction for H.266/FVC Video Coding (Lin DCC’20), by National Taipei University of Technology, and National Central University, is briefly presented. I read this because I work on video coding research.

The next-generation video compression standard H.266/Future Video Coding (FVC) provides high compression efficiency in terms of the cost of computing the optimal intra mode from 67 modes with high complexity. Studies carefully considering the intra mode decision to reduce the computational complexity of intra coding were developed.

And VVC is developed based on FVC.

This is a paper in 2020 DCC. (Sik-Ho Tsang @ Medium)

(Since this is only a 1-page paper. Details are not too much. The network architecture & the detailed results are obtained from authors’ poster.)

Outline

  1. Network Architecture
  2. Overall Approach
  3. Experimental Results

1. Network Architecture

Network Architecture
  • The CNN architecture comprises two convolutional layers and a fully connected layer.
  • The input including the neighbor pixels are fed into the network.
  • The output is a 67-dimensional vector for the 67 intra prediction modes.

2. Overall Approach

Overview flowchart of proposed method.
  • Only 16 × 16 blocks with the deep learning methodology were tested by JEM 7.0.
  • According to the CNN output, the top 5 modes are chosen to have full rate-distortion optimization (RDO) process.

3. Experimental Results

BD-PSNR, BDBR and Time Difference Compared to JEM-7.0 Without Fast Search
  • doFastSearch: The default SATD-based fast search.
  • Compared with the default fast search method doFastSearch in JEM 7.0, the proposed method can achieve averagely a 0.033% decrease in Bjøntegaard delta bit rate (BDBR) with only a slight increase in time.
  • Furthermore, the proposed method gains much improvement achieving a 0.097% decrease in BDBR over the default method when the tested videos are with moderate frame size such as classes B, C, D, E.

This is the 28th story in this month.

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

No responses yet