Review: Laude PCS’16 — Deep Learning-Based Intra Prediction Mode Decision for HEVC (Fast HEVC Prediction)
Only 0.52% Increase in BD-Rate Compared with HM-16.6+SCM-5.2
In this story, a deep Convolutional Neural Network (CNN) classifiers for fast HEVC intra coding is briefly reviewed. I read this paper because I work on video coding research. This is published in 2016 PCS. (Sik-Ho Tsang @ Medium)
Outline
- Conventional Intra Coding
- Network Architecture
- Experimental Results
1. Conventional Intra Coding
- A frame is divided into different sizes of non-overlapping blocks for encoding/compression.
- These blocks are called Coding Units (CUs), which are from 64×64, 32×32, 16×16 down to 8×8.
- For each CU in intra prediction, there are 35 predictions as shown above.
- Neighbor reference samples are used to predict the current CU.
- 0: planar, to predict smooth gradual change within the CU.
- 1: DC, using the average value to fill in the CU as prediction.
- 2–34: Angular, using different angles to predict the current CU.
- Some examples are shown at the right of the figure.
However, it is time consuming to find the best prediction. Because we need to estimate the cost of each prediction which involves the coding rate (birate) and distortion (PSNR) of each prediction. This complicated process is called Rate Distortion Optimization (RDO).
2. Network Architecture
- To reduce the complexity, a deep convolutional neural network (CNN) classifier is to replace the conventional RD optimization for the intra prediction mode.
- The above is the model for 32×32 blocks.
- The architecture is similar to AlexNet.
- Each block is fed through two convolutional, one max pooling, and two fully-connected layers. In the final layer, a classification into 35 classes (i.e. intra prediction modes) is carried out.
3. Experimental Results
- HM-16.6+SCM-5.2 reference software is used.
- Only 0.52% increase in BD-Rate.
- However, time reduction is not provided. The target of this paper is to replace the RDO by CNN.
- This is one of the early papers using CNN in the aspect of video coding.
Reference
[2016 PCS] [Laude PCS’16]
Deep learning-based intra prediction mode decision for HEVC