Reading: AQ-CNN — Adaptive QP Convolutional Neural Network (Fast 3D-HEVC Prediction)
LeNet-Like Architecture, 69.4% Average Time Reduction With Negligible Bitrate Increase on Synthesized Views
In this story, Adaptive QP Convolutional Neural Network (AQ-CNN), by Huazhong University of Science and Technology, is briefly presented since this is a 1-page conference paper, there are not much details. I read this because I work on video coding research.
3D-HEVC is one of the extensions of HEVC to support 3D video. With the color/texture videos and depth maps of both left and right views, any intermediate virtual views in between can be synthesized, such that autostereoscopic or free viewpoint 3D videos can be supported. Thus, it is called multiview video plus depth coding (MVD). And recently, the 3D video technology has also been involved (Of course enhancements and changes are needed.) in the MPEG-I project to support 3DoF and 6DoF videos which will be completed in the coming years.
This is a paper in 2019 DCC. (Sik-Ho Tsang @ Medium)
Outline
- AQ-CNN: Network Architecture
- Experimental Results
1. AQ-CNN: Network Architecture
- A LeNet-like architecture of a binary classifier is used.
- The output label is either SPLIT or NOT SPLIT.
- AQ-CNN consists of two convolutional layers, two pooling layers, two full connection layers and one softmax layer.
- Three separate models are trained for different CU sizes with the same structure but different parameters
- Considering that QP has a great influence on CU partition, QP is concatenated to the two full connection layers for better predicting CU splitting labels.
Reference
[2019 DCC] [AQ-CNN]
Fast CU Size Decision based on AQ-CNN for Depth Intra Coding in 3D-HEVC
Codec Fast Prediction
H.264 to HEVC [Wei VCIP’17] [H-LSTM]
HEVC [Yu ICIP’15 / Liu ISCAS’16 / Liu TIP’16] [Laude PCS’16] [Li ICME’17] [Katayama ICICT’18] [Chang DCC’18] [ETH-CNN & ETH-LSTM] [Zhang RCAR’19] [Kuanar JCSSP’19]
3D-HEVC [AQ-CNN]
VVC [Jin VCIP’17] [Jin PCM’17] [Wang ICIP’18] [Pooling-Variable CNN]