# Reading: QE-CNN — Quality Enhancement Convolutional Neural Network (Codec Filtering)

In this story, **Quality Enhancement Convolutional Neural Network (QE-CNN)**, by Beihang University, is briefly reviewed. In this paper:

**QE-CNN-I**: QE-CNN for I frames is proposed.**QE-CNN-P**: QE-CNN for P/B frames is also proposed.- Time-constraint QE-CNN is also proposed for real-time scenario.

This is an extension of a paper DS-CNN in 2018 ICME. And it is published in **2019 TCSVT **where TCSVT has a **high impact factor of 4.046**. (Sik-Ho Tsang @ Medium)

# Outline

**Introduction to****ARCNN****QE-CNN-I: Network Architecture****QE-CNN-P: Network Architecture****Experimental Results****Results for Time Constraint (Real-Time) Scenario**

# 1. Introduction to ARCNN

**2. QE-CNN-I: Network Architecture**

- As shown above, one more convolution layer is used in QE-CNN-I.
- Thus, this is
**QE-CNN-I (9–7–3–1–5)**.

- It is tested that AR-CNN-3, i.e.
**QE-CNN-I (9–7–3–1–5)**, with**PReLU**used, has the largest PSNR gain. Therefore, this architecture is adopted.

# 3. QE-CNN-P: Network Architecture

- Similar to QE-CNN-I, it uses 9–7–3–1–5 network architecture.
- Different from that, it has an additional path as shown in green color.
- A the end of the network,
**the outputs of Conv 4 and Conv 8 are concatenated, and are both convolved by Conv 9**.

# 4. **Experimental Results**

## 4.1. Objective Quality

- LDP (Low Delay P): means except I frames, other frames are P frames which is encoded using previous frames information.
- As shown above, QE-CNN-I obtains highest PSNR gain for I frames compared to ARCNN, VRCNN and DCAD.
- And QE-CNN-P obtains highest PSNR gain for P frames compared to ARCNN, VRCNN and DCAD.

## 4.2. Subjective Quality

- 12 non-expert subjects are involved in the test.
- During the test, sequences were displayed at random order. After viewing each sequence, the subjects were asked to rate the subjective score.
- The rating score includes excellent (100–81), good (80–61), fair (60–41), poor (40–21), and bad (20–1).
- Again, QE-CNN outperforms ARCNN and DCAD in terms of DMOS.

## 4.3. Time Analysis

- The running time of
**AR-CNN**method is**0.70 ms per Coding Tree Unit (CTU)**and that of**DCAD****0.64 ms**per CTU. In contrast, our**QE-CNN-I**model requires approximately**1.53 ms**per CTU, and**QE-CNN-P**consumes**3.90 ms**per CTU. - Thus, the performance improvement of our QE-CNN method is at the expense of computational time.

# 5. **Results for Time Constraint (Real-Time) Scenario**

- Under time constraint scenario, three options can be chosen for
*k*-th CTU:

: QE-CNN-P (highest complexity)*k*2: QE-CNN-I (moderate)*k*1: or no filtering (lowest)*k*0

- Hence, the time constraint equation is formulated:

- where
*k*is the kth CTU,*n*= 0, 1, 2 to indicate which filter is used as shown above,*t*is the time needed for that CTU, and*N*is the total number of CTU within one frame. - The main idea is that
**the time consumed by all CTUs should be smaller than or equal to the time constraint**(I don’t go into details about this because I just want to focus on CNN. If interested, please read the original paper. The hyperlink is at the bottom.)*T*, while the MSE reduction (ΔMSE) for all CTUs should be maximized.

- For 60 fps,
*T*= 16.67 ms per each frame. With 600 frames, 10 seconds of video are encoded. - Under this scenario,
**time-constraint QE-CNN still can obtain 1.41% to 6.83% BD-rate reduction under the time constraint of 60fps (10 seconds in total)**.

During the days of coronavirus, let me have a challenge of writing 30 stories again for this month ..? Is it good? This is the 15th story in this month. 50% progress!! Thanks for visiting my story..

## Reference

[2019 TCSVT] [QE-CNN]

Enhancing Quality for HEVC Compressed Videos

## Codec Filtering

**JPEG**: [ARCNN] [RED-Net] [DnCNN] [Li ICME’17] [MemNet] [MWCNN]**HEVC**:[Lin DCC’16] [IFCNN] [VRCNN] [DCAD] [MMS-net] [DRN] [Lee ICCE’18] [DS-CNN] [RHCNN] [VRCNN-ext] [S-CNN & C-CNN] [MLSDRN] [Liu PCS’19] [QE-CNN]**VVC**: [Lu CVPRW’19] [Wang APSIPA ASC’19]