Review: Lee ICCE’18 — CNN for Visual Quality Improvement on HEVC (Codec Filtering)

Network Similar to ARCNN, Up to 0.24dB Image Quality Improvement is Achieved

Sik-Ho Tsang
3 min readApr 19, 2020

In this story, CNN-based Approach for Visual Quality Improvement on HEVC (Lee ICCE’18), by Graduate School of SunMoon Univeristy and Sookmyung Women’s University, is briefly reviewed. I read this paper because I am working on video coding research. This is a paper in ICCE 2018. (What video coding is: please feel free to visit Sections 1 & 2 in IPCNN.) (What in-loop filter is: please visit Section 1 in DRN.)

Every year, ICCE (International Conference on Consumer Electronics) is launched together with CES at Las Vegas in January. And I think many people knows about CES (Consumer Electronics Show). CES is one of the biggest exhibitions for companies such as Apple, Sony and Samsung to show their innovative CE products. As I remember (I don’t know right now), with a paper published in ICCE, as a registered author, author can also walk around the CES which is quite attractive. (This is not an advertisement). (Sik-Ho Tsang @ Medium)

Outline

  1. Network Architecture
  2. Experimental Results

1. Network Architecture

Left: The Decoder with Proposed CNN, Right: CNN Architecture
  • Left: The placement of CNN is after the adaptive loop filter, and before the output to the screen and decoded picture buffer.
  • Hence, the CNN model not only improves the quality of the intra-coded video but also may improve the quality of the reference frame for inter-coding.
  • Right: Convolution kernel sizes 9×9, 7×7, 1×1, and 5×5 were used respectively. SSE is employed as a loss function.
  • Input and label sizes are 50×50 and 32×32, respectively.
  • where F^c and F^a are convolution operation and activation function ReLU respectively.
  • The network is quite similar to ARCNN.

2. Experimental Results

Dataset
  • 520,484 patches from the first frame of each of the 1∼6 sequences for training and 30,333 patches from the first frame of another 7∼10 sequences for the validation.
  • The test sequences were encoded with QP=51 using intra-main profile in HEVC reference software 16.10.
  • As shown above, the 500 frames of BasketballDrill test sequence showed the improvement of quality and showed a higher PSNR from 0.07 dB to 0.24 dB.

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

Responses (1)