Reading: Fischer QoMEX’20 — Coding Chain with Spatial Up and Down-Scaling (VVC Inter)
In this paper, On Versatile Video Coding at UHD with Machine-Learning-Based Super-Resolution (Fischer QoMEX’20), by Friedrich-Alexander-Universit¨at Erlangen-N¨urnberg (FAU), is shortly presented. I read this because I work on video coding research. In this paper:
- The input frame is firstly downsampled, then encoded.
- This encoded frame is then decoded, and upsampled by super resolution network, as shown at the top branch of the figure above.
This is a paper in 2020 QoMEX. (Sik-Ho Tsang @ Medium)
3. Experimental Results
3.1. BD-Rate
- VTM-5.0 is used with Random Access Configuration.
- L-SEABI is the Gaussian filter upsampling approach.
- Considering a very low video quality (QPconv ={42, 44, 46, 48}), coding with the investigated coding chain with spatial downscaling results in BD-Rate reduction with respect to PSNR above 9 %.
- At best, the investigated coding chain with RDN can save 39.5 % for the FoodMarket4 sequence.
3.2. Time Complexity
- VDSR takes around 1 second to upscale the Y channel of a full HD frame to 4K resolution on a NVIDIA GeForce RTX 2080 Ti.
- On the same unit, RDN takes between 6 and 8 seconds.
- The L-SEABI takes around 1 second on an Intel Xeon E3–1275 v6 @ 3.8 GHz in the proposed coding chain for upscaling the Y-channel.
Reference
[2020 QoMEX] [Fischer QoMEX’20]
On Versatile Video Coding at UHD with Machine-Learning-Based Super-Resolution
Codec Inter Prediction
H.264 [DRNFRUC & DRNWCMC]
HEVC [CNNIF] [Zhang VCIP’17] [NNIP] [GVTCNN] [Ibrahim ISM’18] [VC-LAPGAN] [VI-CNN] [CNNMCR] [FRUC+DVRF] [FRUC+DVRF+VECNN] [RSR] [Zhao ISCAS’18 & TCSVT’19] [Ma ISCAS’19] [Xia ISCAS’19] [Zhang ICIP’19] [ES] [GVCNN] [FRCNN] [Pham ACCESS’19] [CNNInvIF / InvIF] [CNN-SR & CNN-UniSR & CNN-BiSR] [DeepFrame] [U+DVPN] [Multi-Scale CNN] [Klopp TIP’20]
AVS3 [Zhang ICMEW’20]
VVC [FRUC+DVRF+VECNN] [ScratchCNN] [Fischer QoMEX’20]