Review: MIP — Multiple Linear Regression Intra Prediction (HEVC Prediction)

0.4% BD-Rate Reduction Compared to the Conventional HEVC HM-16.0

In this story, Multiple linear regression Intra Prediction (MIP), by University of Missouri Kansas City, University of Science and Technology of China, and Tencent America, is reviewed. I read this because I work on video coding research.

Linear regression is the one of the very basic items before learning deep neural networks. In this paper, authors show how they formulate a video coding problem, particularly intra prediction, into a linear regression problem. This is a paper in 2019 ICASSP. (Sik-Ho Tsang @ Medium)

  1. HEVC Intra Prediction
  2. Proposed Approach
  3. Experimental Results

1. HEVC Intra Prediction

35 HEVC Intra Predictions
35 HEVC Intra Predictions
  • HEVC inherits block-based scheme from previous video coding frameworks with the main difference of Coding Tree Units (CTU) concept.
  • HEVC intra coding is based on spatial interpolation of samples from previously decoded image blocks.
  • It supports up to 35 intra prediction modes named planar, DC and 33 angular prediction modes as shown above.
  • (To know more about video coding and intra prediction, please feel free to read Sections 1 & 2 in IPCNN.)

2. Proposed Approach

Framework of proposed Multiple linear regression Intra Prediction (MIP) scheme
  • The reference pixels and the best intra prediction, X, will be flattened and concatenated together and fed into the Multiple Linear Regression (MLR) model.
  • The ground truth Y is the N×N block.
  • Therefore the loss function is:
  • Separate model is trained for Planar and DC mode due to their special texture characteristics.
  • Since the neighboring angular modes share a lot of similarities, 3 adjacent angular modes are combined into a single MIP angular mode to avoid the singularity:
  • where n is HEVC intra prediction mode index and m is MIP mode index. Therefore, there are in total 13 MIP modes.

(The network is quite similar to IPFCN. But here only 1 layer is used with multiple models. In IPFCN, one model is used with more than 1 hidden layers.)

Proposed MIP in HEVC Encoder
  • An additional bit flag is encoded to indicate whether MIP is selected.
  • Therefore, for each block, either conventional intra prediction mode or proposed MIP can be selected according to Rate Distortion (RD) cost.

3. Experimental Results

3.1. Settings

  • HM-16.0 is used.
  • All-intra configuration is adopted.
  • Quantization parameter (QPs): 22, 27, 32, 37. Higher QP, lower bitrate and quality, and vice versa.
  • DIV2K 2K resolution high-quality image dataset is used to generate the training dataset. DIV2K contains 800 training images, 100 validation images and 100 testing images covering a wide range of contents.
  • A separate set of MIP models for each QP and block size combination.
  • And there are 13 transformations in each set.
  • Therefore, there are in total 4×4×13 = 208 MIP models.

3.2. Results

BD-rate (%) of proposed approach compared with the conventional HEVC HM-16.9
  • An average of -0.4%, -0.6%, and -0.8% BD-Rate saving is achieved for Luma and two Chroma components respectively by the proposed method.
  • As high as -0.9% BD-Rate reduction is observed on Traffic from Class A while only -0.2% BD-Rate saving is achieved on Class C.

During the days of coronavirus, I hope to write 30 stories in this month to give myself a small challenge. This is the 27th story in this month. Thanks for visiting my story…

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store