# Review — Fast Intra Coding Algorithm for Depth Map with End-to-End Edge Detection Network (3D-HEVC Intra)

## The Encoding Time is Reduced by 39.56% on Average

In this story, **Fast Intra Coding Algorithm for Depth Map with End-to-End Edge Detection Network**, (Liu VCIP’20), is briefly reviewed since I need to review a manuscript in one IEEE transactions. In this paper:

- First,
**Holistically Nested Edge Detection (HED) network**is used for**edge detection**. - Then,
**Ostu method**is used to divide the output of the HED into**foreground**region and**background**region. - Finally,
**the CU size**and the**candidate list**of intra mode are**determined**according to the region of coding tree unit (CTU).

This is a paper in **2020 VCIP**. (Sik-Ho Tsang @ Medium)

# Outline

**Brief Introduction in HEVC Depth Coding****Proposed Fast Approach****Experimental Results**

**1. Brief Introduction in HEVC Depth Coding**

- In HEVC Depth Coding, except the
**35 intra prediction modes**in HEVC (0–34), there are also**2 DMM modes**(35–36). **The coding time of depth map is about 4 times than texture map**, accounting for more than**80% of the total coding time**.- In addition, the probability of DMM modes being selected as the best mode is only 0.85%.

# 2. **Proposed Fast Approach**

## 2.1. HED Network for Edge Detection

- First,
**Holistically Nested Edge Detection (HED) network (2015 ICCV)**, is used for edge detection. - The
**input**is the**depth map**. - In this network, convolutional neural network (CNN) at
**different scales**is used from scales of 1 to 5. - Then,
**fusion module**is used for fuse the multi-scale features. - Finally, the
**probabilistic edge map**is generated as**output**.

Since the

coding unit (CU) with edgestends to use amore complicated modesanda smaller CU size, whereas theCU without edgestends to use amore smooth mode, such as DC or planar, anda larger CU size.

- (In this paper, HED network is used to detect the edge. HED network is a 2015 ICCV paper with over 200 citations. There are also other edge detection approaches afterwards.)

## 2.2. Ostu Method for Region Division

**Ostu method**is used to divide the output of the HED into**foreground**region and**background**region.- Otsu is a common method to determine adaptive threshold.
- It is used to binarize edge detection map into more easily processed edge map.

- where
*W*0 represents the proportion of non-edge pixels in the whole image, and the average gray scale is*U*0 .*W*1 represents the proportion of edge pixels in the whole image, and the average gray scale is*U*1. - The average gray scale of the whole image is
*U*. *σn*² is the inter-class variance.- Let the pixel
*K*value in the entire pixel range of the image in turn, to find the corresponding inter-class variance. **The**corresponding to the*k*value**maximum inter-class variance**is the**optimal threshold**.*T*- When the pixel value of a pixel is
**greater than or equal to the threshold value**, it will be judged as*T***foreground**region,**otherwise**it will be judged as**background**region.

## 2.3. CU Size and Mode Decision

**Step 1**: When PU is**64×64**and located in the**edge region**,**only the Angular modes**is traversed to find the optimal mode.**Otherwise**, perform**Step 2**.**Step 2**: Perform**rough mode**and**most probable mode**decision and**only the non-angular modes**are traversed to find the optimal mode.**Step 3**: When PU is**not 64×64**and located in the**edge region**, 35 HEVC modes are skipped and**only DMM modes**are traversed to find the best partition mode as the optimal mode.**Otherwise**, perform**Step 4**.**Step 4**: Perform**rough mode**and**most probable mode**decision and**only the non-angular modes**and**non-DMM modes**are traversed to find the optimal mode.**Step 5**: The total RD cost of all candidate modes in the candidate mode list is calculated to find the optimal mode.

# 3. Experimental Results

**HTM-16.0**with**intra main****configuration**is used.- GPU is disabled during testing.
**V/T**: BD-Rate of coded texture views over total bitrate.**S/T**: BD-Rate of synthesized views over total bitrate.**ΔT**: The time saving.**Compared with the original HTM, average 0.31% and 1.22% BD-Rate loss for coded views and synthesized views, respectively.**- The proposed method
**outperforms [11]**, which is a 2015 ICASSP paper.

## Reference

[2020 VCIP] [Liu VCIP’20]

Fast Intra Coding Algorithm for Depth Map with End-to-End Edge Detection Network

## Codec Intra Prediction

**JPEG** [MS-ROI] [Baig JVICU’17]**JPEG-HDR** [Han VCIP’20]**HEVC **[Xu VCIP’17] [Song VCIP’17] [Li VCIP’17] [Puri EUSIPCO’17] [IPCNN] [IPFCN] [HybridNN, Li ICIP’18] [Liu MMM’18] [CNNAC] [Li TCSVT’18] [Spatial RNN] [PS-RNN] [AP-CNN] [MIP] [Wang VCIP’19] [IntraNN] [CNNAC TCSVT’19] [CNN-CR] [CNNMC Yokoyama ICCE’20] [PNNS] [CNNCP] [Zhu TMM’20] [Sun VCIP’20] [DLT] [Zhong ELECGJ’21]**3D-HEVC** [Liu VCIP’20]**VVC** [CNNIF & CNNMC] [Brand PCS’19] [Bonnineau ICASSP’20] [Santamaria ICMEW’20] [Zhu TMM’20]