Brief Review — Channel-UNet: A Spatial Channel-Wise Convolutional Neural Network for Liver and Tumors Segmentation

Channel-UNet, U-Net With Spatial Channel-Wise Convolution

Sik-Ho Tsang
4 min readJan 3, 2023
Example of over-segmentation and under-segmentation in liver segmentation. Channel-UNet obtains more pleasant results.

Channel-UNet: A Spatial Channel-Wise Convolutional Neural Network for Liver and Tumors Segmentation,
Channel-UNet, by Chinese Academy of Sciences, Peng Cheng Laboratory, Shenzhen Institutes of Advanced Technology, Shenzhen Institutes of Advanced Technology, Wuhan University, and The Chinese University of Hong Kong
2019 J. Frontiers in Genetics, Over 60 Citations (Sik-Ho Tsang @ Medium)
Medical Imaging, Medical Image Analysis, Image Segmentation, U-Net

  • Channel-UNet is proposed, which takes U-Net as the main structure of the network with spatial channel-wise convolution added.

Outline

  1. Channel-UNet
  2. Results

1. Channel-UNet

1.1. Overall Architecture

Network architecture of Channel-UNet.
  • Top: The backbone structure of the proposed Channel-Net is the U-Net one, with spatial channel-wise convolution used.
  • Bottom: Each sub-module consists of two branch channels.
  • Branch 1 is composed of multiple convolutional layers in series.
  • Branch 2 is composed of multiple convolution layers and a spatial channel-wise convolution layer in series, which would extend the receptive field of the spatial channel-wise convolutional layers.
  • The two branches are eventually concatenated.

1.2. Spatial Channel-Wise Convolution

The difference between convolution and spatial channel-wise convolution. (A) Convolution (B) Spatial channel-wise convolution.
  • (A) Traditional convolution uses 1×1×N convolution kernels, where N represents the number of convolutional kernels whose value is equal to the number of output images:
  • (B) Spatial channel-wise convolution uses 1×1×32² convolutional kernels to calculate spatial channel-wise convolution with three input images (32×32):
Example of learning mapping relationship between two pixels
  • Example: When nine different 1×1 convolution kernels are applied to a 3×3 image, the pixel values of both upper left and lower right corners are 1, while the pixel values of the other locations are 0, thus we can learn the mapping relationship between the upper left and lower right pixels.

2. Results

  • 3Dircadb dataset is used.

2.1. Ablation Study

The performance of Channel-UNet with different number of convolutional layers.
  • With the increase of the number of convolutional layers stacked in front of spatial channel-wise convolution layer, the Dice value increases first and then decreases.

When the convolutional layer number is 3, the Dice value is the highest.

Segmentation results by ablation study of our method on the test dataset.

The improvement of segmentation accuracy indicates that the information between the pixels on (x, y)-plane extracted by spatial channel-wise convolution is helpful to the recognition of tumors and liver.

2.2. SOTA Comparisons

Comparison of liver segmentation results.
Comparison of tumors segmentation result.

Dice values of liver and tumors segmentation by Channel-UNet are 0.984 and 0.940 respectively, outperforms the current best method H-DenseUNet (Li et al., 2018b).

2.3. Qualitative Results

Liver segmentation results by ablation study on validation dataset.
Tumor segmentation results by ablation study on validation dataset.

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.