Review — Half-UNet: A Simplified U-Net Architecture for Medical Image Segmentation

Half-UNet, With the Use of GhostNet Ghost Module

Sik-Ho Tsang
4 min readMar 23, 2023

Half-UNet: A Simplified U-Net Architecture for Medical Image Segmentation,
Half-UNet, by South-Central Minzu University for Nationalities, and Hubei Provincial Engineering Research Center for Intelligent Management of Manufacturing Enterprises,
2022 J. Front. Neuroinform., Over 5 Citations (Sik-Ho Tsang @ Medium)
Medical Imaging, Medical Image Analysis, Image Segmentation, U-Net

Biomedical Image Segmentation
2015 … 2021
[Expanded U-Net] [3-D RU-Net] [nnU-Net] [TransUNet] [CoTr] [TransBTS] [Swin-Unet] [Swin UNETR] [RCU-Net] [IBA-U-Net] [PRDNet] [Up-Net] 2022 [UNETR]
==== My Other Paper Readings Also Over Here ====

  • Yesterday, I talked about GhostNet. Today, Half-UNet is reviewed in which both the encoder and decoder are simplified, with also the use of Ghost modules (GhostNet).
  • The redesigned architecture includes the unification of channel numbers, full-scale feature fusion, and Ghost modules (GhostNet).

Outline

  1. Motivations
  2. Half-UNet
  3. Results

1. Motivations

Illustrations of different types of encoders, the structures of encoders (A–C) are derived from U-Net’s encoder, decoder, and full structure, respectively.
Experimental results of different kinds of encoders.
  • U-Net’s encoder and decoder are considered as encoders. Then, the features from C1 to C16 are aggregated by designing a single decoder, where the structure is the same as full-scale feature aggregation in UNet 3+.
  • The encoder (A) can achieve comparable performance with the encoder (C), while the performance obviously drops in the encoder (B).

The U-Net’s decoder can be simplified to reduce the complexity.

2. Half-UNet

2.1. Unify the Channel Numbers

Illustration of how to construct the full-scale aggregated feature map in the third decoder layer of UNet3+.
  • In each downsampling step of U-Net and UNet 3+, the number of feature channels is doubled, which enhances the diversity of feature expression. However, this increases the complexity of the model, especially in UNet 3+.

In Half-UNet, on the other hand, the channel numbers of all feature maps are unified, which reduces the number of filters in the convolution operation.

2.2. Full-Scale Feature Fusion

The architecture of Half-UNet.
  • Both U-Net and UNet 3+ use concatenate operations for feature fusion, which require more memory overhead and computation.
  • The addition operation does not require additional parameters or computational complexity.

Feature maps from different scales are first upsampled to the size of the original image, and then feature fusion is performed through the addition operation.

2.3. Ghost Module

Ghost module.
  • Ghost module, as in GhostNet, is used to generate more feature maps while using cheap operations.
  • s=2 is used where s represents the reciprocal of the proportion of intrinsic feature maps.
  • Half of the feature maps are generated by convolution, and the other half are generated by depthwise separable convolution.
  • Finally, the two halves of the feature map are concatenated to form the output.

The Ghost module is used in Half-UNet to reduce the required parameters and FLOPs compared to standard convolution.

3. Results

3.1. Datasets

The medical image segmentation datasets.
  • Three datasets are used for experiments.

3.2. Quantitative Results

Comparison of U-Net and its variants and the proposed Half-UNet on three datasets.
  • Half-UNet†: remove the Ghost modules in Half-UNet.
  • Half-UNet† outperforms U-Net and its variants in regard to mammography images and is closer to them than Half-UNet in terms of lung nodule images. Yet, Half-UNet† performed less well than Half-UNet for left ventricular MRI images.

Half-UNet (with and without Ghost modules) has similar segmentation accuracy compared with U-Net and its variants, while the parameters and FLOPs are reduced by 98.6 and 81.8%.

  • The channel numbers of Half-UNet∗†_u and Half-UNet∗†_d are doubled after downsampling.
  • There are two strategies for feature fusion in the decoder: (1) Upsampling2D + 3×3 convolution, which is what Half-UNet∗†_u and UNet 3+ do; (2) Deconvolution, which is what Half-UNet∗†_d and U-Net do.

Half-UNet∗†_u and Half-UNet∗†_d increase the required FLOPs and parameters, respectively, compared with Half-UNet†.

3.2. Qualitative Results

Qualitative comparison between Half-UNet, U-Net, and UNet 3+ in left ventricular MRI.

Half-UNet can segment endocardial and epicardial boundaries more completely.

3.3. Further Study

The architecture of the same part of UNet 3+ and U-Net.
Three sub-networks.
  • In the left part of the Half-UNet sub-network, since bilinear upsampling and addition are both linear operations, almost no parameters and computation are generated.
  • In the right part of the Half-UNet sub-network, due to the lower number of input channels (only 64) and the use of the Ghost module, the cost of convolution is significantly smaller than in other structures.

Half-UNet avoids the problems of the above three networks, significantly reducing the required parameters and FLOPs.

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.