Review — SD-UNet: Stripping down U-Net for Segmentation of Biomedical Images on Platforms with Low Computational Budgets

SD-UNet, Modified U-Net Using WS & GN

4 min readFeb 5, 2023

--

SD-UNet: Stripping down U-Net for Segmentation of Biomedical Images on Platforms with Low Computational Budgets,
SD-UNet, by Beijing University of Technology, and Guilin University of Electronic Technology,
2020 MDPI J. Diagnostics, Over 60 Citations (Sik-Ho Tsang @ Medium)
Medical Imaging, Medical Image Analysis, Image Segmentation, U-Net

An extremely fast, small and computationally effective model, Stripped-Down UNet (SD-UNet), is proposed, which uses depthwise separable convolution as from MobileNetV1, Weight Standization (WS), and Group Normalization (GN).

Outline

SD-UNet
Results

1. SD-UNet

The above SDU-Net block is used to construct the SD-UNet model.

SD-UNet follows a similar architecture as U-Net with a few modifications. Except for the first convolution layer which has a standard convolution, all other convolution layers are made of depthwise separable convolution layers, as from MobileNetV1.

1.1. Encoding Path

The encoding is made up of 5 blocks:
Block1: A standard convolution layer, a ReLU activation function, and a GN layer.
Block2 and Block3: One SD-UNet block and a max-pooling layer. An SD-UNet block is made up of two depthwise separable convolution layers, weight standardized (WS), two ReLU activation layers, and one GN layer.
Block4: One SD-UNet block, a dropout layer to introduce regularization, and a max-pooling layer.
All depthwise (3 × 3) convolution layers are weight standardized (WS).

1.2. Decoding Path

The decoding path of SD-UNet is made of a mixture of depthwise separable convolutions and SD-UNet blocks. Upsampling is performed on the decoding path with a size of 2 in order to recover the size of the segmentation map.
It also consists of 5 Blocks:
Block1: A depthwise separable convolution layer with its features concatenated with the dropout layer from Block4 of the encoding path.
Blocks 2, 3, 4: An SD-UNet block and a depthwise separable layer concatenated with corresponding blocks from the encoding path.
Block 5: Two SD-UNet blocks and two depthwise separable layers with the last one as the final prediction layer.

1.3. Dataset

**Sample magnetic resonance imaging (MRI) images and their ground truth labels from the Brain Tumor Segmentation (BRATs) dataset.**

The datasets used to evaluate the performance of SD-UNet are the ISBI challenge dataset for the segmentation of neuronal structures in electron microscopic (EM) stacks and the MSD challenge brain tumor segmentation (BRATs) dataset.
The loss used in training on the EM stacks dataset was based on binary cross-entropy loss. On the BRATs dataset, the loss was a weighted sum of negative dice loss and binary cross-entropy loss algorithms.

2. Results

2.1. Computational Comparison

**Computational comparison of SD-UNet and other models.**

SD-UNet requires approximately 8× fewer FLOPs compared to U-Net. Additionally, SD-UNet is approximately 81 milliseconds faster than U-Net. SD-UNet is also 23× smaller.

U-Net (depthwise + GN=32) achieves the fastest inference on a single test image with 87 milliseconds but is still 3× the size of SD-UNet.

2.2. ISBI

**Sample Segmentation on electron microscopy dataset.**

**Comparison of results on the ISBI challenge dataset.**

SD-UNet is seen to achieve comparable performance in terms of accuracy, mean IOU and Dice coefficient, while being more computationally efficient than the original U-Net model.

2.3. BRATs

**SD-UNet shows a faster convergence and improved loss during training.**

WS & GN significantly improves the training loss and obtains a smoother curve.

**Sample Segmentation results on sample images from our test split.**

**Performance comparison between** **U-Net** **and SD-UNet.**

SD-UNet achieves comparable performance with U-Net on large tumor segmentations, it significantly outperforms U-Net on smaller tumor segmentations.
SD-UNet significantly performs better than U-Net on smaller tumors.