Review — SD-UNet: Stripping down U-Net for Segmentation of Biomedical Images on Platforms with Low Computational Budgets
SD-UNet: Stripping down U-Net for Segmentation of Biomedical Images on Platforms with Low Computational Budgets,
SD-UNet, by Beijing University of Technology, and Guilin University of Electronic Technology,
2020 MDPI J. Diagnostics, Over 60 Citations (Sik-Ho Tsang @ Medium)
Medical Imaging, Medical Image Analysis, Image Segmentation, U-Net
- The above SDU-Net block is used to construct the SD-UNet model.
- SD-UNet follows a similar architecture as U-Net with a few modifications. Except for the first convolution layer which has a standard convolution, all other convolution layers are made of depthwise separable convolution layers, as from MobileNetV1.
1.1. Encoding Path
- The encoding is made up of 5 blocks:
- Block1: A standard convolution layer, a ReLU activation function, and a GN layer.
- Block2 and Block3: One SD-UNet block and a max-pooling layer. An SD-UNet block is made up of two depthwise separable convolution layers, weight standardized (WS), two ReLU activation layers, and one GN layer.
- Block4: One SD-UNet block, a dropout layer to introduce regularization, and a max-pooling layer.
- All depthwise (3 × 3) convolution layers are weight standardized (WS).
1.2. Decoding Path
- The decoding path of SD-UNet is made of a mixture of depthwise separable convolutions and SD-UNet blocks. Upsampling is performed on the decoding path with a size of 2 in order to recover the size of the segmentation map.
- It also consists of 5 Blocks:
- Block1: A depthwise separable convolution layer with its features concatenated with the dropout layer from Block4 of the encoding path.
- Blocks 2, 3, 4: An SD-UNet block and a depthwise separable layer concatenated with corresponding blocks from the encoding path.
- Block 5: Two SD-UNet blocks and two depthwise separable layers with the last one as the final prediction layer.
- The datasets used to evaluate the performance of SD-UNet are the ISBI challenge dataset for the segmentation of neuronal structures in electron microscopic (EM) stacks and the MSD challenge brain tumor segmentation (BRATs) dataset.
- The loss used in training on the EM stacks dataset was based on binary cross-entropy loss. On the BRATs dataset, the loss was a weighted sum of negative dice loss and binary cross-entropy loss algorithms.
2.1. Computational Comparison
- U-Net (depthwise + GN=32) achieves the fastest inference on a single test image with 87 milliseconds but is still 3× the size of SD-UNet.
SD-UNet is seen to achieve comparable performance in terms of accuracy, mean IOU and Dice coefficient, while being more computationally efficient than the original U-Net model.
SD-UNet significantly performs better than U-Net on smaller tumors.
Cases that the dice scores on test images fall under 75.0, as shown above.
[2020 MDPI J. Diagnostics] [SD-UNet]
SD-UNet: Stripping down U-Net for Segmentation of Biomedical Images on Platforms with Low Computational Budgets