Review: UNet++ — A Nested U-Net Architecture (Biomedical Image Segmentation)
- 1) having convolution layers on skip pathways, which bridges the semantic gap between encoder and decoder feature maps.
- 2) having dense skip connections on skip pathways, which improves gradient flow.
- 3) having deep supervision, which enables model pruning and improves or in the worst case achieves comparable performance to using only one loss layer.
This is a 2018 DLMIA paper with more than 40 citations. (Sik-Ho Tsang @ Medium)
- UNet++ Architecture
- Re-designed Skip Pathways
- Deep Supervision
- Experimental Results
1. UNet++ Architecture
- UNet++ starts with an encoder sub-network or backbone followed by a decoder sub-network.
- There are re-designed skip pathways (green and blue) that connect the two sub-networks and the use of deep supervision (red).
2. Re-designed Skip Pathways
- The above figure shows an example how the feature maps travel through the top skip pathway of UNet++.
- Another example, consider the skip pathway between nodes X0,0 and X1,3, as shown in the first figure. The skip pathway consists of a dense convolution block with three convolution layers.
- Each convolution layer is preceded by a concatenation layer that fuses the output from the previous convolution layer of the same dense block with the corresponding up-sampled output of the lower dense block.
- Formally, we can formulate as follows:
- where H() is a convolution operation followed by an activation function, U() denotes an up-sampling layer, and [ ] denotes the concatenation layer.
- This is the idea from DenseNet.
The main idea behind is to bridge the semantic gap between the feature maps of the encoder and decoder prior to fusion.
3. Deep Supervision
- With deep supervision:
accurate mode wherein the outputs from all segmentation branches are averaged.
Or fast mode wherein the nal segmentation map is selected from only one of the segmentation branches, the choice of which determines the extent of model pruning and speed gain.
- Owing to the nested skip pathways, UNet++ generates full resolution feature maps at multiple semantic levels. Thus, the loss are estimated from 4 semantic levels.
- Also, a combination of binary cross-entropy and dice coefficient as the loss function:
- where N is the batch size.
4. Experimental Results
- Four medical imaging datasets are used for model evaluation, covering lesions/organs from different medical imaging modalities.
4.2. Baseline Models
- Original U-Net and Wide U-Net are compared.
- Wide U-Net is the modified U-Net with more kernels such that it has similar number of parameters with the UNet++.
- UNet++ without deep supervision achieves a significant performance gain over both U-Net and wide U-Net, yielding average improvement of 2.8 and 3.3 points in IoU.
- UNet++ with deep supervision exhibits average improvement of 0.6 points over UNet++ without deep supervision.
4.4. Model Pruning
- UNet++ L3 achieves on average 32.2% reduction in inference time while degrading IoU by only 0.6 points.
- More aggressive pruning further reduces the inference time but at the cost of significant accuracy degradation.
4.5. Qualitative Results
[2018 DLMIA] [UNet++]
UNet++: A Nested U-Net Architecture for Medical Image Segmentation
My Previous Reviews
Image Classification [LeNet] [AlexNet] [Maxout] [NIN] [ZFNet] [VGGNet] [Highway] [SPPNet] [PReLU-Net] [STN] [DeepImage] [SqueezeNet] [GoogLeNet / Inception-v1] [BN-Inception / Inception-v2] [Inception-v3] [Inception-v4] [Xception] [MobileNetV1] [ResNet] [Pre-Activation ResNet] [RiR] [RoR] [Stochastic Depth] [WRN] [ResNet-38] [Shake-Shake] [FractalNet] [Trimps-Soushen] [PolyNet] [ResNeXt] [DenseNet] [PyramidNet] [DRN] [DPN] [Residual Attention Network] [DMRNet / DFN-MR] [IGCNet / IGCV1] [MSDNet] [ShuffleNet V1] [SENet] [NASNet] [MobileNetV2]
Object Detection [OverFeat] [R-CNN] [Fast R-CNN] [Faster R-CNN] [MR-CNN & S-CNN] [DeepID-Net] [CRAFT] [R-FCN] [ION] [MultiPathNet] [NoC] [Hikvision] [GBD-Net / GBD-v1 & GBD-v2] [G-RMI] [TDM] [SSD] [DSSD] [YOLOv1] [YOLOv2 / YOLO9000] [YOLOv3] [FPN] [RetinaNet] [DCN]
Semantic Segmentation [FCN] [DeconvNet] [DeepLabv1 & DeepLabv2] [CRF-RNN] [SegNet] [ParseNet] [DilatedNet] [DRN] [RefineNet] [GCN] [PSPNet] [DeepLabv3] [ResNet-38] [ResNet-DUC-HDC] [LC] [FC-DenseNet] [IDW-CNN] [DIS] [SDN] [DeepLabv3+]
Biomedical Image Segmentation [CUMedVision1] [CUMedVision2 / DCAN] [U-Net] [CFS-FCN] [U-Net+ResNet] [MultiChannel] [V-Net] [3D U-Net] [M²FCN] [SA] [QSA+QNT] [3D U-Net+ResNet] [Cascaded 3D U-Net] [VoxResNet] [DenseVoxNet] [Attention U-Net] [RU-Net & R2U-Net] [UNet++]
Generative Adversarial Network [GAN]