Review — Dilated Dense U-Net for Infant Hippocampus Subfield Segmentation
Inserting Dilated Convolution & Dense Connection into U-Net
Dilated Dense U-Net for Infant Hippocampus Subfield Segmentation,
DUnet & ResDUnet, by University of North Carolina at Chapel Hill, Shaoxing University, Chang Gung University College of Medicine, and Korea University
2019 J Front. Neuroinform., Over 30 Citations (Sik-Ho Tsang @ Medium)
Medical Imaging, Medical Image Analysis, Image Segmentation, U-Net
- A new fully convolutional network (FCN) for infant hippocampal subfield segmentation is proposed by embedding the dilated dense network in the U-Net, namely DUnet. The embedded dilated dense network can generate multi-scale features while keeping high spatial resolution.
- Every pair of convolutional layers are grouped with one residual connection in the DUnet, and obtain the Residual DUnet (ResDUnet).
1. Data
- Two datasets are used. One is BCP dataset and one is a publicly available dataset (https://www.nitrc.org/projects/mni-hisub25).
- The first one consists of MRI scans (including T1- and T2-weighted structural MRI, DTI, and rs-fMRI) of 500 typically developing children, ages 0–5 years, over the course of 4 years. In our experiment, 10 infant subjects (6 females/4 males).
- The imaging protocol for acquiring the T1w and T2w MR images is listed in the table above. Five hippocampal subfields were manually labeled for each subject by the consensus of two neuroradiologists.
- The second one contains 25 adult subjects (31 ± 7 years, 12 males). Each subject consists of an isotropic 3D-MPRAGE T1-weighted image.
- All T1w and T2w images underwent automated correction for intensity non-uniformity and intensity standardization.
2. Dilated Dense U-Net (DUnet)
2.1. Dilated Convolution
- Left: Small convolutional kernels with size 3 × 3 × 3 can reduce the number of parameters, but also reduce the receptive field, compared with large kernel.
- Middle & Right: Using the dilated convolutions, the feature maps can be computed with a high spatial resolution, and the size of the receptive field can be enlarged arbitrarily, as shown above.
- The below equation formulates the dilated convolution:
- When l=1, the dilated convolution becomes the normal convolution.
- (Please feel free to read DeepLab & DilatedNet for more information about atrous/dilated convolution.)
2.2. Dilated Dense U-Net (DUnet)
- Dense block, as in DenseNet, is used.
- Dilated convolution, as in DeepLab & DilatedNet, is used with different rates for different layers.
- The dilated dense network in the U-Net is proposed to obtain a new network (DUnet).
- The feature maps before the second pooling layer are first input into the dilated dense network.
- Then, the output features of the dilated dense network are concatenated to the corresponding feature maps in the expanding path.
3. Residual Dilated Dense U-Net (ResDUnet)
- To further improve the performance, residual connections, as in ResNet, are used in DUnet to promote the information flow within the network.
- Every pair of convolutional layers are grouped with one residual connection along the contracting path and the expanding path of DUnet, and obtain the Residual DUnet (ResDUnet):
- For all networks, softmax loss is used:
4. Ablation Studies
4.1. Metrics
- Dice coefficient (Dice) and Average Symmetric Surface Distance (ASSD), are used.
4.2. Patch Size
- Five-fold cross validation was used in the experiment for the BCP dataset. In each fold, 7 subjects for training, 1 subject for validation, and 2 subjects for testing. Experiments were performed using a NVIDIA Titan Xp with 12 GB memory.
- 1,300 patches are extracted from each subject.
The patch size was optimally set to 24 × 24 × 24 by comparing the results obtained by the baseline 3D U-Net method with different patch sizes.
- Stride of 8 × 8 × 8 is used. A majority voting strategy is used for the overlap regions.
4.3. Post-Processing
- For example, a patch in the caudate (denoted by the pink circle) may look similar to the patches in the hippocampus, and will be classified to hippocampal subfields in the testing stage.
- To remove these artifacts automatically, the post-processing steps include searching the voxels of each automated segmentation to find the non-zero neighbors of current voxel, and to obtain several connected regions. Then, two regions with maximum volumes are selected for the final left and right hippocampal subfields.
4.4. Multi-Modality
- The efficacy of multi-modality is tested by comparing the segmentation results obtained using only single modality images (i.e., T1w or T2w) and multi-modality images (T1w+T2w), respectively.
The network trained with multi-modality images can generate more discriminative features, which improves the performance of hippocampal subfield segmentation.
5. Experimental Results
5.1. BCP Dataset
DUnet outperforms 3D U-Net in segmenting CA1, SUB, CA4/DG and Uncus, and ResDUnet outperforms 3D U-Net in segmenting CA1, CA2/3, SUB, and Uncus, according to the Wilcoxon signed rank tests with p<0.05.
The proposed ResDUnet achieves the highest Dice coefficient for the average of subfields.
- 3D U-Net+ResNet is not too bad in terms of ASSD.
5.2. KULAGA-YOSKOVITZ Dataset
DUnet outperforms 3D U-Net and 3D U-Net+ResNet in segmenting CA1–3 and SUB, and ResDUnet outperforms 3D U-Net and 3D U-Net+ResNet in segmenting all subfields, according to the Wilcoxon signed rank tests with p<0.05.
- DUnet and ResDUnet also outperform HIPS method, especially for segmenting the CA4/DG subfield which is the most difficult task.
Reference
[2019 J Front. Neuroinform.] [DUnet & ResDUnet]
Dilated Dense U-Net for Infant Hippocampus Subfield Segmentation
4.2. Biomedical Image Segmentation
2015 … 2019 … [DUnet & ResDUnet] 2020 [MultiResUNet] [UNet 3+] [Dense-Gated U-Net (DGNet)] [Non-local U-Net]