[Paper] Dense-Gated U-Net (DGNet): Brain Lesion Segmentation (Biomedical Image Segmentation)

Outperforms UNet++ & DeepLabv3

5 min readDec 5, 2020

In this story, A Dense-Gated U-Net for Brain Lesion Segmentation, Dense-Gated U-Net (DGNet), by Peking University Shenzhen Graduate School, Peking University, and Macau University of Science and Technology, is shortly presented. In this paper:

A Dense-Gated UNet (DGNet), which is a hybrid of Dense-gated blocks and U-Net, is proposed.
DGNet can achieve weighted concatenation and suppress useless features.

This is a paper in 2020 VCIP where the conference has just been over this month. (Sik-Ho Tsang @ Medium)

Outline

Dense-Gated U-Net (DGNet)
Experimental Results

1. Dense-Gated U-Net (DGNet)

1.1. Fated Fusion

The original dense connections are changed by reweighting each feature maps before concatenation and design a gating module to make the network focus on more informative feature maps.
More precisely, feature compression is performed and turn each of feature maps concatenated into a layer descriptor.
This layer descriptor has a global receptive field in certain degrees, and the output dimension matches the number of input characteristic concatenation.

The idea is very similar to SE module in SENet.

For the (l+1)-th layer, a statistic z ∈ Rl is generated by squeezing the feature maps X and zc as the c-th element of z can be expressed as:

After applying global average pooling above, then 1×1 convolutional layers Wa and Wb, are used to explicitly model the correlation between different layers.

where 𝛿 is ReLU and σ is sigmoid.
Thus, the final output is a reweight operation:

In this way, feature maps X are converted into new feature maps ˜X, which contain more valid information and less redundancy.

1.2. Dense-Gated Blocks

The feature maps are densely connected in a left-right manner within a block. Each dense-gated block has 5 convolutional layers:

This idea is similar to the one in DenseNet.

1.3. Network Architecture

3D U-Net, with skip connections, is used.

The network consists of 4 level encoders in the downward path, 4 level decoders in the upward path and a base level.
In the encoder path, each encoder level has a dense-gated block (DGB) which aims at semantic feature extraction.
Each layer in the dense block can use the feature maps of all preceding layers as inputs, and use its own feature maps as input into all subsequent gates.

2. Experimental Results

2.1. BraTS 2018 Dataset

MICCAI BraTS 2018 training set and validation set consist of ample multi-institutional clinically-acquired and multi-modal MRI scans of glioblastoma (HGG) and lower-grade glioma (LGG).
The training set includes totally 210 HGG patients and 75 LGG patients. Annotations include the GD-enhancing tumor core (ET-label 4), the peritumoral edema (ED-label 2) and the necrotic and non-enhancing tumor core (NCR/NET-label 1). Other pixels except these labels (1,2,4) are labeled as 0.
128 × 128 × 128 voxels are used as inputs and batch size is set as 2.

Qualitative example is shown as above. The segmentation results of tumor core and enhancing tumor are most impressive.

**Top: Validation Set, Middle: Training Set, Bottom: Ablation Study on Validation Set**

The Dice and Hausdorff Distance is used to evaluate the performance of semantic segmentations.
Top: The proposed method performs on par with the best results on the validation dataset.
Middle: The proposed method clearly outperforms competing methods on the training dataset. (Comparison on training set is strange, I don’t know why.)
Bottom: For the non-dense structure (Base), the original 3D U-Net framework without any dense blocks is used. For non-gated dense Structure (Base+DB), all gates are removed in dense-gated blocks but keep dense connection for feature reuse.
Improvements are shown when adding the proposed strategies gradually.

Above shows the gradual improvement example.

2.2. Hemorrhage Dataset Collected by Authors

It is made up of intracranial hemorrhage CT images consisting of 500 collected patients from hospitals. The Annotations include intracranial hemorrhage area (lesion, labeled as 1), while other pixels are all labeled as 0.
The total 5000 slices are divided into the training set (4000 slices) and test set (1000 slices).

The proposed method achieves better performance than other comparing methods, like UNet++.
It is also found that DeepLabv3 produces worse segmentation results even than our plain baseline method.