Review: DenseVoxNet — Volumetric Brain Segmentation (Biomedical Image Segmentation)

Outperform 3D U-Net and VoxResNet with Fewer Parameters

5 min readSep 30, 2019

In this story, DenseVoxNet, by The Chinese University of Hong Kong, Chang Guang University, The Hong Kong Polytechnic University, Chinese Academy of Sciences, is briefly reviewed. A novel densely-connected volumetric convolutional neural network is proposed:

It preserves the maximum information flow between layers by a densely-connected mechanism.
It avoids learning redundant feature maps by encouraging feature reuse and hence requires fewer parameters to achieve high performance.
Auxiliary side paths are added to strengthen the gradient propagation and stabilize the learning process.

This is a 2017 MICCAI paper with more than 50 citations. (Sik-Ho Tsang @ Medium)

Outline

Dense Connection
DenseVoxNet Architecture
Experimental Results

1. Dense Connection

1.1. ConvNet

In ConvNet, xl can be computed by a transformation Hl(x) from the output of the previous layer, xl-1:

1.2. ResNet

ResNet introduces a kind of skip connection which integrates the response of Hl(x) with the identity mapping of the features from the previous layer to augment the information propagation:

The identity function and the output of Hl are combined by summation.

1.3. DenseNet

The dense connectivity, by DenseNet, exercises the idea of skip connections to the extreme by implementing the connections from a layer to all its subsequent layers:

where […] refers to the concatenation operation.
This makes all layers receive direct supervision signal.
More importantly, such a mechanism can encourage the reuse of features among all these connected layers.
If the output of each layer has k feature maps, then the k, referred as growth rate.
(If interested, please read my review on DenseNet for more details.)

2. DenseVoxNet Architecture

2.1. Down-sampling

The down-sampling components are divided into two densely-connected blocks.
Each DenseBlock is comprised of 12 transformation layers with dense connections.
Each transformation layer is sequentially composed of a BN, a ReLU, and a 3×3×3 Conv and the growth rate, k, of our DenseVoxNet is 12.
The first DenseBlock is prefixed with a Conv with 16 output channels and stride of 2 to learn primitive features.
In-between the two DenseBlocks is the transition block which consists of a BN, a ReLU, a 1×1×1 Conv and a 2×2×2 max pooling layers.

2.2. Up-sampling

The up-sampling component is composed of a BN, a ReLU, a 1×1×1 Conv and two 2×2×2 deconvolutional (Deconv) layers to ensure the sizes of segmentation prediction map consistent with the size of input images.
The up-sampling component is then followed with a 1×1×1 Conv layer and softmax layer to generate the final label map of the segmentation.

2.3. Long Skip Connection

A kind of long skip connection to connect the transition layer to the output layer with a 2×2×2 Deconv layer.
This skip connection shares the similar idea of deep supervision to strengthen the gradient propagation and stabilize the learning process.

3. Experimental Results

3.1. Participating teams in this challenge on HVSMR2016 dataset

DenseVoxNet obtains the highest Dice scores.

3.2. More comparisons on HVSMR2016 dataset

DenseVoxNet has about 1.8M parameters in total, which is much fewer than 3D U-Net with 19.0M parameters and VoxResNet with 4.0M parameters.
And DenseVoxNet obtains the highest Dice scores.

3.3. Qualitative Results

4 typical segmentation results on training images (the first two samples, via cross validation) and testing images (the last two samples).
The blue and purple color denotes the segmentation results for blood pool and myocardium, respectively.