Review — Uncertainty-Aware Self-ensembling Model for Semi-supervised 3D Left Atrium Segmentation

UA+MT, Semi-Supervised Segmentation Using Teacher-Student Paradigm

  • A novel uncertainty-aware semi-supervised framework is proposed for left atrium segmentation from 3D MR images.
  • The framework consists of a student model and a teacher model, and the student model learns from the teacher model by minimizing a segmentation loss and a consistency loss with respect to the targets of the teacher model.


  1. Semi-Supervised Segmentation
  2. Uncertainty-Aware Mean Teacher Framework (UA-MT)
  3. Experimental Results

1. Semi-Supervised Segmentation

Uncertainty-aware self-ensembling mean teacher framework (UA-MT) for semi-supervised LA segmentation.

1.1. Definitions

  • Let say we have the 3D data, where the training set consists of N labeled data and M unlabeled data, called DL and DU respectively:
  • where xi of the size H×W×D is the input volume and yi ∈ {0, 1} of the size H×W×D is the ground-truth annotations.

1.2. Loss Functions

  • The goal of semi-supervised segmentation framework is to minimize the following combined objective function:
  • where Ls denotes the supervised loss (e.g., cross-entropy loss) to evaluate the quality of the network output on labeled inputs, and
  • Lc represents the unsupervised consistency loss for measuring the consistency between the prediction of the teacher model and the student model for the same input xi under different perturbations.
  • Here, f(·) denotes the segmentation neural network; (θ’, ξ’) and (θ, ξ) represents the weights and different perturbation operations (e.g., adding noise to input and network Dropout) of the teacher and student models.
  • λ is an ramp-up weighting coefficient that controls the trade-off between the supervised and unsupervised loss:
  • At the beginning, when the model is not well trained, λ is small such that the above loss function mainly depends on supervised loss.
  • As the training continues (where t is the training step), λ becomes larger such that loss function is a combination of supervised loss and consistency loss.
  • The teacher’s weights θ’ as an exponential moving average (EMA) of the student’s weights θ to ensemble the information in different training step:
  • where α is the EMA decay.

2. Uncertainty-Aware Mean Teacher Framework (UA-MT)

2.1. Uncertainty Estimation

  • T stochastic forward passes on the teacher model under random Dropout and input Gaussian noise for each input volume. Therefore, for each voxel in the input, we obtain a set of softmax probability vector:
  • The predictive entropy is:
  • where pct is the probability of the c-th class in the t-th time prediction.
  • The uncertainty is estimated in voxel level and the uncertainty of the whole volume U is:

2.2. Uncertainty-Aware Consistency Loss

  • The uncertainty-aware consistency loss Lc as the voxel-level mean squared error (MSE) loss of the teacher and student models only for the most certainty predictions:
  • where I(·) is the indicator function; fv and fv are the predictions of teacher model and student model at the v-th voxel, respectively.
  • uv is the estimated uncertainty U at the v-th voxel; and H is a threshold to select the most certain targets.

2.3. Model Architecture

  • V-Net is used as the network backbone. The short residual connection is removed in each convolution block, and a joint cross-entropy loss and dice loss are used.
  • To adapt the V-Net as a Bayesian network to estimate the uncertainty, two Dropout layers with Dropout rate 0.5 are added after the L-Stage 5 layer and R-Stage 1 layer of the V-Net.

3. Experimental Results

3.1. Dataset

  • Atrial Segmentation Challenge dataset is used. It provides 100 3D gadolinium-enhanced MR imaging scans (GE-MRIs) and LA segmentation mask for training and validation.
  • These scans have an isotropic resolution of 0.625×0.625×0.625mm³. The 100 scans are split into 80 scans for training and 20 scans for evaluation. All the scans were cropped centering at the heart region for better comparison of the segmentation performance of different methods.

3.2. SOTA Comparisons

Comparison between the proposed method and various methods
  • The above table shows the segmentation performance of V-Net trained with only the labeled data (the first two rows) and the proposed semi-supervised method (UA-MT) on the testing dataset.
  • The fully supervised V-Net with all 80 labeled scans is evaluated as upper bound (3rd to 4th rows).
  • Compared with the Vanilla V-Net, adding Dropout (Bayesian V-Net) improves the segmentation performance, and achieves an average Dice of 86.03% and Jaccard of 76.06% with only the labeled training data.
  • Compared with the self-training method, the DAN and ASDNet improve by 0.60% and 0.98% Dice, respectively, showing the effect of adversarial learning in semi-supervised learning. The ASDNet is better than DAN, since it selects the trustworthy region of unlabeled data for training the segmentation network.
  • The self-ensembling-based methods TCSE achieve slightly better performance than ASDNet, demonstrating that perturbation-based consistency loss is helpful for the semi-supervised segmentation problem.

3.3. Analyses

Quantitative analysis of the proposed method.
Visualization of the segmentations by different methods and the uncertainty.
  • Compared with the supervised method, the proposed results have higher overlap ratio with the ground truth (the second row) and produce less false positives (the first row).
  • As shown in (d), the network estimates high uncertainty near the boundary and ambiguous regions of great vessels.


[2019 MICCAI] [UA+MT]
Uncertainty-Aware Self-ensembling Model for Semi-supervised 3D Left Atrium Segmentation

Biomedical Image Semi-Supervised Learning

2019 [UA+MT]

My Other Previous Readings



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store