Brief Review — Improving the Generalizability of Convolutional Neural Network-Based Segmentation on CMR Images

U-Net for CMR Image Segmentation

Happy Lunar New Year 2023, Year of the Rabbit (Free Figures From Vecteezy)
  • By carefully designing data normalization and augmentation strategies, a neural network trained on a single-site single-scanner dataset from the UK Biobank can be successfully applied to segmenting cardiac MR images across different sites and different scanners without substantial loss of accuracy.
  • This is the first work to explore the generalizability of CNN-based methods for cardiac MR image multi-structure segmentation.


  1. Proposed Approach
  2. Results

1. Proposed Approach

1.1. Datasets

General descriptions of the three datasets.
  • Three datasets are used, as listed above:
  1. UK Biobank (UKBB): consists of over half a million voluntary participants aged between 40 and 69 from across the UK, nearly 100,000 participants, including brain, cardiac and whole-body MR imaging. Pixel-wise segmentations of three essential structures (LV, MYO, and RV) for both end-diastolic (ED) frames and end-systolic (ES) frames are provided as ground-truths.
  2. Automated Cardiac Diagnosis Challenge (ACDC): is a part of the MICCAI 2017 benchmark dataset for CMR image segmentation. This dataset is composed of 100 CMR images. The LV, MYO, and RV in this dataset have been manually segmented for both ED frames and ES frames.
  3. British Society of CardiovascularMagnetic Resonance Aortic Stenosis (BSCMR-AS): consists of CMR images of 599 patients with severe aortic stenosis (AS), who had been listed for surgery. Only the left ventricle in ED frames and ES frames, as well as the myocardium in ED frames, have been annotated manually.
  • UKBB dataset for training and intra-domain testing, and use the ACDC data and BSCMRAS dataset for cross-domain testing.
  • In UKBB, 3,975 subjects were used to train the neural network while 300 validation subjects were used for tracking the training progress and avoid over-fitting. The subset consisting of remaining 600 subjects was used for evaluating models’ performance in the intra-domain setting.

1.2. Model Architecture

Overview of the U-Net network structure
  • 2D U-Net is used, but with two main differences: (1) batch normalization (BN) is applied after each hidden convolutional layer to stabilize the training; (2) Dropout regularization is applied after each concatenating operation to avoid overfitting and encourage generalization.
  • Cross entropy loss function is used.

1.3. Preprocessing

Image pre-processing during training and testing.
  • Image resampling and intensity normalization are employed to normalize images in both the training and testing stages.
  • During training, a wide range of geometrical variations is applied in terms of the heart pose and size: Random horizontal and vertical flips, random rotation, random image scaling, and random image cropping.
  • During testing, only center cropping is used.

2. Results

Boxplots of the average Dice scores
Comparison results of segmentation performance between a baseline method and the proposed method across three test sets.
Cross-dataset segmentation performances of four different network architectures.
  • UNet-16: a smaller version of U-Net where the number of filters in each convolutional layer is reduced by four times.
Visualization of good segmentation examples selected from three patient groups. Row 1: Ground-truth, Row Predictions



PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store