Sitemap

Review — Semi-Supervised Learning for Fetal Brain MRI Quality Assessment with ROI consistency

Mean Teacher with ROI Consistency Loss

5 min readJul 30, 2025

--

Semi-Supervised Learning for Fetal Brain MRI Quality Assessment with ROI consistency
Mean Teacher + ROI Consistency
, by MIT, Boston Childrens Hospital, Harvard Medical School
2020 MICCAI, Over 30 Citations (

@ Medium)

Quality Assessment
2022
[Swin-MIQA]
==== My Healthcare and Medical Related Paper Readings ====
==== My Other Paper Readings Are Also Over Here ====

  • MRI is vulnerable to motion artifacts because data acquisition is slow. (There is a 1–2s delay between the acquisition of two consecutive slices in the stack.)
  • Manual annotation for fetal MR image quality assessment are usually time-consuming.
  • A semi-supervised deep learning approach is proposed based on Mean Teacher with ROI Consistency Loss.

Outline

  1. Mean Teacher + ROI Consistency
  2. Results

1. Mean Teacher + ROI Consistency

Mean Teacher + ROI Consistency

1.1. Mean Teacher

  • In semi-supervised learning, let {x1, x2, …, xNl} be the labeled dataset with labels {y1, y2, …, xNl} and let {xNl+1, xNl+2, …, xN} be the unlabeled dataset.
  • The Mean Teacher model consists of two networks with the same architecture, i.e., student network and teacher network. During training, the student network is updated by minimizing the following loss function:
  • The first term is the classification loss for labeled data, which is the cross entropy between student network prediction fθ and label yi.
  • The second term is the consistency loss between predictions of student and teacher networks. Kullback-Leibler (KL) divergence to measure the distance between the student and teacher predictions, instead of MSE.
  • η and η denote the noise perturbation for the two networks.
  • The teacher network is updated as:
  • where t is the training step.

1.2. Brain ROI Consistency

  • However, the fetal brain is the ROI relevant for fetal brain MRI IQA since only the artifacts occurring in the brain affect diagnostic quality of the image. Therefore, it is essential to train the model to focus on features within the brain ROI.
  • First, an ROI extraction module is introduced (Fig. 1A).
  • For each image x, it produces a brain ROI mask R. xR = x R is the masked image, where ⊙ is the Hadamard product.
  • A trained U-Net in [11] is used to segment fetal brains from MR slices.
  • To improve robustness, the masks of images belonging to the same scan are aggregated to generate a single ROI mask for the whole stack of images (Fig. 1C).
  • Masks with area less than a threshold Amin, are excluded.
  • The area-weighted mean and variance of the centers over B, i.e., are then computed.
  • The final ROI mask R is defined as the circle centered at q with radius r based on mean and variance.
  • The ROI consistency loss are defined as the MSE between these two features:
  • To guide the teacher network to learn meaningful features from the masked images, the classification loss for masked images in the labeled dataset is used as a regularization which is denoted as Lcls-roi:
  • Conditional entropy is adopted as an additional loss:
  • Therefore, the total loss of the proposed method is as follows:

2. Results

2.1. Dataset

  • A total of 217129 images were obtained from 644 previously acquired research and clinical scans of mothers with singleton pregnancies and no pathologies, ranging in gestational age between 19 to 37 weeks.
  • A set of 11223 images from 42 subjects are selected as labeled set and classified into three categories: diagnostic (D), non-diagnostic (N) and images without brain region of interest (W).
  • The labeled dataset is divided into training (7717 images), validation (1782 images), and test (1724 images) set.

2.2. Setup

  • ResNet-34 is used as the backbone of student and teacher.
  • For each method, the model is trained using 1000, 2000, 4000 and all labeled data in training set (7717).
  • For semi-supervised method, all unlabeled data are used for training.
  • Batch size of 384 is used. To balance the number of labeled and unlabeled data seen by the model, in each batch, 96 images are drawn from labeled dataset while the remains are unlabeled data.
  • Four NVIDIA TITAN X GPUs are used.

2.3. Results

Results
  • The proposed method outperforms other state-of-the-art semi-supervised learning method in terms of both accuracy and AUC of non-diagnostic image.

2.4. Online Implementation

Online Implementation
  • A pipeline is implemented that runs the IQA CNN during fetal MR scans to assign a IQA score to each slice and those slices with low IQA scores are reacquired.
  • The trained CNN is deployed on a GPU (NVIDIA 1050Ti).
  • The proposed method outperforms the supervised baseline and only misses one non-diagnostic slice in average when q = 50%.

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

No responses yet