Brief Review — BT‑Unet: A self‑supervised learning framework for biomedical image segmentation using barlow twins with U‑net models

BT‑Unet, Pretrain U-Net with Barlow Twins

Sik-Ho Tsang
4 min readNov 25, 2022

BT‑Unet: A self‑supervised learning framework for biomedical image segmentation using barlow twins with U‑net models,
BT-Unet, by Indian Institute of Information Technology Allahabad,
2022 JML (Sik-Ho Tsang @ Medium)

  • BT-Unet is proposed that uses the Barlow Twins approach to pre-train the encoder of a U-Net model via redundancy reduction in an unsupervised manner to learn data representation.
  • Later, complete network is fine-tuned to perform actual segmentation.

Outline

  1. BT-Unet
  2. Results

1. BT-Unet

BT-Unet framework. a Pre-training U-Net encoder network, and b Fine-tuning U-Net model that is initialized with pre-trained encoder weights
  • The BT-Unet framework is divided into two phases: 1) Pre-training, and 2) Fine-tuning.

1.1. Pre-Training

  • In pre-training, the aim is to learn the complex feature representations using unannotated data samples.
  • The encoder of the U-Net models is pre-trained with the Barlow Twins (BT) strategy and later fine-tuned to perform actual segmentation:
  • (Please feel free to read Barlow Twins if interested.)
  • BT-Unet framework is applied to various state-of-the-art U-Net models: vanilla U-Net, Attention U-Net (A-Unet), inception U-Net (I-Unet) and residual cross-spatial attention guided inception U-Net (RCA-IUnet).

1.2. Fine-Tuning

  • The weights of the encoder network in the U-Net model are initialized with pre-trained weights (from the first phase), whereas the rest of the network is initialized with default weights.
  • Finally, the U-Net model is fine-tuned with limited annotated samples for the biomedical image segmentation.
  • U-Net models are fine-tuned with segmentation loss function, L defined as the average of binary cross-entropy loss, LBC and dice coefficient loss, LDC:
  • where y is the ground truth label of a pixel, p(y) is the predicted label of a pixel and N is the total number of pixels.

2. Results

2.1. Datasets

Summary of biomedical datasets used in the paper

2.2. Performance Using Fixed Small Training Set

Impact of BT pre-training on segmentation performance of the U-Net models
  • KDSB18: The performance of the BT enabled U-Net models exceeds as compared to the models without BT.
  • BUSIS: U-Net and A-Unet models are not able to learn and extract feature maps concerning tumor regions (achieved 0 precision, DC and mIoU), however with pre-training, these models achieved noticeable improvement. In case of I-Unet and RCAIUnet models, considerable improvements are observed with pre-training.
  • ISIC18: The I-Unet and RCAIUnet models are the most influenced networks that achieved 5.1% and 2.2% increase in precision respectively. However, a slight decline in performance is observed with vanilla U-Net and A-Unet while using BT pre-training.
  • BraTS18: I-Unet and RCA-IUnet models achieved significant gain in the segmentation performance while using the BT-Unet framework, whereas the same behavior is not observed with vanilla U-Net and A-Unet models.

2.3. Performance Using Different-Sized Small Training Set

Performance analysis of U-Net variants with and without pre-training using Barlow Twins (BT) over different fractions of training datasets (DS)
  • For all datasets with training fractions less than 50%, similar change in performance is observed among the models.

2.4. Qualitative Results

Qualitative comparative analysis of the segmentation performance
  • RCA-IUNet with BT has a very good segmentation result.

Hope I can review Inception U-Net and RCA-IUNet in the coming future.

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.