Brief Review — Medical Image Segmentation Based on Self-Supervised Hybrid Fusion Network

Multi-Modal ResUNet+ASPP+HAFB

Sik-Ho Tsang
3 min readJun 25, 2023
Multi-Modal ResUNet+ASPP+HAFB

Medical Image Segmentation Based on Self-Supervised Hybrid Fusion Network,
Multi-Modal ResUNet+ASPP+HAFB, by 1Dalian University of Technology, 2The First Affiliated Hospital of Dalian Medical University, and 3The Affiliated Central Hospital, Dalian University of Technology
2023 Frontiers in Oncology
(Sik-Ho Tsang @ Medium)

Biomedical Image Self-Supervised Learning
2018 … 2022 [BT-Unet] [Taleb JDiagnostics’22] [Self-Supervised Swin UNETR] [Self-Supervised Multi-Modal]
==== My Other Paper Readings Are Also Over Here ====

Outline

  1. Brief Review of Multi-Modal ResUNet+ASPP+HAFB
  2. Results

1. Brief Review of Multi-Modal ResUNet+ASPP+HAFB

  • Indeed, the model architecture and loss function is almost the same as the one in Self-Supervised Multi-Modal (2022 JHBI). (So, I don’t repeat here too much. Please feel free to read the story directly.)

A self-supervised multi-modal encoder-decoder network is proposed based on ResNet as shown at the top where HAFB enables the feature learning using images from multi-modalities.

Hybrid Attentional Fusion Block (HAFB)

The network introduces a multi-modal Hybrid Attentional Fusion Block (HAFB) to fully extract the unique features of each modality and reduce the complexity of the whole framework.

  • One input is features from other modalities. Another input is features from the same modality but at the higher level.
Similarity Loss and Image Masking

In addition, to better learn multi-modal complementary features and improve the robustness of the model, a pretext task is designed based on image masking.

  • By masking, inputs become not the same from x to x’, similarity loss is added to enforce them to be the same at feature level even one of them is masked.

2. Results

  • Besides BraTS 2019 dataset which is evaluated in 2022 JBHI (with different result values), BraTS 2020 is also evaluated.

2.1. BraTS 2020

BraTS 2020

The proposed model obtains the best results on ET and WT.

2.2. Ablation Study

Ablation Study of Each Component

With the ASPP module (last row), although there is no significant improvement in the dice coefficient, ASPP module performs feature extraction from multiple scales on the feature map at the end of the encoder, which makes it more accurate for edge information extraction.

Masking Strategies
Masking Strategies

All these masking strategies are able to make the self-supervised strategy effective and make the model accuracy improve. Overall, the best performer is the 20×20 square mask.

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.