Brief Review — FocalMix: Semi-Supervised Learning for 3D Medical Image Detection

FocalMix,

4 min readJul 19, 2023

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection
FocalMix, by Peking University, and Yizhun Medical AI Co., Ltd,
2020 CVPR, Over 100 Citations (Sik-Ho Tsang @ Medium)
Biomedical Image Semi-Supervised Learning
2019 [UA+MT] 2020 [SASSNet]
==== My Other Paper Readings Are Also Over Here ====

FocalMix is proposed, which is the first to leverage recent advances in semi-supervised learning (SSL) for 3D medical image detection.

Outline

Preliminaries
FocalMix
Results

1. Preliminaries

1.1. Example of Medical Image Detector

Example of Medical Image Detection

1.2. Focal Loss for Imbalance Dataset

The original Focal Loss (FL) used in RetinaNet is:

(Please feel free to read about RetinaNet if interested.)

1.3. MixMatch for Semi-Supervised Learning

MixMatch consists of two major components, target prediction for unlabeled data and mixup augmentation. MixMatch uses the average ensemble of predictions by the current model parameterized by θ on K augmented instances:

Then, these guessed labels are further transformed by a sharpening operator before used as training targets:

The sharpening operation implicitly enforces the model to output low-entropy predictions on unlabeled data.
(Please feel free to read about MixMatch if interested.)

1.4. mixup for Augmentation

mixup augmentation produces a stochastic linear interpolation with another training example (x′, y′), either labeled or unlabeled:

With mixup, two images x and x’ are mixed together as ^x. The corresponding image labels y and y’ are mixed as ^y. (Please read mixup if interested.)

2. FocalMix

FocalMix Overview

Following the recommendations in Oliver NeurIPS’18, the exact same model is used, a 3D variant of FPN, as both the fully-supervised baseline and the base model for FocalMix.
Two essential components in the MixMatch framework are tailored specifically for lesion detection tasks: target prediction and mixup augmentation.

2.1. Soft-Target Focal Loss

With the use of MixMatch, using focal loss amounts to having a skewed distribution of soft labels.
The proposed soft-target focal loss for SSL is designed:

where CE loss is:

As we can see, focal loss is a special case of the proposed soft-target focal loss.

2.2. mixup Augmentation for Detection

Directly applying mixup for bounding boxes is not applicable.
Image-Level mixup is used such that mixup training signals are at the anchor level. Anchor-to-anchor mixup requires the model to be able to detect lesions that are mixed with stronger background noises than usual, analogous to the idea of “altitude training”.

Object-level mixup is also applied to generate extra object instances by mixing up different lesion patterns within each training batch.

3. Results

3.1. LUNA16

LUNA16

When 25 labeled images are used, the fully-supervised model can only obtain a CPM score of 66.6%, whereas FocalMix boosts it to 78.1% with a 17.3% relative improvement.

LUNA16

CPM score consistently grows as the amount of unlabeled data increases, which proves the effectiveness of using unlabeled data in FocalMix.

3.2. Ablation Study

Ablation Study

SFL is the best loss function. K=4 is the best. With both image-level and object-level mixup, CPM is the best.

Examples of mixup

Intuitively, the goal of image-level mixup is to encourage models to perform linearly between foreground and background, while object-level mixup encourages models to detect lesions with richer patterns.

All the models are trained for 400 epochs. When using all the 533 annotated CT scans, the proposed mixup strategies (i.e., anchor-level and object-level mixup) alone can improve the CPM score of the fully-supervised learning approach from 89.2% to 90.0%.

FocalMix further improves this result to 90.7% by leveraging around 3,000 images without annotation.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Artificial Intelligence

Medical Imaging

Medical Image Analysis

Object Detection

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Recommended from Medium

Real-Time Object Detection using YoloV7 on Google Colab

In

Towards AI

by

Adijsad

Real-Time Object Detection using YoloV7 on Google Colab

Want to test your video using Yolov7 and Google Colab? Learn how to make real-time object detection using your videos in this tutorial

Dec 16, 2024

YOLOE: Revolutionizing Object Detection with Visual Prompts [Part-1]

akhil pillai

YOLOE: Revolutionizing Object Detection with Visual Prompts [Part-1]

Introduction

Mar 31

Point cloud edge detection — Sort counterclockwise — Gross point removal

PointCloud-Slam-Image-Web3

Point cloud edge detection — Sort counterclockwise — Gross point removal

Background: proceededge detectionThe extracted point cloud has many glitch points, and the output point cloud is not sorted clockwise or…

Nov 4, 2024

Beginner’s Guide to Video-Transformer (ViT) Model for Video Analytics

Sajid Khan

Beginner’s Guide to Video-Transformer (ViT) Model for Video Analytics

Building a Simple Video Transformer for Action Recognition

Nov 6, 2024

ControlNet, ControlNet++ and Uni-ControlNet

Juneta Tao

ControlNet, ControlNet++ and Uni-ControlNet

ControlNet

Oct 11, 2024

An Introduction to CLIP and SigLIP: Revolutionizing Multimodal Learning

LM Po

An Introduction to CLIP and SigLIP: Revolutionizing Multimodal Learning

If you’re not a Medium subscriber, click here to read the full article.

Mar 28

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech