Brief Review — ResUNet++, CRF and TTA for Colorectal Polyp Segmentation

ResUNet++, CRF and TTA

3 min readApr 5, 2023

A Comprehensive Study on Colorectal Polyp Segmentation With ResUNet++, Conditional Random Field and Test-Time Augmentation,
ResUNet++, by SimulaMet, UiT The Arctic University of Norway, University of Oslo, Sahlgrenska University Hospital, Brum Hospital, University of Gothenburg, and Oslo Metropolitan University,
2021 J. Biomedical and Health Informatics, Over 90 Citations, and
2019 ISM, Over 400 Citations (Sik-Ho Tsang @ Medium)
Biomedical Image Segmentation
2015 … 2022 [UNETR] [Half-UNet] [BUSIS] [RCA-IUNet] 2023 [DCSAU-Net]
==== My Other Paper Readings Also Over Here ====

ResUNet++ has been proposed in 2019 ISM, “ResUNet++: An Advanced Architecture for Medical Image Segmentation”.
In this paper, ResUNet++ is extended with the use of Conditional Random Field (CRF) and Test-Time Augmentation (TTA).

Outline

ResUNet++
CRF and TTA
Results

1. ResUNet++

The backbone of ResUNet++ architecture is ResUNet: an encoder-decoder network and based on U-Net, which uses residual blocks.

Besides residual blocks (ResNet), the proposed architecture also takes the benefit of squeeze and excite block (SENet, dark gray), atrous spatial pyramid pooling (ASPP, dark red) (DeepLabv3), and attention block (Transformer, green).

(In the paper, they also do not describe the above modules in details. If interested, please feel free to read the stories that I wrote for them.)

2. CRF and TTA

2.1. Conditional Random Field (CRF)

Conditional Random Field (CRF) is a popular statistical modeling method used when the class labels for different inputs are not independent (e.g., image segmentation tasks).
CRF can model useful geometric characteristics like shape, region connectivity, and contextual information.
CRF concept is also used in DeepLabv1 and CRF-RNN.

Here, CRF acts as a post-processing step to refine the predicted segmenation map.

2.2. Test Time Augmentation (TTA)

Test Time Augmentation (TTA) is popularly used in image classification models.
In TTA, augmentation is applied to each test image, and multiple augmented images are created. After that, we make predictions on these augmented images, and the average prediction of each augmented image is taken as the final output prediction.