Brief Review — PraNet: Parallel Reverse Attention Network for Polyp Segmentation

PraNet, Using Res2Net as Backbone

4 min readApr 16, 2023

PraNet: Parallel Reverse Attention Network for Polyp Segmentation,
PraNet, by Inception Institute of Arti cial Intelligence, Wuhan University, and Mohamed bin Zayed University of Arti cial Intelligence,
2020 MICCAI, Over 440 Citations (Sik-Ho Tsang @ Medium)
Biomedical Image Segmentation
2015 … 2022 [UNETR] [Half-UNet] [BUSIS] [RCA-IUNet] 2023 [DCSAU-Net]
==== My Other Paper Readings Are Also Over Here ====

A parallel reverse attention network (PraNet) is proposed, which aggregates the features in high-level layers using a parallel partial decoder (PPD).
Then, a global map is generated as the initial guidance area.
The boundary cues are mined using the reverse attention (RA) module, establishing the relationship between areas and boundary cues.

Outline

PraNet
Results

1. PraNet

**Overview of the proposed PraNet, which consists of three reverse attention modules with a parallel partial decoder connection.**

PraNet utilizes a parallel partial decoder to generate the high-level semantic global map and a set of reverse attention modules for accurate polyp segmentation from the colonoscopy images.

1.1. Feature Aggregating via Parallel Partial Decoder

Res2Net-based backbone is used to extract 5 levels of features from f1 to f5.
The partial decoder feature in [29] is computed by PD=pd(f3, f4, f5), to obtain a global map Sg. (No details about PD in the paper.)
In [29]:

Equation from [29]

The features are updated via a element-wise multiplying itself with all features of deeper layers with upsamling (Up) and convolution (Conv).
This Sg only captures a relatively rough location of the polyp tissues, without structural details. Thus, RA module is introduced.

1.2. Reverse Attention (RA) Module

Multiple cascaded RA module is used to progressively increase the resolution.
Specifically, the output reverse attention features Ri is obtained by multiplying (element-wise) the high-level side-output feature {fi; i=3, 4, 5} by a reverse attention weight Ai, as below:

where Ai is:

where P() denotes an up-sampling operation, σ() is the Sigmoid function, and ⊖() is a reverse operation subtracting the input from atrix E, in which all the elements are 1.

2.3. Loss Function

The loss function is defined as:

which is the weighted IoU loss and binary cross entropy (BCE) loss for the global restriction and local (pixel-level) restriction.
Deep supervision is also used for the three side-outputs (i.e., S3, S4, and S4) and the global map Sg. Each map is up-sampled to the same size as the ground-truth map.
Thus the total loss for the proposed PraNet:

2. Results

2.1. Kvasir and CVC-612

**Quantitative results on Kvasir and CVC-612 datasets.**

PraNet outperforms all SOTAs by a large margin (mean Dice: about > 7%), across both datasets, in all metrics.

2.2. CVC-ColonDB, ETIS, and test set (CVC-T)

**Quantitative results on CVC-ColonDB, ETIS, and test set (CVC-T) of EndoScene datasets.**

All images here are used as the unseen testing set.
PraNet again outperforms existing classical medical segmentation baselines (i.e., U-Net, UNet++), as well as SFA, with significant improvements on all three unseen datasets.

2.3. Training and Inference Analysis

**Training and inference analysis (same platform) on CVC-ClinicDB dataset.**

The proposed model achieves convergence with only 20 epochs (0.5 hours) of training.

PraNet runs at a real-time speed of 50fps for a 352×352 input, which guarantees our method can be implemented in colonoscopy video.

2.4. Ablation Study

**Ablation study for PraNet on the CVC-612 and CVC300 datasets.**

Each item contributes to the gain.

2.5. Qualitative Results

**Qualitative results of different methods.**

PraNet can precisely locate and segment the polyp tissues in many challenging cases, such as varied size, homogeneous regions, different kinds of texture, etc.

Brief Review — PraNet: Parallel Reverse Attention Network for Polyp Segmentation

PraNet, Using Res2Net as Backbone

Outline

1. PraNet

1.1. Feature Aggregating via Parallel Partial Decoder

1.2. Reverse Attention (RA) Module

2.3. Loss Function

2. Results

2.1. Kvasir and CVC-612

2.2. CVC-ColonDB, ETIS, and test set (CVC-T)

2.3. Training and Inference Analysis

2.4. Ablation Study

2.5. Qualitative Results

Written by Sik-Ho Tsang

No responses yet