Brief Review — Selective Feature Aggregation Network with Area-Boundary Constraints for Polyp Segmentation
Selective Feature Aggregation Network with Area-Boundary Constraints for Polyp Segmentation,
UNet+Up+SKM, by The Chinese University of Hong Kong, and City University of Hong Kong,
2019 MICCAI, Over 140 Citations (Sik-Ho Tsang @ Medium)
- The network contains a shared encoder and two mutually constrained decoders for predicting polyp areas and boundaries, respectively.
- Selective Feature Aggregation is proposed by (1) introducing three up-concatenations between encoder and decoders and (2) embedding Selective Kernel Modules into convolutional layers which can adaptively extract features from different size of kernels.
- Furthermore, a new boundary-sensitive loss function is used.
- Selective Feature Aggregation Network
- Boundary-Sensitive Loss Function
1. Selective Feature Aggregation Network
1.1. Overall Architecture
- The network is composed of a shared encoder, an area branch, and a boundary branch.
- Each branch contains four convolutional modules. Each module contains three layers integrated with the SKMs.
- On top of the area branch, a 2-layer light-weight U-Net is adopted to help detect boundaries of the predicted areas.
- Besides the standard skip connection (Dashed lines), three extra up-concatenations are added to both the area and boundary branch (red arrow lines), enriches the feature representations.
1.2. Selective Kernel Module (SKM)
- SKM can dynamically aggregate features obtained from different size of kernels.
- An input feature map X is first filtered by three respective kernels simultaneously, then followed by a Batch Normalization and a ReLU activation, and outputs three distinct feature maps X3, X5, X7.
- To regress the weight vectors, element-wise summation of the three feature maps is performed to obtain ~X. Then global average pooling (GAP) and fully connected (FC) layer are performed, then softmax to obtain mk.
- The obtained mk is used as the weighting for X3, X5, X7 to obtain ^X3, ^X5, and ^X7. Finally, they are agggregated to form ^X.
- (Please feel free to read SKNet for more details.)
2. Boundary-Sensitive Loss Function
- The loss function is composed of three parts: an area loss La, a boundary loss Lb, and the area-boundary constraint loss, i.e., LC1 and LC2.
2.1. Area Loss
- La consists of a binary cross-entropy loss and a dice loss:
- where mi indicates the probability of pixel i being categorized into polyp class and zi is ground-truth.
2.2. Boundary Loss
- Lb measures the difference between outputs of boundary branch and boundary ground truth labels:
2.3. Area-Boundary Constraint Loss
- The area-boundary constraint loss is composed of two parts.
- The first part LC1 is to minimize the difference between edge detector results and boundary ground truth.
- The second part LC2 aims to minimize the difference between edge detector results and outputs of boundary branch.
- where qi is the results predicted by the edge detector, i.e., the light-weight U-Net, yi denotes boundary ground truth, and pi indicates outputs of boundary branch.
- DKL denotes Kullback-Leibler divergence. Minimizing DKL is equivalent to making the final outputs of area and boundary branch closer.
2.4. Total Loss
- where wa, wb, and wC1 are set to 1, wC2 is set to 0.5.
- UNet+Up: UNet with up-concatenations achieves better performance than UNet alone.
- UNet+Up+SKM: The SKM component is also verified to be effective in improving the segmentation performance, especially the Precision and IoUp which increase by more than 1.5%.
- UNet+Up+SKM+bd: The integration of boundary branch helps improve segmentation accuracy a lot.
UNet+Up+SKM+bd+LC1: The area boundary constraint loss functions also play important roles in improving the segmentation performance.
The proposed method obtains much better segmentation results compared with others.