Brief Review — Multi-task Thyroid Tumor Segmentation Based on the Joint Loss Function
FFANet As Backbone for Multi-Task Learning (MTL)
Multi-task Thyroid Tumor Segmentation Based on the Joint Loss Function,
FFANet+MTL, by Hebei University of Technology
2023 J. Bio. Signal Process. Control (Sik-Ho Tsang @ Medium)Biomedical Image Multi-Task Learning
2018 … 2020 [BUSI] [Song JBHI’20] [cGAN JESWA’20] 2021 [Ciga JMEDIA’21] [CMSVNetIter]
==== My Other Paper Readings Are Also Over Here ====
- FFANet is used as backbone, with a classification branch added to expand it into a multi-task image segmentation framework.
- A joint loss function is designed for classification and segmentation.
Outline
- FFANet+MTL
- Results
1. FFANet+MTL
1.1. Brief Review of FFANet for Segmentation Branch
- FFANet is used as the basic segmentation network.
- FFANet redesigns and optimizes VoVNet as backbone, where the residual connection is added so that the VoVNet can learn more features.
- A special feature fusion mechanism is designed to effectively aggregate multi-scale features.
- Finally, in the up-sampling stage, a mixed-domain attention mechanism is inserted to refine the segmentation results
- (Please feel free to read FFANet for more details.)
1.2. Classification Branch
- For classification branches, it consists of a global average pooling (GAP) layer and a fully connected layer.
1.3. Joint Loss Function
- For classification task, cross-entropy (CE) loss function is used:
- For segmentation task, the dice loss function is used:
- The final loss is the weighted CE and DICE losses:
- where α=0.2 as final choice.
2. Results
2.1. Dataset and Metrics
- The dataset we used was provided by the MICCAI 2020 Open Competition Thyroid Nodule Segmentation and Classification (TN-SC 2020).
- The dataset includes 3644 images provided by 3644 patients and annotated by professional doctors. All images are single-channel gray-scale ultrasound images.
- The ratio of the training set, validation set and test set is 7∶1∶2.
- The image size is adjusted to 512×512.
- 4 metrics are used, i.e. DICE, Accuracy (ACC), Sensitivity (SE), and Specificity (SP). (This ACC is pixel-wise ACC)
2.2. Weight Selection
α=0.2 obtains the highest performance which is used for later experiments.
α=0.2 obtains a little bit worse classification performance.
2.3. SOTA Comparisons
The best performance of the multi-task model (0.2) in key index DICE is 0.935. The performance on ACC and SP reaches the third place, and the performance on SE is also very good.
The proposed multi-task model (0.2) reaches classification accuracy of 0.79, ranks second, surpassing ResNet-50, ResNet-50 with ECA-Net channel attention, and VoVNet-39.
The proposed multi-task model is very accurate for the localization and prediction of the target area, and the description of the lesion area is more refined.