Review — A Multi‑Task Convolutional Neural Network for Classification and Segmentation of Chronic Venous Disorders
VENet for Chronic Venous Disorders (CVD) Via Multi-Task Learning (MTL)
A Multi‑Task Convolutional Neural Network for Classification and Segmentation of Chronic Venous Disorders,
VENet, by University of Minho, ICVS/3B's—PT Government Associate Laboratory, 2Ai — School of Technology, IPCA, and LASI — Associate Laboratory of Intelligent Systems,
2023 Nature Sci. Rep. (Sik-Ho Tsang @ Medium)
- Chronic Venous Disorders (CVD) of the lower limbs are one of the most prevalent medical conditions, affecting 35% of adults in Europe and North America.
- VENet is proposed that simultaneously solves segmentation and classification tasks, exploiting the information of both tasks to increase learning efficiency, ultimately improving their performance.
- CVD Dataset
1. CVD Dataset
- The signs of CVD are typically evaluated in terms of a structured clinical classification protocol named CEAP (Clinical, Etiologic, Anatomic, Pathophysiologic).
This protocol incorporates a wide range of signs and symptoms of CVDs to describe their severity, ranging from C0 (no visible signs of venous disease) to C6 (active venous ulcer).
A clinical database contains 1376 photographs of patients with CVD in lower limbs, where 522 images were obtained from two public datasets, namely 217 images from ULCER47 and 305 from the SD-19848.
- The included CVD images, excepting level 3 related images (i.e. lesions may cover the entire leg), contain pixel-level lesion annotations.
- The above figure illustrates examples of the different image sources and corresponding segmentations.
- The VENet architecture utilizes the U-Net as the backbone, and it is divided into four parts, namely the encoding path, the classification head, the decoding path, and the segmentation head.
2.1. Encoding Path
- VENet encoding path is composed of downsampling blocks to extract high-level features.
- Each block of the encoding path is composed of two convolutional layers, with each layer consisting of a convolution, followed by batch normalization and a leaky rectified linear unit (Leaky ReLU).
- The downsampling is implemented using a strided convolution. The initial number of feature maps is defined to be 32. The number of feature maps is limited to 480.
2.2. Classification Head
- A classification head is added to the bottom of the VENet.
- The high-level features are fed into the classification head composed of a convolution block, followed by an adaptive average pooling layer to allow VENet to deal with different input sizes.
- Finally, the features are fed into the final block composed of a fully connected layer, a Leaky ReLU activation function, another fully connected layer, and a softmax layer to get the diagnostic probability of each CVD severity level.
2.3. Decoding Path
- Each block of the decoding path is composed of two convolutional layers, a batch normalization, and a Leaky ReLU activation function.
- Skip connections are established between the encoding path and the decoding path at the same level using concatenation.
2.4. Segmentation Head
- Three consecutive convolution layers are used, with each layer consisting of a convolution, followed by batch normalization and a Leaky ReLU. A softmax layer is used to obtain the final instance-level probability maps.
- Deep supervision is also used, where probability maps from each decoding block are generated using a convolution layer followed by a softmax layer.
2.5. Loss Function
- Multi-class cross-entropy loss is used for the classification task:
- For segmentation task, the cross-entropy loss is combined with the DICE loss.
- where DICE loss is:
- Additionally, to enable the deep supervision of the training, the final segmentation loss is the weighted sum of the losses from all resolution outputs of VENet:
- where the weights are giving less importance to segmentation predictions of lower resolution:
- The final loss is:
- with λclass=0.5 and λseg=1.
3.1. Ablation Study
Overall, the proposed strategy presented the best performance for all metrics, except for the PRE (Precision).
- When comparing against VENetC (Only Classification), an improvement in the overall accuracy was shown.
VENet presents the best average results against all the other networks.
Confusion matrix of VENet, shows that most of the classification errors originated from images with severity level 3.
Overall, VENet showed the best performance in comparison with all the other evaluated strategies
- Example results for all the evaluated DCNNs are shown.
3.4. Multi-Task Learning
Overall, the VENet architecture presents the best performance for the segmentation of the CVD lesions, with an increment of 4% against DSI-Net.
3.5. Loss Curves
VENet obtains lower losses for both training and validation sets.