Brief Review — RA-UNet: A Hybrid Deep Attention-Aware Network to Extract Liver and Tumor in CT Scans
RA-UNet, first Work To Use Attention Residual Mechanism for Tumor Segmentation from 3D Medical Volumetric Images.
RA-UNet: A Hybrid Deep Attention-Aware Network to Extract Liver and Tumor in CT Scans,
RA-UNet, by Tianjin University, CSIRO Data61, Tianjin University of Traditional Chinese Medicine, and La Trobe University
2020 J. Front. Bioeng. Biotechnol., Over 220 Citations (Sik-Ho Tsang @ Medium)
Medical Imaging, Medical Image Analysis, Image Segmentation, U-Net
- A 3D hybrid residual attention-aware segmentation method, RA-UNet, is proposed where attention residual modules are integrated into U-Net so that the attention-aware features change adaptively.
- This is the first work that an attention residual mechanism is used to segment tumors from 3D medical volumetric images.
- Residual Attention-aware U-Net (RA-UNet)
- Pipeline Details
1. Residual Attention-aware U-Net (RA-UNet)
1.1. Overall Pipeline
- The pipeline has three steps:
- RA-UNet-I: A 2D residual attention-aware U-Net (RA-UNet), named RA-UNet-I, is to obtain a coarse liver boundary box first.
- The First RA-UNet-II: Next, a 3D RA-UNet, which is called RA-UNet-II, was trained to obtain a precise liver volume of interest (VOI).
- The second RA-UNet-II: Finally, the obtained liver VOI was sent to a second RA-UNet-II to extract the tumor region.
1.2. Datasets and Materials
- The public Liver Tumor Segmentation Challenge (LiTS) dataset is used, which has a total of 200 CT scans containing 130 scans as training data and 70 scans as test data.
- Another dataset named 3DIRCADb is used as an external test dataset, which includes 20 enhanced CT scans.
- both of which have the same 512×512 in-plane resolution but with different numbers of axial slices in each scan.
1.3. RA-UNet Model Architecture
- In traditional residual block:
- where x denotes the first input of a residual block, OR denotes the output of a residual block.
- The residual block consists of three sets of combinations of a batch normalization (BN) layer, an activation (ReLU) layer, and a convolutional layer, as above.
In this paper, attention residual learning proposed by Residual Attention Network is used, as above.
- The attention residual mechanism divides the attention module into a trunk branch and a soft mask branch, where the trunk branch is used to process the original features and the soft-mask branch is used to construct the identity mapping.
- The output OA of the attention module under attention residual learning can be formulated as:
In brief, the output soft-mask branch S is sigmoided, which is in the range of [0,1]. Therefore, for the correlated features, S will be close to 1. For uncorrelated features, S will be close to 0.
By multiplcations with F, features F will be magnified by S that are close to 1, and will be diminished by S that are close to 0.
- 1+ is to given the skip connection.
- Therefore, this mechanism enhances good features and reduce the noises from the trunk branch.
- (Please feel free to read Residual Attention Network if interested.)
- The overall architecture of RA-UNET-II is shown as above.
- Sigmoid is used at the output to generate the final probability map of liver segmentation.
1.4. Loss Function
- Standard Dice loss is used:
2. Pipeline Details
2.1. Liver Localization Using RA-UNet-I
- The first stage aimed to locate the 3D liver boundary box. A 2D version RA-UNet-I was introduced here to segment a coarse liver region, which can reduce the computational cost of the subsequent RA-UNet-II.
- The slices are downsampled to 256×256 and fed into the trained RA-UNet-I. All the slices are stacked in their original sequence.
- Afterwards, a 3D connected-component labeling is used for assigning a unique label to each connected component in an image.
- Finally, the liver region is interpolated to its original volume size with a 512×512 size.
The attention mechanism has successfully constrained the liver region.
2.2. Liver Segmentation Using RA-UNet-II
- The RA-UNet-II is employed on each CT patch to generate 3D liver probability patches in sequence. Then, those probability patches are interpolated and stacked to be restored to the original size of the boundary box.
- A voting strategy is used to generate the final liver probability of the VOI from overlapped sub-patches.
- A 3D connected-component labeling is used and the largest component was chosen on the merged VOI to yield the final liver region.
The liver region was precisely extracted by selecting the largest region.
2.3. Extraction of Tumors Based on RA-UNet-II
- Tumor region extraction is similar to liver segmentation but no interpolation and resizing were performed.
- In order to solve the data imbalance issue and learn more effective tumor features, patches on both tumor and its surroundings non-tumor regions are picked for training.
- A voting strategy is used again on the merged VOI to yield the final tumor segmentation. At last, we filtered out those voxels which were not in the liver region.
3.1. Ablation on Loss Functions
- DC is Dice Coefficient Score.
Liver (Left): DC reached up to 0.961 and 0.977 Dice scores on the LiTS test dataset and the 3DIRCADb dataset, respectively.
Tumor (Right): DC reached 0.595 and 0.830 Dice scores on the LiTS test dataset and the 3DIRCADb dataset, respectively.
3.2. Qualitative Results
It shows that liver regions which are large in size are successfully segmented and tumors that are tiny and hard to detect can be identified by the proposed method as well.
Due to the low contrast with the surrounding livers and the extremely small size of some tumors, the proposed method still has some false positives and false negatives for tumor extraction.
3.2. Quantitative Results
The proposed method obtains precise segmentation of liver and tumor, outperforms two SOTA approaches.
3.3. Generalization of the Proposed RA-UNet
- To show the generalization of the proposed method, we used the weights well-trained on LiTS and tested on the 3DIRCADb dataset.
The proposed method reached a mean Dice score of 0.830 on livers with tumors compared to a mean Dice score of 0.56 for the method by Christ et al. (2017a).
[2020 J. Front. Bioeng. Biotechnol.] [RA-UNet]
RA-UNet: A Hybrid Deep Attention-Aware Network to Extract Liver and Tumor in CT Scans