Brief Review — Joint Segmentation and Fine-Grained Classification of Nuclei in Histopathology Images
Joint Segmentation and Fine-Grained Classification of Nuclei in Histopathology Images,
Qu ISBI’19, by Rutgers University, and Cancer Institute of New Jersey
2019 ISBI, Over 30 Citations (Sik-Ho Tsang @ Medium)
Medical Image Analysis, Multi-Task Learning, Image Segmentation, Image Classification
- An unified framework is proposed for Nuclei segmentation and classification, which can segment individual nuclei and classify them into tumor, lymphocyte and stroma nuclei.
- (Yesterday, I reviewed about Perceptual loss.) In this paper, Perceptual loss is utilized to enhance the segmentation. Transfer learning is used.
Outline
- Dataset & Preprocessing
- Proposed Unified Framework
- Results
1. Dataset & Preprocessing
- A dataset is annotated that consists of 40 H&E stained tissue images from 8 different lung adenocarcinoma or lung squamous cell carcinoma cases, and each case has 5 images of size about 900×900.
- There are around 24000 annotated nuclei in the dataset and each nucleus is marked as one of the following three types: tumor nucleus, lymphocytes nucleus, stroma (fibroblasts, macrophages, neutrophils, endothelial cells, etc.) nucleus.
- For each image, one label image is used to encode the segmentation mask and classification class information of each nucleus. In a ground truth label, pixels of value 0 are background. Pixels that have a same positive integer belong to an individual nucleus.
- The integer value id also indicates the class of the nucleus: (1) tumor nucleus if mod(id, 3) = 0, (2) lymphocyte nucleus if mod(id, 3) = 1, (3) stroma nucleus if mod(id, 3) = 2, where mod is the modular operation.
2. Proposed Unified Framework
- The proposed framework consists of two parts: the prediction network that generates the segmentation mask of each type of nuclei, and the perceptual loss network that computes the perceptual loss between the predicted label and ground-truth label.
2.1. Prediction Network
- The prediction network is the routine encoder-decoder structure based on U-Net.
- The encoder is from ResNet-34, without the average pooling and fully connected layers, and is initialized with the pretrained parameters from image classification tasks.
- There are skip connections between encoder and decoder, which helps to recover high resolution feature maps.
- The network outputs five probability maps: background, inner part of tumor nuclei, inner part of lymphocytes nuclei, inner part of stroma nuclei and contours of all nuclei.
- The contour map mainly aims to capture the contours of crowded and touching nuclei. As a result, the predicted inner parts of each nucleus are not connected. The final nuclei mask is generated by a simple morphological dilation operation.
2.2. Perceptual Loss Network
- The perceptual loss network is utilized to improve the segmentation accuracy of details in the image. It originates from Johnson et al.’s work [17], in which the authors compute loss between high-level features of the transformed image and the original image.
- The pretrained VGG-16 model is a feature extractor and is fixed during training and test. Four levels of features are extracted using this network for the output of the prediction network and the ground-truth label, i.e., feature maps after the last ReLU layer of the first, second, third and fourth blocks of VGG-16 model, denoted as relu1_2, relu2_2, relu3_3, relu4_3.
- The mean square loss is then computed between the feature sets of two inputs.
2.3. Loss Function
- The loss function of the method consists of two parts.
- The first part is the cross entropy loss for five classes:
- Larger weights are assigned for low frequent class pixels:
- where d1, d2 are the distances to the nearest and the second nearest nuclei. σ=5 and ω0=10.
- The second part is the perceptual loss. Let’s denote the trained VGG-16 model as a function f.
- where ^y=arg max y is the prediction map obtained from the output probability map y.
- The final loss function is:
- where β=0.1.
- For fine-grained classification, authors only consider the accuracy in true positives instead of all ground-truth nuclei, because not all nuclei have corresponding predicted ones.
2. Results
All three model variants have achieved relatively good segmentation and fine-grained classification results, showing that the idea of combining the two tasks are feasible.
- Compared to FCN-8s and U-Net, the proposed method has improvements on the segmentation of all types of nuclei, especially on lymphocytes.
- The proposed method also outperforms FCN-8s and U-Net on the fine-grained classification.
Both transfer learning and perceptual loss techniques can promote the performance of segmentation and classification.
Reference
[2019 ISBI] [Qu ISBI’19]
Joint Segmentation and Fine-Grained Classification of Nuclei in Histopathology Images
Biomedical Multi-Task Learning
2018 [ResNet+Mask R-CNN] [cU-Net+PE] [Multi-Task Deep U-Net] [cGAN-AutoEnc & cGAN-Unet] 2019 [cGAN+AC+CAW] [Qu ISBI’19]