Brief Review — Joint Segmentation and Fine-Grained Classification of Nuclei in Histopathology Images

U-Net with ResNet-34 as Encoder, VGG Loss is used

5 min readNov 11, 2022

Joint Segmentation and Fine-Grained Classification of Nuclei in Histopathology Images,
Qu ISBI’19, by Rutgers University, and Cancer Institute of New Jersey
2019 ISBI, Over 30 Citations (Sik-Ho Tsang @ Medium)
Medical Image Analysis, Multi-Task Learning, Image Segmentation, Image Classification

An unified framework is proposed for Nuclei segmentation and classification, which can segment individual nuclei and classify them into tumor, lymphocyte and stroma nuclei.
(Yesterday, I reviewed about Perceptual loss.) In this paper, Perceptual loss is utilized to enhance the segmentation. Transfer learning is used.

Outline

Dataset & Preprocessing
Proposed Unified Framework
Results

1. Dataset & Preprocessing

Example of an image and its labels. (a) Original image, (b) Ground-truth label, © Classification label, red, green and blue colors represent tumor, lymphocytes and stroma nuclei, respectively. (d) Segmentation label, distinct colors are different nuclei.

A dataset is annotated that consists of 40 H&E stained tissue images from 8 different lung adenocarcinoma or lung squamous cell carcinoma cases, and each case has 5 images of size about 900×900.
There are around 24000 annotated nuclei in the dataset and each nucleus is marked as one of the following three types: tumor nucleus, lymphocytes nucleus, stroma (fibroblasts, macrophages, neutrophils, endothelial cells, etc.) nucleus.
For each image, one label image is used to encode the segmentation mask and classification class information of each nucleus. In a ground truth label, pixels of value 0 are background. Pixels that have a same positive integer belong to an individual nucleus.
The integer value id also indicates the class of the nucleus: (1) tumor nucleus if mod(id, 3) = 0, (2) lymphocyte nucleus if mod(id, 3) = 1, (3) stroma nucleus if mod(id, 3) = 2, where mod is the modular operation.

2. Proposed Unified Framework

The proposed framework consists of two parts: the prediction network that generates the segmentation mask of each type of nuclei, and the perceptual loss network that computes the perceptual loss between the predicted label and ground-truth label.

2.1. Prediction Network

The prediction network is the routine encoder-decoder structure based on U-Net.
The encoder is from ResNet-34, without the average pooling and fully connected layers, and is initialized with the pretrained parameters from image classification tasks.
There are skip connections between encoder and decoder, which helps to recover high resolution feature maps.
The network outputs five probability maps: background, inner part of tumor nuclei, inner part of lymphocytes nuclei, inner part of stroma nuclei and contours of all nuclei.
The contour map mainly aims to capture the contours of crowded and touching nuclei. As a result, the predicted inner parts of each nucleus are not connected. The final nuclei mask is generated by a simple morphological dilation operation.

2.2. Perceptual Loss Network

The perceptual loss network is utilized to improve the segmentation accuracy of details in the image. It originates from Johnson et al.’s work [17], in which the authors compute loss between high-level features of the transformed image and the original image.
The pretrained VGG-16 model is a feature extractor and is fixed during training and test. Four levels of features are extracted using this network for the output of the prediction network and the ground-truth label, i.e., feature maps after the last ReLU layer of the first, second, third and fourth blocks of VGG-16 model, denoted as relu1_2, relu2_2, relu3_3, relu4_3.
The mean square loss is then computed between the feature sets of two inputs.

2.3. Loss Function

The loss function of the method consists of two parts.
The first part is the cross entropy loss for five classes:

Larger weights are assigned for low frequent class pixels:

where d1, d2 are the distances to the nearest and the second nearest nuclei. σ=5 and ω0=10.
The second part is the perceptual loss. Let’s denote the trained VGG-16 model as a function f.

where ^y=arg max y is the prediction map obtained from the output probability map y.
The final loss function is:

where β=0.1.
For fine-grained classification, authors only consider the accuracy in true positives instead of all ground-truth nuclei, because not all nuclei have corresponding predicted ones.

2. Results

**Some images results of ground-truth labels,** **FCN-8s,** **U-Net** **and the proposed method.**

**Nuclei segmentation results on the test set**

**Nuclei fine-grained classification accuracies (%) on the test set.**

All three model variants have achieved relatively good segmentation and fine-grained classification results, showing that the idea of combining the two tasks are feasible.

Compared to FCN-8s and U-Net, the proposed method has improvements on the segmentation of all types of nuclei, especially on lymphocytes.
The proposed method also outperforms FCN-8s and U-Net on the fine-grained classification.

Both transfer learning and perceptual loss techniques can promote the performance of segmentation and classification.

Reference

[2019 ISBI] [Qu ISBI’19]
Joint Segmentation and Fine-Grained Classification of Nuclei in Histopathology Images

Biomedical Multi-Task Learning

2018 [ResNet+Mask R-CNN] [cU-Net+PE] [Multi-Task Deep U-Net] [cGAN-AutoEnc & cGAN-Unet] 2019 [cGAN+AC+CAW] [Qu ISBI’19]

Brief Review — Joint Segmentation and Fine-Grained Classification of Nuclei in Histopathology Images

U-Net with ResNet-34 as Encoder, VGG Loss is used

Outline

1. Dataset & Preprocessing

2. Proposed Unified Framework

2.1. Prediction Network

2.2. Perceptual Loss Network

2.3. Loss Function

2. Results

Reference

Biomedical Multi-Task Learning

My Other Previous Paper Readings

Written by Sik-Ho Tsang

No responses yet