Review — Breast Tumor Segmentation and Shape Classification in Mammograms Using Generative Adversarial and Convolutional Neural Network
cGAN for Segmentation, CNN for Classification
Breast Tumor Segmentation and Shape Classification in Mammograms Using Generative Adversarial and Convolutional Neural Network,
cGAN JESWA’20, by Universitat Rovira i Virgili, A ∗STAR, and Hospital Universitari Sant Joan, 2020 JESWA, Over 130 Citations (Sik-Ho Tsang @ Medium) Medical Imaging, Medical Image Anlaysis, Multi-Tasking Learning, Image Segmentation, Image Classification
- A conditional Generative Adversarial Network (cGAN) is used to segment a breast tumor within a region of interest (ROI) in a mammogram.
- The generative network learns to recognize the tumor area and to create the binary mask that outlines it. In turn, the adversarial network learns to distinguish between real (ground truth) and synthetic segmentations, thus enforcing the generative network to create binary masks as realistic as possible.
- Overall Framework
- Conditional GAN (cGAN) for Image Segmentation
- Shape classification model (CNN) for Image Classification
- Image Segmentation Results
- Image Classification Results
1. Overall Framework
- The proposed CAD system is divided into two stages: breast tumor segmentation and shape classification.
- Before feeding into the first stage, SSD is used to locate the tumor position and fit a bounding box around it.
- SSD is found to the best among the methods in the above table.
1.3. Loose Frame
- A method so-called “loose frame” is used to expand the original bounding box coordinates by adding extra space around.
- The loose frame provides a convenient proportion between healthy and tumorous pixels.
- ROI images are scaled to 256 ×256 pixels, which is the optimal cGAN input size found experimentally.
- After scaling, they are pre-processed for noise removal, and then contrast is enhanced using histogram equalization. Finally, normalizing into range of [0, 1] is performed.
1.5. Segmentation Then Classification
- The prepared data is then fed to the cGAN to obtain a binary mask of the breast tumor, which is post-processed using morphological operations (filter sizes of 3 ×3 for closing, 2 ×2 for erosion, and 3 ×3 for dilation) to remove small speckles, as above.
- The output binary mask is downsampled into 64 ×64 pixels, which is then fed to a multi-class CNN shape descriptor to categorize it into four classes: irregular, lobular, oval and round.
2. Conditional GAN (cGAN) for Image Segmentation
- The Generator G network of the cGAN is an FCN composed of encoding and decoding layers, which learn the intrinsic features of healthy and unhealthy (tumor) breast tissue, and generate a binary mask according to these features.
- The Discriminative D network of the cGAN assesses if a given binary mask is likely to be a realistic segmentation or not.
- (For architecture details, please feel free to read the above figure or paper.)
2.2. Loss Functions
- Let x be a tumor ROI, y the ground truth mask, z a random variable, λ an empirical weighting factor, G(x, z) and D(x, G(x, z)) the outputs of G and D, respectively.
- Then, the loss function of G is defined as:
- where z is introduced as Dropout in the decoding layers Dn1, Dn2 and Dn3, and the lDice(y, G(x, z)) is the dice loss of the predicted mask with respect to ground truth, which is defined as:
- where ◦ is the pixel wise multiplication of the two images and |.| is the total sum of pixel values of a given image.
- The loss function of D is:
- These two terms compute BCE loss using both masks.
- The optimization of G and D is done concurrently, i.e. , one optimization step for both networks at each iteration.
- The above figure shows the dice loss achieves lower values (more optimal) than the L1-norm loss.
3. Shape classification model (CNN) for Image Classification
- The CNN attempts to use only shape context to classify the tumor shapes. (For architecture details, please feel free to read the above figure or paper.)
- A weighted categorical cross-entropy loss is used to avoid the problem of unbalanced dataset.
4. Image Segmentation Results
According to the results, the proposed method outperforms the compared state-of-the-art methods in all cases except for the IoU computed on tight crops of the private dataset. The SLSDeep approach yielded the best IoU (79.93%), whereas the proposed method yielded the second best result (79.87%) with a very small difference of 0.06%.
- The post-processing improved the results of the proposed model by 1% with the three framing inputs.
The proposed method clearly outperforms the rest for all tumors except for the second one.
5. Image Classification Results
The proposed method yielded around 73% of classification accuracy for irregular and lobular classes.
The proposed classifier based only on binary masks yields an overall accuracy of 80%, outperforming the second best results.
The above figure shows ROC curve illustrating that the proposed model attained AUC about 0.8.
Most of Luminal-A and -B samples (i.e., 96/123 and 82/107 for Luminal-A and -B, respectively) are mostly assigned to irregular and lobular shape classes. In turn, oval and round tumors give indications to the Her-2 and Basal-like samples.
It is found that three samples that are mis-segmented because they contained two tumors, the one in the center, which is properly segmented, and another that is shown partially in the left-down border of the image, which is wrongly ignored as non-tumor region (FN).