Brief Review — Conditional Generative Adversarial and Convolutional Networks for X-ray Breast Mass Segmentation and Shape Classification

cGAN+Autoencoder, or cGAN+U-Net

Sik-Ho Tsang
3 min readNov 6, 2022


Conditional Generative Adversarial and Convolutional Networks for X-ray Breast Mass Segmentation and Shape Classification,
cGAN-AutoEnc & cGAN-Unet
, by Bioinformatics Institute, Kayakalp Hospital, and Hospital Universitari Sant Joan de Reus
2018 MICCAI, Over 40 Citations (Sik-Ho Tsang @ Medium)
Medical Image Analysis, Image Segmentation, Image Classification

  • Conditional Generative Adversarial Networks (cGAN) for breast mass segmentation in mammography. Autoencoder or U-Net can be used as backbone.
  • After segmentation, another CNN is used for shape classification.


  1. Breast Mass Segmentation Using cGAN
  2. Shape Classification Using CNN
  3. Results

1. Breast Mass Segmentation Using cGAN

Proposed framework for breast mass segmentation and shape classification
  • The Generator network G of the cGAN is an FCN network composed of two networks: encoders and decoders.
  • It can be Autoencoder and U-Net. It outputs the binary mask, in which each entry is the two output classes (mass/normal).
  • The Discriminative network D is to classify the if the binary mask is generated by G or real.
  • Let x represents a mass ROI image, y the corresponding ground truth segmentation, z a random variable, G(x, z) is the predicted mask, ||yG(x, z)||1 is the L1 normalized distance between ground truth and predicted masks, λ is an empirical weighting factor and D(x, G(x, z)) is the output score of the discriminator, the generator loss is defined as:
  • The discriminator loss is:
  • This combination of generator/discriminator networks allows robust learning with very few training samples.
  • A post-processing morphological filtering (i.e., erosion and dilation) is used to remove the artifacts and small white regions from the binary masks.

2. Classification Using CNN

  • The input images for this stage (binary masks) do not render complex distribution of pixel values, just morphological structure, a simple CNN (i.e., two convolutional layers plus two fully connected layers) is used to learn a generalization of the four mass shapes.

3. Results

3.1. Segmentation

Accuracy, Dice Coefficient, Jaccard Index, Sensitivity and Specificity
  • The cGAN-Unet provides the best results of all computed metrics on the DDSM test samples, with very remarkable Accuracy, Dice and Jaccard scores (around 97%, 94% and 89%, respectively).
  • On the in-house private dataset, however, the cGAN-AutoEnc yields better results than the cGAN-Unet in terms of Dice, Jaccard and Sensitivity (+2%, +4% and +12%, respectively), which indicates that the cGAN-AutoEnc has learned a more generalized representation of tumor features since it performs better on the dataset not used for training.
Qualitative Examples

3.2. Classification

  • For the shape classification, the proposed CNN obtains overall accuracy around 72%.

3.3. Relationship Between Shapes and Subtypes

Distribution of breast cancer molecular subtypes samples from the hospital dataset with respect to its predicted mask shape.
  • Tumor shape could play an important role to predict the breast cancer molecular subtypes [18].
  • Luminal-A and -B groups are mostly assigned to irregular and lobular shape classes.
  • In turn, oval and round masses give indications to the Her-2 and Basal-like groups.



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.