Review — Breast Cancer Classification in Automated Breast Ultrasound Using Multiview Convolutional Neural Network with Transfer Learning

Multiview CNN, Using Multiview Inputs

Sik-Ho Tsang
5 min readJan 1, 2023
Happy New Year 2023 (Image from Pixabay)
  • Happy New Year 2023 !!!

Breast Cancer Classification in Automated Breast Ultrasound Using Multiview Convolutional Neural Network with Transfer Learning,
2020 J. UltraMedBio, by University of Saskatchewan, and Jeonbuk National University Medical School
Multiview CNN, Over 50 Citations (Sik-Ho Tsang @ Medium)
Medical Imaging, Medical Image Analysis, Image Classification

  • A modified Inception-v3 architecture is proposed to classifies breast lesions as benign and malignant.
  • Because the ABUS images can be visualized in transverse and coronal views, multiview CNN is further proposed using the modified Inception-v3 backbone.


  1. Modified Inception-v3
  2. Multiview CNN
  3. Results

1. Modified Inception-v3

1.1. Dataset

Distribution of the number of lesions by size
  • A total of 316 breast lesions in 263 patients were included in the dataset, which consists of 135 malignant and 181 benign lesions.
  • Mean lesion size was 13.23 mm with a standard deviation of 4.29 mm. A detailed distribution of the number of lesions by size is provided above.
(a) Two lesion patches obtained from two slices of the same benign lesion in transverse view. (b) Two lesion patches obtained from two slices of the same benign lesion in coronal view.
  • Multiple lesion patches were cropped from different slices of each lesion.
  • Thereafter, 743 malignant patches (359 and 384 patches from coronal and transverse views, respectively) and 419 benign patches (233 and 186 patches from coronal and transverse views, respectively).

1.2. Inception-v3 Module

Architectures of (a) Inception module A, (b) Inception module B and © Inception module C.
  • (a) Inception A: is equivalent to the inception module used in GoogLeNet, however, the 5×5 convolution is factored to two 3×3 convolutions.
  • (b) Inception B: The 7×7 convolution is factored to two asymmetric convolutions with kernel sizes of 1×7 and 7×1.
  • (c) Inception C: The kernel sizes are 1×1, 1×3, 3×1 and 3×3.
  • ImageNet pretrained weights are used.

1.3. Inception-v3 Backbone

Architectures of (a) Inception-v3 backbone and (b) modified Inception-v3 convolutional neural network (CNN).
  • The input size of the backbone is 299×299×3.
  • The first several layers of the backbone consist of six convolutional layers with kernel sizes of 3×3 and an average pooling layer with a kernel size of 3×3, followed by five Inception A, four Inception B and two Inception C modules.
  • The backbone outputs 2048 feature maps, and each feature map has a size of 8×8.
  • A global average pooling layer is then added.
  • 3 FC layers are appended at the end. The numbers of hidden neurons are 256, 128 and 2. The dropout scheme is applied after the first and the second FC layers to reduce overfitting.

2. Multiview CNN

Architectures of proposed multiview convolutional neural networks (CNNs).
  • (a) Multiview CNN A: Two lesion patches are cropped. the multiview lesion patch is generated by concatenating the cropped lesion patches, e.g.: one lesion patch from the coronal view (CA) and two lesion patches from the transverse view (TA and TB).
  • By using different combination during training, i.e. CA-TA-TA, CA-TA-TB, CA-TB-TA, CA-TB-TB , 3085 multiview patches (1767 malignant and 1318 benign) that can be used to train the network.
  • (b) Multiview CNN B: adopts two Inception-v3 backbones. Then, the extracted features from the two Inception-v3 backbones are concatenated on top of the first FC layer.
  • 1525 lesion samples (549 malignant and 468 benign) can be obtained to train the multiview CNN B.

3. Results

3.1. Single view vs Multiview

Comparison of multiview and single-view CNNs.

Multiview CNN A achieved a sensitivity of 0.886, specificity of 0.876 and mean AUC value of 0.9468 with a standard deviation of 0.0164, which obtains the best performance.

  • The inference time of multiview CNN A was faster than that of multi-view CNN B (34.34 ms/lesion vs. 73.79 ms/lesion).

3.2. SOTA Comparison (CNN)

Classification performance of multiview CNN A using different backbones

The multiview CNN A with the Inception-v3 backbone outperformed those with ResNet, DenseNet, Inception-v4 or Inception-ResNetv2 backbones.

3.3. SOTA Comparison (ML)

Comparison of the multiview CNN A with conventional machine learning approaches.

HOG outperforms PCA but still could not reach the classification performance of the multiview CNN.

3.4. Observers’ Performance

Results of observer performance test

With the aid of the multiview CNN A, all human reviewers improved in diagnostic accuracy. The reduced AUC value for the special radiologist was not statistically significant.

I have reviewed 34 papers in November and 34 papers in December, about 1 paper for each day. This is my task given to myself before the coming of 2023. Indeed, it is quite exhausting, after that, I will slower down a bit starting today. : )

Wish you all Happy Learning & Happy New Year 2023 !! : )



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.