Brief Review — Classification of Heart Sounds Using Chaogram Transform and Deep Convolutional Neural Network Transfer Learning

Chaogram + Inception-v3

Sik-Ho Tsang
3 min readApr 20, 2024
Flow Diagram

Classification of Heart Sounds Using Chaogram Transform and Deep Convolutional Neural Network Transfer Learning
Chaogram +
Inception-v3, by Islamic Azad University, University of Southern Queensland (USQ), Universidade do Porto
2022 MDPI Sensors (Sik-Ho Tsang @ Medium)

Phonocardiogram (PCG)/Heart Sound Classification
20132023 [2LSTM+3FC, 3CONV+2FC] [NRC-Net] [Log-MelSpectrum+Modified VGGNet] [CNN+BiGRU] [CWT+MFCC+DWT+CNN+MLP] [LSTM U-Net (LU-Net)] [DL Overview] [MFCC + k-NN / RF / ANN / SVM + Grid Search] [Long-Short Term Features (LSTF)] [WST+1D-CNN and CST+2D-CNN Ensemble] [CTENN] [Bispectrum + ViT]
==== My Other Paper Readings Are Also Over Here ====

  • The reconstructed phase space (RPS) representation of the phonocardiogram (PCG) signal is projected on three coordinate planes.
  • Then, pretrained deep convolutional neural networks, e.g.: Inception-v3, are used for fine-tuning for heart sound classification.


  1. Chaogram + Inception-v3
  2. Results

1. Chaogram + Inception-v3

1.1. Preprocessing

A two-stage noise cancellation technique proposed by Whitaker et al. is used where a third-order Butterworth band-pass filter with bandwidths of 15 to 800 Hz is used first. Then, the spectral subtraction denoising scheme was employed.

1.2. Transformation of One-Dimensional PCG Signal into a Chaogram Image

To build a chaogram image, reconstructed phase space (RPS) is a beneficial tool for analysing a system’s nonlinear and chaotic behaviour [29] where a phase space comprises the collection of all possible states of a system.

  • The RPS can be built by defining the vectors:
  • where Sn, with n = 1, 2, 3. . . N, is the nth sample of the PCG signal; d and τ denote the embedding dimension and time delay, respectively.

Indeed, authors uses skedm library to implement the phase space reconstruction. d=3 and τ=18 are optimal.


6 RPS for normal and abnormal signals are shown above. We might observe some differences in RPS between normal and abormal signals.

Chaogram Image

After that, the projections of this 3D tensor on XY, XZ, and YZ planes are calculated as three images, Ixy, Ixz, and Iyz.

1.3. Pretrained CNN Models

  • Pretrained AlexNet, VGG-16, GoogleNet, Inception-v3, ResNet-50 are evaluated.
  • PhysioNet dataset is used for fine-tuning.
  • Only the two last layers of the network were allowed to fine-tune on the used dataset.
  • Data augmentation is applied onto chaogram with rotation, scaling, width shift and heighr shift.
  • A Dropout layer with p = 0.5 is used

2. Results


Inception-v3 is found to be the best.

The proposed method shows better accuracy and recall than all other methods. The precision of the proposed method is slightly lower than the ones of the other methods, and the f1 score is only lower than the one of [44].



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.