Brief Review — Fundamental Heart Sound Classification using the Continuous Wavelet Transform and Convolutional Neural Networks

CWT Scalogram+CNN

Sik-Ho Tsang
2 min readJan 3, 2024

Fundamental Heart Sound Classification using the Continuous Wavelet Tranform and Convolutional Neural Networks
CWT Scalogram+CNN EMBC’18
, by AUT Institute of Biomedical Technologies, University of Auckland
2018 EMBC, Over 50 Citations (Sik-Ho Tsang @ Medium)

Heart Sound Classification
20132023 [2LSTM+3FC, 3CONV+2FC] [NRC-Net] [Log-MelSpectrum+Modified VGGNet] [CNN+BiGRU]
==== My Other Paper Readings Are Also Over Here ====

  • Continuous wavelet transform (CWT) scalograms and convolutional neural networks (CNN) are used for the normal and abnormal heart sound classification.

Outline

  1. CWT Scalograms+CNN
  2. Results

1. CWT Scalograms+CNN

1.1. CWT Scalograms

CWT Scalograms
  • Scalograms were created for all of the extracted heart sounds (Figure 1). A scalogram is a visual representation of the CWT of a signal, similar to a spectrogram created using a short time Fourier transform (STFT).
  • All CWT’s were calculated using the Morse analytic wavelet with 48 voices per octave which resulted in 207 frequency windows between 23 and 423 Hz.

Each scalogram was stored as a 207×150 pixel grayscale image.

1.2. CNN

CNN
  • The architecture for the CNN used in this research is shown in Figure 2.

In this figure the convolutional, batch normalization, ReLU, and pooling layers are visually combined into convolutional units to allow for clearer representation.

Two fully connected layers of 20 neurons and 2 neurons respectively are used for classification.

A Dropout layer that performs regularization with a neuron retention rate of 50% is used in between 2 fully connected layers.

2. Results

Results
  • Support vector machine (SVM) classifiers, and k-Nearest Neighbours (kNN) classifiers, are used for comparisons.
  • The LBP extracted features are used for feature comparison.

The complete deep CNN, including the final fully connected layer had the best overall performance (accuracy = 86.0%) but did not perform statistically significantly better than the SVM classifier trained with CNN features (accuracy = 85.9%, p = 0.391).

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.