Brief Review — Classifying Heart-Sound Signals Based on CNN Trained on MelSpectrum and Log-MelSpectrum Features

Modified VGGNet Using Log-MelSpectrum As Input Features

3 min readDec 14, 2023

Classifying Heart-Sound Signals Based on CNN Trained on MelSpectrum and Log-MelSpectrum Features
Log-MelSpectrum+Modified VGGNet, by Nantong University
2023 MDPI Bioengineering (Sik-Ho Tsang @ Medium)
Heart Sound Classification
2013 … 2022 [CirCor Dataset] [CNN-LSTM] [DsaNet] [Modified Xception] [Improved MFCC+Modified ResNet] [Learnable Features + VGGNet/EfficientNet] [DWT + SVM] [MFCC+LSTM] [DWT+ 1D-CNN] [CNN+Attention] 2023 [2LSTM+3FC, 3CONV+2FC] [NRC-Net]
==== My Other Paper Readings Are Also Over Here ====

MelSpectrum and Log-MelSpectrum features of heart-sound signals combined with a mathematical model of cardiac-sound acquisition were analysed theoretically.
Results demonstrated that the Log-MelSpectrum features can reduce the classification difference between domains and improve the performance of CNNs.

Outline

MelSpectrum & Log-MelSpectrum Feature Extraction
Results

1. MelSpectrum & Log-MelSpectrum Feature Extraction

The heart-sound signals are resampled from 25 Hz to 950 Hz using a Butterworth filter with a sampling frequency of 2000 Hz.
The signals are then passed through a Savitzky–Golay filter to improve the smoothness of the time-frequency feature graph and reduce noise interference.
The filtered signals are framed and windowed using a Hanning window function to fix the signals into a selected frame length.
Frames are transformed into the periodogram estimate of the power spectrum using STFT.
Each periodogram estimate is mapped onto the Mel-scale using Mel filters, which consist of several triangular filters. The output of the Mel filter is called the MelSpectrum.

Logarithmic transformation is applied to the MelSpectrum features to obtain the Log-MelSpectrum.

**Detailed Parameters for MelSpectrum & Log-MelSpectrum Feature Extraction**

The detailed parameters are shown above.

**MelSpectrum & Log-MelSpectrum Feature Visualization**

Examples of MelSpectrum and Log-MelSpectrum feature maps from normal heartsound fragment are shown above.

Heart-sound signals are easily disturbed by additive and multiplicative noise during the acquisition process.

The stethoscope-induced multiplicative component can be converted into an additive term in the Log-MelSpectrum domain. Therefore, Log-Melspectrum feature maps are easier to improve the classification performance of CNN.

2. Results

2.1. PhysioNet Dataset

PhysioNet Dataset is used.

2.2. Modified VGGNet Model

The input feature vector size was modified to 128×128.
The output layer is corresponding to 2 classes: normal and abnormal.

2.3. Training Hyperparameters

2.4. Performance

The accuracies of the Log-MelSpectrum and MelSpectrum time-frequency characteristic diagram are 91.74%±3.72% and 87.42%±3.99%, respectively.

The model trained by the Log-MelSpectrum feature maps has higher average Se, Sp, and MAcc than that trained by the MelSpectrum feature maps.

Brief Review — Classifying Heart-Sound Signals Based on CNN Trained on MelSpectrum and Log-MelSpectrum Features

Modified VGGNet Using Log-MelSpectrum As Input Features

Outline

1. MelSpectrum & Log-MelSpectrum Feature Extraction

2. Results

2.1. PhysioNet Dataset

2.2. Modified VGGNet Model

2.3. Training Hyperparameters

2.4. Performance

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Sik-Ho Tsang

No responses yet