Brief Review — Classifying Heart Sound Recordings using Deep Convolutional Neural Networks and Mel-Frequency Cepstral Coefficients


Sik-Ho Tsang
2 min readDec 28, 2023

Classifying Heart Sound Recordings using Deep Convolutional Neural Networks and Mel-Frequency Cepstral Coefficients
, by Palo Alto Research Center
2016 CinC, Over 150 Citations (Sik-Ho Tsang @ Medium)

Heart Sound Classification
2013 … 2023 [2LSTM+3FC, 3CONV+2FC] [NRC-Net] [Log-MelSpectrum+Modified VGGNet] [CNN+BiGRU] [CWT+MFCC+DWT+CNN+MLP]
==== My Other Paper Readings Are Also Over Here ====

  • Heart sounds is converted as MFCC and input to CNN for classification.


  2. Results


1.1. Segmentation


Each PCG waveform is firstly segmented into the fundamental heart sounds (S1, Systole, S2 and Diastole) using Springer’s segmentation algorithm [2].

  • Segmentation was used to ensure that each 3-second heart sound segment began at S1.

1.2. MFCC

  • 13 MFCC feature values are extracted for each sliding window.

In total each heat map consists of 300 time frames represented on the x-axis, and 13 MFCC filterbanks represented on the y-axis.

1.3. CNN

  • A single channel 6×300 MFCC heat map is used as input and a binary classification is the output.

A standard architecture is used consisting of two convolutional layers, each followed by a max-pooling layer, followed by two fully connected layers before final classification.

  • (Please read the paper for the detailed model architecture.)

2. PhysioNet Results

PhysioNet Results

Results for the proposed top scoring submissions made to the PhysioNet challenge server for both Phase I and Phase II are depicted in Table 2.



Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.

No responses yet