Brief Review — Classifying Heart Sound Recordings using Deep Convolutional Neural Networks and Mel-Frequency Cepstral Coefficients

MFCC+CNN

Sik-Ho Tsang
2 min readDec 28, 2023

Classifying Heart Sound Recordings using Deep Convolutional Neural Networks and Mel-Frequency Cepstral Coefficients
MFCC+CNN CinC’16
, by Palo Alto Research Center
2016 CinC, Over 150 Citations (Sik-Ho Tsang @ Medium)

Heart Sound Classification
2013 … 2023 [2LSTM+3FC, 3CONV+2FC] [NRC-Net] [Log-MelSpectrum+Modified VGGNet] [CNN+BiGRU] [CWT+MFCC+DWT+CNN+MLP]
==== My Other Paper Readings Are Also Over Here ====

  • Heart sounds is converted as MFCC and input to CNN for classification.

Outline

  1. MFCC+CNN
  2. Results

1. MFCC+CNN

1.1. Segmentation

Segmentation

Each PCG waveform is firstly segmented into the fundamental heart sounds (S1, Systole, S2 and Diastole) using Springer’s segmentation algorithm [2].

  • Segmentation was used to ensure that each 3-second heart sound segment began at S1.

1.2. MFCC

MFCC
  • 13 MFCC feature values are extracted for each sliding window.

In total each heat map consists of 300 time frames represented on the x-axis, and 13 MFCC filterbanks represented on the y-axis.

1.3. CNN

CNN
  • A single channel 6×300 MFCC heat map is used as input and a binary classification is the output.

A standard architecture is used consisting of two convolutional layers, each followed by a max-pooling layer, followed by two fully connected layers before final classification.

  • (Please read the paper for the detailed model architecture.)

2. PhysioNet Results

PhysioNet Results

Results for the proposed top scoring submissions made to the PhysioNet challenge server for both Phase I and Phase II are depicted in Table 2.

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.