Brief Review — Heart Sound Anomaly and Quality Detection using Ensemble of Neural Networks without Segmentation

20 Ensemble FFNN, Besides Normal/Abnormal Classification, It Also Has Good/Bad Quality Classification

Sik-Ho Tsang
4 min readNov 15, 2023

Heart Sound Anomaly and Quality Detection using Ensemble of Neural Networks without Segmentation
20 Ensemble FFNN
, by Tampere University of Technology, University of Stavanger, Qatar University, Northwestern University
2016 CinC, Over 170 Citations (Sik-Ho Tsang @ Medium)

Heart Sound Classification
2020 [1D-CNN] [WaveNet] [Power Features+KNN] [Improved MFCC + CRNN] 2021 [CardioXNet] 2023 [2LSTM+3FC, 3CONV+2FC] [NRC-Net]
==== My Other Paper Readings Are Also Over Here ====

  • Automatic classification method for anomaly (normal vs. abnormal) and quality (good vs. bad) detection of PCG recordings without segmentation is proposed.
  • For this purpose, a subset of 18 features is selected among 40 features based on a wrapper feature selection scheme.
  • These features are extracted from time, frequency, and time-frequency domains without any segmentation. The selected features are fed into an ensemble of 20 feedforward neural networks (FFNNs).


  1. Feature Set
  2. Ensemble of FFNN
  3. Results

1. Feature Set

  • Physionet/Computing in Cardiology Challenge 2016 [9] dataset is used.
  • (1) Linear Predictive Coefficient (LPC): the first, third, sixth, eight, ninth, and tenth coefficients of 10th-order linear predictor are used as features.
  • (2) Entropy based features: Natural and Tsallis entropy of PCG signals are calculated as:
  • (3) 3 features based on Mel Frequency Cepstral Coefficients (MFCCs):
  • (4) Wavelet transform based features: The approximation coefficients of level 5 (𝑎5) and the detail coefficients of level 3 to 5 (𝑑3, 𝑑4, and 𝑑5) are used for feature extraction as:
  • (5) Features extracted over power spectral density:

2. Ensemble of FFNN

Overall Flowchart

2.1. Overall Flow

In the first step, the bad quality recordings (class 0) are detected.

In the second step, among the good quality signals, the normal (class -1) and abnormal (class 1) PCGs are classified.

This process includes two different classifiers: one for good/bad quality recordings, and the other for normal/abnormal recordings

2.2. Ensemble of FFNN

  • An ensemble of 20 feedforward Artificial Neural Networks (ANNs) is used with 2 hidden layers in each, and 25 hidden neurons at each layer.
  • The number of neurons in the output layer is 4.
  • Tanh is used as activation function.
  • 20-fold cross-validation is used for training.
  • To deal with data imbalance issue, random sampling with replacement is used for abnormal signals so that the size of the selected set becomes equal to the size of the normal set.

2.3. Combination Rules

  • 2 rules are used. 1) non-trainable rule and 2) trainable rule.
  • In the first approach, unweighted average of class-specific outputs [16] of the ANNs is used.

In the second approach, the combination is based on the voting system of the class labels which is learned during a 10-fold cross-validation scheme as follows: if at least 17 out of 20 classifiers recognize a signal as bad quality, our algorithm recognizes it as bad quality and assigns the label 0. For the remaining signals, which recognized as good quality, our algorithm decides whether it is normal or abnormal such that if at least 7 out of 20 classifiers recognize it as abnormal our algorithm detects the signal as abnormal (1) and otherwise as normal (-1).

3. Results

Results on Physionet/Computing in Cardiology Challenge 2016 [9] dataset
  • 10-fold cross-validation is used.
  • Although the performance of the two rules are fairly close (91.17% vs. 91.50%), the second rule is decided to be used on the unseen test data.

The proposed solution achieved the overall score of 85.90% (86.91% Se and 84.90% Sp) on the unseen test dataset, which is the second best score in the competition.



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.