Review — Classification of heart sound signals using a novel deep WaveNet model

WaveNet for 5-Class Heart Sound Classification

Sik-Ho Tsang
4 min readNov 11, 2023
Future Work: Authors Believe the Proposed WaveNet Can be Deployed in the Cloud to Aid Clinician for Rapid Diagnosis

Classification of heart sound signals using a novel deep WaveNet model
WaveNet, by Ngee Ann Polytechnic, Singapore University of Social Sciences, National Heart Centre, Columbia University, Kumamoto University, Asia University
2020 Elsevier J. CMPB, Over 110 Citations (Sik-Ho Tsang @ Medium)

Heart Sound Classification
2013 [PASCAL] 2018 [RNN Variants] 2023 [2LSTM+3FC, 3CONV+2FC] [NRC-Net]
==== My Other Paper Readings Are Also Over Here ====

  • A novel in-house developed deep WaveNet model for automated classification of five types of heart sounds. The model is developed using a total of 1000 PCG recordings belonging to five classes with 200 recordings in each class.


  1. Heart Sound Preliminaries
  2. WaveNet Model
  3. Results

1. Heart Sound Preliminaries

1.1. PCG Signals for Different Classes

  • Different types of Heart valve diseases (HVDs) such as aortic stenosis (AS), mitral stenosis (MS), mitral regurgitation (MR) and mitral valve prolapse (MVP) can be diagnosed using PCG signals.
  • Yet, Visual screening of the PCG signal is time-consuming and prone to error.

Thus, WaveNet deep learning model is proposed using PCG signals for the categorization of heart sounds in HVD.

  • Fig. 2a-2b represents the normal and pathological images of aortic stenosis (AS).
  • Fig. 3a–3d represents the normal and pathological images of mitral valve disease (mitral stenosis (MS), mitral regurgitation (MR) and mitral valve prolapse (MVP)).

1.2. Summary of Prior Arts

Summary of Prior Arts
Summary of Prior Arts (Continued)
  • Table 1 presents a summary of studies that employ deep learning models for automated categorization of heart sounds in HVD.

2. WaveNet Model

2.1. Dataset & Preprocessing

  • PCG signals used in this study were obtained from a public database [7]. A total of 1000 PCG recordings were obtained from five different classes with 200 recordings each. The different classes of signals were N, AS, MS, MR and MVP.
  • Each recording was sampled at a frequency of 8000 Hz. Each audio sound wave was normalized between −1 to 1 to ensure that the data shared a common scale for easier analysis.
  • As the samples had varying length, they were zero-padded to a 31,943 discrete sample point length for consistency.
  • Dataset Link:

2.2. Model Architecture

WaveNet Model Architecture
  • The proposed WaveNet model consists of 6 residual blocks.
  • The residual block is different from the one in ResNet, as shown above. It uses dilated two parallel 1D convolution (DeepLab or DilatedNet).
  • Tanh and sigmoid are used respectively at each branch then mulitplied:
  • Then it is followed by another 1×1 convolution.
  • After 6 residual blocks, all signals from different residual blocks are added together, pass through two 1×1 convolutions, then two fully connected layers, with the use of ReLU and Dropout.
  • Finally, softmax is used at output layer.
  • It was trained using 3 epochs, with a batch size of 3.
  • It took less than a millisecond to classify each sample.

3. Results

Classification Results

The 5 classes were classified with accuracies of above 95%.

  • Authors claim that, to the best of their knowledge, the first to report on the 5-class problem for heart sound classification using a deep WaveNet model

The validation accuracy has increased steeply from epochs 1 to 5, with minimal changes and gradual improvement from epoch 5 onwards.

It is noteworthy that the misclassification rates of N, MVP, MS, MR and AS are 6%, 11%, 4%, 11%, and 6%, respectively.



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.