Brief Review — Artificial intelligence for heart sound classification: A review

Another Review Paper for Heart Sound Classification, With Detailed Review on Dataset, ML, DL

7 min readJul 20, 2024

Artificial intelligence for heart sound classification: A review
Review on AI for HSC, by Dalian University of Technology, Northeastern University, Incheon National University, Universidad Politécnica de Madrid
2024 Wiley Expert Systems (Sik-Ho Tsang @ Medium)
Phonocardiogram (PCG)/Heart Sound Classification
2013 … 2023 … [CTENN] [Bispectrum + ViT] 2024 [MWRS-BFSC + CNN2D]
Summary: My Healthcare and Medical Related Paper Readings and Tutorials
==== My Other Paper Readings Are Also Over Here ====

This article gives a thorough overview of various heart sound analysis subtasks and examine the improvements made in each subtask by both machine learning (ML) techniques and deep learning (DL) algorithms. It goals to highlight the potential of AI to revolutionize cardiovascular healthcare.
This paper helps to organize a set of papers for researcher to read, going into this field.

Outline

Databases
Performance Metrics
ML for Heart Sound Classification
DL for Heart Sound Classification
Discussions & Limitations

1. Databases

**The most famous heart sound database:** **PhysioNet**.

Currently, the most famous heart sound database is PhysioNet. The public training set contains 3240 records.
HSCT-11 contains information of 206 people.
The PASCAL database contains 656 recordings for heart sound categorizing and 176 recordings for the heart sound segmentation process. However, due to the application of a low-pass filter during recording, the frequency band of a heartbeat signal is limited to below 195 Hz, which may remove some valuable components for clinical diagnosis.
GitHub also provides an open-source heart sound database, collecting 1000 audio files (0.679 h in total). Recordings of five different types of heart sounds are included.
The Open Michigan Heart Sound and Murmur Library: A total of 23 records were provided with a total duration of 0.413 h.
The literature (DigiScope) publishes a paediatric dataset containing samples from 29 individuals of various health levels.
The CirCor DigiScope database:A total of 5272 PCG records at one or more locations from 1568 young patients in rural areas of Brazil.
The Shenzhen University General Hospital gathered the Heart Sounds Shenzhen (HSS) corpus. It contained 845 records from 170 individuals.
Fetal (Cesarelli et al. (2012)) recorded fetal heartbeats from 35 pregnant women.
In a recent study (Kazemnejad et al., 2021), a dataset of 69 simultaneous electrocardiograms (ECG) and PCG recordings was provided.

In addition to the above, there are many other heart sound data sets, such as the heart murmurs database (eGeneralMedical) [7], the heart sound library of Thinklabs [8], Texas [9], Biosciences Database of heart Sound signals [10]. Furthermore, D. Mason’s book (Mason, 2000) includes some heart sound recordings.

2. Performance Metrics

Confusion matrix.
Binary Classification: Sensitivity, specificity and accuracy.
Multi-Class Classification: Precision and Recall.
Also, ROC curve and Area Under Curve (AUC).

3. ML for Heart Sound Classification

3.1. Pre-Processing — Denoising

Signal filters are a common method of removing noise.
Band-pass filters can simultaneously remove high-frequency and low-frequency noise and retain signal data within their cut-off frequency range, often used for denoising heart sounds (Gao et al., 2023; Ghosh et al., 2019; Yadav et al., 2018).
In several heart sound categorization investigations, Butterworth band-pass filters with various orders and cut-off frequencies have been employed (Ahmad et al., 2021; Noman et al., 2019; Singh et al., 2020). For instance, a 4th-order Butterworth filter with a 25–400 Hz cut-off frequency was applied in Singh et al. (2019), while a 5th-order Butterworth filter with a 25–500 Hz cut-off frequency was used in Li et al. (2021).

3.2. Pre-Processing — Segmentation

For example, the combination of peak spacing patterns with Hilbert envelopes was proposed in Milani et al. (2019), Teague Energy Operator and wavelet transform were proposed together in Ceyhan et al. (2017).
Other methods include hidden semi-Markov model (HSMM) based CNN methods (HSMM-CNN) and GMM (Gaussian mixture model) based HSMM (HSMM-GMM).
This section will focus on four popular segmentation algorithms: the wavelet transform, the fractal decomposition, the Hilbert envelope, and the Shannon energy envelope.

3.3. Feature Engineering

After denoising and segmentation, features are usually extracted manually from the raw heart sound data, which are transformed into temporal, spectral, and features containing both the temporal and frequency domain information.
Temporal features are commonly used because they are easy to extract and quantify. Statistical indicators such as signal energy (Ibrahim et al., 2021; Khan et al., 2018), amplitude (Eslamizadeh & Barati, 2017), envelope (Singh et al., 2019; Varghees & Ramachandran, 2017), and kurtosis (Xu et al., 2022) can be calculated to analyse heart sound signals.
The most commonly used time-domain features include the locations of S1 and S2, systolic and diastolic intervals, and amplitude information such as the mean and standard deviation of time duration.

**Mel-frequency cepstral coefficients (MFCCs)**

For spectral analysis, there is power spectral density (Ibrahim et al., 2021). Mel-frequency cepstral coefficients (MFCCs) have also been widely used.
Time-frequency representation: e.g.: using S-transform (Moukadem et al., 2013), short-time Fourier transforms (STFT) (Chen, Guo, et al., 2023), and wavelet transform.

3.4. ML Models for Classification

ANNs, GMMs, random forests, SVM, and HMM are examples of conventional ML techniques, typically combine hand-crafted features to complete heart sound classification tasks.

4. DL for Heart Sound Classification

4.1. Pre-Processing

Authors only found two studies of pre-processing for DL. One
study evaluated the effectiveness of 2D U-Net and denoising convolutional neural network (DnCNN) in denoising heart sound signals (Sharan et al., 2020), while another proposed an end-to-end DL model for real-time denoising of heart sounds (Ali et al., 2023).

4.2. Segmentation

Renna et al. (2019) were among the first to attempt this by developing a temporal modelling solution. The estimated segmentation sequence is produced by combining the CNN outputs corresponding to different PCG parts.
Recurrent neural networks (RNNs) have also been used to localize heart sound states. For example, bidirectional gated recurrent units (GRU-RNNs) were developed in Fan et al. (2018) to directly segment heart sounds.
The literature (Chen, Lv, et al., 2020) proposed a duration-long-short-term memory (LSTM) to address the problem of poor utilization of cardiac cycle duration information due to the inability of envelope features to effectively model intrinsic sequence features.
In order to overcome the difficulties brought on by erratic and noisy PCG recordings, Fernando et al. (2019) proved the usefulness of a mix of RNN-based temporal modelling and attention-based salient feature extraction strategies.
Lastly, Chen, Sun, Lv, et al. (2021) proposed an end-to-end approach that employs Convolutional CLSTM architecture.

4.3. DL Models for Classification

Commonly used 2D representations of heartbeat signals include Mel spectrograms and short-time Fourier transform time-frequency maps.
Several CNN architectures have been proposed for 1D CNN-based methods to identify abnormal heart sounds.
Various integration methods for heart sound classification have been proposed. One such method is the fusion of CNN and RNN models.

5. Discussions & Limitations

5.1. ECG vs Heart Sound Analysis

Authors mentioned that both ECG analysis and heart sound analysis are valuable techniques for predicting CVDs, but they provide different types of information about the heart’s functioning.

Different information: The ECG primarily measures the electrical activity of the heart. On the other hand, heart sound analysis focuses on capturing the acoustic signals produced by the heart.
Complementary insights: While ECG is well-established and widely used, it may not always capture certain structural abnormalities or subtle changes in cardiac function. By incorporating heart sound analysis, clinicians can gain additional information about the heart’s mechanical function, valvular abnormalities, and potential structural defects.
Early Detection of Certain Conditions: Heart sound analysis has shown promise in the early detection of specific cardiac conditions.
Non-invasive and portable heart sound analysis: making it accessible and convenient for screening purposes. This advantage is particularly relevant in resource-constrained settings.

5.2. Limitations

However, this research field still faces many technical challenges.

It is difficult to obtain clean data because the recording environment may be disturbed by environmental noise and speech.
Moreover, evaluating the effectiveness of existing algorithms is challenging due to the diverse range of test datasets utilized, making it difficult to draw direct comparisons between them.
The algorithm architecture should consider the effect and the clinical practicality.

Brief Review — Artificial intelligence for heart sound classification: A review

Another Review Paper for Heart Sound Classification, With Detailed Review on Dataset, ML, DL

Outline

1. Databases

2. Performance Metrics

3. ML for Heart Sound Classification

3.1. Pre-Processing — Denoising

3.2. Pre-Processing — Segmentation

3.3. Feature Engineering

3.4. ML Models for Classification

4. DL for Heart Sound Classification

4.1. Pre-Processing

4.2. Segmentation

4.3. DL Models for Classification

5. Discussions & Limitations

5.1. ECG vs Heart Sound Analysis

5.2. Limitations

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Sik-Ho Tsang

No responses yet