Brief Review — A lightweight hybrid deep learning system for cardiac valvular disease classification

Augmented Sound Dataset + CNN-LSTM

4 min readNov 20, 2023

A lightweight hybrid deep learning system for cardiac valvular disease classification
CNN-LSTM, by Yarmouk University
2022 Nature Sci. Rep., Over 20 Citations (Sik-Ho Tsang @ Medium)
Heart Sound Classification
2013 … 2022 [CirCor] 2023 [2LSTM+3FC, 3CONV+2FC] [NRC-Net]
==== My Other Paper Readings Are Also Over Here ====

A combined CNN and LSTM model is proposed for 5-class phonocardiogram (PCG) signal classification, which utilizes either augmented or non-augmented datasets.

Outline

Datasets, Preprocessing & Data Augmentation
Proposed CNN-LSTM & FFT-CNN-LSTM
Results

1. Datasets, Preprocessing & Data Augmentation

1.1. Datasets

**GitHub Dataset** & **PhysioNet/CinC Challenge 2016**

The model was trained using the publicly available open heart sounds GitHub Dataset. 1000 recordings. with 5 classes. Each class has 200 recordings.
PhysioNet/CinC Challenge 2016 was the second dataset utilized in this research to further examine the suggested model. This dataset contains normal and abnormal classes only.

Some examples are shown above for 2 datasets.

1.2. Preprocessing

Fourier transform of PCG signals was clipped to contain only 350 Hz from the 4000 Hz spectrum.
Each PCG record in the first dataset is downsampled by a factor of 8, and each PCG record in the second dataset is downsampled by a factor of 2.
Therefore, the highest frequency content is 500 Hz in all heart conditions, as shown above.

1.3. Data Augmentation

**GitHub Dataset** **After Data Augmentation**

Similar to images, there are several techniques to augment audio signals, and these techniques are usually applied to the raw audio signals.

Time stretch: randomly slow down or speed up the sound.
Time shift: shift audio to the left or the right by a random amount.
Add noise: add some random values to the sound.
Control volume: randomly increasing or decreasing the volume of the audio.

2. Proposed CNN-LSTM & FFT-CNN-LSTM

2.1. CNN-LSTM

In brief, deep feature extraction and selection from the PCG signals are handled by CNN blocks, particularly the 1D convolutional layers, the batch normalization layers, the ReLU layers, and the max-pooling layers.
Utilizing the LSTM component produce a richer and more concentrated model compared to the pure CNN models, resulting in higher performance with fewer parameters.

2.2. FFT-CNN-LSTM

Using the FFT input, the model becomes a FFT-CNN-LSTM model.

3. Results

3.1. Non-Augmented Data vs Augmented Data

10-fold cross-validation is used.

For the non-augmented data, the accuracy was 98.5%.
For the augmented data, the accuracy was 99.87%.
For the binary dataset, the accuracy was 93.77%.

(Please read the paper directly for more experimental results.)

3.2. SOTA Comparisons

**SOTA Comparisons on** **GitHub Dataset**

The proposed architecture outperforms all models for all important performance metrics. The accuracy of the new model is 99.87% which is 0.27% higher than the accuracy of the second-best model built by Shuvo et al. in 2021.

**SOTA Comparisons on** **PhysioNet/CinC Challenge 2016**

The new system outperformed the previous state-of-the-art models for all performance metrics. The obtained accuracy is 6.45% higher than the 87.31% accuracy reported by Alkhodari et al. in 2021.

3.3. Time Measurement

The result shows that it is a lightweight model that can be implemented using embedded systems.

Brief Review — A lightweight hybrid deep learning system for cardiac valvular disease classification

Augmented Sound Dataset + CNN-LSTM

Outline

1. Datasets, Preprocessing & Data Augmentation

1.1. Datasets

1.2. Preprocessing

1.3. Data Augmentation

2. Proposed CNN-LSTM & FFT-CNN-LSTM

2.1. CNN-LSTM

2.2. FFT-CNN-LSTM

3. Results

3.1. Non-Augmented Data vs Augmented Data

3.2. SOTA Comparisons

3.3. Time Measurement

Written by Sik-Ho Tsang

No responses yet