Brief Review — An End-to-End Deep Learning Framework for Real-Time Denoising of Heart Sounds for Cardiac Disease Detection in Unseen Noise
LSTM U-Net (LU-Net)
An End-to-End Deep Learning Framework for Real-Time Denoising of Heart Sounds for Cardiac Disease Detection in Unseen Noise
LSTM U-Net (LU-Net), by Bangladesh University of Engineering and Technology (BUET), National Heart Foundation Hospital and Research Institute, Qatar University, Johns Hopkins University
2023 ACCESS (Sik-Ho Tsang @ Medium)Heart Sound Classification
2013 … 2023 [2LSTM+3FC, 3CONV+2FC] [NRC-Net] [Log-MelSpectrum+Modified VGGNet] [CNN+BiGRU] [CWT+MFCC+DWT+CNN+MLP]
==== My Other Paper Readings Are Also Over Here ====
- A novel deep encoder-decoder-based denoising architecture (LSTM U-Net, LU-Net) to suppress ambient and internal lung sound noises.
- Training is done using a large benchmark PCG dataset mixed with physiological noise, i.e., breathing sounds.
- Two different noisy datasets were prepared for experimental evaluation by mixing unseen lung sounds and hospital ambient noises with the clean heart sound recordings.
- Authors also used the inherently noisy portion of the PASCAL heart sound dataset for evaluation.
Outline
- LSTM U-Net (LU-Net)
- Dataset Preparation
- Results
1. LSTM U-Net (LU-Net)
1.1. Problem Formulation
A noise-free PCG signal x is corrupted with several irrelevant components n coming from the environment or system to form a noisy PCG signal y:
- LU-Net, F(), is used to denoise y to obtain ^x, which should be close to x:
- Thus, Mean Square Error (MSE) is used to train the network.
With denoised heart sound, classification performance should also be improved.
1.1. Model Architecture
The proposed network is a convolutional encoder-decoder-based architecture with bi-directional long short term memory (Bi-LSTM) modules in the skip connections.
- Encoder Path: 1D convolution layers with ReLU are used. Last
- Encoder_i=2–5 contain convolution layer with a stride of 2, they successively create lower dimensional representation.
- Decoder Path: The Decoder_i consists of a 1D convolution layer followed by a ReLU non-linearity activation and an UpSampling1D layer.
- Finally, the output from Decoder_1 is passed through a convolution layer, where Cout = 1 which provides the corresponding denoised output sequence, yˆt.
- At skip connection, the Bi-LSTM module is used as it can internally concatenate the forward and backward vectors to a single vector to learn the long-term dependencies with fewer parameters.
2. Dataset Preparation
- (Please read the paper directly for the detailed dataset preparation and experimental setup. It covers many pages for this section.)
2.1. PhysioNet
- This dataset provides signal with the presence of several noises (e.g., breathing, stethoscope movement, intestinal activity, peripheral talking.)
2.2. PASCAL
- In the training set of Dataset-B, there are sub-directories containing noisy data of normal (120) and murmur (29).
2.3. Open-Access Heart Sound (OAHS) Dataset (Yaseen GitHub Dataset)
- It provides publicly available noise-free PCG dataset containing a total number of 1000 recordings.
2.4. ICBHI 2017 Dataset
- The largest publicly available respiratory sound database [48].
2.5. Hospital Ambient Noise (HAN) Dataset
- A non-copyrighted YouTube video of 68 minutes where the audio occurrences were recorded from different places (corridor, waiting room, etc.) of a busy hospital.
2.6. Training Data Preparation
- PhysioNet dataset is used.
- Lung sounds from the ICBHI 2017 dataset as the noise source to create synthetic noisy PCG recordings.
2.7. Test Data Preparation
- The relatively clean OAHS dataset recordings are mixed with lung sound and hospital ambient noise to generate two synthetic noisy test sets, OAHS-LS and OAHS-HAN.
- To represent the real-world test scenario, the noisy recordings of the PASCAL dataset are used.
- For classification, OAHS dataset is split into 3 distinct sets: training, validation, and test, with a ratio of 70 : 10 : 20. The test portion has been mixed with lung sound and hospital ambient noise to generate the test OAHS-LS and OAHS-HAN datasets, respectively.
3. Results
3.1. Denoising Performance
LU-Net consistently outperforms FCN and U-Net across all evaluated metrics.
3.2. Classification Performance
- CardioXNet is used as the classification model.
The proposed LU-Net improves the estimated SNR by 6.517 dB, which is 26.175% and 2.725% superior relative to U-Net and FCN, respectively.
3.3. Visualization
The superiority of LU-Net over the baselines can be visually observed in Fig. 7 and 8 above.