Brief Review — Classification of heart sounds based on the combination of the modified frequency wavelet transform and convolutional neural network

MFSWT + CNN

4 min readMar 2, 2024

Classification of heart sounds based on the combination of the modified frequency wavelet transform and convolutional neural network
MFSWT + CNN, by Shandong University
2020 J. MBEC, Over 40 Citations (Sik-Ho Tsang @ Medium)
Heart Sound Classification
2013 … 2023 [2LSTM+3FC, 3CONV+2FC] [NRC-Net] [Log-MelSpectrum + Modified VGGNet] [CNN+BiGRU] [CWT+MFCC+DWT+CNN+MLP] [LSTM U-Net (LU-Net)]
==== My Other Paper Readings Are Also Over Here ====

Nodified frequency slice wavelet transform (MFSWT) and convolutional neural network (CNN) are combined for classifying normal and abnormal heart sounds.

Outline

MFSWT + CNN
Results

1. MFSWT + CNN

1.1. Segmentation

A hidden Markov model (HMM) is used to find the position of each cardiac cycle in the heart sound signal and determine the exact position of the four periods of S1, S2, systole, and diastole.

1.2. Modified Frequency Slice Wavelet Transform (MFSWT)

The MFSWT is based on the frequency slice wavelet transform (FSWT) and is improved for low-frequency biosignals in ECG [18].
A signal-adaptive frequency slice function (FSF) in the frequency slice wavelet transform is used to accurately locate the position of each component of the biosignal in the time-frequency diagram, and the classification of ECG signals by this method had achieved good results [18].
In the FSWT, the scale σ is a constant or a function of ω, t, and u. But in the MFSWT, scale σ is a function of ^f(x), defined as follows:

(Please kindly read [18] for FWST.)

1.3. Sample Entropy (SampEn)

Sample entropy is denoted by SampEn(N, r, m), where N is the length of the time signal, m is the dimension, and r is the similar tolerance.

Suppose that a signal with a length of N is written as the following:

The m-dimensional vector Xm(i) is defined as the following:

The distance between two m-dimensional vectors Xm(i) and Xm(j)(i ≠ j) is defined as Di,j as the following:

For the vector Xm(i), the number of vectors Xm(j) whose distance is less than the tolerance r are counted, denoted as Βmi(r).

SampEn is a measure of signal complexity as defined below:

The lower of the SampEn, the more regular the signal is.

Two CNN models are applied to classify signals according to different SampEn.

1.4. CNN

The structure of the two models are similar, all of which are 12-layer neural networks, consisting of two convolutional layers, two ReLU layers, two max-pooling layers, three fully connected layers, and input and output layers.