Brief Review — Heart sounds classification: Application of a new CyTex inspired method and deep convolutional neural network with transfer learning

CyTex + ResNet

Sik-Ho Tsang
4 min readJun 1, 2024
Heart Sound Signal is Converted into CyTex Image Then Fed Into DCNN for Classification

Heart sounds classification: Application of a new CyTex inspired method and deep convolutional neural network with transfer learning
CyTex + ResNet, by Islamic Azad University, Bellarmine University, The University of Louisiana at Lafayette
2023 Elseview J. Smart Health (Sik-Ho Tsang @ Medium)

Phonocardiogram (PCG)/Heart Sound Classification
2013 …
2023 … [CTENN] [Bispectrum + ViT] 2024 [MWRS-BFSC + CNN2D]
==== My Other Paper Readings Are Also Over Here ====

  • A new CyTex-inspired transform is used to convert heart sound signals to textured images where the neighboring pixels have meaningful relationships that result in semi-periodic patterns in the output image.
  • The image is then fed into deep convolutional neural network (DCNN), e.g.: ResNet, for heart sound classification.


  1. CyTex + ResNet
  2. Results

1. CyTex + ResNet

1.1. Preprocessing

  • A two-stage noise cancellation scheme is used.
  • Firstly, a band-pass filter removes the frequency components lower than 15 Hz and above 800 Hz.
  • Next, a spectral subtraction de-noising scheme, recommended for biological signals such as speech and EEG, is applied.
  • An improved version of Schmidt’s segmentation algorithm by Springer, which is based on logistic regression hidden semi-Markov model (HSMM), is used to segment PCG.

1.1. CyTex Image

  • To convert the 1-dimensional PCG signals into images in a meaningful way, each cardiac cycle of the signal is put in one image column, followed by its states index (S1, S2, S3, S4 index) in the next column. The process continued until all the cardiac cycles were put in the image.
  • Then, the value of pixels normalized to the range of 0 to 1:
  • Where ̂𝑥 and ̂s are the normalized versions of the 𝑥 = (𝑥1, 𝑥2,…, 𝑥𝑘) and 𝑠 = (𝑠1, 𝑠2,…, 𝑠𝑘) as the original PCG signal and corresponding segmentation output index, respectively.
  • where m is the number of cardiac cycles.
CyTex Image Examples
  • The number of rows of images is determined according to the most extended cardiac cycle. The number of columns is twice the number of cycles.

Adjacent vertical pixels demonstrate changes in sample values moving from one sample to the next within a cardiac cycle, and horizontal neighboring pixels show how values of the samples in a similar position within the cardiac cycles change form one cycle to the next.

1.3. DCNN

  • The input image is first fed to a pre-trained DCNN network such as AlexNet, ResNet-50, Inception-v3, and VGG-16.
  • The original 2 final layers are removed.
  • 4 additional layers, including a flattened layer and 3 dense layers, are added: i.e. a 512 nodes layer followed by another 128 nodes layer with ReLU activation function is added. The final layer was a one-node layer with a sigmoid activation function.
  • Data augmentation, hyperparameter tuning and Dropout are also used.

2. Results

2.1. Performance

  • ResNet-50 obtains the best score among all networks.

2.2. Limitations

  • However, authors also mentioned that noise pollution is one of the main challenges to accurately classifying PCG signals. Recording a clean sound from the heart in an actual situation is relatively impossible, and a high noise level often pollutes the recorded signal.
  • This area remains an open problem that remarkably impacts classification results. Since a high noise level can affect the segmentation outcome, the quality of CyTex directly depends on it.



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.