Brief Review — Murmur Identification Using Supervised Contrastive Learning

CeZIS Team, 8th and 4th Ranks for Murmur and Outcome Predictions Respectively

5 min readDec 21, 2024

Murmur Identification Using Supervised Contrastive Learning
CeZIS, by VSL Software, and Pavol Jozef Šafárik University in Košice,
2022 CinC (Sik-Ho Tsang @ Medium)
Phonocardiogram (PCG) / Heart Sound Classification
2016 … 2024 [MWRS-BFSC + CNN2D] [ML & DL Model Study on HSS] [Audio Data Analysis Tool]
Summary: My Healthcare and Medical Related Paper Readings and Tutorials
==== My Other Paper Readings Are Also Over Here ====

A supervised contrastive learning and a deep convolutional neural network are proposed to obtain an embedding of the phonocardiogram slice onto a unit hypersphere in low-dimensional space.
The obtained latent factors are then applied to classify patients using a Random Forest model for murmur detection and Challenge cost score.

Outline

CeZIS Team
Results

1. CeZIS Team

1.1. Dataset

There are 3163 PCGs from 942 patients from CirCor for training.
There are basic demographic data (gender, age group, height, weight, pregnancy status) and at least one recording from at least one prominent auscultation location were available for each patient. There are four standard locations.
The original murmur labels are Absent (695 / 73.78%), Present (179 / 19.00%), and Unknown (68 / 7.22%).

The murmur label is redefined for each PCG as above with P1 and P2 introduced.

According to clinical outcome diagnosed by a medical expert, the patients were divided into two classes: Abnormal (456 / 48.41%) and Normal (486 / 51.59%).
However, there are 29 patients in the training set with the observed murmur and the Outcome label of Normal. The number of patients with the Outcome label of Abnormal is much higher than patients with the observed murmur. (To me, inconsistent labels are problematic for deep neural network training since the labels should be the ground-truth labels to supervise the model training.)

1.2. Contrastive Learning

Based on the labels, SupCon loss function is used so that for each selected anchor, other samples from the same class in the batch are located in the embedded space nearby, while samples from others classes are located much further away.

An embedding for a short PCG (approximately 8 seconds) should be obtained onto the unit hypersphere in a low-dimensional space (dim = 16) through the deep learning model.
The obtained embedding can be used to solve the original classification task, but also for downstream tasks.

The backbone of the CNN is a one-dimensional variant of the ResNet50 with a reduced number of virtual channels (width = 1/4), to the output of which L2 normalization is applied.
Thus, for each PCG slice, its embedding is obtained onto the unit hypersphere in 512-dimensional space. Subsequently, a projection head is added, which maps the 512-dimensional space to the 16-dimensional space, where the projection head contains only one simple linear layer.
The L2 normalization is then used again to embed the PCG slice onto the unit hypersphere in 16-dimensional space.
For contrastive learning, only PCGs with murmur labels A (Absent) and P1 (Present/Murmur location) are used. The sample with label P2 and label U are omitted.

1.3. Patient Classification

PCGs in the training set range in length from approximately 5 seconds to more than 64 seconds. For each PCG, 10 different slices (offsets) are defined with a length of approximately 8 seconds.
If the PCG is shorter than 8 seconds, the signal is padded with varying numbers of zeros on the left and right.
For each patient, the 10 views corresponding to the offsets are used for data augmentation, as shown above.

After having obtained the latent factors from CNN, we also obtained the values of the demographic variables.
A patient label is predicted by the arithmetic mean of predicted probabilities from all patient views.

The task is divided into two subsequent binary classifications (Figure 4):

Present vs. others (Absent, Unknown) — to separate the patients with the original Murmur label of Present.
Unknown vs. Absent — the additional classification for other values of the original Murmur labels of the patients.

Random forest is used for the classification.
Similar for outcome prediction.

2. Results

10-fold cross-validation (CV) is used on the training set.
For the training set, the mean and standard deviation obtained from CV are shown, and for the test set, the ranking among all teams is also displayed.

Due to the inaccuracy of the model and the high cost of late treatment, the model predicts a very large proportion of patients as Abnormal and refers them to experts for examination.

Brief Review — Murmur Identification Using Supervised Contrastive Learning

CeZIS Team, 8th and 4th Ranks for Murmur and Outcome Predictions Respectively

Outline

1. CeZIS Team

1.1. Dataset

1.2. Contrastive Learning

1.3. Patient Classification

2. Results

Written by Sik-Ho Tsang

No responses yet