Brief Review — Hierarchical Multi-Scale Convolutional Network for Murmurs Detection on PCG Signals

HMS-Net, Proposed by the team HearTech+

Sik-Ho Tsang
5 min readDec 14, 2024

Hierarchical Multi-Scale Convolutional Network for Murmurs Detection on PCG Signals
HMS-Net
, by King’s College London
2022 CinC (Sik-Ho Tsang @ Medium)

Phonocardiogram (PCG) / Heart Sound Classification
2016 … 2024 [MWRS-BFSC + CNN2D] [ML & DL Model Study on HSS] [Audio Data Analysis Tool]
Summary: My Healthcare and Medical Related Paper Readings and Tutorials
==== My Other Paper Readings Are Also Over Here ====

  • The team, HearTech+, proposes a hierarchical multi-scale convolutional neural network (HMS-Net) to conduct both the murmur (T1) and clinical outcome (T2) classification.
  • HMS-Net extracts convolutional features from the spectrograms on multiple scales and fuses them through its hierarchical architecture. The network builds long short-term independencies between multi-scale features and improves the classification performance.
  • Finally, the prediction of a patient is based on the ensembled segment predictions by sliding window.

Outline

  1. Database & Preprocessing
  2. HMS-Net
  3. Results

1. Database & Preprocessing

1.1. Database

The database used in this study is the PhysioNet Challenge 2022 publicly released data, containing 3163 PCG recordings from 942 patients (Murmurs: 695 absent, 68 unknown and 179 present; Outcomes: 456 abnormal and 486 normal.).

1.2. Pre-processing

  • The recording sampling frequency is downsampled from 4000 to 2000 Hz for faster data loading speed.
  • Afterwards, the signal is normalised by z-score normalisation.

While the PCG duration effect has been proved minor on CNN performance, the recordings are cropped into 3s segments as CNN inputs, considering both the shortest signal length 5s and CNN receptive fields.

1.3. Quality Assessment & Label Correction Strategy

  • However, the PCG recordings contain many low-quality segments caused by ambient noise, artefacts, body friction, etc., which will mislead CNN optimisation. Since the frequency of normal heart sound is between 20 − 200 Hz, the energy of most murmurs is much less than that of heart sound.

The selected assessment criteria is the ratio of spectral density between 20−200 Hz to full band (0 − 1000 Hz), named quality ratio.

Quality Assessment
  • Fig. 1a and 1b are spectrograms of two segments from one ’absent’ recording. Fig. 1c is of a segment from an ’unknown’ recording. There are visible differences between 1a and 1b, especially in the higher frequency bands. On the contrary, these high-frequency noises in 1a are similar to those in 1c.

Thus, the label correction strategy is: if the quality ratio is larger than 30%, this segment murmur label follows the recording label. Otherwise, it will be relabelled as ’unknown’. It should be noted this label correction is only for murmur labels but keep outcome labels unchanged.

2. HMS-Net

2.1. Model Inputs

Multi-scale spectrograms

The CNN inputs are the multi-scale spectrograms of 3s segments. Three scales (×1.0, ×0.5, ×0.25) are selected.

2.2. Multi-Scale Model

HMS-Net
  • A hierarchical multi-scale convolutional neural network (HMS-Net) is proposed. Three-scale spectrograms of a segment input HMS-Net and output the 3-class murmur prediction. For outcomes prediction, it combines with patient information and outputs the binary result.

In brief, a larger scale of spectrogram requires deeper layers; thus, HMS-Net has four phases containing layers with incremental depths for extracting features from different scales.

  • For example in Phase 1, two sub-networks are employed to convolve the features from Scale 1 and Scale 2. The 2-scale features are then concatenated in channel dimension and passed to the next phase.
  • Phase 4 summarises the multi-scale convolutional features by global average pooling and classifies the segment with a linear layer.

Regarding the outcome classifier, patient information, including age, gender (one-hot), pregnant status, height, and weight, is added as extra information to distinguish patients with abnormal clinical outcomes.

  • 256 patient features are extracted from these information via a 4-layer multi-layer perceptron (MLP). The final outcome prediction is obtained from both the convolutional features and the patient features.

2.3. Murmur Classification Post-processing

  • For recording classification, a sliding window with 3s width and 1s step is applied to classify the whole recording continuously. For a frame slided by multiple windows, its label is calculated by the averaged distribution probabilities of the passed windows.
  • If the predicted ‘unknown’ accounts for over 80%, it is classified as ‘unknown’. Otherwise, the recording is classified with the majority of serial labels.
  • For patient classification, if one location recording is classified as ’present’, the patient is labelled ’present’. In terms of ’absent’ and ’unknown’, the patients are classified by the majority of location recording labels.
  • When there is the same number of ’absent’ and ’unknown’ recordings of the patient, ’absent’ has the priority.

2.4. Outcome Classification Post-processing

  • The serial labels (’abnormal’ or ’normal’) per second are obtained by sliding window as well. If over 1/3 frames are predicted as ‘abnormal’, the recording is predicted as ‘abnormal’.
  • When a patient has at least one predicted ‘abnormal’ recording, the strategy diagnoses the patient as ‘abnormal’. Otherwise, ‘normal’.

3. Results

  • Using 5-fold cross validation, HMS-Net achieved an average murmur classification accuracy of 91.37% (best 92.85%) on segments.
  • It performed 83.78% averaged murmur classification accuracy on patients and 0.81 averaged weighted score.
  • The weighted murmur accuracy was 0.853.
  • Regarding patient outcome classification, the proposed method achieved averaged 56.83% accuracy (best 62.96%) and 9808 averaged outcome (best 9242).

For the challenge, in the blind validation set, the algorithm achieved 0.806 murmur weighted accuracy and 9120 outcome challenge score, and in the blind testing set, they were 0.776 and 12069, respectively.

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

No responses yet