Brief Review — Dual Bayesian ResNet: A Deep Learning Approach to Heart Murmur Detection
Dual Bayesian ResNet (DBRes)
Dual Bayesian ResNet: A Deep Learning Approach to Heart Murmur Detection
DBRes, by University of Oxford, University of Surrey
2022 CinC, Over 10 Citations (Sik-Ho Tsang @ Medium)Phonocardiogram (PCG) / Heart Sound Classification
2016 … 2024 [MWRS-BFSC + CNN2D] [ML & DL Model Study on HSS] [Audio Data Analysis Tool]
Summary: My Healthcare and Medical Related Paper Readings and Tutorials
==== My Other Paper Readings Are Also Over Here ====
- Two models are designed and implemented.
- The first model is a Dual Bayesian ResNet (DBRes), where each patient’s heart sound recording is segmented into overlapping log mel spectrograms. These spectrograms undergo two binary classifications: present versus unknown or absent, and unknown versus present or absent. These classifications are aggregated to give a patient’s final classification.
- The second model is the output of DBRes integrated with demographic data and signal features using XGBoost.
Outline
- 2022 George B. Moody PhysioNet Challenge Dataset
- Dual Bayesian ResNet (DBRes)
- Results
1. 2022 George B. Moody PhysioNet Challenge Dataset
1.1. Dataset
The dataset is the CirCor dataset, which contains heart sound recordings of length 5 to 45 seconds, together with demographic data, which consists of age categories, sex, height, weight, and pregnancy status.
- There are 1568 patients in the dataset, of which 60% (942) were given to the participants for training. For each patient, there were up to six heart sound recordings available, with 5272 recordings in the full dataset and 3163 in the training set.
- Each recording was taken from multiple locations or at least one location.
- Each patient has a heart murmur label, which can be present, unknown, or absent.
The murmur challenge score used to evaluate classifiers was the weighted accuracy:
1.2. Data Preparation
The heart sound audio is transformed into log mel spectrogram, where the mel scale, which aims to maintain the distance humans perceive between pitches. The spectrogram’s minimum and maximum frequencies were 10 and 2000 Hz, respectively.
There are also signal features extracted include summary features in the time and frequency domains, as well as summary features of the spectral centroid, rolloff and bandwidth.
The demographic data was preprocessed using the example code provided by organisers of this year’s Challenge. This consisted of converting age labels into approximate ages in months, one hot encoding the sex label, and converting the pregnancy status into a binary variable. Missing values are handled using a mean imputation.
2. Dual Bayesian ResNet (DBRes)
2.1. Bayesian Neural Networks
- The core of the audio-based inference is performed using two Monte Carlo Dropout ResNet50 Bayesian neural networks.
In order to approximate the model posterior as test time, Dropout layers are added to the modules BasicBlock() and Bottleneck() as well as the overall model construction.
- The model was pre-trained on ImageNet and the layers were re-trainable.
2.2. Dual Bayesian ResNet (DBRes) and XGBoost Integration
- There are three major components to the models: classifying the individual spectrograms, aggregating these classifications, and integrating the demographic data and signal features via XGBoost.
2.2.1. The First Model: DBRes
The ternary murmur classification is split into two binary classifications: present versus unknown or absent; and unknown versus present or absent.
- Separate Bayesian ResNet50 networks are trained on the individual spectrograms for each of these tasks. During testing, a patient’s individual spectrograms are simultaneously classified using both networks.
- If this averaged output classifies the patient’s murmur as present, it is classified as present. If not, then the arithmetic mean of the output from the unknown versus present or absent ResNet50 is taken.
- If this averaged output classifies the patient’s murmur as unknown, it is classified as unknown, else the patient’s murmur is classified as absent.
2.2.2. The Second Model: DBRes + XGBoost
- The second model integrates the output from the DBRes with the patient’s demographic data and extracted signal features using XGBoost.
3. Results
- The data contain mainly children and is highly unbalanced, with murmurs being absent in 74% of patients, present in 19% of patients, and unknown in 7% of patients.
Splitting into two binary classifications: present versus unknown or absent; and unknown versus present or absent, helps to reduce a bit the imbalance issue.
DBRes scored 0.768 when evaluating the murmur challenge score using the PhysioNet hidden test set.
The similarity between the murmur challenge score on the local test set and the PhysioNet hidden test set demonstrate that the local test set has been constructed in a sound way to promote model generalisation across datasets.
- The integration of demographic data and signal features improves the accuracy. However, this integration decreases the weighted accuracy. This might be due to the current implementation of XGBoost integration is not optimised for the weighted accuracy.