# Brief Review — Towards Domain Invariant Heart Sound Abnormality Detection using Learnable Filterbanks

## tConv-CNN

Towards Domain Invariant Heart Sound Abnormality Detection using Learnable Filterbanks, by Bangladesh University of Engineering and Technology (BUET), and Robert Bosch Research and Technology Center (RTC)

tConv-CNN2020 JBHI, Over 50 Citations(Sik-Ho Tsang @ Medium)

Heart Sound Classification…

20132023[2LSTM+3FC, 3CONV+2FC] [NRC-Net] [Log-MelSpectrum + Modified VGGNet] [CNN+BiGRU][CWT+MFCC+DWT+CNN+MLP] [LSTM U-Net (LU-Net)]

==== My Other Paper Readings Are Also Over Here ====

- A novel Convolutional Neural Network (CNN) layer is proposed, consisting of
**time-convolutional (tConv) units, that emulate Finite Impulse Response (FIR) filters**, for heart sound classification. - The filter coefficients
**can be updated via backpropagation**and be**stacked**in the front-end of the network as a learnable filterbank.

# Outline

**tConv Variants****tConv-CNN Model Architecture****Results**

**1. tConv Variants**

- To be brief, authors argue that
**CNNs are analogous to FIR filters**. Also,**symmetry condition**is found for a causal generalized linear phase FIR filter. - Thus, the proposed
**tConv**units in the**front-end enable the pre-processing steps, e.g. spectral decomposition or filterbank analysis**, to be supplemented by the first layer of an end-to-end CNN.

## 1.1. Linear Phase tConv (LP-tConv)

**Linear Phase tConv**is proposed, which can have**4 types**as above.

CNN kernels with

weight sharing between coefficients on the two sides of the symmetry. This results inhalf of the kernel weights being learned and shared.

## 1.2. Zero Phase tConv (ZP-tConv)

- A zero phase tConv layer is proposed that has
**no phase effect on the input signal.**If*x*(*n*) is the input signal,*h*(*n*) is the impulse response of the kernel, and*y*(*n*) is the output, we have in the frequency domain:

- The flip operation in time domain is equivalent to taking the complex conjugate in the frequency domain.

In the implementation of the

ZP-tConvunit,two consecutive convolution operations with the same kernelare performed;during the second convolution, the kernel is flippedto equalize the phase response of the first convolution.

## 1.3. Gammatone tConv

- The gammatone auditory filterbank is implemented in practice as
**a series of parallel band-pass filters.**It models the tuning frequency at different points of the human basilar membrane [28].**tConv**unit is proposed that**approximates a gammatone function.** - The gammatone impulse response is given by:

- where,
denote the*g*(*n*),*α*,*η*,*β*,*f*and*φ**n*-th gammatone coefficient, amplitude, filter order, bandwidth, center frequency and phase of the gammatone wavelet (in radians), respectively. - With
*φ*is set to 0, a gammatone tConv has**only 4 learnable parameters (**. These 4 parameters can be learnt by backpropagation.*α*,*η*,*β*,*f*)

**2. tConv-CNN Model Architecture**

## 2.1. Model Architecture

**The frontend of the model is a learnable filterbank, built with four tConv units.****Each of the spectral bands decomposed by the learnable filterbank is passed through a separate branch of our CNN architecture.****Each branch**has**two convolutional layers**of kernel size 5, followed by a**Rectified Linear Unit (****ReLU****)**activation and a**max-pooling**of 2. Activations are normalized for each training mini-batch prior to ReLU and Dropout with a probability of 0.5- The outputs of the four branches are fed to an
**MLP**network after being concatenated along the channels and flattened. - Cardiac cycles are extracted from each PCG resampled to 1kHz using the method presented in [29] and zero-padded to be 2.5s in length.
**The posterior predictions**for all of the cardiac cycles are**fused for each recording.****Cross-entropy loss**is optimized.

## 2.2. Domain Balanced Training

- With each mini-batch of size
*B*, balanced with an equal number of classes from each PHSDB (PhysioNet) subset.

# 3. Results

## 3.1. **PHSDB (PhysioNet) Dataset**

The proposed methods portray superior performance in all of the metrics compared to the baselines, with asignificant improvement in average subset-wise accuracy and Macc.

- The proposed CNN with a learnable filterbank front-end with linear phase Type IV tConvs, acquired relative improvements of 8% and 11:84% in Macc compared to the Potes-CNN and Gabor-BoAW-SVC (Upsamp.) baselines, respectively.

**3.2. ****HSS Dataset**

The proposed

Gammatone tConv-CNNandType IV tConv-CNNmodels provide thebestperformances in terms of Macc and F1 metrics, respectively.