Brief Review — OSASUD: A dataset of stroke unit recordings for the detection of Obstructive Sleep Apnea Syndrome

A Dataset for Obstructive Sleep Apnea (OSA)

3 min readApr 21, 2025

OSASUD: A dataset of stroke unit recordings for the detection of Obstructive Sleep Apnea Syndrome
OSASUD
, by Udine University Hospital, and University of Udine
2022 Nature Scientific Data (Sik-Ho Tsang @ Medium)

==== My Healthcare and Medical Related Paper Readings ====
==== My Other Paper Readings Are Also Over Here ====

  • Polysomnography (PSG) is a fundamental diagnostical method for the detection of Obstructive Sleep Apnea Syndrome (OSAS). However, a trained physician is needed for manually identifying OSAS episodes in individuals based on PSG recordings. Also, it is a time-consuming task.
  • A dataset, Obstructive Sleep Apnea Stroke Unit Dataset (OSASUD), is collected at the stroke unit of the Udine University Hospital, Italy, for the research purpose of OSA detection.

Outline

  1. OSASUD Dataset Collection
  2. OSASUD Dataset Characteristics

1. OSASUD Dataset Collection

OSASUD Dataset Collection Workflow

It is composed of overnight recordings of 30 patients that were admitted to the stroke unit of the Udine University Hospital, Italy, from August 2019 to July 2020.

  • Each patient underwent simultaneous overnight PSG and vital signs recording. Recordings were performed during the first days after clinical onset (average 1.31.1 days, range 0–5).
  • For each patient, recordings of multi-channel ECG and photoplethysmography (PPG) are reported, together with derived data including heart rate, oxygen saturation, pulsatility index, respiratory rate, and premature ventricular contractions.
  • The collected PSG data was then annotated by a trained sleep physician against the presence of apnea and hypopnea events, at one second granularity.
  • The PSG data and annotations were then temporally aligned with and matched against the recorded vital signs.
  • The final dataset was assembled considering the physician’s annotations and a relevant subset of the collected data.
OSASUD Dataset Characteristics
  • A level 3 PSG without video recording was performed using an Embletta MPR polysomnograph (Natus Medical Inc., Pleasanton, CA, USA), keeping track of the following channels: nasal airflow, blood oxygen saturation, snoring, body position, thoracic and abdominal movements, and ECG.
  • Vital signs were collected by means of a Mindray iMec15 monitor connected to a Mindray Benevision CMS II central monitoring system (Mindray Bio-Medical Electronics Co., Ltd., Shenzhen).
Partial recording with its annotations
  • Figure 2 shows a partial recording with its annotations, opened in Embla RemLogic.
  • Due to time clock difference, temporal alignment between Embletaa and Mindray is performed.

2. OSASUD Dataset Characteristics

OSASUD Dataset Characteristics

The dataset OSASUD consists of a Pandas DataFrame with 18 columns and 961357 rows, saved in Pickle format.

  • Only a subset of the originally recorded data is contained as some signals are redundant..
  • There is a physician’s annotation, that distinguishes between regular breathing behaviour (string ‘NONE’), hypopnea (string ‘HYPOPNEA’), and different kinds of apnea (strings ‘APNEA-CENTRAL’, ‘APNEA-OBSTRUCTIVE’, ‘APNEA-MIXED’);
  • There is a boolean attribute that coarsely distinguishes between regular and anomalous breathing behaviour (it equals True if and only if the annotation is not ‘NONE’);
  • A final validation of the dataset comes from the successful development of a deep learning model for OSAS event prediction based on the considered data, recently presented in the literature in [26]. (Hope I can read it later.)
  • (There are many details for the dataset collection, characteristics, and further processing. Please feel free to read the paper directly.)

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

No responses yet