Brief Review — MedGPT: Medical Concept Prediction from Clinical Narratives
MedGPT, Predicting Future Events
4 min readOct 12, 2023
MedGPT: Medical Concept Prediction from Clinical Narratives
MedGPT, by Kings College London, and Kings College Hospital NHS Foundation Trust,
2021 arXiv v1 (Sik-Ho Tsang @ Medium)Medical LLM
2020 [BioBERT] [BEHRT] 2023 [Med-PaLM]
==== My Other Paper Readings Are Also Over Here ====
- Temporal modelling of a patient’s medical history, which takes into account the sequence of past events, can be used to predict future events such as a diagnosis of a new disorder or complication of a previous or existing disorder.
- MedGPT, a novel Transformer-based pipeline, is proposed that uses Named Entity Recognition (NER) and Linking tools (i.e. MedCAT) to structure and organize the free text portion of EHRs and anticipate a range of future medical events (initially disorders).
- (Beaware when searching MedGPT in Internet, many products and services are dubbed as MedGPT nowadays.)
Outline
- MedGPT
- Datasets
- Results
1. MedGPT
1.1. Model
- MedGPT is built on top of the GPT-2, which uses causal language modeling (CLM).
- Given a corpus of patients 𝑈 = {𝑢1, 𝑢2, u3, ...} where each patient is defined as a sequence of tokens 𝑢𝑖 = {𝑤1, w2, w3, …} and each token is medically relevant and temporally defined piece of patient data, the objective is the standard language modeling objective:
1.2. 8 Model Variants
- 8 different approaches on top of the base GPT-2 model, are tried.
- 1) Memory Transformers [4]; 2) Residual Attention [7]; 3) ReZero [2]; 4) Talking Heads Attention [18]; 5) Sparse Transformers [23]; 6) Rotary embeddings [21]; 7) GLU [17]; and 8) Word2Vec word embedding initialization.
- (Please read the paper for the details.)
2. Datasets
- Two EHR datasets were used: King’s College Hospital (KCH) NHS Foundation Trust, UK and MIMIC-III [10].
- No preprocessing or filtering was done on the MIMIC-III dataset of clinical notes and all 2083179 free text documents were used directly.
- At KCH, a total of 18436789 documents is collected. After filtering step 13084498 documents were left.
- In brief, the Medical Concept Annotation Toolkit (MedCAT [11]) was used to extract disorder concepts from free text and link them to the SNOMED-CT concept database.
- (Please read the paper for the sophisticated disorder extraction.)
- The concepts were then grouped by patient and only the first occurrence of a concept was kept.
- Without any filtering, there were 1121218 patients at KCH and 42339 at MIMIC-III, after removal of all disorders with frequency < 100 and all patients that have < 5 tokens, 582548 and 33975 patients were left respectively. The length of each sample/patient is limited to 50 tokens.
- The resulting dataset was then split into a train/test set with an 80/20 ratio. The train set was further split into a train/validation set with a 90/10 ratio.
3. Results
3.1. Model Variants
- Finally, the MedGPT model, which consists of the GPT-2 base model with the GLU+Rotary extension, is tested on two datasets KCH and MIMIC-III.
3.2. Performance Comparison
MedGPT outperforms BoC SVM and LSTM.
3.3. Qualitative Analysis
- Example 1: This is a simple binary task which it performed well, consistent with medical literature.
- Example 2: The background (cerebral aneurysm) provided the contextual cue for the rarer diagnosis which MedGPT successfully discerned.
- Example 3: is used to test the longer attention. MedGPT also successfully handled the necessary indirect inference.
- Example 4: Similar to above Example 3, attention in the presence of distractors were tested through intermixing historical diseases.