Brief Review — A guide to deep learning in healthcare

DL in Healthcare Overview

Sik-Ho Tsang
5 min readJan 9, 2024
Deep learning (DL) model that can accept multi-modal inputs

A guide to deep learning in healthcare
DL in Healthcare
, by Stanford University, and Google Research,
2019 Nature Medicine, Over 2500 Citations (Sik-Ho Tsang @ Medium)
==== My Other Paper Readings Are Also Over Here ====

  • Deep-learning techniques are presented for healthcare, centering the discussion on deep learning in computer vision, natural language processing, reinforcement learning, and generalized methods.


  1. Computer Vision (CV) for Medical Imaging
  2. Natural Language Processing (NLP) for Unstructured Data, e.g.: Electronic Health Records (EHRs)
  3. Reinforcement Learning (RL) for Surgical Tasks
  4. Generalized Deep Learning (DL) for Genomics

1. Computer Vision for Medical Imaging

Computer Vision for Medical Imaging

1.1. Usages

  • Image-level diagnostics have been quite successful at employing CNN-based methods, as above.
  • Remarkably, deep-learning models have achieved physician-level accuracy at a broad variety of diagnostic tasks, including identifying moles from melanomas9,10, diabetic retinopathy, cardiovascular risk, and referrals from fundus15,16 and optical coherence tomography (OCT)17 images of the eye, breast lesion detection in mammograms13, and spinal analysis with magnetic resonance imaging23.

1.2. Challenges

However, a key limitation across studies that compare human to algorithmic performance has been a lack of clinical context — they constrain the diagnosis to be performed using just the images at hand.

  • This often increases the difficulty of the diagnostic task for the human reader, who in real-world clinical settings has access to both the medical imagery and supplemental data, including the patient history and health record, additional tests, patient testimony, etc.

Another limitation to building a supervised deep-learning system for a new medical imaging task is access to a sufficiently large, labeled dataset.

2. Natural Language Processing (NLP) for Unstructured Data, e.g.: Electronic Health Records (EHRs)

Natural Language Processing (NLP) on Electronic Health Records (EHRs)

2.1. Usages

  • Recent uses of deep learning model the temporal sequence of structured events that occurred in a patient’s record with convolutional and recurrent neural networks in order to predict future medical incidents35–38. Much of this work focuses on the Medical Information Mart for Intensive Care (MIMIC) dataset39 (e.g., for the prediction of sepsis40), which contains intensive care unit (ICU) patients from a single center. It is still uncertain how well techniques derived from this data will generalize to broader populations.
  • The next generation of automatic speech recognition32 and information extraction models will likely develop clinical voice assistants to accurately transcribe patient visits.
  • Doctors easily spend 6 hours in an 11-hour workday working on documentation in the EHR, which leads to burnout and reduces time with patients31. Automated transcription will alleviate this and facilitate more affordable scribing services.

2.2. Challenges

The key challenge lies in classifying the attributes and status of each medical entity from the conversation while accurately summarizing the dialogue.

3. Reinforcement Learning (RL) for Surgical Tasks

3.1. Usage

  • One healthcare domain that can benefit from deep RL is robotic-assisted surgery (RAS). Deep learning can enhance the robustness and adaptability of RAS by using computer vision models (e.g., CNNs) to perceive surgical environments and RL methods to learn from a surgeons physical motions41,42.
  • These techniques support the automation and speed of highly repetitive and time-sensitive surgical tasks, such as suturing and knot-tying7.
  • Computer vision techniques (e.g., CNNs for object detection/segmentation and stereovision) can reconstruct the landscape of an open wound from image data, and a suturing or knot-tying trajectory can be generated by solving a path optimization problem.

3.2. Challenges

These techniques are particularly advantageous for fully autonomous robotic surgery or minimally invasive surgery. Consider modern laparoscopic surgery (MLS), One of the main challenges during semiautonomous teleoperation is correctly localizing an instrument’s position and orientation in the vicinity of surgical scenes.

Another challenge for the progression of deep learning in surgical robotics is data collection. Deep imitation learning requires large training datasets with many examples per surgical action. Given that many surgeries are nuanced and unique, it remains difficult to collect sufficient data for more general surgical tasks.

4. Generalized Deep Learning (DL) for Genomics

Generalized Deep Learning (DL) for Genomics

4.1. Usage

  • Modern genomic technologies collect a wide variety of measurements, from an individual’s DNA sequence to the quantity of various proteins in their blood, ultimately help clinicians provide more accurate treatments and diagnoses.
  • The general DL process involves taking raw data (e.g., gene expression data), converting this raw data into input data tensors, and feeding these tensors through neural networks which then power specific biomedical applications, as above.
  • One set of opportunities centers on genome-wide association (GWA) studies — large case-control studies that seek to discover causal genetic mutations affecting specific traits.
  • Genomic data can directly serve as a biomarker for the onset and progression of disease. For example, blood contains small fragments of cell-free DNA released from cells present elsewhere in the body. These fragments are noninvasive indicators of organ rejection (i.e., the immune system attacking graft cells57), bacterial infection58, and early-stage cancer59.

4.2. Challenges

Analyzing GWA studies requires algorithms that scale to very large patient cohorts and that deal with latent confounders. These challenges can be addressed via optimization tools and techniques developed for deep learning — including stochastic optimization and other modern algorithms47 combined with software frameworks for scaling computation in parallel48 — as well as through modeling techniques that handle unseen confounders49.

In the near future, models that integrate external modalities and additional sources of biological data into GWA studies — e.g., medical images or measurements of splicing and other intermediary molecular phenotypes.

Biomarker data are often noisy and requires sophisticated analysis.



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.