Brief Review — Health information for all: do large language models bridge or widen the digital divide?

The Needs to Enhance Low-Resource LLMs

Sik-Ho Tsang
4 min readNov 3, 2024

Health information for all: do large language models bridge or widen the digital divide?
Low Resource LLM on Health Info
, by RMIT University, University of Melbourne, Oxford University Clinical Research Unit, Chinese University of Hong Kong, and National University of Singapore
2024 BMJ (Sik-Ho Tsang @ Medium)

Healthcare/Medical LLM Evaluation
2023
[GPT-4 in Radiology] [ChatGPT & GPT‑4 on USMLE] 2024 [ChatGPT & GPT-4 on Dental Exam] [ChatGPT-3.5 on Radiation Oncology] [LLM on Clicical Text Summarization] [Extract COVID-19 Symptoms Using ChatGPT & GPT-4] [ChatGPT on Patients Medication] [AI Chatbot Study on Drug Info for Patients]
My Healthcare and Medical Related Paper Readings and Tutorials
==== My Other Paper Readings Are Also Over Here ====

  • (Due to project needs, my leader suggests to have this paper read, to see the impact of LLM on low resource language.)
  • Large language models (LLMs) like ChatGPT could have a role in narrowing the health information digital divide. But evidence indicates that LLMs might exacerbate the digital disparity in health information access in low and middle income countries.
  • Most LLMs perform badly in low resources languages like Vietnamese, resulting in the dissemination of inaccurate health information and posing potential public health risks.

Outline

  1. Hallucination in LLMs
  2. Vietnamese Case Study
  3. Six Pillars of Advancing AI Language Inclusivity

1. Hallucination in LLMs

1.1. LLM in Medical Communications

  • By providing access to health information around the clock, LLMs could reduce the workload of healthcare professionals, enabling them to focus on complex cases, improving services, and reducing costs.
  • But a major concern is information accuracy. AI hallucination, where AI generates plausible but factually incorrect information, poses the risk of disseminating misleading information.
  • These inaccuracies include not only factual errors but also inappropriate and stereotypical responses.

1.2. AI hallucination in Low Resource LLMs

  • GPT-4 is reported to currently support over 80 languages. But data indicates that ChatGPT produces weaker representations for non-English languages than English owing to limited training data.
  • As shown above, most of the over 7000 languages used today still lack adequate digital resources for model development. The eight most used languages online comprise 80% of all digital content, account for 62% of global gross domestic product (GDP).
  • There are multiple causes of LLM generated misinformation, as shown above.
  • LLMs learn from training datasets of mixed quality, leading to biases and inaccuracies.
  • The design often emphasises linguistic fluency over accuracy.

AI hallucinations are more prevalent and more severe in low resource languages owing to the limited digital training content.

The training dataset for Meta’s Llama 2, for example, is predominantly English, accounting for 89.7% of the dataset, with Vietnamese making up 0.08%.

  • LLMs trained on insufficient data are more prone to AI hallucination.
  • The quality of training for low resource languages is often poorer, containing more inaccuracies and biases.
  • Another challenge is the lack of thorough digital documentation of nuanced medical terminology and the latest practices in these languages.

2. Vietnamese Case Study

2.1. Challenges in Achieving Equitable Access to Health Information

Vietnamese responses generated by GPT-3.5 often lack linguistic fidelity and informational accuracy, leading many bilingual Vietnamese people to switch to English when using these models.

  • But this approach is limited to those proficient in English, who tend to be more economically advantaged.

The more advanced and accurate GPT-4 requires a $20 (£15; €18) monthly subscription to access. With Vietnam’s average monthly income below $300, this is prohibitively expensive for most people, preventing less economically capable people from benefiting from LLM technology.

  • Eight inquiries related to cardiological health are made through three LLMs: GPT-3.5, GPT-4 (both through ChatGPT), and Gemini Pro (through Gemini).
  • A striking error was identified with prompts related to atrial fibrillation, where GPT-3.5 incorrectly identified atrial fibrillation as Parkinson’s disease.
  • Other major inaccuracies included misclassifying dementia as a complication of atrial fibrillation.
  • There are problems with incorrect information, terminology inaccuracies, irrelevant or broken links, and unsubstantiated claims.

Therefore, it can be said that for speakers of low resource languages, inaccuracies and misinformation are prevalent, and the digital divide is further widened by the lack of quality medical information in their native languages.

3. Six Pillars of Advancing AI Language Inclusivity

Multisectoral initiatives from policy makers, research funding agencies, big technology corporations, research communities, healthcare practitioners, and linguistically under-represented communities are crucial to improving AI language inclusivity.

  1. The United Nations released a report on governing AI for humanity and has mentioned the global inclusivity principle of AI tools equitable access.
  2. The European Union has introduced the EU AI Act.
  3. Funding agencies are pivotal in expanding support for AI language inclusivity.
  4. Big technology corporations also have a role in fostering technology for language inclusivity.
  5. The research community should lead initiatives in open source linguistic data, models, and tools.
  6. Healthcare practitioners also have a vital role in providing critical feedback for model development.

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.