Brief Review — COVID-QA: A Question Answering Dataset for COVID-19

RoBERTa on 2019 COVID Questions, COVID-QA Dataset

2 min readMay 7, 2024

COVID-QA: A Question Answering Dataset for COVID-19
COVID-QA, by deepset GmbH, Intel Corporation, and Lawrence Livermore National Laboratory
2020 ACL Workshop NLP-COVID19, Over 80 Citations (Sik-Ho Tsang @ Medium)
Medical/Clinical/Healthcare NLP/LLM
2017 … 2024 [ChatGPT & GPT-4 on Dental Exam] [ChatGPT-3.5 on Radiation Oncology]
==== My Other Paper Readings Are Also Over Here ====

COVID-QA is proposed, which is a Question Answering dataset consisting of 2,019 question/answer pairs annotated by volunteer biomedical experts on scientific articles related to COVID-19.
RoBERTa-base is used for benchmarking.

Outline

COVID-QA
Benchmarking Results

1. COVID-QA

1.1. Dataset

147 scientific articles are selected mostly related to COVID-19 from the CORD-19 (The White House Office of Science and Technology Policy, 2020 (accessed May 9, 2020) collection to be annotated by 15 experts.
The annotations were created in SQuAD style fashion where annotators mark text as answers and formulate corresponding questions.
COVID-QA differs from SQuAD in that answers come from longer texts (6118.5 vs 153.2 tokens), answers are generally longer (13.9 vs. 3.2 words) and it does not contain n-way annotated development nor test sets.

1.2. Model

RoBERTa-base is used, either the baseline model vs. the model finetuned on COVID-QA.

2. Results

As shown above, finetuning the model on COVID-QA results in significant improvement across both metrics though the overall scores are pretty low compared to SQuAD.

It is hypothesized the low scores relate to more complex question/answer pairs on much longer documents and the lack of multiple annotations per question.

Brief Review — COVID-QA: A Question Answering Dataset for COVID-19

RoBERTa on 2019 COVID Questions, COVID-QA Dataset

Outline

1. COVID-QA

1.1. Dataset

1.2. Model

2. Results

Written by Sik-Ho Tsang