Brief Review — COVID-QA: A Question Answering Dataset for COVID-19
RoBERTa on 2019 COVID Questions, COVID-QA Dataset
2 min readMay 7, 2024
COVID-QA: A Question Answering Dataset for COVID-19
COVID-QA, by deepset GmbH, Intel Corporation, and Lawrence Livermore National Laboratory
2020 ACL Workshop NLP-COVID19, Over 80 Citations (Sik-Ho Tsang @ Medium)Medical/Clinical/Healthcare NLP/LLM
2017 … 2024 [ChatGPT & GPT-4 on Dental Exam] [ChatGPT-3.5 on Radiation Oncology]
==== My Other Paper Readings Are Also Over Here ====
- COVID-QA is proposed, which is a Question Answering dataset consisting of 2,019 question/answer pairs annotated by volunteer biomedical experts on scientific articles related to COVID-19.
- RoBERTa-base is used for benchmarking.
Outline
- COVID-QA
- Benchmarking Results
1. COVID-QA
1.1. Dataset
- 147 scientific articles are selected mostly related to COVID-19 from the CORD-19 (The White House Office of Science and Technology Policy, 2020 (accessed May 9, 2020) collection to be annotated by 15 experts.
- The annotations were created in SQuAD style fashion where annotators mark text as answers and formulate corresponding questions.
- COVID-QA differs from SQuAD in that answers come from longer texts (6118.5 vs 153.2 tokens), answers are generally longer (13.9 vs. 3.2 words) and it does not contain n-way annotated development nor test sets.
1.2. Model
- RoBERTa-base is used, either the baseline model vs. the model finetuned on COVID-QA.
2. Results
As shown above, finetuning the model on COVID-QA results in significant improvement across both metrics though the overall scores are pretty low compared to SQuAD.
- It is hypothesized the low scores relate to more complex question/answer pairs on much longer documents and the lack of multiple annotations per question.