Brief Review — MedicationQA: Bridging the Gap Between Consumers’ Medication Questions and Trusted Answers
MedicationQA: 674 Question-Answer Pairs
2 min readOct 23, 2023
Bridging the Gap Between Consumers’ Medication Questions and Trusted Answers
MedicationQA, by National Library of Medicine
2019 SHTI (Sik-Ho Tsang @ Medium)Medical Dataset
==== My Other Paper Readings Are Also Over Here ====
- This paper addresses the task of answering consumer health questions about medications.
- A gold standard corpus for Medication Question Answering is created using real consumer questions. The gold standard consists of 674 question-answer pairs with annotations of the question focus and type and the answer source.
- Recurrent and convolutional neural networks are used in question type identification and focus recognition.
- (This is a dataset evaluated by Med-PaLM. In Med-PaLM, the dataset proposed by this paper is named “MedicationQA”.)
Outline
- MedicationQA Dataset
- Results
1. MedicationQA Dataset
- Each question is manually annotated with a:
- Question focus (always a Drug name in this dataset),
- Question type (e.g. Dose, Interaction, Side effects).
- The ground-truth answer is an answer retrieved sequentially based on availability:
- MedlinePlus and DailyMed.
- Other NIH or U.S. government websites.
- Other trustworthy websites (e.g., the Mayo Clinic) or academic institutions’ websites.
- Other websites returned by a Google search.
- The final gold standard contains 674 question-answer pairs with their associated annotations. These annotations include 25 question types, reported with examples as in Table 1 above.
- The answer sources are summarized in Figure 4.
- Table 2 shows the token-and-sentence-level statistics about the questions and the answers in the dataset.
2. Results
- Focus Recognition: The Bi-LSTM-CRF network is trained on 80% training data. The CRF-based loss function is minimized.
74% F1 score in question focus recognition for exact span matching and 90% for partial span matching, are obtained.
- Question Type Identification: The CNN network is trained. The softmax-based loss function is minimized.
The CNN network achieved an average accuracy of 75.7% on 5 runs with a variation in the [0, 2.5%] range.