This paper addresses the task of answering consumer health questions about medications.
A gold standard corpus for Medication Question Answering is created using real consumer questions. The gold standard consists of 674 question-answer pairs with annotations of the question focus and type and the answer source.
Recurrent and convolutional neural networks are used in question type identification and focus recognition.
(This is a dataset evaluated by Med-PaLM. In Med-PaLM, the dataset proposed by this paper is named “MedicationQA”.)
Outline
MedicationQA Dataset
Results
1. MedicationQA Dataset
Each question is manually annotated with a:
Question focus (always a Drug name in this dataset),
Question type (e.g. Dose, Interaction, Side effects).
The ground-truth answer is an answer retrieved sequentially based on availability:
MedlinePlus and DailyMed.
Other NIH or U.S. government websites.
Other trustworthy websites (e.g., the Mayo Clinic) or academic institutions’ websites.
Other websites returned by a Google search.
Press enter or click to view image in full size
The final gold standard contains 674 question-answer pairs with their associated annotations. These annotations include 25 question types, reported with examples as in Table 1 above.
The answer sources are summarized in Figure 4.
Table 2 shows the token-and-sentence-level statistics about the questions and the answers in the dataset.
2. Results
Focus Recognition: The Bi-LSTM-CRF network is trained on 80% training data. The CRF-based loss function is minimized.
74% F1 score in question focus recognition for exact span matching and 90% for partial span matching, are obtained.
Question Type Identification: The CNN network is trained. The softmax-based loss function is minimized.
The CNN network achieved an average accuracy of 75.7% on 5 runs with a variation in the [0, 2.5%] range.