Question answering from documents has become a popular research topic in recent years Machine reading comprehension (MRC) is one of core parts in document-based question answering systems which’s goal is finding the answer from texts related to the question In order to simulate conditions when integrating machine reading comprehension models to question answering systems many works based on multi-document reading comprehension setting have been proposed The task of multi-document reading comprehension is to find answers from a set of documents instead of from a related paragraph which is known in advanced A common approach of multi-document reading comprehension is the pipeline approach which selects paragraphs probably contains answer firstly then extract the answer from selected paragraph A problem of the pipeline approach is error propagation: mistakes made by the step of selecting paragraphs leads it’s hard to extract correct answers We propose a reinforcement learning method to resolve the error propagation problem Another challenge when applying machine reading comprehension models to question answering systems is lacking training data on the application domain The gap between the domain of training data and the domain of application incurs machine reading comprehension models can’t predict appropriate answers To reduce the performance degradation of machine reading comprehension models in the application domain we propose our models for machine reading comprehension the BERT Ranker and the BERT Reader Based on them we build a question answering system on health knowledge domain To verify our methods we conduct experiments on the benchmark dataset DuReader and health knowledge machine reading comprehension dataset collected by ourselves The experimental results show that BERT Reader can alleviate the performance degradation on the application domain and our reinforcement learning method boosts the performance of BERT Ranker
Multi-Document Reading Comprehension Based on BERT and Reinforcement Learning – Building a Health Knowledge Question Answering System
何謙, 曹. (Author). 2020
Student thesis: Doctoral Thesis