Response selection and automatic message-response expansion in retrieval-based QA systems using semantic dependency pair model

Ming Hsiang Su, Chung Hsien Wu, Kun Yi Huang, Wu Hsuan Lin

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

This article presents an approach to response selection and message-response (MR) database expansion from the unstructured data on the psychological consultation websites for a retrieval-based question answering (QA) system in a constrained domain for emotional support and comforting. First, we manually construct an initial MR database based on the articles collected from the psychological consultation websites. The Chinese Knowledge and Information Processing probabilistic context-free grammar is adopted to obtain the semantic dependency graphs (SDGs) of all the messages and responses in the initial MR database. For each sentence in the MR database, all the semantic dependencies, each composed of two words and their semantic relation, are extracted from the SDG of the sentence to form a semantic dependency set. Finally, a matrix with the element representing the correlation between the semantic dependencies of the messages and their corresponding responses is constructed as a semantic dependency pair model (SDPM) for response selection. Moreover, as the number of MR pairs in the psychological consultation websites is increasing day by day, the MR database in the QA system should be expanded to meet the needs of the users. For MR database expansion, the unstructured data from the message board are automatically collected. For the collected data, the supervised latent Dirichlet allocation is adopted for event detection and then the event-based delta Bayesian Information Criterion is used for message and response article segmentation. Each extracted message segment is then fed to the constructed retrieval-based QA system to find the best matched response segment and the matching score is also estimated to verify if the new MR pair is suitable to be included in the expanded MR database. Fivefold cross validation was employed to evaluate the performance of the proposed retrieval-based QA system over the expanded MR database based on SDPM. Compared to the vector space model-based method, the Okapi BM25 model, and the deep learning-based sequence-to-sequence with attention model, the proposed approach achieved a more favorable performance according to a statistical significance test. The retrieval accuracy based on MR expansion was also evaluated and a satisfactory result was obtained confirming the effectiveness of the expanded MR database. In addition, the user's satisfaction score of the proposed system was evaluated using the Cronbach's alpha value and the satisfaction score of the proposed SDPM was higher than those of the methods for comparison.

Original languageEnglish
Article number3
JournalACM Transactions on Asian and Low-Resource Language Information Processing
Volume18
Issue number1
DOIs
Publication statusPublished - 2018 Nov 1

Fingerprint

Semantics
Websites
Context free grammars
Statistical tests
Vector spaces

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this

@article{f69f92de6de847f19aecdfe0f8c78da2,
title = "Response selection and automatic message-response expansion in retrieval-based QA systems using semantic dependency pair model",
abstract = "This article presents an approach to response selection and message-response (MR) database expansion from the unstructured data on the psychological consultation websites for a retrieval-based question answering (QA) system in a constrained domain for emotional support and comforting. First, we manually construct an initial MR database based on the articles collected from the psychological consultation websites. The Chinese Knowledge and Information Processing probabilistic context-free grammar is adopted to obtain the semantic dependency graphs (SDGs) of all the messages and responses in the initial MR database. For each sentence in the MR database, all the semantic dependencies, each composed of two words and their semantic relation, are extracted from the SDG of the sentence to form a semantic dependency set. Finally, a matrix with the element representing the correlation between the semantic dependencies of the messages and their corresponding responses is constructed as a semantic dependency pair model (SDPM) for response selection. Moreover, as the number of MR pairs in the psychological consultation websites is increasing day by day, the MR database in the QA system should be expanded to meet the needs of the users. For MR database expansion, the unstructured data from the message board are automatically collected. For the collected data, the supervised latent Dirichlet allocation is adopted for event detection and then the event-based delta Bayesian Information Criterion is used for message and response article segmentation. Each extracted message segment is then fed to the constructed retrieval-based QA system to find the best matched response segment and the matching score is also estimated to verify if the new MR pair is suitable to be included in the expanded MR database. Fivefold cross validation was employed to evaluate the performance of the proposed retrieval-based QA system over the expanded MR database based on SDPM. Compared to the vector space model-based method, the Okapi BM25 model, and the deep learning-based sequence-to-sequence with attention model, the proposed approach achieved a more favorable performance according to a statistical significance test. The retrieval accuracy based on MR expansion was also evaluated and a satisfactory result was obtained confirming the effectiveness of the expanded MR database. In addition, the user's satisfaction score of the proposed system was evaluated using the Cronbach's alpha value and the satisfaction score of the proposed SDPM was higher than those of the methods for comparison.",
author = "Su, {Ming Hsiang} and Wu, {Chung Hsien} and Huang, {Kun Yi} and Lin, {Wu Hsuan}",
year = "2018",
month = "11",
day = "1",
doi = "10.1145/3229184",
language = "English",
volume = "18",
journal = "ACM Transactions on Asian and Low-Resource Language Information Processing",
issn = "2375-4699",
publisher = "Association for Computing Machinery (ACM)",
number = "1",

}

TY - JOUR

T1 - Response selection and automatic message-response expansion in retrieval-based QA systems using semantic dependency pair model

AU - Su, Ming Hsiang

AU - Wu, Chung Hsien

AU - Huang, Kun Yi

AU - Lin, Wu Hsuan

PY - 2018/11/1

Y1 - 2018/11/1

N2 - This article presents an approach to response selection and message-response (MR) database expansion from the unstructured data on the psychological consultation websites for a retrieval-based question answering (QA) system in a constrained domain for emotional support and comforting. First, we manually construct an initial MR database based on the articles collected from the psychological consultation websites. The Chinese Knowledge and Information Processing probabilistic context-free grammar is adopted to obtain the semantic dependency graphs (SDGs) of all the messages and responses in the initial MR database. For each sentence in the MR database, all the semantic dependencies, each composed of two words and their semantic relation, are extracted from the SDG of the sentence to form a semantic dependency set. Finally, a matrix with the element representing the correlation between the semantic dependencies of the messages and their corresponding responses is constructed as a semantic dependency pair model (SDPM) for response selection. Moreover, as the number of MR pairs in the psychological consultation websites is increasing day by day, the MR database in the QA system should be expanded to meet the needs of the users. For MR database expansion, the unstructured data from the message board are automatically collected. For the collected data, the supervised latent Dirichlet allocation is adopted for event detection and then the event-based delta Bayesian Information Criterion is used for message and response article segmentation. Each extracted message segment is then fed to the constructed retrieval-based QA system to find the best matched response segment and the matching score is also estimated to verify if the new MR pair is suitable to be included in the expanded MR database. Fivefold cross validation was employed to evaluate the performance of the proposed retrieval-based QA system over the expanded MR database based on SDPM. Compared to the vector space model-based method, the Okapi BM25 model, and the deep learning-based sequence-to-sequence with attention model, the proposed approach achieved a more favorable performance according to a statistical significance test. The retrieval accuracy based on MR expansion was also evaluated and a satisfactory result was obtained confirming the effectiveness of the expanded MR database. In addition, the user's satisfaction score of the proposed system was evaluated using the Cronbach's alpha value and the satisfaction score of the proposed SDPM was higher than those of the methods for comparison.

AB - This article presents an approach to response selection and message-response (MR) database expansion from the unstructured data on the psychological consultation websites for a retrieval-based question answering (QA) system in a constrained domain for emotional support and comforting. First, we manually construct an initial MR database based on the articles collected from the psychological consultation websites. The Chinese Knowledge and Information Processing probabilistic context-free grammar is adopted to obtain the semantic dependency graphs (SDGs) of all the messages and responses in the initial MR database. For each sentence in the MR database, all the semantic dependencies, each composed of two words and their semantic relation, are extracted from the SDG of the sentence to form a semantic dependency set. Finally, a matrix with the element representing the correlation between the semantic dependencies of the messages and their corresponding responses is constructed as a semantic dependency pair model (SDPM) for response selection. Moreover, as the number of MR pairs in the psychological consultation websites is increasing day by day, the MR database in the QA system should be expanded to meet the needs of the users. For MR database expansion, the unstructured data from the message board are automatically collected. For the collected data, the supervised latent Dirichlet allocation is adopted for event detection and then the event-based delta Bayesian Information Criterion is used for message and response article segmentation. Each extracted message segment is then fed to the constructed retrieval-based QA system to find the best matched response segment and the matching score is also estimated to verify if the new MR pair is suitable to be included in the expanded MR database. Fivefold cross validation was employed to evaluate the performance of the proposed retrieval-based QA system over the expanded MR database based on SDPM. Compared to the vector space model-based method, the Okapi BM25 model, and the deep learning-based sequence-to-sequence with attention model, the proposed approach achieved a more favorable performance according to a statistical significance test. The retrieval accuracy based on MR expansion was also evaluated and a satisfactory result was obtained confirming the effectiveness of the expanded MR database. In addition, the user's satisfaction score of the proposed system was evaluated using the Cronbach's alpha value and the satisfaction score of the proposed SDPM was higher than those of the methods for comparison.

UR - http://www.scopus.com/inward/record.url?scp=85056766752&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85056766752&partnerID=8YFLogxK

U2 - 10.1145/3229184

DO - 10.1145/3229184

M3 - Article

AN - SCOPUS:85056766752

VL - 18

JO - ACM Transactions on Asian and Low-Resource Language Information Processing

JF - ACM Transactions on Asian and Low-Resource Language Information Processing

SN - 2375-4699

IS - 1

M1 - 3

ER -