Sentence extraction with topic modeling for question–answer pair generation

Chung-Hsien Wu, Chao Hong Liu, Po Hsun Su

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Recently, automatic QA pair generation has been an essential technique to reduce human involvement in the construction of QA systems. In a big data era, huge information is produced every day. Therefore, it is an important issue for QA systems to be able to respond to users with up-to-date information, e.g., to answer questions regarding recent posts on blogs. The major problem in building such systems is the efficiency to capture relevant text sources for specific QA domains. In this study, topic modeling is used as a means to help determine efficiently if an article is of the same topic as a specific domain of interest, e.g., health domain as exemplified in this paper. QA pairs are then generated from these selected articles using the proposed sentence extraction method. Experimental results show that, using the proposed method with topic modeling, a 7.3 % acceptance rate improvement on the generated questions was achieved.

Original languageEnglish
Pages (from-to)39-46
Number of pages8
JournalSoft Computing
Volume19
Issue number1
DOIs
Publication statusPublished - 2014 Jan 1

Fingerprint

Blogs
Health
Modeling
Experimental Results
Big data

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Geometry and Topology

Cite this

Wu, Chung-Hsien ; Liu, Chao Hong ; Su, Po Hsun. / Sentence extraction with topic modeling for question–answer pair generation. In: Soft Computing. 2014 ; Vol. 19, No. 1. pp. 39-46.
@article{869e434edf914725ae42a3963f1ca463,
title = "Sentence extraction with topic modeling for question–answer pair generation",
abstract = "Recently, automatic QA pair generation has been an essential technique to reduce human involvement in the construction of QA systems. In a big data era, huge information is produced every day. Therefore, it is an important issue for QA systems to be able to respond to users with up-to-date information, e.g., to answer questions regarding recent posts on blogs. The major problem in building such systems is the efficiency to capture relevant text sources for specific QA domains. In this study, topic modeling is used as a means to help determine efficiently if an article is of the same topic as a specific domain of interest, e.g., health domain as exemplified in this paper. QA pairs are then generated from these selected articles using the proposed sentence extraction method. Experimental results show that, using the proposed method with topic modeling, a 7.3 {\%} acceptance rate improvement on the generated questions was achieved.",
author = "Chung-Hsien Wu and Liu, {Chao Hong} and Su, {Po Hsun}",
year = "2014",
month = "1",
day = "1",
doi = "10.1007/s00500-014-1386-6",
language = "English",
volume = "19",
pages = "39--46",
journal = "Soft Computing",
issn = "1432-7643",
publisher = "Springer Verlag",
number = "1",

}

Sentence extraction with topic modeling for question–answer pair generation. / Wu, Chung-Hsien; Liu, Chao Hong; Su, Po Hsun.

In: Soft Computing, Vol. 19, No. 1, 01.01.2014, p. 39-46.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Sentence extraction with topic modeling for question–answer pair generation

AU - Wu, Chung-Hsien

AU - Liu, Chao Hong

AU - Su, Po Hsun

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Recently, automatic QA pair generation has been an essential technique to reduce human involvement in the construction of QA systems. In a big data era, huge information is produced every day. Therefore, it is an important issue for QA systems to be able to respond to users with up-to-date information, e.g., to answer questions regarding recent posts on blogs. The major problem in building such systems is the efficiency to capture relevant text sources for specific QA domains. In this study, topic modeling is used as a means to help determine efficiently if an article is of the same topic as a specific domain of interest, e.g., health domain as exemplified in this paper. QA pairs are then generated from these selected articles using the proposed sentence extraction method. Experimental results show that, using the proposed method with topic modeling, a 7.3 % acceptance rate improvement on the generated questions was achieved.

AB - Recently, automatic QA pair generation has been an essential technique to reduce human involvement in the construction of QA systems. In a big data era, huge information is produced every day. Therefore, it is an important issue for QA systems to be able to respond to users with up-to-date information, e.g., to answer questions regarding recent posts on blogs. The major problem in building such systems is the efficiency to capture relevant text sources for specific QA domains. In this study, topic modeling is used as a means to help determine efficiently if an article is of the same topic as a specific domain of interest, e.g., health domain as exemplified in this paper. QA pairs are then generated from these selected articles using the proposed sentence extraction method. Experimental results show that, using the proposed method with topic modeling, a 7.3 % acceptance rate improvement on the generated questions was achieved.

UR - http://www.scopus.com/inward/record.url?scp=84921698960&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84921698960&partnerID=8YFLogxK

U2 - 10.1007/s00500-014-1386-6

DO - 10.1007/s00500-014-1386-6

M3 - Article

VL - 19

SP - 39

EP - 46

JO - Soft Computing

JF - Soft Computing

SN - 1432-7643

IS - 1

ER -