Semantic segment extraction and matching for internet FAQ retrieval

Chung-Hsien Wu, Jui Feng Yeh, Yu Sheng Lai

Research output: Contribution to journalArticle

25 Citations (Scopus)

Abstract

This investigation presents a novel approach to semantic segment extraction and matching for retrieving information from Internet FAQs with natural language queries. Two semantic segments, the question category segment (QS) and the keyword segment (KS), are extracted from the input queries and the FAQ questions with a semiautomatically derived question-semantic grammar. A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the FAQ collection. Additionally, the vector space model (VSM) is adopted to measure the similarity between the query and the answers of the QA pairs. Finally, a multistage ranking strategy is adopted to determine the optimally performing combination of similarity metrics. The experimental results illustrate that the proposed method achieves an average rank of 4.52 and a top-10 recall rate of 90.89 percent. Compared with the query-expansion method, this method improves the performance by 4.82 places in the average rank of correct answers, 25.34 percent in the top-5 recall rate, and 5.21 percent in the top-10 recall rate.

Original languageEnglish
Article number1637419
Pages (from-to)930-940
Number of pages11
JournalIEEE Transactions on Knowledge and Data Engineering
Volume18
Issue number7
DOIs
Publication statusPublished - 2006 Jul 1

Fingerprint

Semantics
Internet
Query languages
Vector spaces

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this

@article{5533a9a0a32a481884b0f8b9e712231f,
title = "Semantic segment extraction and matching for internet FAQ retrieval",
abstract = "This investigation presents a novel approach to semantic segment extraction and matching for retrieving information from Internet FAQs with natural language queries. Two semantic segments, the question category segment (QS) and the keyword segment (KS), are extracted from the input queries and the FAQ questions with a semiautomatically derived question-semantic grammar. A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the FAQ collection. Additionally, the vector space model (VSM) is adopted to measure the similarity between the query and the answers of the QA pairs. Finally, a multistage ranking strategy is adopted to determine the optimally performing combination of similarity metrics. The experimental results illustrate that the proposed method achieves an average rank of 4.52 and a top-10 recall rate of 90.89 percent. Compared with the query-expansion method, this method improves the performance by 4.82 places in the average rank of correct answers, 25.34 percent in the top-5 recall rate, and 5.21 percent in the top-10 recall rate.",
author = "Chung-Hsien Wu and Yeh, {Jui Feng} and Lai, {Yu Sheng}",
year = "2006",
month = "7",
day = "1",
doi = "10.1109/TKDE.2006.115",
language = "English",
volume = "18",
pages = "930--940",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "7",

}

Semantic segment extraction and matching for internet FAQ retrieval. / Wu, Chung-Hsien; Yeh, Jui Feng; Lai, Yu Sheng.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 18, No. 7, 1637419, 01.07.2006, p. 930-940.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Semantic segment extraction and matching for internet FAQ retrieval

AU - Wu, Chung-Hsien

AU - Yeh, Jui Feng

AU - Lai, Yu Sheng

PY - 2006/7/1

Y1 - 2006/7/1

N2 - This investigation presents a novel approach to semantic segment extraction and matching for retrieving information from Internet FAQs with natural language queries. Two semantic segments, the question category segment (QS) and the keyword segment (KS), are extracted from the input queries and the FAQ questions with a semiautomatically derived question-semantic grammar. A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the FAQ collection. Additionally, the vector space model (VSM) is adopted to measure the similarity between the query and the answers of the QA pairs. Finally, a multistage ranking strategy is adopted to determine the optimally performing combination of similarity metrics. The experimental results illustrate that the proposed method achieves an average rank of 4.52 and a top-10 recall rate of 90.89 percent. Compared with the query-expansion method, this method improves the performance by 4.82 places in the average rank of correct answers, 25.34 percent in the top-5 recall rate, and 5.21 percent in the top-10 recall rate.

AB - This investigation presents a novel approach to semantic segment extraction and matching for retrieving information from Internet FAQs with natural language queries. Two semantic segments, the question category segment (QS) and the keyword segment (KS), are extracted from the input queries and the FAQ questions with a semiautomatically derived question-semantic grammar. A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the FAQ collection. Additionally, the vector space model (VSM) is adopted to measure the similarity between the query and the answers of the QA pairs. Finally, a multistage ranking strategy is adopted to determine the optimally performing combination of similarity metrics. The experimental results illustrate that the proposed method achieves an average rank of 4.52 and a top-10 recall rate of 90.89 percent. Compared with the query-expansion method, this method improves the performance by 4.82 places in the average rank of correct answers, 25.34 percent in the top-5 recall rate, and 5.21 percent in the top-10 recall rate.

UR - http://www.scopus.com/inward/record.url?scp=33746643474&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33746643474&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2006.115

DO - 10.1109/TKDE.2006.115

M3 - Article

VL - 18

SP - 930

EP - 940

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 7

M1 - 1637419

ER -