Applying VSM and LCS to develop an integrated text retrieval mechanism

Cheng Shiun Tasi, Yong Ming Zang, Chien Hung Liu, Yueh Min Huang

研究成果: Article同行評審

9 引文 斯高帕斯(Scopus)

摘要

Text retrieval has received a lot of attention in computer science. In the text retrieval field, the most widely-adopted similarity technique is using vector space models (VSM) to evaluate the weight of terms and using Cosine, Jaccard or Dice to measure the similarity between the query and the texts. However, these similarity techniques do not consider the effect of the sequence of the information. In this paper, we propose an integrated text retrieval (ITR) mechanism that takes the advantage of both VSM and longest common subsequence (LCS) algorithm. The key idea of the ITR mechanism is to use LCS to re-evaluate the weight of terms, so that the sequence and weight relationships between the query and the texts can be considered simultaneously. The results of mathematical analysis show that the ITR mechanism can increase the similarity on Jaccard and Dice similarity measurements when a sequential relationship exists between the query and the texts.

原文English
頁(從 - 到)3974-3982
頁數9
期刊Expert Systems With Applications
39
發行號4
DOIs
出版狀態Published - 2012 三月

All Science Journal Classification (ASJC) codes

  • 工程 (全部)
  • 電腦科學應用
  • 人工智慧

指紋

深入研究「Applying VSM and LCS to develop an integrated text retrieval mechanism」主題。共同形成了獨特的指紋。

引用此