A similarity measure for text processing

Jung Yi Jiang, Wen Hao Cheng, Yu Shu Chiou, Shie Jue Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

In this paper, we propose a novel similarity measure for document data processing. For two document vectors, the proposed measure takes three cases into account: a) The feature considered appears in both documents, b) the feature considered appears in only one document, and c) the feature considered appears in none of the documents. For the first case, we give a lower bound and decrease the similarity according to the difference between the feature values of the two documents. For the second case, we give a fixed value disregarding the magnitude of the feature value. For the last case, we treat it as an identity, Experimental results show that our proposed method can work more effectively than others.

Original languageEnglish
Title of host publicationProceedings of 2011 International Conference on Machine Learning and Cybernetics, ICMLC 2011
Pages1460-1465
Number of pages6
DOIs
Publication statusPublished - 2011 Nov 7
Event2011 International Conference on Machine Learning and Cybernetics, ICMLC 2011 - Guilin, Guangxi, China
Duration: 2011 Jul 102011 Jul 13

Publication series

NameProceedings - International Conference on Machine Learning and Cybernetics
Volume4
ISSN (Print)2160-133X
ISSN (Electronic)2160-1348

Other

Other2011 International Conference on Machine Learning and Cybernetics, ICMLC 2011
CountryChina
CityGuilin, Guangxi
Period11-07-1011-07-13

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Human-Computer Interaction

Fingerprint Dive into the research topics of 'A similarity measure for text processing'. Together they form a unique fingerprint.

Cite this