TY - GEN
T1 - A similarity measure for text processing
AU - Jiang, Jung Yi
AU - Cheng, Wen Hao
AU - Chiou, Yu Shu
AU - Lee, Shie Jue
PY - 2011/11/7
Y1 - 2011/11/7
N2 - In this paper, we propose a novel similarity measure for document data processing. For two document vectors, the proposed measure takes three cases into account: a) The feature considered appears in both documents, b) the feature considered appears in only one document, and c) the feature considered appears in none of the documents. For the first case, we give a lower bound and decrease the similarity according to the difference between the feature values of the two documents. For the second case, we give a fixed value disregarding the magnitude of the feature value. For the last case, we treat it as an identity, Experimental results show that our proposed method can work more effectively than others.
AB - In this paper, we propose a novel similarity measure for document data processing. For two document vectors, the proposed measure takes three cases into account: a) The feature considered appears in both documents, b) the feature considered appears in only one document, and c) the feature considered appears in none of the documents. For the first case, we give a lower bound and decrease the similarity according to the difference between the feature values of the two documents. For the second case, we give a fixed value disregarding the magnitude of the feature value. For the last case, we treat it as an identity, Experimental results show that our proposed method can work more effectively than others.
UR - http://www.scopus.com/inward/record.url?scp=80155138569&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80155138569&partnerID=8YFLogxK
U2 - 10.1109/ICMLC.2011.6016998
DO - 10.1109/ICMLC.2011.6016998
M3 - Conference contribution
AN - SCOPUS:80155138569
SN - 9781457703065
T3 - Proceedings - International Conference on Machine Learning and Cybernetics
SP - 1460
EP - 1465
BT - Proceedings of 2011 International Conference on Machine Learning and Cybernetics, ICMLC 2011
T2 - 2011 International Conference on Machine Learning and Cybernetics, ICMLC 2011
Y2 - 10 July 2011 through 13 July 2011
ER -