TY - JOUR
T1 - A dissimilarity measure for document clustering
AU - Jiang, Jung Yi
AU - Cheng, Wen Hao
AU - Lee, Shie Jue
PY - 2012/1
Y1 - 2012/1
N2 - In this paper, we propose a novel dissimilarity measure for document data processing and apply it to document clustering. For a document vector and a cluster representation, the proposed measure takes three cases into account: a) the feature considered appears in both the document and the cluster, b) the feature considered appears in the document or the cluster, but not in both, and c) the feature considered appears neither in the document nor in the cluster. For the first case, we give a lower bound and decrease the similarity according to the difference between the two feature values. For the second case, we give a fixed value disregarding the magnitude of the feature value. For the last case, we treat it as an identity. Experimental results show that our proposed method can work more effectively than others.
AB - In this paper, we propose a novel dissimilarity measure for document data processing and apply it to document clustering. For a document vector and a cluster representation, the proposed measure takes three cases into account: a) the feature considered appears in both the document and the cluster, b) the feature considered appears in the document or the cluster, but not in both, and c) the feature considered appears neither in the document nor in the cluster. For the first case, we give a lower bound and decrease the similarity according to the difference between the two feature values. For the second case, we give a fixed value disregarding the magnitude of the feature value. For the last case, we treat it as an identity. Experimental results show that our proposed method can work more effectively than others.
UR - http://www.scopus.com/inward/record.url?scp=83755224939&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=83755224939&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:83755224939
SN - 1881-803X
VL - 6
SP - 15
EP - 21
JO - ICIC Express Letters
JF - ICIC Express Letters
IS - 1
ER -