Clustering categorical data by utilizing the correlated-force ensemble

Kun-Ta Chuang, Ming Syan Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

We explore in this paper a novel clustering algorithm, named CORE (standing for CORrelated-Force Ensemble), for categorical data. In general, it is more difficult to perform clustering on categorical data than on numerical data due to the absence of the ordered property in the former. Though several clustering algorithms which concentrate on categorical date were proposed, acquiring the desirable quality remains a challenging issue. Note that there is significance hidden in the correlation between attribute values that can be explored to aid clustering, especially extracting clusters in the high dimensional data. Therefore by employing the concept of correlated-force ensemble, clusters which consist of the highly correlated set of nominal attribute values, can be acquired by the proposed algorithm, CORE. As validated by variant real datasets, it is shown in our experimental results that algorithm CORE significantly outperforms the prior works.

Original languageEnglish
Title of host publicationProceedings of the Fourth SIAM International Conference on Data Mining
EditorsM.W. Berry, U. Dayal, C. Kamath, D. Skillicorn
Pages269-278
Number of pages10
Publication statusPublished - 2004
EventProceedings of the Fourth SIAM International Conference on Data Mining - Lake Buena Vista, FL, United States
Duration: 2004 Apr 222004 Apr 24

Other

OtherProceedings of the Fourth SIAM International Conference on Data Mining
CountryUnited States
CityLake Buena Vista, FL
Period04-04-2204-04-24

Fingerprint

Nominal or categorical data
Ensemble
Clustering
Clustering Algorithm
Attribute
High-dimensional Data
Date
Categorical
Categorical or nominal
Experimental Results

All Science Journal Classification (ASJC) codes

  • Mathematics(all)

Cite this

Chuang, K-T., & Chen, M. S. (2004). Clustering categorical data by utilizing the correlated-force ensemble. In M. W. Berry, U. Dayal, C. Kamath, & D. Skillicorn (Eds.), Proceedings of the Fourth SIAM International Conference on Data Mining (pp. 269-278)
Chuang, Kun-Ta ; Chen, Ming Syan. / Clustering categorical data by utilizing the correlated-force ensemble. Proceedings of the Fourth SIAM International Conference on Data Mining. editor / M.W. Berry ; U. Dayal ; C. Kamath ; D. Skillicorn. 2004. pp. 269-278
@inproceedings{a74aa8797b1f4f84985f4e6d740012f6,
title = "Clustering categorical data by utilizing the correlated-force ensemble",
abstract = "We explore in this paper a novel clustering algorithm, named CORE (standing for CORrelated-Force Ensemble), for categorical data. In general, it is more difficult to perform clustering on categorical data than on numerical data due to the absence of the ordered property in the former. Though several clustering algorithms which concentrate on categorical date were proposed, acquiring the desirable quality remains a challenging issue. Note that there is significance hidden in the correlation between attribute values that can be explored to aid clustering, especially extracting clusters in the high dimensional data. Therefore by employing the concept of correlated-force ensemble, clusters which consist of the highly correlated set of nominal attribute values, can be acquired by the proposed algorithm, CORE. As validated by variant real datasets, it is shown in our experimental results that algorithm CORE significantly outperforms the prior works.",
author = "Kun-Ta Chuang and Chen, {Ming Syan}",
year = "2004",
language = "English",
pages = "269--278",
editor = "M.W. Berry and U. Dayal and C. Kamath and D. Skillicorn",
booktitle = "Proceedings of the Fourth SIAM International Conference on Data Mining",

}

Chuang, K-T & Chen, MS 2004, Clustering categorical data by utilizing the correlated-force ensemble. in MW Berry, U Dayal, C Kamath & D Skillicorn (eds), Proceedings of the Fourth SIAM International Conference on Data Mining. pp. 269-278, Proceedings of the Fourth SIAM International Conference on Data Mining, Lake Buena Vista, FL, United States, 04-04-22.

Clustering categorical data by utilizing the correlated-force ensemble. / Chuang, Kun-Ta; Chen, Ming Syan.

Proceedings of the Fourth SIAM International Conference on Data Mining. ed. / M.W. Berry; U. Dayal; C. Kamath; D. Skillicorn. 2004. p. 269-278.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Clustering categorical data by utilizing the correlated-force ensemble

AU - Chuang, Kun-Ta

AU - Chen, Ming Syan

PY - 2004

Y1 - 2004

N2 - We explore in this paper a novel clustering algorithm, named CORE (standing for CORrelated-Force Ensemble), for categorical data. In general, it is more difficult to perform clustering on categorical data than on numerical data due to the absence of the ordered property in the former. Though several clustering algorithms which concentrate on categorical date were proposed, acquiring the desirable quality remains a challenging issue. Note that there is significance hidden in the correlation between attribute values that can be explored to aid clustering, especially extracting clusters in the high dimensional data. Therefore by employing the concept of correlated-force ensemble, clusters which consist of the highly correlated set of nominal attribute values, can be acquired by the proposed algorithm, CORE. As validated by variant real datasets, it is shown in our experimental results that algorithm CORE significantly outperforms the prior works.

AB - We explore in this paper a novel clustering algorithm, named CORE (standing for CORrelated-Force Ensemble), for categorical data. In general, it is more difficult to perform clustering on categorical data than on numerical data due to the absence of the ordered property in the former. Though several clustering algorithms which concentrate on categorical date were proposed, acquiring the desirable quality remains a challenging issue. Note that there is significance hidden in the correlation between attribute values that can be explored to aid clustering, especially extracting clusters in the high dimensional data. Therefore by employing the concept of correlated-force ensemble, clusters which consist of the highly correlated set of nominal attribute values, can be acquired by the proposed algorithm, CORE. As validated by variant real datasets, it is shown in our experimental results that algorithm CORE significantly outperforms the prior works.

UR - http://www.scopus.com/inward/record.url?scp=2942618800&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=2942618800&partnerID=8YFLogxK

M3 - Conference contribution

SP - 269

EP - 278

BT - Proceedings of the Fourth SIAM International Conference on Data Mining

A2 - Berry, M.W.

A2 - Dayal, U.

A2 - Kamath, C.

A2 - Skillicorn, D.

ER -

Chuang K-T, Chen MS. Clustering categorical data by utilizing the correlated-force ensemble. In Berry MW, Dayal U, Kamath C, Skillicorn D, editors, Proceedings of the Fourth SIAM International Conference on Data Mining. 2004. p. 269-278