A time efficient pattern reduction algorithm for k-means based clustering

Chun Wei Tsai, Chu Sing Yang, Ming Chao Chiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Citations (Scopus)

Abstract

In this paper, we present an efficient algorithm, called Pattern Reduction (PR) algorithm, to reduce the time required for data clustering based on iterative clustering algorithms. Conceptually similar to a lossy data compression scheme, this algorithm removes at each iteration those data patterns that are close to the centroid of a cluster or remain in the same cluster for a certain number of iterations in a row and are thus unlikely to be moved again from one cluster to another at later iterations by computing a new pattern to represent all the data patterns removed. Our simulation results - from 2 to 1,000 dimensions and 150 to 6,000,000 patterns - indicate that the proposed algorithm can reduce the computation time of k-means, Generic k-means Algorithm (GKA) and k-means with Genetic Algorithm (KGA) from 10% up to about 80% and that for high dimensional data sets, it can even reduce the computation time for more than 70%.

Original languageEnglish
Title of host publication2007 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2007
Pages504-509
Number of pages6
DOIs
Publication statusPublished - 2007 Dec 1
Event2007 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2007 - Montreal, QC, Canada
Duration: 2007 Oct 72007 Oct 10

Publication series

NameConference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
ISSN (Print)1062-922X

Other

Other2007 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2007
CountryCanada
CityMontreal, QC
Period07-10-0707-10-10

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Fingerprint Dive into the research topics of 'A time efficient pattern reduction algorithm for k-means based clustering'. Together they form a unique fingerprint.

Cite this