A time-efficient pattern reduction algorithm for k-means clustering

Ming Chao Chiang, Chun Wei Tsai, Chu Sing Yang

Research output: Contribution to journalArticlepeer-review

75 Citations (Scopus)

Abstract

This paper presents an efficient algorithm, called pattern reduction (PR), for reducing the computation time of k-means and k-means-based clustering algorithms. The proposed algorithm works by compressing and removing at each iteration patterns that are unlikely to change their membership thereafter. Not only is the proposed algorithm simple and easy to implement, but it can also be applied to many other iterative clustering algorithms such as kernel-based and population-based clustering algorithms. Our experiments - from 2 to 1000 dimensions and 150 to 10,000,000 patterns - indicate that with a small loss of quality, the proposed algorithm can significantly reduce the computation time of all state-of-the-art clustering algorithms evaluated in this paper, especially for large and high-dimensional data sets.

Original languageEnglish
Pages (from-to)716-731
Number of pages16
JournalInformation sciences
Volume181
Issue number4
DOIs
Publication statusPublished - 2011 Feb 15

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'A time-efficient pattern reduction algorithm for k-means clustering'. Together they form a unique fingerprint.

Cite this