Extended Gaussian kernel version of fuzzy c-means in the problem of data analyzing

S. Ramathilagam, Yueh Min Huang

Research output: Contribution to journalArticlepeer-review

25 Citations (Scopus)

Abstract

Fuzzy c-means clustering with spatial constraints is considered as suitable algorithm for data clustering or data analyzing. But FCM has still lacks enough robustness to employ with noise data, because of its Euclidean distance measure objective function for finding the relationship between the objects. It can only be effective in clustering 'spherical' clusters, and it may not give reasonable clustering results for "non-compactly filled" spherical data such as "annular-shaped" data. This paper realized the drawbacks of the general fuzzy c-mean algorithm and it tries to introduce an extended Gaussian version of fuzzy C-means by replacing the Euclidean distance in the original object function of FCM. Firstly, this paper proposes initial kernel version of fuzzy c-means to aim at simplifying its computation and then extended it to extended Gaussian kernel version of fuzzy c-means. It derives an effective method to construct the membership matrix for objects, and it derives a robust method for updating centers from extended Gaussian version of fuzzy C-means. Furthermore, this paper proposes a new prototypes learning method and it obtains initial cluster centers using new mathematical initialization centers for the new effective objective function of fuzzy c-means, so that this paper tries to minimize the iteration of algorithms to obtain more accurate result. Initial experiment will be done with an artificially generated data to show how effectively the new proposed Gaussian version of fuzzy C-means works in obtaining clusters, and then the proposed methods can be implemented to cluster the Wisconsin breast cancer database into two clusters for the classes benign and malignant. To show the effective performance of proposed fuzzy c-means with new initialization of centers of clusters, this work compares the results with results of recent fuzzy c-means algorithm; in addition, it uses Silhouette method to validate the obtained clusters from breast cancer datasets.

Original languageEnglish
Pages (from-to)3793-3805
Number of pages13
JournalExpert Systems With Applications
Volume38
Issue number4
DOIs
Publication statusPublished - 2011 Apr

All Science Journal Classification (ASJC) codes

  • Engineering(all)
  • Computer Science Applications
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Extended Gaussian kernel version of fuzzy c-means in the problem of data analyzing'. Together they form a unique fingerprint.

Cite this