Abstract
This paper presents a cluster validity measure with a hybrid parameter search method for the support vector clustering (SVC) algorithm to identify an optimal cluster structure for a given data set. The cluster structure obtained by the SVC is controlled by two parameters: the parameter of kernel functions, denoted as q; and the soft-margin constant of Lagrangian functions, denoted as C. Large trial-and-error search efforts on these two parameters are necessary for reaching a satisfactory clustering result. From intensive observations of the behavior of the cluster splitting, we found that (1) the overall search range of q is related to the densities of the clusters; (2) each cluster structure corresponds to an interval of q, and the size of each interval is different; and (3) identifying the optimal structure is equivalent to finding the largest interval among all intervals. We have based our findings on developing a validity measure with an ad hoc parameter search algorithm to enable the SVC algorithm to identify optimal cluster configurations with a minimal number of executions. Computer simulations have been conducted on benchmark data sets to demonstrate the effectiveness and robustness of our proposed approach.
Original language | English |
---|---|
Pages (from-to) | 506-520 |
Number of pages | 15 |
Journal | Pattern Recognition |
Volume | 41 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2008 Feb |
All Science Journal Classification (ASJC) codes
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence