Clustering is a traditional data mining problem that has attracted researchers from different disciplines because its solution can be applied to many useful problems in our daily life. Since the era of big data is coming, how to “reduce the computing time” of an “effective clustering algorithm” has been a promising research issue in recent years. Thus, this paper presents an effective clustering algorithm, by using the so-called searched information to determine later search directions, and then has it implemented on Spark to accelerate its response time for analyzing large-scale datasets. Simulation results show that the proposed algorithm provides a better result than the other clustering algorithms compared in this paper because it is less sensitive to the initial solutions. The simulation results further show that cloud computing platform is capable of enhancing the performance of the proposed algorithm.
All Science Journal Classification (ASJC) codes
- Arts and Humanities (miscellaneous)
- Human-Computer Interaction