DPSP: Distributed Progressive Sequential Pattern mining on the cloud

Jen Wei Huang, Su Chen Lin, Ming Syan Chen

研究成果: Conference contribution

24 引文 斯高帕斯(Scopus)

摘要

The progressive sequential pattern mining problem has been discussed in previous research works. With the increasing amount of data, single processors struggle to scale up. Traditional algorithms running on a single machine may have scalability troubles. Therefore, mining progressive sequential patterns intrinsically suffers from the scalability problem. In view of this, we design a distributed mining algorithm to address the scalability problem of mining progressive sequential patterns. The proposed algorithm DPSP, standing for Distributed Progressive Sequential Pattern mining algorithm, is implemented on top of Hadoop platform, which realizes the cloud computing environment. We propose Map/Reduce jobs in DPSP to delete obsolete itemsets, update current candidate sequential patterns and report up-to-date frequent sequential patterns within each POI. The experimental results show that DPSP possesses great scalability and consequently increases the performance and the practicability of mining algorithms.

原文English
主出版物標題Advances in Knowledge Discovery and Data Mining - 14th Pacific-Asia Conference, PAKDD 2010, Proceedings
頁面27-34
頁數8
版本PART 2
DOIs
出版狀態Published - 2010 12月 1
事件14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2010 - Hyderabad, India
持續時間: 2010 6月 212010 6月 24

出版系列

名字Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
號碼PART 2
6119 LNAI
ISSN(列印)0302-9743
ISSN(電子)1611-3349

Other

Other14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2010
國家/地區India
城市Hyderabad
期間10-06-2110-06-24

All Science Journal Classification (ASJC) codes

  • 理論電腦科學
  • 一般電腦科學

指紋

深入研究「DPSP: Distributed Progressive Sequential Pattern mining on the cloud」主題。共同形成了獨特的指紋。

引用此