Progressive sampling for association rules based on samplings error estimation

Kun Ta Chuang, Ming Syan Chen, Wen Chieh Yang

研究成果: Conference contribution

30 引文 斯高帕斯(Scopus)

摘要

We explore in this paper a progressive sampling algorithm, called Sampling Error Estimation (SEE), which aims to identify an appropriate sample size for mining association rules. SEE has two advantages over previous works in the literature. First, SEE is highly efficient because an appropriate sample size can be determined without the need of executing association rules. Second, the identified sample size of SEE is very accurate, meaning that association rules can be highly efficiently executed on a sample of this size to obtain a sufficiently accurate result. This is attributed to the merit of SEE for being able to significantly reduce the influence of randomness by examining several samples with the same size in one database scan. As validated by experiments on various real data and synthetic data, SEE can achieve very prominent improvement in efficiency and also the resulting accuracy over previous works.

原文English
主出版物標題Advances in Knowledge Discovery and Data Mining - 9th Pacific-Asia Conference, PAKDD 2005, Proceedings
發行者Springer Verlag
頁面505-515
頁數11
ISBN(列印)3540260765, 9783540260769
DOIs
出版狀態Published - 2005 1月 1
事件9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2005 - Hanoi, Viet Nam
持續時間: 2005 5月 182005 5月 20

出版系列

名字Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
3518 LNAI
ISSN(列印)0302-9743
ISSN(電子)1611-3349

Other

Other9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2005
國家/地區Viet Nam
城市Hanoi
期間05-05-1805-05-20

All Science Journal Classification (ASJC) codes

  • 理論電腦科學
  • 一般電腦科學

指紋

深入研究「Progressive sampling for association rules based on samplings error estimation」主題。共同形成了獨特的指紋。

引用此