A statistical framework for mining substitution rules

Wei Guang Teng, Ming Jyh Hsieh, Ming Syan Chen

Research output: Contribution to journalArticlepeer-review

16 Citations (Scopus)

Abstract

In this paper, a new mining capability, called mining of substitution rules, is explored. A substitution refers to the choice made by a customer to replace the purchase of some items with that of others. The mining of substitution rules in a transaction database, the same as that of association rules, will lead to very valuable knowledge in various aspects, including market prediction, user behaviour analysis and decision support. The process of mining substitution rules can be decomposed into two procedures. The first procedure is to identify concrete itemsets among a large number of frequent itemsets, where a concrete itemset is a frequent itemset whose items are statistically dependent. The second procedure is then on the substitution rule generation. In this paper, we first derive theoretical properties for the model of substitution rule mining and devise a technique on the induction of positive itemset supports to improve the efficiency of support counting for negative itemsets. Then, in light of these properties, the SRM (substitution rule mining) algorithm is designed and implemented to discover the substitution rules efficiently while attaining good statistical significance. Empirical studies are performed to evaluate the performance of the SRM algorithm proposed. It is shown that the SRM algorithm not only has very good execution efficiency but also produces substitution rules of very high quality.

Original languageEnglish
Pages (from-to)158-178
Number of pages21
JournalKnowledge and Information Systems
Volume7
Issue number2
DOIs
Publication statusPublished - 2005 Feb 1

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems
  • Human-Computer Interaction
  • Hardware and Architecture
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'A statistical framework for mining substitution rules'. Together they form a unique fingerprint.

Cite this