An efficient algorithm to maintain the discovered frequent sequences with record deletion

Jerry Chun Wei Lin, Wensheng Gan, Tzung Pei Hong, Hsin Yi Chen, Sheng-Tun Li

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

One of the major concerns with Sequential-pattern mining (SPM) is how to discover frequent sequences from transactional databases. Most SPMalgorithms can only handle static databases, which is not practical in real-life situations. The Fast UPdated 2 (FUP2) algorithm was proposed to maintain and update the discovered association rules for transaction deletion. This algorithm can also be extended to SPM for maintaining the discovered frequent sequences, especially when some sequential records are deleted from their original databases. The original database is, however, required to be rescanned if the maintained sequences are small in the deleted sequential records. In the past, a pre-large concept was proposed to reduce the computational cost of database rescans until the number of deleted customers of the deleted sequential records achieves the designed safety bound. In this paper, a PreFUSP-TREE-DEL algorithm is proposed to adopt a pre-large FUSP-tree structure and the pre-large concept is used for maintaining and updating the discovered sequential patterns with record deletion. The proposed algorithm first partitions the discovered sequential patterns into three parts with nine cases. The discovered sequential patterns of each case are then maintained and updated by the designed procedure. Based on the proposed PreFUSP-TREE-DEL algorithm, it is unnecessary to rescan the original database until the cumulative number of deleted customers achieves the designed safety bound. The conducted experiments show that that the proposed PreFUSP-TREE-DEL algorithm has good performance when compared to other batch-mode algorithms or other maintenance algorithms.

Original languageEnglish
Pages (from-to)655-677
Number of pages23
JournalIntelligent Data Analysis
Volume20
Issue number3
DOIs
Publication statusPublished - 2016 Apr 20

Fingerprint

Deletion
Efficient Algorithms
Sequential Patterns
Mining
Customers
Safety
Association rules
Association Rules
Tree Structure
Batch
Transactions
Updating
Computational Cost
Maintenance
Update
Partition
Experiment
Costs
Experiments

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Cite this

Lin, Jerry Chun Wei ; Gan, Wensheng ; Hong, Tzung Pei ; Chen, Hsin Yi ; Li, Sheng-Tun. / An efficient algorithm to maintain the discovered frequent sequences with record deletion. In: Intelligent Data Analysis. 2016 ; Vol. 20, No. 3. pp. 655-677.
@article{71f89edc79cc4cf1860ba05fdcfda0de,
title = "An efficient algorithm to maintain the discovered frequent sequences with record deletion",
abstract = "One of the major concerns with Sequential-pattern mining (SPM) is how to discover frequent sequences from transactional databases. Most SPMalgorithms can only handle static databases, which is not practical in real-life situations. The Fast UPdated 2 (FUP2) algorithm was proposed to maintain and update the discovered association rules for transaction deletion. This algorithm can also be extended to SPM for maintaining the discovered frequent sequences, especially when some sequential records are deleted from their original databases. The original database is, however, required to be rescanned if the maintained sequences are small in the deleted sequential records. In the past, a pre-large concept was proposed to reduce the computational cost of database rescans until the number of deleted customers of the deleted sequential records achieves the designed safety bound. In this paper, a PreFUSP-TREE-DEL algorithm is proposed to adopt a pre-large FUSP-tree structure and the pre-large concept is used for maintaining and updating the discovered sequential patterns with record deletion. The proposed algorithm first partitions the discovered sequential patterns into three parts with nine cases. The discovered sequential patterns of each case are then maintained and updated by the designed procedure. Based on the proposed PreFUSP-TREE-DEL algorithm, it is unnecessary to rescan the original database until the cumulative number of deleted customers achieves the designed safety bound. The conducted experiments show that that the proposed PreFUSP-TREE-DEL algorithm has good performance when compared to other batch-mode algorithms or other maintenance algorithms.",
author = "Lin, {Jerry Chun Wei} and Wensheng Gan and Hong, {Tzung Pei} and Chen, {Hsin Yi} and Sheng-Tun Li",
year = "2016",
month = "4",
day = "20",
doi = "10.3233/IDA-160825",
language = "English",
volume = "20",
pages = "655--677",
journal = "Intelligent Data Analysis",
issn = "1088-467X",
publisher = "IOS Press",
number = "3",

}

An efficient algorithm to maintain the discovered frequent sequences with record deletion. / Lin, Jerry Chun Wei; Gan, Wensheng; Hong, Tzung Pei; Chen, Hsin Yi; Li, Sheng-Tun.

In: Intelligent Data Analysis, Vol. 20, No. 3, 20.04.2016, p. 655-677.

Research output: Contribution to journalArticle

TY - JOUR

T1 - An efficient algorithm to maintain the discovered frequent sequences with record deletion

AU - Lin, Jerry Chun Wei

AU - Gan, Wensheng

AU - Hong, Tzung Pei

AU - Chen, Hsin Yi

AU - Li, Sheng-Tun

PY - 2016/4/20

Y1 - 2016/4/20

N2 - One of the major concerns with Sequential-pattern mining (SPM) is how to discover frequent sequences from transactional databases. Most SPMalgorithms can only handle static databases, which is not practical in real-life situations. The Fast UPdated 2 (FUP2) algorithm was proposed to maintain and update the discovered association rules for transaction deletion. This algorithm can also be extended to SPM for maintaining the discovered frequent sequences, especially when some sequential records are deleted from their original databases. The original database is, however, required to be rescanned if the maintained sequences are small in the deleted sequential records. In the past, a pre-large concept was proposed to reduce the computational cost of database rescans until the number of deleted customers of the deleted sequential records achieves the designed safety bound. In this paper, a PreFUSP-TREE-DEL algorithm is proposed to adopt a pre-large FUSP-tree structure and the pre-large concept is used for maintaining and updating the discovered sequential patterns with record deletion. The proposed algorithm first partitions the discovered sequential patterns into three parts with nine cases. The discovered sequential patterns of each case are then maintained and updated by the designed procedure. Based on the proposed PreFUSP-TREE-DEL algorithm, it is unnecessary to rescan the original database until the cumulative number of deleted customers achieves the designed safety bound. The conducted experiments show that that the proposed PreFUSP-TREE-DEL algorithm has good performance when compared to other batch-mode algorithms or other maintenance algorithms.

AB - One of the major concerns with Sequential-pattern mining (SPM) is how to discover frequent sequences from transactional databases. Most SPMalgorithms can only handle static databases, which is not practical in real-life situations. The Fast UPdated 2 (FUP2) algorithm was proposed to maintain and update the discovered association rules for transaction deletion. This algorithm can also be extended to SPM for maintaining the discovered frequent sequences, especially when some sequential records are deleted from their original databases. The original database is, however, required to be rescanned if the maintained sequences are small in the deleted sequential records. In the past, a pre-large concept was proposed to reduce the computational cost of database rescans until the number of deleted customers of the deleted sequential records achieves the designed safety bound. In this paper, a PreFUSP-TREE-DEL algorithm is proposed to adopt a pre-large FUSP-tree structure and the pre-large concept is used for maintaining and updating the discovered sequential patterns with record deletion. The proposed algorithm first partitions the discovered sequential patterns into three parts with nine cases. The discovered sequential patterns of each case are then maintained and updated by the designed procedure. Based on the proposed PreFUSP-TREE-DEL algorithm, it is unnecessary to rescan the original database until the cumulative number of deleted customers achieves the designed safety bound. The conducted experiments show that that the proposed PreFUSP-TREE-DEL algorithm has good performance when compared to other batch-mode algorithms or other maintenance algorithms.

UR - http://www.scopus.com/inward/record.url?scp=84969850548&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84969850548&partnerID=8YFLogxK

U2 - 10.3233/IDA-160825

DO - 10.3233/IDA-160825

M3 - Article

AN - SCOPUS:84969850548

VL - 20

SP - 655

EP - 677

JO - Intelligent Data Analysis

JF - Intelligent Data Analysis

SN - 1088-467X

IS - 3

ER -