Incrementally updating the discovered sequential patterns based on pre-large concept

Jerry Chun Wei Lin, Tzung Pei Hong, Wensheng Gan, Hsin Yi Chen, Sheng Tun Li

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Mining useful information from large databases has become an important research area in recent years. Among the classes of knowledge derived, sequential pattern can be applied in many domains, such as market analysis, web click streams, and biological data. The fast updated sequential pattern tree (FUSP-tree) algorithm was proposed to update discovered sequential patterns in incremental mining. However, it must rescan the original database for maintaining discovered sequential patterns. This study proposes the PreFUSP-TREE-INS algorithm based on the pre-large concept for maintaining discovered sequential patterns without rescanning the original database until the cumulative number of newly added customer sequences exceeds a safety bound. The execution time for reconstructing the tree when old or new customer sequences are added into the original database is reduced by using pre-large sequences. The pre-large sequences are defined by lower and upper support thresholds that prevent the movement of sequences directly from large to small and vice versa. Experiments are conducted to show the performance of the proposed algorithm for various minimum support thresholds and ratios of inserted sequences.

Original languageEnglish
Pages (from-to)1071-1089
Number of pages19
JournalIntelligent Data Analysis
Volume19
Issue number5
DOIs
Publication statusPublished - 2015 Sep 8

Fingerprint

Sequential Patterns
Updating
Mining
Customers
Tree Algorithms
Execution Time
Concepts
Exceed
Safety
Update
Experiments
Experiment

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Cite this

Lin, Jerry Chun Wei ; Hong, Tzung Pei ; Gan, Wensheng ; Chen, Hsin Yi ; Li, Sheng Tun. / Incrementally updating the discovered sequential patterns based on pre-large concept. In: Intelligent Data Analysis. 2015 ; Vol. 19, No. 5. pp. 1071-1089.
@article{36fec1d3b40342aebf45b5d697cbeef9,
title = "Incrementally updating the discovered sequential patterns based on pre-large concept",
abstract = "Mining useful information from large databases has become an important research area in recent years. Among the classes of knowledge derived, sequential pattern can be applied in many domains, such as market analysis, web click streams, and biological data. The fast updated sequential pattern tree (FUSP-tree) algorithm was proposed to update discovered sequential patterns in incremental mining. However, it must rescan the original database for maintaining discovered sequential patterns. This study proposes the PreFUSP-TREE-INS algorithm based on the pre-large concept for maintaining discovered sequential patterns without rescanning the original database until the cumulative number of newly added customer sequences exceeds a safety bound. The execution time for reconstructing the tree when old or new customer sequences are added into the original database is reduced by using pre-large sequences. The pre-large sequences are defined by lower and upper support thresholds that prevent the movement of sequences directly from large to small and vice versa. Experiments are conducted to show the performance of the proposed algorithm for various minimum support thresholds and ratios of inserted sequences.",
author = "Lin, {Jerry Chun Wei} and Hong, {Tzung Pei} and Wensheng Gan and Chen, {Hsin Yi} and Li, {Sheng Tun}",
year = "2015",
month = "9",
day = "8",
doi = "10.3233/IDA-150759",
language = "English",
volume = "19",
pages = "1071--1089",
journal = "Intelligent Data Analysis",
issn = "1088-467X",
publisher = "IOS Press",
number = "5",

}

Incrementally updating the discovered sequential patterns based on pre-large concept. / Lin, Jerry Chun Wei; Hong, Tzung Pei; Gan, Wensheng; Chen, Hsin Yi; Li, Sheng Tun.

In: Intelligent Data Analysis, Vol. 19, No. 5, 08.09.2015, p. 1071-1089.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Incrementally updating the discovered sequential patterns based on pre-large concept

AU - Lin, Jerry Chun Wei

AU - Hong, Tzung Pei

AU - Gan, Wensheng

AU - Chen, Hsin Yi

AU - Li, Sheng Tun

PY - 2015/9/8

Y1 - 2015/9/8

N2 - Mining useful information from large databases has become an important research area in recent years. Among the classes of knowledge derived, sequential pattern can be applied in many domains, such as market analysis, web click streams, and biological data. The fast updated sequential pattern tree (FUSP-tree) algorithm was proposed to update discovered sequential patterns in incremental mining. However, it must rescan the original database for maintaining discovered sequential patterns. This study proposes the PreFUSP-TREE-INS algorithm based on the pre-large concept for maintaining discovered sequential patterns without rescanning the original database until the cumulative number of newly added customer sequences exceeds a safety bound. The execution time for reconstructing the tree when old or new customer sequences are added into the original database is reduced by using pre-large sequences. The pre-large sequences are defined by lower and upper support thresholds that prevent the movement of sequences directly from large to small and vice versa. Experiments are conducted to show the performance of the proposed algorithm for various minimum support thresholds and ratios of inserted sequences.

AB - Mining useful information from large databases has become an important research area in recent years. Among the classes of knowledge derived, sequential pattern can be applied in many domains, such as market analysis, web click streams, and biological data. The fast updated sequential pattern tree (FUSP-tree) algorithm was proposed to update discovered sequential patterns in incremental mining. However, it must rescan the original database for maintaining discovered sequential patterns. This study proposes the PreFUSP-TREE-INS algorithm based on the pre-large concept for maintaining discovered sequential patterns without rescanning the original database until the cumulative number of newly added customer sequences exceeds a safety bound. The execution time for reconstructing the tree when old or new customer sequences are added into the original database is reduced by using pre-large sequences. The pre-large sequences are defined by lower and upper support thresholds that prevent the movement of sequences directly from large to small and vice versa. Experiments are conducted to show the performance of the proposed algorithm for various minimum support thresholds and ratios of inserted sequences.

UR - http://www.scopus.com/inward/record.url?scp=84941641381&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84941641381&partnerID=8YFLogxK

U2 - 10.3233/IDA-150759

DO - 10.3233/IDA-150759

M3 - Article

AN - SCOPUS:84941641381

VL - 19

SP - 1071

EP - 1089

JO - Intelligent Data Analysis

JF - Intelligent Data Analysis

SN - 1088-467X

IS - 5

ER -