A model-selection framework for concept-drifting data streams

Bo Heng Chen, Kun-Ta Chuang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

There has been an increasing research interest in classification for data streams. Due to the evolving nature of data streams, it is a highly challenging issue to detect the appearance of concept drifts, which will make the current classification model invalid as time passes. So far most stream classification solutions exploit the so-called incremental learning process to continuously track the deviation of prediction accuracy. Unfortunately, to achieve the prompt concept-drifting detection, such strategies usually rely on an infeasible assumption about the availability of data instances with true labels. We in this paper propose a new framework, called Inference of Concept Evolution (abbreviated as ICE), to minimize the need of real-time acquisition of true labels. Specifically, the ICE framework is devised based on the idea of model reuse. The dictionary learning technique is utilized to determine whether the concept drift appears without the need of label acquisition. When the drift happens, the ICE framework will select the best model maintained in the model pool, decreasing the need of model re-training and its costly label acquisition. As demonstrated in our experimental result, the ICE framework can track the best model correctly and efficiently, showing its feasibility in real cases.

Original languageEnglish
Title of host publicationDSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics
EditorsGeorge Karypis, Longbing Cao, Wei Wang, Irwin King
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages290-296
Number of pages7
ISBN (Electronic)9781479969913
DOIs
Publication statusPublished - 2014 Mar 10
Event2014 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2014 - Shanghai, China
Duration: 2014 Oct 302014 Nov 1

Publication series

NameDSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics

Other

Other2014 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2014
CountryChina
CityShanghai
Period14-10-3014-11-01

Fingerprint

Labels
Glossaries
Data streams
Model selection
Availability

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Information Systems
  • Information Systems and Management

Cite this

Chen, B. H., & Chuang, K-T. (2014). A model-selection framework for concept-drifting data streams. In G. Karypis, L. Cao, W. Wang, & I. King (Eds.), DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics (pp. 290-296). [7058087] (DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/DSAA.2014.7058087
Chen, Bo Heng ; Chuang, Kun-Ta. / A model-selection framework for concept-drifting data streams. DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics. editor / George Karypis ; Longbing Cao ; Wei Wang ; Irwin King. Institute of Electrical and Electronics Engineers Inc., 2014. pp. 290-296 (DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics).
@inproceedings{16becc487d1b4d139157c5d69a693605,
title = "A model-selection framework for concept-drifting data streams",
abstract = "There has been an increasing research interest in classification for data streams. Due to the evolving nature of data streams, it is a highly challenging issue to detect the appearance of concept drifts, which will make the current classification model invalid as time passes. So far most stream classification solutions exploit the so-called incremental learning process to continuously track the deviation of prediction accuracy. Unfortunately, to achieve the prompt concept-drifting detection, such strategies usually rely on an infeasible assumption about the availability of data instances with true labels. We in this paper propose a new framework, called Inference of Concept Evolution (abbreviated as ICE), to minimize the need of real-time acquisition of true labels. Specifically, the ICE framework is devised based on the idea of model reuse. The dictionary learning technique is utilized to determine whether the concept drift appears without the need of label acquisition. When the drift happens, the ICE framework will select the best model maintained in the model pool, decreasing the need of model re-training and its costly label acquisition. As demonstrated in our experimental result, the ICE framework can track the best model correctly and efficiently, showing its feasibility in real cases.",
author = "Chen, {Bo Heng} and Kun-Ta Chuang",
year = "2014",
month = "3",
day = "10",
doi = "10.1109/DSAA.2014.7058087",
language = "English",
series = "DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "290--296",
editor = "George Karypis and Longbing Cao and Wei Wang and Irwin King",
booktitle = "DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics",
address = "United States",

}

Chen, BH & Chuang, K-T 2014, A model-selection framework for concept-drifting data streams. in G Karypis, L Cao, W Wang & I King (eds), DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics., 7058087, DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics, Institute of Electrical and Electronics Engineers Inc., pp. 290-296, 2014 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2014, Shanghai, China, 14-10-30. https://doi.org/10.1109/DSAA.2014.7058087

A model-selection framework for concept-drifting data streams. / Chen, Bo Heng; Chuang, Kun-Ta.

DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics. ed. / George Karypis; Longbing Cao; Wei Wang; Irwin King. Institute of Electrical and Electronics Engineers Inc., 2014. p. 290-296 7058087 (DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A model-selection framework for concept-drifting data streams

AU - Chen, Bo Heng

AU - Chuang, Kun-Ta

PY - 2014/3/10

Y1 - 2014/3/10

N2 - There has been an increasing research interest in classification for data streams. Due to the evolving nature of data streams, it is a highly challenging issue to detect the appearance of concept drifts, which will make the current classification model invalid as time passes. So far most stream classification solutions exploit the so-called incremental learning process to continuously track the deviation of prediction accuracy. Unfortunately, to achieve the prompt concept-drifting detection, such strategies usually rely on an infeasible assumption about the availability of data instances with true labels. We in this paper propose a new framework, called Inference of Concept Evolution (abbreviated as ICE), to minimize the need of real-time acquisition of true labels. Specifically, the ICE framework is devised based on the idea of model reuse. The dictionary learning technique is utilized to determine whether the concept drift appears without the need of label acquisition. When the drift happens, the ICE framework will select the best model maintained in the model pool, decreasing the need of model re-training and its costly label acquisition. As demonstrated in our experimental result, the ICE framework can track the best model correctly and efficiently, showing its feasibility in real cases.

AB - There has been an increasing research interest in classification for data streams. Due to the evolving nature of data streams, it is a highly challenging issue to detect the appearance of concept drifts, which will make the current classification model invalid as time passes. So far most stream classification solutions exploit the so-called incremental learning process to continuously track the deviation of prediction accuracy. Unfortunately, to achieve the prompt concept-drifting detection, such strategies usually rely on an infeasible assumption about the availability of data instances with true labels. We in this paper propose a new framework, called Inference of Concept Evolution (abbreviated as ICE), to minimize the need of real-time acquisition of true labels. Specifically, the ICE framework is devised based on the idea of model reuse. The dictionary learning technique is utilized to determine whether the concept drift appears without the need of label acquisition. When the drift happens, the ICE framework will select the best model maintained in the model pool, decreasing the need of model re-training and its costly label acquisition. As demonstrated in our experimental result, the ICE framework can track the best model correctly and efficiently, showing its feasibility in real cases.

UR - http://www.scopus.com/inward/record.url?scp=84946693371&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946693371&partnerID=8YFLogxK

U2 - 10.1109/DSAA.2014.7058087

DO - 10.1109/DSAA.2014.7058087

M3 - Conference contribution

T3 - DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics

SP - 290

EP - 296

BT - DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics

A2 - Karypis, George

A2 - Cao, Longbing

A2 - Wang, Wei

A2 - King, Irwin

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Chen BH, Chuang K-T. A model-selection framework for concept-drifting data streams. In Karypis G, Cao L, Wang W, King I, editors, DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics. Institute of Electrical and Electronics Engineers Inc. 2014. p. 290-296. 7058087. (DSAA 2014 - Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics). https://doi.org/10.1109/DSAA.2014.7058087