Greedy active learning algorithm for logistic regression models

Hsiang Ling Hsu, Yuan chin Ivan Chang, Ray-Bing Chen

Research output: Contribution to journalArticle

Abstract

We study a logistic model-based active learning procedure for binary classification problems, in which we adopt a batch subject selection strategy with a modified sequential experimental design method. Moreover, accompanying the proposed subject selection scheme, we simultaneously conduct a greedy variable selection procedure such that we can update the classification model with all labeled training subjects. The proposed algorithm repeatedly performs both subject and variable selection steps until a prefixed stopping criterion is reached. Our numerical results show that the proposed procedure has competitive performance, with smaller training size and a more compact model compared with that of the classifier trained with all variables and a full data set. We also apply the proposed procedure to a well-known wave data set (Breiman et al., 1984) and a MAGIC gamma telescope data set to confirm the performance of our method.

LanguageEnglish
Pages119-134
Number of pages16
JournalComputational Statistics and Data Analysis
Volume129
DOIs
Publication statusPublished - 2019 Jan 1

Fingerprint

Active Learning
Logistic Regression Model
Learning algorithms
Logistics
Learning Algorithm
Variable Selection
Binary Classification
Stopping Criterion
Logistic Model
Selection Procedures
Experimental design
Telescopes
Classification Problems
Design of experiments
Batch
Design Method
Telescope
Classifiers
Update
Classifier

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Computational Mathematics
  • Computational Theory and Mathematics
  • Applied Mathematics

Cite this

@article{acbc5277cccd4640be51e6dbcff8e4ad,
title = "Greedy active learning algorithm for logistic regression models",
abstract = "We study a logistic model-based active learning procedure for binary classification problems, in which we adopt a batch subject selection strategy with a modified sequential experimental design method. Moreover, accompanying the proposed subject selection scheme, we simultaneously conduct a greedy variable selection procedure such that we can update the classification model with all labeled training subjects. The proposed algorithm repeatedly performs both subject and variable selection steps until a prefixed stopping criterion is reached. Our numerical results show that the proposed procedure has competitive performance, with smaller training size and a more compact model compared with that of the classifier trained with all variables and a full data set. We also apply the proposed procedure to a well-known wave data set (Breiman et al., 1984) and a MAGIC gamma telescope data set to confirm the performance of our method.",
author = "Hsu, {Hsiang Ling} and Chang, {Yuan chin Ivan} and Ray-Bing Chen",
year = "2019",
month = "1",
day = "1",
doi = "10.1016/j.csda.2018.08.013",
language = "English",
volume = "129",
pages = "119--134",
journal = "Computational Statistics and Data Analysis",
issn = "0167-9473",
publisher = "Elsevier",

}

Greedy active learning algorithm for logistic regression models. / Hsu, Hsiang Ling; Chang, Yuan chin Ivan; Chen, Ray-Bing.

In: Computational Statistics and Data Analysis, Vol. 129, 01.01.2019, p. 119-134.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Greedy active learning algorithm for logistic regression models

AU - Hsu, Hsiang Ling

AU - Chang, Yuan chin Ivan

AU - Chen, Ray-Bing

PY - 2019/1/1

Y1 - 2019/1/1

N2 - We study a logistic model-based active learning procedure for binary classification problems, in which we adopt a batch subject selection strategy with a modified sequential experimental design method. Moreover, accompanying the proposed subject selection scheme, we simultaneously conduct a greedy variable selection procedure such that we can update the classification model with all labeled training subjects. The proposed algorithm repeatedly performs both subject and variable selection steps until a prefixed stopping criterion is reached. Our numerical results show that the proposed procedure has competitive performance, with smaller training size and a more compact model compared with that of the classifier trained with all variables and a full data set. We also apply the proposed procedure to a well-known wave data set (Breiman et al., 1984) and a MAGIC gamma telescope data set to confirm the performance of our method.

AB - We study a logistic model-based active learning procedure for binary classification problems, in which we adopt a batch subject selection strategy with a modified sequential experimental design method. Moreover, accompanying the proposed subject selection scheme, we simultaneously conduct a greedy variable selection procedure such that we can update the classification model with all labeled training subjects. The proposed algorithm repeatedly performs both subject and variable selection steps until a prefixed stopping criterion is reached. Our numerical results show that the proposed procedure has competitive performance, with smaller training size and a more compact model compared with that of the classifier trained with all variables and a full data set. We also apply the proposed procedure to a well-known wave data set (Breiman et al., 1984) and a MAGIC gamma telescope data set to confirm the performance of our method.

UR - http://www.scopus.com/inward/record.url?scp=85053065230&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85053065230&partnerID=8YFLogxK

U2 - 10.1016/j.csda.2018.08.013

DO - 10.1016/j.csda.2018.08.013

M3 - Article

VL - 129

SP - 119

EP - 134

JO - Computational Statistics and Data Analysis

T2 - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

ER -