TY - JOUR
T1 - Greedy active learning algorithm for logistic regression models
AU - Hsu, Hsiang Ling
AU - Chang, Yuan chin Ivan
AU - Chen, Ray Bing
N1 - Funding Information:
The research of Prof. Hsu was partly supported by the Ministry of Science and Technology (MOST) of Taiwan with Grant No., MOST 105-2118-M-390-004 . The research of Dr. Chang was partly supported by the Ministry of Science and Technology (MOST) of Taiwan with Grant No., MOST 106-2118-M-001-007-MY2 and MOST 103-2118-M-001-002-MY3 . The research of Prof. Chen was partially supported by the Ministry of Science and Technology (MOST) of Taiwan with Grant No., MOST 105-2628-M-006-002-MY2 and the Mathematics Division of the National Center for Theoretical Sciences in Taiwan .
Publisher Copyright:
© 2018 Elsevier B.V.
PY - 2019/1
Y1 - 2019/1
N2 - We study a logistic model-based active learning procedure for binary classification problems, in which we adopt a batch subject selection strategy with a modified sequential experimental design method. Moreover, accompanying the proposed subject selection scheme, we simultaneously conduct a greedy variable selection procedure such that we can update the classification model with all labeled training subjects. The proposed algorithm repeatedly performs both subject and variable selection steps until a prefixed stopping criterion is reached. Our numerical results show that the proposed procedure has competitive performance, with smaller training size and a more compact model compared with that of the classifier trained with all variables and a full data set. We also apply the proposed procedure to a well-known wave data set (Breiman et al., 1984) and a MAGIC gamma telescope data set to confirm the performance of our method.
AB - We study a logistic model-based active learning procedure for binary classification problems, in which we adopt a batch subject selection strategy with a modified sequential experimental design method. Moreover, accompanying the proposed subject selection scheme, we simultaneously conduct a greedy variable selection procedure such that we can update the classification model with all labeled training subjects. The proposed algorithm repeatedly performs both subject and variable selection steps until a prefixed stopping criterion is reached. Our numerical results show that the proposed procedure has competitive performance, with smaller training size and a more compact model compared with that of the classifier trained with all variables and a full data set. We also apply the proposed procedure to a well-known wave data set (Breiman et al., 1984) and a MAGIC gamma telescope data set to confirm the performance of our method.
UR - http://www.scopus.com/inward/record.url?scp=85053065230&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85053065230&partnerID=8YFLogxK
U2 - 10.1016/j.csda.2018.08.013
DO - 10.1016/j.csda.2018.08.013
M3 - Article
AN - SCOPUS:85053065230
VL - 129
SP - 119
EP - 134
JO - Computational Statistics and Data Analysis
JF - Computational Statistics and Data Analysis
SN - 0167-9473
ER -