TY - JOUR
T1 - Early stopping in L2 Boosting
AU - Ivan Chang, Yuan Chin
AU - Huang, Yufen
AU - Huang, Yu Pai
N1 - Funding Information:
The authors would like to thank the editor and referees for detailed and thoughtful comments which aided our revision. The corresponding author is partially supported by a grant from the National Science Council of Taiwan ( NSC-96-2118-M-194-002-MY2 ).
PY - 2010/10/1
Y1 - 2010/10/1
N2 - It is well known that the boosting-like algorithms, such as AdaBoost and many of its modifications, may over-fit the training data when the number of boosting iterations becomes large. Therefore, how to stop a boosting algorithm at an appropriate iteration time is a longstanding problem for the past decade (see Meir and Rtsch, 2003). Bhlmann and Yu (2005) applied model selection criteria to estimate the stopping iteration for L2Boosting, but it is still necessary to compute all boosting iterations under consideration for the training data. Thus, the main purpose of this paper is focused on studying the early stopping rule for L2Boosting during the training stage to seek a very substantial computational saving. The proposed method is based on a change point detection method on the values of model selection criteria during the training stage. This method is also extended to two-class classification problems which are very common in medical and bioinformatics applications. A simulation study and a real data example to these approaches are provided for illustrations, and comparisons are made with LogitBoost.
AB - It is well known that the boosting-like algorithms, such as AdaBoost and many of its modifications, may over-fit the training data when the number of boosting iterations becomes large. Therefore, how to stop a boosting algorithm at an appropriate iteration time is a longstanding problem for the past decade (see Meir and Rtsch, 2003). Bhlmann and Yu (2005) applied model selection criteria to estimate the stopping iteration for L2Boosting, but it is still necessary to compute all boosting iterations under consideration for the training data. Thus, the main purpose of this paper is focused on studying the early stopping rule for L2Boosting during the training stage to seek a very substantial computational saving. The proposed method is based on a change point detection method on the values of model selection criteria during the training stage. This method is also extended to two-class classification problems which are very common in medical and bioinformatics applications. A simulation study and a real data example to these approaches are provided for illustrations, and comparisons are made with LogitBoost.
UR - http://www.scopus.com/inward/record.url?scp=77955272634&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77955272634&partnerID=8YFLogxK
U2 - 10.1016/j.csda.2010.03.024
DO - 10.1016/j.csda.2010.03.024
M3 - Article
AN - SCOPUS:77955272634
SN - 0167-9473
VL - 54
SP - 2203
EP - 2213
JO - Computational Statistics and Data Analysis
JF - Computational Statistics and Data Analysis
IS - 10
ER -