Active learning with simultaneous subject and variable selections

Yuan chin Ivan Chang, Ray Bing Chen

研究成果: Article同行評審

1 引文 斯高帕斯(Scopus)


Training data are essential for learning classification models. Therefore, if only a limited number of labeled subjects are available for use as training samples, whereas a considerable amount of unlabeled data already exists, then it is always desirable enlarging the training set by labeling more subjects in order to ameliorate classification models. When it is costly in time and capital to label unlabeled subjects, it is crucial to know how many labeled subjects are necessary for training a satisfactory classification model. Although, active learning methods can gradually recruit new unlabeled subjects and disclose their label information to enlarge the size of the training set, there is a lack of discussion about the size of training samples in the literature. Hence, when/how to appropriately stop an active learning procedure is studied in this paper. Since the sequential subject recruiting strategy is used in active learning procedures, it is natural to adopt the idea of sequential analysis to dynamically and adaptively determine the training sample size for learning. In this study, we propose a stopping criterion for a linear model-based active learning procedure, such that this learning process will asymptotically achieve its best possible empirical performance, in terms of the area under receiver the operating characteristic curve (ROC), when the procedure is stopped. Other statistical properties of the proposed procedure, including estimation consistency and variable selection, are also studied. The numerical results using both synthesized and a real example are reported.

頁(從 - 到)495-505
出版狀態Published - 2019 2月 15

All Science Journal Classification (ASJC) codes

  • 電腦科學應用
  • 認知神經科學
  • 人工智慧


深入研究「Active learning with simultaneous subject and variable selections」主題。共同形成了獨特的指紋。