A preliminary study of variable selection in penalized logistic regression with rare events data

  • 林 鼎晃

Student thesis: Doctoral Thesis

Abstract

It's well known that the accuracy of MLE of the regression coefficient in logistic regression model is seriously affected by rare events Less attention is given to the performance of variable selection in logistic regression with rare events Therefore this thesis studies the performance of three variable selection methods LASSO (Least Absolute Shrinkage and Selection Operator) SCAD (Smoothly Clipper Absolute Deviation) and Adaptive LASSO when event rate is low and the number of explanatory variables is much larger than sample sizes A simulation study is conducted to compare the accuracy in selecting important explanatory variables of logistic regression model Based on limited simulation scenarios when event rate is as low as 0 05 the simulation results recommended using Adaptive LASSO to select important explanatory variables Consequently Adaptive LASSO is recommended for variable selection and prediction with rare events data
Date of Award2019
Original languageEnglish
SupervisorYun-Chan Chi (Supervisor)

Cite this

'