LASSO Variable Selection in Data Envelopment Analysis and Convex Nonparametric Least Squares

  • 蔡 佳盈

Student thesis: Master's Thesis


The number of inputs and outputs factors has significant impacts on the production function estimated by data envelopment analysis (DEA) That is “curse of dimensionality” is an issue when using a small number of observations for estimating the high-dimensional frontier The study conducts a data generating process (DGP) to argue that the typical “rule of thumbs” e g the number of observations should be at least larger than twice of the number of inputs and outputs used in DEA is ambiguous and may lead to large deviations in technical efficiency estimation Hence this study proposes variable selection technique to address this issue This study can be separated into two parts: single-output and multiple-inputs scenario (Chapter 3) and multiple-outputs and multiple-inputs scenario (Chapter 4) In Chapter 3 we propose a Least Absolute Shrinkage and Selection Operator (LASSO) variable selection technique usually used in data mining for extracting significant factors in the formulation of sign-constrained convex nonparametric least squares (SCNLS) regarded as DEA and the results show that the proposed LASSO-SCNLS method is useful to give guidelines of dimension reduction in DEA In Chapter 4 we suggest Principle Component Analysis (PCA) Group-LASSO SCNLS method for variable selection and the result shows that is performs well for dimension reduction
Date of Award2017 Jul 27
Original languageEnglish
SupervisorChia-Yen Lee (Supervisor)

Cite this