TY - JOUR
T1 - Automated learning of mixtures of factor analysis models with missing information
AU - Wang, Wan Lun
AU - Lin, Tsung I.
N1 - Funding Information:
The authors gratefully acknowledge the editors and two anonymous referees for their comments and suggestions that greatly improved the quality of this paper. We are also grateful to Ms. Ying-Ting Lin for her assistance in initial simulations. This research was supported by the Ministry of Science and Technology of Taiwan under Grant Nos. 107-2628-M-035-001-MY3 and 107-2118-M-005-002-MY2.
Publisher Copyright:
© 2020, Sociedad de Estadística e Investigación Operativa.
PY - 2020/12
Y1 - 2020/12
N2 - The mixture of factor analyzers (MFA) model has emerged as a useful tool to perform dimensionality reduction and model-based clustering for heterogeneous data. In seeking the most appropriate number of factors (q) of a MFA model with the number of components (g) fixed a priori, a two-stage procedure is commonly implemented by firstly carrying out parameter estimation over a set of prespecified numbers of factors, and then selecting the best q according to certain penalized likelihood criteria. When the dimensionality of data grows higher, such a procedure can be computationally prohibitive. To overcome this obstacle, we develop an automated learning scheme, called the automated MFA (AMFA) algorithm, to effectively merge parameter estimation and selection of q into a one-stage algorithm. The proposed AMFA procedure that allows for much lower computational cost is also extended to accommodate missing values. Moreover, we explicitly derive the score vector and the empirical information matrix for calculating standard errors associated with the estimated parameters. The potential and applicability of the proposed method are demonstrated through a number of real datasets with genuine and synthetic missing values.
AB - The mixture of factor analyzers (MFA) model has emerged as a useful tool to perform dimensionality reduction and model-based clustering for heterogeneous data. In seeking the most appropriate number of factors (q) of a MFA model with the number of components (g) fixed a priori, a two-stage procedure is commonly implemented by firstly carrying out parameter estimation over a set of prespecified numbers of factors, and then selecting the best q according to certain penalized likelihood criteria. When the dimensionality of data grows higher, such a procedure can be computationally prohibitive. To overcome this obstacle, we develop an automated learning scheme, called the automated MFA (AMFA) algorithm, to effectively merge parameter estimation and selection of q into a one-stage algorithm. The proposed AMFA procedure that allows for much lower computational cost is also extended to accommodate missing values. Moreover, we explicitly derive the score vector and the empirical information matrix for calculating standard errors associated with the estimated parameters. The potential and applicability of the proposed method are demonstrated through a number of real datasets with genuine and synthetic missing values.
UR - http://www.scopus.com/inward/record.url?scp=85078757155&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078757155&partnerID=8YFLogxK
U2 - 10.1007/s11749-020-00702-6
DO - 10.1007/s11749-020-00702-6
M3 - Article
AN - SCOPUS:85078757155
SN - 1133-0686
VL - 29
SP - 1098
EP - 1124
JO - Test
JF - Test
IS - 4
ER -