TY - JOUR
T1 - Flexible clustering via extended mixtures of common t-factor analyzers
AU - Wang, Wan Lun
AU - Lin, Tsung I.
N1 - Funding Information:
The authors are grateful to the Chief Editor, the Associate Editor, and two anonymous reviewers for their insightful comments and suggestions that greatly improved this article. This work was partially supported by the Ministry of Science and Technology of Taiwan under Grant Nos. MOST 105-2118-M-035-004-MY2 and MOST 105-2118-M-005-003-MY2.
Publisher Copyright:
© 2016, Springer-Verlag Berlin Heidelberg.
PY - 2017/7/1
Y1 - 2017/7/1
N2 - Mixtures of t-factor analyzers have been broadly used for model-based density estimation and clustering of high-dimensional data from a heterogeneous population with longer-than-normal tails or atypical observations. To reduce the number of parameters in the component covariance matrices, the mixtures of common t-factor analyzers (MCtFA) have been recently proposed by assuming a common factor loading across different components. In this paper, we present an extended version of MCtFA using distinct covariance matrices for component errors. The modified mixture model offers a more appropriate way to represent the data in a graphical fashion. Two flexible EM-type algorithms are developed for iteratively computing maximum likelihood estimates of parameters. Practical considerations for the specification of starting values, model-based clustering, classification of new subject and identification of potential outliers are also provided. We demonstrate the superiority of the proposed methodology by analyzing the Italian wine data and a simulation study.
AB - Mixtures of t-factor analyzers have been broadly used for model-based density estimation and clustering of high-dimensional data from a heterogeneous population with longer-than-normal tails or atypical observations. To reduce the number of parameters in the component covariance matrices, the mixtures of common t-factor analyzers (MCtFA) have been recently proposed by assuming a common factor loading across different components. In this paper, we present an extended version of MCtFA using distinct covariance matrices for component errors. The modified mixture model offers a more appropriate way to represent the data in a graphical fashion. Two flexible EM-type algorithms are developed for iteratively computing maximum likelihood estimates of parameters. Practical considerations for the specification of starting values, model-based clustering, classification of new subject and identification of potential outliers are also provided. We demonstrate the superiority of the proposed methodology by analyzing the Italian wine data and a simulation study.
UR - http://www.scopus.com/inward/record.url?scp=84994322684&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84994322684&partnerID=8YFLogxK
U2 - 10.1007/s10182-016-0281-0
DO - 10.1007/s10182-016-0281-0
M3 - Article
AN - SCOPUS:84994322684
SN - 1863-8171
VL - 101
SP - 227
EP - 252
JO - AStA Advances in Statistical Analysis
JF - AStA Advances in Statistical Analysis
IS - 3
ER -