TY - JOUR
T1 - Robust clustering of multiply censored data via mixtures of t factor analyzers
AU - Wang, Wan Lun
AU - Lin, Tsung I.
N1 - Funding Information:
The authors gratefully acknowledge the editors and two anonymous referees for their insightful comments and constructive suggestions that greatly improved the quality of this paper. We are also grateful to Ms. Ting-Yu Lin for her skillful assistance in initial simulations and help sketching some graphs. This research was supported by the Ministry of Science and Technology of Taiwan under Grant Nos. 107-2628-M-035-001-MY3 and 109-2118-M-005-005-MY3.
Publisher Copyright:
© 2021, Sociedad de Estadística e Investigación Operativa.
PY - 2022/3
Y1 - 2022/3
N2 - Mixtures of t factor analyzers (MtFA) have been well recognized as a prominent tool in modeling and clustering multivariate data contaminated with heterogeneity and outliers. In certain practical situations, however, data are likely to be censored such that the standard methodology becomes computationally complicated or even infeasible. This paper presents an extended framework of MtFA that can accommodate censored data, referred to as MtFAC in short. For maximum likelihood estimation, we construct an alternating expectation conditional maximization algorithm in which the E-step relies on the first-two moments of truncated multivariate-t distributions and CM-steps offer tractable solutions of updated estimators. Asymptotic standard errors of mixing proportions and component mean vectors are derived by means of missing information principle, or the so-called Louis’ method. Several numerical experiments are conducted to examine the finite-sample properties of estimators and the ability of the proposed model to downweight the impact of censoring and outlying effects. Further, the efficacy and usefulness of the proposed method are also demonstrated by analyzing a real dataset with genuine censored observations.
AB - Mixtures of t factor analyzers (MtFA) have been well recognized as a prominent tool in modeling and clustering multivariate data contaminated with heterogeneity and outliers. In certain practical situations, however, data are likely to be censored such that the standard methodology becomes computationally complicated or even infeasible. This paper presents an extended framework of MtFA that can accommodate censored data, referred to as MtFAC in short. For maximum likelihood estimation, we construct an alternating expectation conditional maximization algorithm in which the E-step relies on the first-two moments of truncated multivariate-t distributions and CM-steps offer tractable solutions of updated estimators. Asymptotic standard errors of mixing proportions and component mean vectors are derived by means of missing information principle, or the so-called Louis’ method. Several numerical experiments are conducted to examine the finite-sample properties of estimators and the ability of the proposed model to downweight the impact of censoring and outlying effects. Further, the efficacy and usefulness of the proposed method are also demonstrated by analyzing a real dataset with genuine censored observations.
UR - http://www.scopus.com/inward/record.url?scp=85104092481&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85104092481&partnerID=8YFLogxK
U2 - 10.1007/s11749-021-00766-y
DO - 10.1007/s11749-021-00766-y
M3 - Article
AN - SCOPUS:85104092481
SN - 1133-0686
VL - 31
SP - 22
EP - 53
JO - Test
JF - Test
IS - 1
ER -