TY - JOUR
T1 - Adaptive decision tree-based phone cluster models for speaker clustering
AU - Hsieh, Chia Hsin
AU - Wu, Chung Hsien
AU - Shen, Han Ping
PY - 2008
Y1 - 2008
N2 - This study presents an approach to speaker clustering using adaptive decision tree-based phone cluster models (DT-PCMs). First, a large broadcast news database is used to train a set of phone models for universal speakers. The multi-space probability distributed-hidden Markov model (MSD-HMM) is adopted for phone modeling. Confusing phone models are merged into phone clusters. Next, for each state in the phone MSD-HMMs, a decision tree is constructed to store the contextual, phonetic, and speaker characteristics for data sharing over all speakers. For speaker clustering, each input speech segment is used to retrieve the Gaussian models from the DT-PCMs to construct the initial speaker-dependent phone cluster models. Finally, all the corresponding adapted speaker-dependent phone cluster models are used for speaker clustering via a cross-likelihood ratio measure. The experimental results show the DT-PCMs outperforms the conventional GMM-based approach.
AB - This study presents an approach to speaker clustering using adaptive decision tree-based phone cluster models (DT-PCMs). First, a large broadcast news database is used to train a set of phone models for universal speakers. The multi-space probability distributed-hidden Markov model (MSD-HMM) is adopted for phone modeling. Confusing phone models are merged into phone clusters. Next, for each state in the phone MSD-HMMs, a decision tree is constructed to store the contextual, phonetic, and speaker characteristics for data sharing over all speakers. For speaker clustering, each input speech segment is used to retrieve the Gaussian models from the DT-PCMs to construct the initial speaker-dependent phone cluster models. Finally, all the corresponding adapted speaker-dependent phone cluster models are used for speaker clustering via a cross-likelihood ratio measure. The experimental results show the DT-PCMs outperforms the conventional GMM-based approach.
UR - http://www.scopus.com/inward/record.url?scp=84867217655&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867217655&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:84867217655
SN - 2308-457X
SP - 861
EP - 864
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association
Y2 - 22 September 2008 through 26 September 2008
ER -