Adaptive decision tree-based phone cluster models for speaker clustering

Chia Hsin Hsieh, Chung-Hsien Wu, Han Ping Shen

研究成果: Conference article同行評審

2 引文 斯高帕斯(Scopus)

摘要

This study presents an approach to speaker clustering using adaptive decision tree-based phone cluster models (DT-PCMs). First, a large broadcast news database is used to train a set of phone models for universal speakers. The multi-space probability distributed-hidden Markov model (MSD-HMM) is adopted for phone modeling. Confusing phone models are merged into phone clusters. Next, for each state in the phone MSD-HMMs, a decision tree is constructed to store the contextual, phonetic, and speaker characteristics for data sharing over all speakers. For speaker clustering, each input speech segment is used to retrieve the Gaussian models from the DT-PCMs to construct the initial speaker-dependent phone cluster models. Finally, all the corresponding adapted speaker-dependent phone cluster models are used for speaker clustering via a cross-likelihood ratio measure. The experimental results show the DT-PCMs outperforms the conventional GMM-based approach.

原文English
頁(從 - 到)861-864
頁數4
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
出版狀態Published - 2008 十二月 1
事件INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, Australia
持續時間: 2008 九月 222008 九月 26

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Sensory Systems

指紋 深入研究「Adaptive decision tree-based phone cluster models for speaker clustering」主題。共同形成了獨特的指紋。

引用此