Adaptive decision tree-based phone cluster models for speaker clustering

Chia Hsin Hsieh, Chung-Hsien Wu, Han Ping Shen

Research output: Contribution to journalConference article

2 Citations (Scopus)

Abstract

This study presents an approach to speaker clustering using adaptive decision tree-based phone cluster models (DT-PCMs). First, a large broadcast news database is used to train a set of phone models for universal speakers. The multi-space probability distributed-hidden Markov model (MSD-HMM) is adopted for phone modeling. Confusing phone models are merged into phone clusters. Next, for each state in the phone MSD-HMMs, a decision tree is constructed to store the contextual, phonetic, and speaker characteristics for data sharing over all speakers. For speaker clustering, each input speech segment is used to retrieve the Gaussian models from the DT-PCMs to construct the initial speaker-dependent phone cluster models. Finally, all the corresponding adapted speaker-dependent phone cluster models are used for speaker clustering via a cross-likelihood ratio measure. The experimental results show the DT-PCMs outperforms the conventional GMM-based approach.

Original languageEnglish
Pages (from-to)861-864
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 2008 Dec 1
EventINTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, Australia
Duration: 2008 Sep 222008 Sep 26

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Sensory Systems

Fingerprint Dive into the research topics of 'Adaptive decision tree-based phone cluster models for speaker clustering'. Together they form a unique fingerprint.

  • Cite this