Chinese-english phone set construction for code-switching ASR using acoustic and DNN-extracted articulatory features

Chung Hsien Wu, Han Ping Shen, Yan Ting Yang

研究成果: Article同行評審

13 引文 斯高帕斯(Scopus)

摘要

This study proposes a data-driven approach to phone set construction for code-switching automatic speech recognition (ASR). Acoustic and context-dependent cross-lingual articulatory features (AFs) are incorporated into the estimation of the distance between triphone units for constructing a Chinese-English phone set. The acoustic features of each triphone in the training corpus are extracted for constructing an acoustic triphone HMM. Furthermore, the articulatory features of the "last/first" state of the corresponding preceding/succeeding triphone in the training corpus are used to construct an AF-based GMM. The AFs, extracted using a deep neural network (DNN), are used for code-switching articulation modeling to alleviate the data sparseness problem due to the diverse context-dependent phone combinations in intra-sentential code-switching. The triphones are then clustered to obtain a Chinese-English phone set based on the acoustic HMMs and the AF-based GMMs using a hierarchical triphone clustering algorithm. Experimental results on code-switching ASR show that the proposed method for phone set construction outperformed other traditional methods.

原文English
頁(從 - 到)858-862
頁數5
期刊IEEE Transactions on Audio, Speech and Language Processing
22
發行號4
DOIs
出版狀態Published - 2014 4月

All Science Journal Classification (ASJC) codes

  • 聲學與超音波
  • 電氣與電子工程

指紋

深入研究「Chinese-english phone set construction for code-switching ASR using acoustic and DNN-extracted articulatory features」主題。共同形成了獨特的指紋。

引用此