Model generation of accented speech using model transformation and verification for bilingual speech recognition

Han Ping Shen, Chung Hsien Wu, Pei Shan Tsai

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

Nowadays, bilingual or multilingual speech recognition is confronted with the accent-related problem caused by non-native speech in a variety of real-world applications. Accent modeling of non-native speech is definitely challenging, because the acoustic properties in highly-accented speech pronounced by non-native speakers are quite divergent. The aim of this study is to generate highly Mandarin-accented English models for speakers whose mother tongue is Mandarin. First, a two-stage, state-based verification method is proposed to extract the state-level, highly-accented speech segments automatically. Acoustic features and articulatory features are successively used for robust verification of the extracted speech segments. Second, Gaussian components of the highly-accented speech models are generated from the corresponding Gaussian components of the native speech models using a linear transformation function. A decision tree is constructed to categorize the transformation functions and used for transformation function retrieval to deal with the data sparseness problem. Third, a discrimination function is further applied to verify the generated accented acoustic models. Finally, the successfully verified accented English models are integrated into the native bilingual phone model set for Mandarin-English bilingual speech recognition. Experimental results show that the proposed approach can effectively alleviate recognition performance degradation due to accents and can obtain absolute improvements of 4.1%, 1.8%, and 2.7% in word accuracy for bilingual speech recognition compared to that using traditional ASR approaches, MAP-adapted, and MLLR-adapted ASR methods, respectively.

Original languageEnglish
Article number6
JournalACM Transactions on Asian and Low-Resource Language Information Processing
Volume14
Issue number2
DOIs
Publication statusPublished - 2015 Mar

All Science Journal Classification (ASJC) codes

  • General Computer Science

Fingerprint

Dive into the research topics of 'Model generation of accented speech using model transformation and verification for bilingual speech recognition'. Together they form a unique fingerprint.

Cite this