Automatic pronunciation clustering using a World English archive and pronunciation structure analysis

H. P. Shen, N. Minematsu, T. Makino, S. H. Weinberger, T. Pongkittiphan, Chung-Hsien Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

English is the only language available for global communication. Due to the influence of speakers' mother tongue, however, those from different regions inevitably have different accents in their pronunciation of English. The ultimate goal of our project is creating a global pronunciation map of World Englishes on an individual basis, for speakers to use to locate similar English pronunciations. If the speaker is a learner, he can also know how his pronunciation compares to other varieties. Creating the map mathematically requires a matrix of pronunciation distances among all the speakers considered. This paper investigates invariant pronunciation structure analysis and Support Vector Regression (SVR) to predict the inter-speaker pronunciation distances. In experiments, the Speech Accent Archive (SAA), which contains speech data of worldwide accented English, is used as training and testing samples. IPA narrow transcriptions in the archive are used to prepare reference pronunciation distances, which are then predicted based on structural analysis and SVR, not with IPA transcriptions. Correlation between the reference distances and the predicted distances is calculated. Experimental results show very promising results and our proposed method outperforms by far a baseline system developed using an HMM-based phoneme recognizer.

Original languageEnglish
Title of host publication2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings
Pages222-227
Number of pages6
DOIs
Publication statusPublished - 2013 Dec 1
Event2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Olomouc, Czech Republic
Duration: 2013 Dec 82013 Dec 13

Publication series

Name2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings

Other

Other2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013
CountryCzech Republic
CityOlomouc
Period13-12-0813-12-13

All Science Journal Classification (ASJC) codes

  • Speech and Hearing

Fingerprint Dive into the research topics of 'Automatic pronunciation clustering using a World English archive and pronunciation structure analysis'. Together they form a unique fingerprint.

  • Cite this

    Shen, H. P., Minematsu, N., Makino, T., Weinberger, S. H., Pongkittiphan, T., & Wu, C-H. (2013). Automatic pronunciation clustering using a World English archive and pronunciation structure analysis. In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings (pp. 222-227). [6707733] (2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings). https://doi.org/10.1109/ASRU.2013.6707733