TY - GEN
T1 - Acoustic and Textual Data Augmentation for Code-Switching Speech Recognition in Under-Resourced Language
AU - Hsieh, I. Ting
AU - Wu, Chung Hsien
AU - Wang, Chun Huang
N1 - Publisher Copyright:
© 2020 APSIPA.
PY - 2020/12/7
Y1 - 2020/12/7
N2 - Under-resourced and code-switching speech recognition have recently received research interest, resulting in several robust acoustic and language modeling approaches. As Taiwanese and Mandarin have been popularly and widely used in Taiwan, this paper aims to address the under-resourced and codeswitching issues. First, phone sharing between Taiwanese and Mandarin is employed for acoustic data augmentation to construct the acoustic models of Taiwanese speech recognizer. Regarding the lack of Taiwanese text corpus, this paper translates Mandarin corpus into Taiwanese corpus based on word-to-word translation. Moreover, additional translation rules for codeswitching text are manually designed. The augmented text corpus is then used for training the code-switching language models. In the experimental results, the word error rate for code-switching speech recognition was 26.02%, which was better than that trained by the pure Taiwanese corpus.
AB - Under-resourced and code-switching speech recognition have recently received research interest, resulting in several robust acoustic and language modeling approaches. As Taiwanese and Mandarin have been popularly and widely used in Taiwan, this paper aims to address the under-resourced and codeswitching issues. First, phone sharing between Taiwanese and Mandarin is employed for acoustic data augmentation to construct the acoustic models of Taiwanese speech recognizer. Regarding the lack of Taiwanese text corpus, this paper translates Mandarin corpus into Taiwanese corpus based on word-to-word translation. Moreover, additional translation rules for codeswitching text are manually designed. The augmented text corpus is then used for training the code-switching language models. In the experimental results, the word error rate for code-switching speech recognition was 26.02%, which was better than that trained by the pure Taiwanese corpus.
UR - http://www.scopus.com/inward/record.url?scp=85100947717&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100947717&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85100947717
T3 - 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings
SP - 302
EP - 307
BT - 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020
Y2 - 7 December 2020 through 10 December 2020
ER -