Acoustic and Textual Data Augmentation for Code-Switching Speech Recognition in Under-Resourced Language

I. Ting Hsieh, Chung Hsien Wu, Chun Huang Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Under-resourced and code-switching speech recognition have recently received research interest, resulting in several robust acoustic and language modeling approaches. As Taiwanese and Mandarin have been popularly and widely used in Taiwan, this paper aims to address the under-resourced and codeswitching issues. First, phone sharing between Taiwanese and Mandarin is employed for acoustic data augmentation to construct the acoustic models of Taiwanese speech recognizer. Regarding the lack of Taiwanese text corpus, this paper translates Mandarin corpus into Taiwanese corpus based on word-to-word translation. Moreover, additional translation rules for codeswitching text are manually designed. The augmented text corpus is then used for training the code-switching language models. In the experimental results, the word error rate for code-switching speech recognition was 26.02%, which was better than that trained by the pure Taiwanese corpus.

Original languageEnglish
Title of host publication2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages302-307
Number of pages6
ISBN (Electronic)9789881476883
Publication statusPublished - 2020 Dec 7
Event2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Virtual, Auckland, New Zealand
Duration: 2020 Dec 72020 Dec 10

Publication series

Name2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings

Conference

Conference2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020
CountryNew Zealand
CityVirtual, Auckland
Period20-12-0720-12-10

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Signal Processing
  • Decision Sciences (miscellaneous)
  • Instrumentation

Fingerprint Dive into the research topics of 'Acoustic and Textual Data Augmentation for Code-Switching Speech Recognition in Under-Resourced Language'. Together they form a unique fingerprint.

Cite this