Selection of Supplementary Acoustic Data for Meta-Learning in Under-Resourced Speech Recognition

I. Ting Hsieh, Chung Hsien Wu, Zhe Hong Zhao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Automatic speech recognition (ASR) for under-resourced languages has been a challenging task during the past decade. In this paper, regarding Taiwanese as the under resourced language, the speech data of the high-resourced languages which have most phonemes in common with Taiwanese are selected as the supplementary resources for meta-training the acoustic models for Taiwanese ASR. Mandarin, English, Japanese, Cantonese and Thai as the high-resourced languages are selected as the supplementary languages based on the designed selection criteria. Model-agnostic meta-learning (MAML) is then used as the meta-training strategy. For evaluation, when 4000 utterances were selected from each supplementary language, we obtained the WER of 20.89% and the SER of 8.86% for Taiwanese ASR. The results were better than the baseline model (26.18% and 13.99%) using only the Taiwanese corpus and traditional method.

Original languageEnglish
Title of host publicationProceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages409-414
Number of pages6
ISBN (Electronic)9786165904773
DOIs
Publication statusPublished - 2022
Event2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022 - Chiang Mai, Thailand
Duration: 2022 Nov 72022 Nov 10

Publication series

NameProceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022

Conference

Conference2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
Country/TerritoryThailand
CityChiang Mai
Period22-11-0722-11-10

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Selection of Supplementary Acoustic Data for Meta-Learning in Under-Resourced Speech Recognition'. Together they form a unique fingerprint.

Cite this