Speech Enhancement Based on Masking Approach Considering Speech Quality and Acoustic Confidence for Noisy Speech Recognition

Shih Chuan Chu, Chung Hsien Wu, Yun Wen Lin

研究成果: Conference contribution

3 引文 斯高帕斯(Scopus)

摘要

In recent years, voice-operated applications have been widely accepted by the public, while the background noise is still a challenging issue for automatic speech recognition (ASR). This paper proposes a mask-based speech enhancement front-end approach taking into account speech quality score as well as acoustic confidence for speech enhancement to reduce the word error rate (WER) for noisy speech recognition. First, the features of speakers, phones, and noises are extracted and considered in the loss function to improve the speech quality. In addition to speech quality, this study also considers the phone confidence from the Kaldi-based ASR into the loss function for training the mask generation model to improve speech quality as well as improve noisy speech recognition performance. Compared with the baseline model, the proposed model not only improved STOI by 2.14% and PESQ by 7.22%, but also reduced 12.13% in WER.

原文English
主出版物標題2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021 - Proceedings
發行者Institute of Electrical and Electronics Engineers Inc.
頁面536-540
頁數5
ISBN(電子)9789881476890
出版狀態Published - 2021
事件2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021 - Tokyo, Japan
持續時間: 2021 12月 142021 12月 17

出版系列

名字2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021 - Proceedings

Conference

Conference2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021
國家/地區Japan
城市Tokyo
期間21-12-1421-12-17

All Science Journal Classification (ASJC) codes

  • 人工智慧
  • 電腦視覺和模式識別
  • 訊號處理
  • 儀器

指紋

深入研究「Speech Enhancement Based on Masking Approach Considering Speech Quality and Acoustic Confidence for Noisy Speech Recognition」主題。共同形成了獨特的指紋。

引用此