Perceptual speech modeling for noisy speech recognition

Research output: Contribution to journalConference article

Abstract

This paper proposes a perceptual modeling approach with a two-stage recognition to deal with the issues of recognition degradation in noisy environment. The auditory masking effect is used for speech enhancement and acoustic modeling in order to overcome the model inconsistencies between training speech and noisy input. In the two-stage recognition, the maximum a posteriori (MAP) based adaptation algorithm is used to incrementally adapt the noise model. In order to evaluate our proposed approach, a Mandarin keyword spotting system was constructed. The experimental results show our proposed method achieves a better recognition rate compared to the audible noise suppression (ANS) and parallel model combination (PMC) methods for both in 70km/hr (10.3dB) and 90km/hr (6.4dB) car environments.

Original languageEnglish
Pages (from-to)I/385-I/388
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume1
Publication statusPublished - 2002 Jul 11
Event2002 IEEE International Conference on Acustics, Speech, and Signal Processing - Orlando, FL, United States
Duration: 2002 May 132002 May 17

Fingerprint

Speech recognition
Speech intelligibility
Speech enhancement
Acoustic noise
Railroad cars
Acoustics
Degradation

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

@article{bd0b1df883384634a90cb9e3b51082ef,
title = "Perceptual speech modeling for noisy speech recognition",
abstract = "This paper proposes a perceptual modeling approach with a two-stage recognition to deal with the issues of recognition degradation in noisy environment. The auditory masking effect is used for speech enhancement and acoustic modeling in order to overcome the model inconsistencies between training speech and noisy input. In the two-stage recognition, the maximum a posteriori (MAP) based adaptation algorithm is used to incrementally adapt the noise model. In order to evaluate our proposed approach, a Mandarin keyword spotting system was constructed. The experimental results show our proposed method achieves a better recognition rate compared to the audible noise suppression (ANS) and parallel model combination (PMC) methods for both in 70km/hr (10.3dB) and 90km/hr (6.4dB) car environments.",
author = "Wu, {Chung Hsien} and Chiu, {Yu Hsien} and Huigan Lim",
year = "2002",
month = "7",
day = "11",
language = "English",
volume = "1",
pages = "I/385--I/388",
journal = "Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing",
issn = "0736-7791",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

Perceptual speech modeling for noisy speech recognition. / Wu, Chung Hsien; Chiu, Yu Hsien; Lim, Huigan.

In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Vol. 1, 11.07.2002, p. I/385-I/388.

Research output: Contribution to journalConference article

TY - JOUR

T1 - Perceptual speech modeling for noisy speech recognition

AU - Wu, Chung Hsien

AU - Chiu, Yu Hsien

AU - Lim, Huigan

PY - 2002/7/11

Y1 - 2002/7/11

N2 - This paper proposes a perceptual modeling approach with a two-stage recognition to deal with the issues of recognition degradation in noisy environment. The auditory masking effect is used for speech enhancement and acoustic modeling in order to overcome the model inconsistencies between training speech and noisy input. In the two-stage recognition, the maximum a posteriori (MAP) based adaptation algorithm is used to incrementally adapt the noise model. In order to evaluate our proposed approach, a Mandarin keyword spotting system was constructed. The experimental results show our proposed method achieves a better recognition rate compared to the audible noise suppression (ANS) and parallel model combination (PMC) methods for both in 70km/hr (10.3dB) and 90km/hr (6.4dB) car environments.

AB - This paper proposes a perceptual modeling approach with a two-stage recognition to deal with the issues of recognition degradation in noisy environment. The auditory masking effect is used for speech enhancement and acoustic modeling in order to overcome the model inconsistencies between training speech and noisy input. In the two-stage recognition, the maximum a posteriori (MAP) based adaptation algorithm is used to incrementally adapt the noise model. In order to evaluate our proposed approach, a Mandarin keyword spotting system was constructed. The experimental results show our proposed method achieves a better recognition rate compared to the audible noise suppression (ANS) and parallel model combination (PMC) methods for both in 70km/hr (10.3dB) and 90km/hr (6.4dB) car environments.

UR - http://www.scopus.com/inward/record.url?scp=17444442112&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=17444442112&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:17444442112

VL - 1

SP - I/385-I/388

JO - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

JF - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

SN - 0736-7791

ER -