Stochastic vector mapping-based feature enhancement using prior model and environment adaptation for noisy speech recognition

Chia Hsin Hsieh, Chung-Hsien Wu, Jun Yu Lin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents an approach to feature enhancement for noisy speech recognition. Three prior models are introduced to characterize clean speech, noise and noisy speech respectively using sequential noise estimation based on noise-normalized stochastic vector mapping. Environment adaptation is also adopted to reduce the mismatch between training data and test data. For AURORA2 database, the experimental results indicate that a 0.77% digit accuracy improvement for multi-condition training and 0.29% digit accuracy improvement for clean speech training were achieved without stereo training data compared to the SPLICE-based approach with recursive noise estimation. For MAT-BN Mandarin broadcast news database, a 2.6% syllable accuracy improvement for anchor speech and 4.2% syllable accuracy improvement for field report speech were obtained compared to the MCE-based approach.

Original languageEnglish
Title of host publicationINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
PublisherInternational Speech Communication Association
Pages29-32
Number of pages4
Volume1
ISBN (Print)9781604234497
Publication statusPublished - 2006
EventINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP - Pittsburgh, PA, United States
Duration: 2006 Sep 172006 Sep 21

Other

OtherINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
CountryUnited States
CityPittsburgh, PA
Period06-09-1706-09-21

Fingerprint

Speech recognition
Anchors

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this

Hsieh, C. H., Wu, C-H., & Lin, J. Y. (2006). Stochastic vector mapping-based feature enhancement using prior model and environment adaptation for noisy speech recognition. In INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP (Vol. 1, pp. 29-32). International Speech Communication Association.
Hsieh, Chia Hsin ; Wu, Chung-Hsien ; Lin, Jun Yu. / Stochastic vector mapping-based feature enhancement using prior model and environment adaptation for noisy speech recognition. INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP. Vol. 1 International Speech Communication Association, 2006. pp. 29-32
@inproceedings{1f3e4da32e88426c95338d67250601a1,
title = "Stochastic vector mapping-based feature enhancement using prior model and environment adaptation for noisy speech recognition",
abstract = "This paper presents an approach to feature enhancement for noisy speech recognition. Three prior models are introduced to characterize clean speech, noise and noisy speech respectively using sequential noise estimation based on noise-normalized stochastic vector mapping. Environment adaptation is also adopted to reduce the mismatch between training data and test data. For AURORA2 database, the experimental results indicate that a 0.77{\%} digit accuracy improvement for multi-condition training and 0.29{\%} digit accuracy improvement for clean speech training were achieved without stereo training data compared to the SPLICE-based approach with recursive noise estimation. For MAT-BN Mandarin broadcast news database, a 2.6{\%} syllable accuracy improvement for anchor speech and 4.2{\%} syllable accuracy improvement for field report speech were obtained compared to the MCE-based approach.",
author = "Hsieh, {Chia Hsin} and Chung-Hsien Wu and Lin, {Jun Yu}",
year = "2006",
language = "English",
isbn = "9781604234497",
volume = "1",
pages = "29--32",
booktitle = "INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP",
publisher = "International Speech Communication Association",

}

Hsieh, CH, Wu, C-H & Lin, JY 2006, Stochastic vector mapping-based feature enhancement using prior model and environment adaptation for noisy speech recognition. in INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP. vol. 1, International Speech Communication Association, pp. 29-32, INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP, Pittsburgh, PA, United States, 06-09-17.

Stochastic vector mapping-based feature enhancement using prior model and environment adaptation for noisy speech recognition. / Hsieh, Chia Hsin; Wu, Chung-Hsien; Lin, Jun Yu.

INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP. Vol. 1 International Speech Communication Association, 2006. p. 29-32.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Stochastic vector mapping-based feature enhancement using prior model and environment adaptation for noisy speech recognition

AU - Hsieh, Chia Hsin

AU - Wu, Chung-Hsien

AU - Lin, Jun Yu

PY - 2006

Y1 - 2006

N2 - This paper presents an approach to feature enhancement for noisy speech recognition. Three prior models are introduced to characterize clean speech, noise and noisy speech respectively using sequential noise estimation based on noise-normalized stochastic vector mapping. Environment adaptation is also adopted to reduce the mismatch between training data and test data. For AURORA2 database, the experimental results indicate that a 0.77% digit accuracy improvement for multi-condition training and 0.29% digit accuracy improvement for clean speech training were achieved without stereo training data compared to the SPLICE-based approach with recursive noise estimation. For MAT-BN Mandarin broadcast news database, a 2.6% syllable accuracy improvement for anchor speech and 4.2% syllable accuracy improvement for field report speech were obtained compared to the MCE-based approach.

AB - This paper presents an approach to feature enhancement for noisy speech recognition. Three prior models are introduced to characterize clean speech, noise and noisy speech respectively using sequential noise estimation based on noise-normalized stochastic vector mapping. Environment adaptation is also adopted to reduce the mismatch between training data and test data. For AURORA2 database, the experimental results indicate that a 0.77% digit accuracy improvement for multi-condition training and 0.29% digit accuracy improvement for clean speech training were achieved without stereo training data compared to the SPLICE-based approach with recursive noise estimation. For MAT-BN Mandarin broadcast news database, a 2.6% syllable accuracy improvement for anchor speech and 4.2% syllable accuracy improvement for field report speech were obtained compared to the MCE-based approach.

UR - http://www.scopus.com/inward/record.url?scp=44949157217&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=44949157217&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:44949157217

SN - 9781604234497

VL - 1

SP - 29

EP - 32

BT - INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP

PB - International Speech Communication Association

ER -

Hsieh CH, Wu C-H, Lin JY. Stochastic vector mapping-based feature enhancement using prior model and environment adaptation for noisy speech recognition. In INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP. Vol. 1. International Speech Communication Association. 2006. p. 29-32