Alternative hypothesis generation using a weighted kernel feature matrix for ASR substitution error correction

Chao Hong Liu, Chung-Hsien Wu, David Sarwono

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Although automatic speech recognition (ASR) has been successfully used in several applications, it is still non-robust and imprecise especially in a harsh environment wherein the input speech is of low quality. Robust error correction for ASR outputs thus becomes important in addition to improving recognition performance. In recent approaches to error correction, linguistic or domain information is used to generate the alternative hypotheses for the ASR outputs followed by the selection of the most likely alternative. In this study, the distances between ASR outputs and the potentially correct alternatives are estimated based on a weighted context-dependent syllable cluster-based kernel feature matrix followed by multidimensional scaling (MDS)-based distance rescaling. These distances are then used to construct an alternative syllable lattice and the dynamic programming is used to obtain the most likely correct output with respect to the original ASR results. Experiments show that the proposed method achieved about 1.95% improvement on the word error rate compared to the correction pair approach using the MATBN Mandarin Chinese broadcast news corpus.

Original languageEnglish
Title of host publication2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012
Pages1-5
Number of pages5
DOIs
Publication statusPublished - 2012 Dec 1
Event2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012 - Hong Kong, China
Duration: 2012 Dec 52012 Dec 8

Publication series

Name2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012

Other

Other2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012
CountryChina
CityHong Kong
Period12-12-0512-12-08

Fingerprint

substitution
multidimensional scaling
broadcast
Error Correction
Substitution
Kernel
Hypothesis Generation
Automatic Speech Recognition
news
programming
linguistics
experiment
performance

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Linguistics and Language

Cite this

Liu, C. H., Wu, C-H., & Sarwono, D. (2012). Alternative hypothesis generation using a weighted kernel feature matrix for ASR substitution error correction. In 2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012 (pp. 1-5). [6423475] (2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012). https://doi.org/10.1109/ISCSLP.2012.6423475
Liu, Chao Hong ; Wu, Chung-Hsien ; Sarwono, David. / Alternative hypothesis generation using a weighted kernel feature matrix for ASR substitution error correction. 2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012. 2012. pp. 1-5 (2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012).
@inproceedings{ad799641ccde4596be8a688f78aa3bf0,
title = "Alternative hypothesis generation using a weighted kernel feature matrix for ASR substitution error correction",
abstract = "Although automatic speech recognition (ASR) has been successfully used in several applications, it is still non-robust and imprecise especially in a harsh environment wherein the input speech is of low quality. Robust error correction for ASR outputs thus becomes important in addition to improving recognition performance. In recent approaches to error correction, linguistic or domain information is used to generate the alternative hypotheses for the ASR outputs followed by the selection of the most likely alternative. In this study, the distances between ASR outputs and the potentially correct alternatives are estimated based on a weighted context-dependent syllable cluster-based kernel feature matrix followed by multidimensional scaling (MDS)-based distance rescaling. These distances are then used to construct an alternative syllable lattice and the dynamic programming is used to obtain the most likely correct output with respect to the original ASR results. Experiments show that the proposed method achieved about 1.95{\%} improvement on the word error rate compared to the correction pair approach using the MATBN Mandarin Chinese broadcast news corpus.",
author = "Liu, {Chao Hong} and Chung-Hsien Wu and David Sarwono",
year = "2012",
month = "12",
day = "1",
doi = "10.1109/ISCSLP.2012.6423475",
language = "English",
isbn = "9781467325059",
series = "2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012",
pages = "1--5",
booktitle = "2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012",

}

Liu, CH, Wu, C-H & Sarwono, D 2012, Alternative hypothesis generation using a weighted kernel feature matrix for ASR substitution error correction. in 2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012., 6423475, 2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012, pp. 1-5, 2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012, Hong Kong, China, 12-12-05. https://doi.org/10.1109/ISCSLP.2012.6423475

Alternative hypothesis generation using a weighted kernel feature matrix for ASR substitution error correction. / Liu, Chao Hong; Wu, Chung-Hsien; Sarwono, David.

2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012. 2012. p. 1-5 6423475 (2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Alternative hypothesis generation using a weighted kernel feature matrix for ASR substitution error correction

AU - Liu, Chao Hong

AU - Wu, Chung-Hsien

AU - Sarwono, David

PY - 2012/12/1

Y1 - 2012/12/1

N2 - Although automatic speech recognition (ASR) has been successfully used in several applications, it is still non-robust and imprecise especially in a harsh environment wherein the input speech is of low quality. Robust error correction for ASR outputs thus becomes important in addition to improving recognition performance. In recent approaches to error correction, linguistic or domain information is used to generate the alternative hypotheses for the ASR outputs followed by the selection of the most likely alternative. In this study, the distances between ASR outputs and the potentially correct alternatives are estimated based on a weighted context-dependent syllable cluster-based kernel feature matrix followed by multidimensional scaling (MDS)-based distance rescaling. These distances are then used to construct an alternative syllable lattice and the dynamic programming is used to obtain the most likely correct output with respect to the original ASR results. Experiments show that the proposed method achieved about 1.95% improvement on the word error rate compared to the correction pair approach using the MATBN Mandarin Chinese broadcast news corpus.

AB - Although automatic speech recognition (ASR) has been successfully used in several applications, it is still non-robust and imprecise especially in a harsh environment wherein the input speech is of low quality. Robust error correction for ASR outputs thus becomes important in addition to improving recognition performance. In recent approaches to error correction, linguistic or domain information is used to generate the alternative hypotheses for the ASR outputs followed by the selection of the most likely alternative. In this study, the distances between ASR outputs and the potentially correct alternatives are estimated based on a weighted context-dependent syllable cluster-based kernel feature matrix followed by multidimensional scaling (MDS)-based distance rescaling. These distances are then used to construct an alternative syllable lattice and the dynamic programming is used to obtain the most likely correct output with respect to the original ASR results. Experiments show that the proposed method achieved about 1.95% improvement on the word error rate compared to the correction pair approach using the MATBN Mandarin Chinese broadcast news corpus.

UR - http://www.scopus.com/inward/record.url?scp=84874471205&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874471205&partnerID=8YFLogxK

U2 - 10.1109/ISCSLP.2012.6423475

DO - 10.1109/ISCSLP.2012.6423475

M3 - Conference contribution

SN - 9781467325059

T3 - 2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012

SP - 1

EP - 5

BT - 2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012

ER -

Liu CH, Wu C-H, Sarwono D. Alternative hypothesis generation using a weighted kernel feature matrix for ASR substitution error correction. In 2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012. 2012. p. 1-5. 6423475. (2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012). https://doi.org/10.1109/ISCSLP.2012.6423475