Multi-keyword spotting of telephone speech using a fuzzy search algorithm and keyword-driven two-level CBSM

Chung Hsien Wu, Yeou Jiunn Chen

研究成果: Article同行評審

15 引文 斯高帕斯(Scopus)

摘要

In telephone speech recognition, the acoustic mismatch between training and testing environments often causes a severe degradation in the recognition performance. This paper presents a keyword-driven two-level codebook-based stochastic matching (CBSM) algorithm to eliminate the acoustic mismatch. Additionally, in Mandarin speech, it is difficult to correctly recognize the unvoiced part in a syllable. In order to reduce the recognition error of unvoiced segments, a fuzzy search algorithm is proposed to extract keyword candidates from a syllable lattice. Finally, a keyword relation and a weighting function for keyword combinations are presented for multi-keyword spotting. In the multi-keyword spotting of Mandarin speech, 94 right context-dependent and 38 context-independent subsyllables are used as the basic recognition units. A corresponding anti-subsyllable model for each subsyllable is trained and used for verification. In this system, 2583 faculty names and 39 department names are selected as the primary keywords and the secondary keywords, respectively. Using a testing set of 3088 conversational speech utterances from 33 speakers (20 male, 13 female), these techniques reduced the recognition error rate from 29.6% to 20.6% for multi-keywords embedded in non-keyword speech.

原文English
頁(從 - 到)197-212
頁數16
期刊Speech Communication
33
發行號3
DOIs
出版狀態Published - 2001 2月

All Science Journal Classification (ASJC) codes

  • 軟體
  • 建模與模擬
  • 通訊
  • 語言與語言學
  • 語言和語言學
  • 電腦視覺和模式識別
  • 電腦科學應用

指紋

深入研究「Multi-keyword spotting of telephone speech using a fuzzy search algorithm and keyword-driven two-level CBSM」主題。共同形成了獨特的指紋。

引用此