Integration of phonetic and prosodie information for robust utterance verification

C. H. Wu, Y. J. Chen, G. L. Yan

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)


Mandarin speech is known for its tonal characteristic, and prosodie information plays an important role in Mandarin speech recognition. Driven by this property, phonetic and prosodie information are integrated and used for Mandarin telephone speech keyword spotting. A two-stage strategy, with recognition followed by verification, is adopted. For keyword recognition, 132 subsyllable models, two general acoustic filler models and one background/silence model are separately trained and used as the basic recognition units. For utterance verification, 12 antisubsyllable models, 175 context-dependent prosodie models and five anti-prosodic models are constructed. A keyword verification function combining phonetic-phase and prosodic-phase verification is investigated. Using a test set of 3088 conversational speech utterances from 33 speakers (20 males and 13 females) and a vocabulary of 2583 faculty names, at 8.5% false rejection, the proposed verification method results in an 18.3% false alarm rate. Furthermore, this method is able correctly to reject 90.9% of non-keywords. Comparison with a baseline system without prosodic-phase verification shows that prosodie information can benefit the verification performance.

Original languageEnglish
Pages (from-to)55-61
Number of pages7
JournalIEE Proceedings: Vision, Image and Signal Processing
Issue number1
Publication statusPublished - 2000

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Electrical and Electronic Engineering


Dive into the research topics of 'Integration of phonetic and prosodie information for robust utterance verification'. Together they form a unique fingerprint.

Cite this