Flexible speech act identification of spontaneous speech with disfluency

Chung Hsien Wu, Gwo Lang Yan

Research output: Contribution to conferencePaper

Abstract

This paper describes an approach for flexible speech act identification of spontaneous speech with disfluency. In this approach, semantic information, syntactic structure, and fragment features of an input utterance are statistically encapsulated into a proposed sp eech act hidden Markov model (SAHMM) to characterize the speech act. To deal with the disfluency problem in a sparse training corpus, an interpolation mechanism is exploited to re-estimate the state transition probability in SAHMM. Finally, the dialog system accepts the speech act with best score and returns the corresponding response. Experiments were conducted to evaluate the proposed approach using a spoken dialogue system for the air travel information service. A testing database from 25 speakers containing 480 dialogues including 3038 sentences was collected and used for evaluation. Using the proposed approach, the experimental results show that the performance can achieve 90.3% in speech act correct rate (SACR) and 85.5% in fragment correct rate (FCR) for fluent speech and gains a significant improvement of 5.7% in SACR and 6.9% in FCR compared to the baseline system without considering filled pauses for disfluent speech.

Original languageEnglish
Pages653-656
Number of pages4
Publication statusPublished - 2003 Jan 1
Event8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland
Duration: 2003 Sep 12003 Sep 4

Other

Other8th European Conference on Speech Communication and Technology, EUROSPEECH 2003
CountrySwitzerland
CityGeneva
Period03-09-0103-09-04

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Software
  • Linguistics and Language
  • Communication

Cite this

Wu, C. H., & Yan, G. L. (2003). Flexible speech act identification of spontaneous speech with disfluency. 653-656. Paper presented at 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, Geneva, Switzerland.