TY - JOUR
T1 - Edit disfluency detection and correction using a cleanup language model and an alignment model
AU - Yeh, Jui Feng
AU - Wu, Chung Hsien
N1 - Funding Information:
Manuscript received October 1, 2005; revised May 19, 2006. This work was supported by the National Science Council, Taiwan, R.O.C., under Contract NSC 94-2213-E-006-018. The associate editor coordinating the review of this paper and approving it for publication was Dr. Geoffrey Zweig.
PY - 2006/9
Y1 - 2006/9
N2 - This investigation presents a novel approach to detecting and correcting the edit disfluency in spontaneous speech. Hypothesis testing using acoustic features is first adopted to detect potential interruption points (IPs) in the input speech. The word order of the cleanup utterance is then cleaned up based on the potential IPs using a class-based cleanup language model, the dclctablc region and the correction are aligned using an alignment model. Finally, log linear weighting is applied to optimize the performance. Using the acoustic features, the IP detection rate is significantly improved especially in recall rate. Based on the positions of the potential IPs, the cleanup language model and the alignment model are able to detect and correct the edit disflucncy efficiently. Experimental results demonstrate that the proposed approach has achieved error rates of 0.33 and 0.21 for IP detection and edit word deletion, respectively.
AB - This investigation presents a novel approach to detecting and correcting the edit disfluency in spontaneous speech. Hypothesis testing using acoustic features is first adopted to detect potential interruption points (IPs) in the input speech. The word order of the cleanup utterance is then cleaned up based on the potential IPs using a class-based cleanup language model, the dclctablc region and the correction are aligned using an alignment model. Finally, log linear weighting is applied to optimize the performance. Using the acoustic features, the IP detection rate is significantly improved especially in recall rate. Based on the positions of the potential IPs, the cleanup language model and the alignment model are able to detect and correct the edit disflucncy efficiently. Experimental results demonstrate that the proposed approach has achieved error rates of 0.33 and 0.21 for IP detection and edit word deletion, respectively.
UR - http://www.scopus.com/inward/record.url?scp=34047266604&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34047266604&partnerID=8YFLogxK
U2 - 10.1109/TASL.2006.878267
DO - 10.1109/TASL.2006.878267
M3 - Article
AN - SCOPUS:34047266604
SN - 1558-7916
VL - 14
SP - 1574
EP - 1583
JO - IEEE Transactions on Audio, Speech and Language Processing
JF - IEEE Transactions on Audio, Speech and Language Processing
IS - 5
ER -