Variable-length unit selection in TTS using structural syntactic cost

Chung Hsien Wu, Chi Chun Hsia, Jiun Fu Chen, Jhing Fa Wang

研究成果: Article同行評審

15 引文 斯高帕斯(Scopus)

摘要

This paper presents a variable-length unit selection scheme based on syntactic cost to select text-to-speech (TTS) synthesis units. The syntactic structure of a sentence is derived from a probabilistic context-free grammar (PCFG), and represented as a syntactic vector. The syntactic difference between target and candidate units (words or phrases) is estimated by the cosine measure with the inside probability of PCFG acting as a weight. Latent semantic analysis (LSA) is applied to reduce the dimensionality of the syntactic vectors. The dynamic programming algorithm is adopted to obtain a concatenated unit sequence with minimum cost. A syntactic property-rich speech database is designed and collected as the unit inventory. Several experiments with statistical testing are conducted to assess the quality of the synthetic speech as perceived by human subjects. The proposed method outperforms the synthesizer without considering syntactic property. The structural syntax estimates the substitution cost better than the acoustic features alone

原文English
文章編號4156186
頁(從 - 到)1227-1235
頁數9
期刊IEEE Transactions on Audio, Speech and Language Processing
15
發行號4
DOIs
出版狀態Published - 2007 5月

All Science Journal Classification (ASJC) codes

  • 聲學與超音波
  • 電氣與電子工程

指紋

深入研究「Variable-length unit selection in TTS using structural syntactic cost」主題。共同形成了獨特的指紋。

引用此