MAP-based adaptation for speech conversion using adaptation data selection and non-parallel training

Chung Han Lee, Chung-Hsien Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

72 Citations (Scopus)

Abstract

This study presents an approach to GMM-based speech conversion using maximum a posteriori probability (MAP) adaptation. First, a conversion function is trained using a parallel corpus containing the same utterances spoken by both the source and the reference speakers. Then a non-parallel corpus from a new target speaker is used for the adaptation of the conversion function which models the voice conversion between the source speaker and the new target speaker. The consistency among the adaptation data is estimated to select suitable data from the nonparallel corpus for MAP-based adaptation of the GMMs. In speech conversion evaluation, experimental results show that MAP adaptation using a small non-parallel corpus can reduce the conversion error and improve the speech quality for speaker identification compared to the method without adaptation. Objective and subjective tests also confirm the promising performance of the proposed approach.

Original languageEnglish
Title of host publicationINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
PublisherInternational Speech Communication Association
Pages2254-2257
Number of pages4
Volume5
ISBN (Print)9781604234497
Publication statusPublished - 2006
EventINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP - Pittsburgh, PA, United States
Duration: 2006 Sep 172006 Sep 21

Other

OtherINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
CountryUnited States
CityPittsburgh, PA
Period06-09-1706-09-21

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this

Lee, C. H., & Wu, C-H. (2006). MAP-based adaptation for speech conversion using adaptation data selection and non-parallel training. In INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP (Vol. 5, pp. 2254-2257). International Speech Communication Association.