A new eigenvoice approach to speaker adaptation

Chih-Hsien Huang, Jen Tzung Chien, Hsin Min Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

In this paper, we present two approaches to improve the eigenvoice-based speaker adaptation. First, we present the maximum a posteriori eigen-decomposition (MAPED), where the linear combination coefficients for eigenvector decomposition are estimated according to the MAP criterion. By incorporating the prior decomposition knowledge, here we use a Gaussian distribution, the MAPED is established accordingly. MAPED is able to achieve better performance than maximum likelihood eigen-decomposition (MLED) with few adaptation data. On the other hand, we exploit the adaptation of covariance matrices of the hidden Markov model (HMM) in the eigenvoice framework. Our method is to use the principal component analysis (PCA) to project the speaker-specific HMM parameters onto a smaller orthogonal feature space. Then, we reliably calculate the HMM covariance matrices using the observations in the reduced feature space. The adapted HMM covariance matrices are estimated by transforming the covariance matrices in the reduced feature space to that in the original feature space. The experimental results show that the eigenvoice speaker adaptation using MAPED and incorporating covariance adaptation can improve the performance of the original eigenvoice adaptation in Mandarin speech recognition.

Original languageEnglish
Title of host publication2004 International Symposium on Chinese Spoken Language Processing - Proceedings
Pages109-112
Number of pages4
Publication statusPublished - 2004 Dec 1
Event2004 International Symposium on Chinese Spoken Language Processing - Hong Kong, China, Hong Kong
Duration: 2004 Dec 152004 Dec 18

Publication series

Name2004 International Symposium on Chinese Spoken Language Processing - Proceedings

Other

Other2004 International Symposium on Chinese Spoken Language Processing
CountryHong Kong
CityHong Kong, China
Period04-12-1504-12-18

Fingerprint

Decomposition
Hidden Markov models
Covariance matrix
Gaussian distribution
Speech recognition
Eigenvalues and eigenfunctions
Principal component analysis
Maximum likelihood

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Cite this

Huang, C-H., Chien, J. T., & Wang, H. M. (2004). A new eigenvoice approach to speaker adaptation. In 2004 International Symposium on Chinese Spoken Language Processing - Proceedings (pp. 109-112). [L5.4] (2004 International Symposium on Chinese Spoken Language Processing - Proceedings).
Huang, Chih-Hsien ; Chien, Jen Tzung ; Wang, Hsin Min. / A new eigenvoice approach to speaker adaptation. 2004 International Symposium on Chinese Spoken Language Processing - Proceedings. 2004. pp. 109-112 (2004 International Symposium on Chinese Spoken Language Processing - Proceedings).
@inproceedings{25ae8fd2f6194e59b5e085f814435cb7,
title = "A new eigenvoice approach to speaker adaptation",
abstract = "In this paper, we present two approaches to improve the eigenvoice-based speaker adaptation. First, we present the maximum a posteriori eigen-decomposition (MAPED), where the linear combination coefficients for eigenvector decomposition are estimated according to the MAP criterion. By incorporating the prior decomposition knowledge, here we use a Gaussian distribution, the MAPED is established accordingly. MAPED is able to achieve better performance than maximum likelihood eigen-decomposition (MLED) with few adaptation data. On the other hand, we exploit the adaptation of covariance matrices of the hidden Markov model (HMM) in the eigenvoice framework. Our method is to use the principal component analysis (PCA) to project the speaker-specific HMM parameters onto a smaller orthogonal feature space. Then, we reliably calculate the HMM covariance matrices using the observations in the reduced feature space. The adapted HMM covariance matrices are estimated by transforming the covariance matrices in the reduced feature space to that in the original feature space. The experimental results show that the eigenvoice speaker adaptation using MAPED and incorporating covariance adaptation can improve the performance of the original eigenvoice adaptation in Mandarin speech recognition.",
author = "Chih-Hsien Huang and Chien, {Jen Tzung} and Wang, {Hsin Min}",
year = "2004",
month = "12",
day = "1",
language = "English",
isbn = "0780386787",
series = "2004 International Symposium on Chinese Spoken Language Processing - Proceedings",
pages = "109--112",
booktitle = "2004 International Symposium on Chinese Spoken Language Processing - Proceedings",

}

Huang, C-H, Chien, JT & Wang, HM 2004, A new eigenvoice approach to speaker adaptation. in 2004 International Symposium on Chinese Spoken Language Processing - Proceedings., L5.4, 2004 International Symposium on Chinese Spoken Language Processing - Proceedings, pp. 109-112, 2004 International Symposium on Chinese Spoken Language Processing, Hong Kong, China, Hong Kong, 04-12-15.

A new eigenvoice approach to speaker adaptation. / Huang, Chih-Hsien; Chien, Jen Tzung; Wang, Hsin Min.

2004 International Symposium on Chinese Spoken Language Processing - Proceedings. 2004. p. 109-112 L5.4 (2004 International Symposium on Chinese Spoken Language Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A new eigenvoice approach to speaker adaptation

AU - Huang, Chih-Hsien

AU - Chien, Jen Tzung

AU - Wang, Hsin Min

PY - 2004/12/1

Y1 - 2004/12/1

N2 - In this paper, we present two approaches to improve the eigenvoice-based speaker adaptation. First, we present the maximum a posteriori eigen-decomposition (MAPED), where the linear combination coefficients for eigenvector decomposition are estimated according to the MAP criterion. By incorporating the prior decomposition knowledge, here we use a Gaussian distribution, the MAPED is established accordingly. MAPED is able to achieve better performance than maximum likelihood eigen-decomposition (MLED) with few adaptation data. On the other hand, we exploit the adaptation of covariance matrices of the hidden Markov model (HMM) in the eigenvoice framework. Our method is to use the principal component analysis (PCA) to project the speaker-specific HMM parameters onto a smaller orthogonal feature space. Then, we reliably calculate the HMM covariance matrices using the observations in the reduced feature space. The adapted HMM covariance matrices are estimated by transforming the covariance matrices in the reduced feature space to that in the original feature space. The experimental results show that the eigenvoice speaker adaptation using MAPED and incorporating covariance adaptation can improve the performance of the original eigenvoice adaptation in Mandarin speech recognition.

AB - In this paper, we present two approaches to improve the eigenvoice-based speaker adaptation. First, we present the maximum a posteriori eigen-decomposition (MAPED), where the linear combination coefficients for eigenvector decomposition are estimated according to the MAP criterion. By incorporating the prior decomposition knowledge, here we use a Gaussian distribution, the MAPED is established accordingly. MAPED is able to achieve better performance than maximum likelihood eigen-decomposition (MLED) with few adaptation data. On the other hand, we exploit the adaptation of covariance matrices of the hidden Markov model (HMM) in the eigenvoice framework. Our method is to use the principal component analysis (PCA) to project the speaker-specific HMM parameters onto a smaller orthogonal feature space. Then, we reliably calculate the HMM covariance matrices using the observations in the reduced feature space. The adapted HMM covariance matrices are estimated by transforming the covariance matrices in the reduced feature space to that in the original feature space. The experimental results show that the eigenvoice speaker adaptation using MAPED and incorporating covariance adaptation can improve the performance of the original eigenvoice adaptation in Mandarin speech recognition.

UR - http://www.scopus.com/inward/record.url?scp=21444440902&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=21444440902&partnerID=8YFLogxK

M3 - Conference contribution

SN - 0780386787

SN - 9780780386785

T3 - 2004 International Symposium on Chinese Spoken Language Processing - Proceedings

SP - 109

EP - 112

BT - 2004 International Symposium on Chinese Spoken Language Processing - Proceedings

ER -

Huang C-H, Chien JT, Wang HM. A new eigenvoice approach to speaker adaptation. In 2004 International Symposium on Chinese Spoken Language Processing - Proceedings. 2004. p. 109-112. L5.4. (2004 International Symposium on Chinese Spoken Language Processing - Proceedings).