Aggregate a posteriori linear regression for speaker adaptation

Chih-Hsien Huang, Jen Tzung Chien

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

In this paper, we present a rapid and discriminative speaker adaptation algorithm for speech recognition. The adaptation paradigm is constructed under the popular linear regression transformation framework. Attractively, we estimate the regression matrices from the speaker-specific adaptation data according to the aggregate a posteriori criterion, which can be expressed in a form of classification error function. The goal of proposed aggregate a posteriori linear regression (AAPLR) turns out to estimate the discriminative linear regression matrices for transformation-based adaptation so that the classification errors can be minimized. Different from minimum classification error linear regression (MCELR), AAPLR algorithm ha closed-form solution to achieve rapid speaker adaptation. The experimental results reveal that AAPLR speaker adaptation does improve speech recognition performance with moderate computational cost compared to the maximum likelihood linear regression (MLLR), maximum a posteriori linear regression (MAPLR) and MCELR.

Original languageEnglish
Title of host publication2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Print)0780388747, 9780780388741
DOIs
Publication statusPublished - 2005 Jan 1
Event2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Philadelphia, PA, United States
Duration: 2005 Mar 182005 Mar 23

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
VolumeI
ISSN (Print)1520-6149

Other

Other2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05
CountryUnited States
CityPhiladelphia, PA
Period05-03-1805-03-23

Fingerprint

Linear regression
Speech recognition
Maximum likelihood

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Huang, C-H., & Chien, J. T. (2005). Aggregate a posteriori linear regression for speaker adaptation. In 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing [1415278] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. I). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2005.1415278
Huang, Chih-Hsien ; Chien, Jen Tzung. / Aggregate a posteriori linear regression for speaker adaptation. 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing. Institute of Electrical and Electronics Engineers Inc., 2005. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{e261a58f28d84bfb98964d7049168e7b,
title = "Aggregate a posteriori linear regression for speaker adaptation",
abstract = "In this paper, we present a rapid and discriminative speaker adaptation algorithm for speech recognition. The adaptation paradigm is constructed under the popular linear regression transformation framework. Attractively, we estimate the regression matrices from the speaker-specific adaptation data according to the aggregate a posteriori criterion, which can be expressed in a form of classification error function. The goal of proposed aggregate a posteriori linear regression (AAPLR) turns out to estimate the discriminative linear regression matrices for transformation-based adaptation so that the classification errors can be minimized. Different from minimum classification error linear regression (MCELR), AAPLR algorithm ha closed-form solution to achieve rapid speaker adaptation. The experimental results reveal that AAPLR speaker adaptation does improve speech recognition performance with moderate computational cost compared to the maximum likelihood linear regression (MLLR), maximum a posteriori linear regression (MAPLR) and MCELR.",
author = "Chih-Hsien Huang and Chien, {Jen Tzung}",
year = "2005",
month = "1",
day = "1",
doi = "10.1109/ICASSP.2005.1415278",
language = "English",
isbn = "0780388747",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
booktitle = "2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing",
address = "United States",

}

Huang, C-H & Chien, JT 2005, Aggregate a posteriori linear regression for speaker adaptation. in 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing., 1415278, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. I, Institute of Electrical and Electronics Engineers Inc., 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05, Philadelphia, PA, United States, 05-03-18. https://doi.org/10.1109/ICASSP.2005.1415278

Aggregate a posteriori linear regression for speaker adaptation. / Huang, Chih-Hsien; Chien, Jen Tzung.

2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing. Institute of Electrical and Electronics Engineers Inc., 2005. 1415278 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. I).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Aggregate a posteriori linear regression for speaker adaptation

AU - Huang, Chih-Hsien

AU - Chien, Jen Tzung

PY - 2005/1/1

Y1 - 2005/1/1

N2 - In this paper, we present a rapid and discriminative speaker adaptation algorithm for speech recognition. The adaptation paradigm is constructed under the popular linear regression transformation framework. Attractively, we estimate the regression matrices from the speaker-specific adaptation data according to the aggregate a posteriori criterion, which can be expressed in a form of classification error function. The goal of proposed aggregate a posteriori linear regression (AAPLR) turns out to estimate the discriminative linear regression matrices for transformation-based adaptation so that the classification errors can be minimized. Different from minimum classification error linear regression (MCELR), AAPLR algorithm ha closed-form solution to achieve rapid speaker adaptation. The experimental results reveal that AAPLR speaker adaptation does improve speech recognition performance with moderate computational cost compared to the maximum likelihood linear regression (MLLR), maximum a posteriori linear regression (MAPLR) and MCELR.

AB - In this paper, we present a rapid and discriminative speaker adaptation algorithm for speech recognition. The adaptation paradigm is constructed under the popular linear regression transformation framework. Attractively, we estimate the regression matrices from the speaker-specific adaptation data according to the aggregate a posteriori criterion, which can be expressed in a form of classification error function. The goal of proposed aggregate a posteriori linear regression (AAPLR) turns out to estimate the discriminative linear regression matrices for transformation-based adaptation so that the classification errors can be minimized. Different from minimum classification error linear regression (MCELR), AAPLR algorithm ha closed-form solution to achieve rapid speaker adaptation. The experimental results reveal that AAPLR speaker adaptation does improve speech recognition performance with moderate computational cost compared to the maximum likelihood linear regression (MLLR), maximum a posteriori linear regression (MAPLR) and MCELR.

UR - http://www.scopus.com/inward/record.url?scp=33646777144&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33646777144&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2005.1415278

DO - 10.1109/ICASSP.2005.1415278

M3 - Conference contribution

SN - 0780388747

SN - 9780780388741

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

BT - 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Huang C-H, Chien JT. Aggregate a posteriori linear regression for speaker adaptation. In 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing. Institute of Electrical and Electronics Engineers Inc. 2005. 1415278. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2005.1415278