Speech emotion classification using multiple kernel Gaussian process

Sih Huei Chen, Jia Ching Wang, Wen Chi Hsieh, Yu Hao Chin, Chin Wen Ho, Chung-Hsien Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Given the increasing attention paid to speech emotion classification in recent years, this work presents a novel speech emotion classification approach based on the multiple kernel Gaussian process. Two major aspects of a classification problem that play an important role in classification accuracy are addressed, i.e. feature extraction and classification. Prosodic features and other features widely used in sound effect classification are selected. A semi-nonnegative matrix factorization algorithm is then applied to the proposed features in order to obtain more information about the features. Following feature extraction, a multiple kernel Gaussian process (GP) is used for classification, in which two similarity notions from our data in the learning algorithm are presented by combining the linear kernel and radial basis function (RBF) kernel. According to our results, the proposed speech emotion classification apporach achieve an accuracy of 77.74%. Moreover, comparing different apporaches reveals that the proposed system performs best than other apporaches.

Original languageEnglish
Title of host publication2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9789881476821
DOIs
Publication statusPublished - 2017 Jan 17
Event2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016 - Jeju, Korea, Republic of
Duration: 2016 Dec 132016 Dec 16

Publication series

Name2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016

Other

Other2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016
CountryKorea, Republic of
CityJeju
Period16-12-1316-12-16

Fingerprint

Feature extraction
Factorization
Learning algorithms
Acoustic waves

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Information Systems
  • Signal Processing

Cite this

Chen, S. H., Wang, J. C., Hsieh, W. C., Chin, Y. H., Ho, C. W., & Wu, C-H. (2017). Speech emotion classification using multiple kernel Gaussian process. In 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016 [7820708] (2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/APSIPA.2016.7820708
Chen, Sih Huei ; Wang, Jia Ching ; Hsieh, Wen Chi ; Chin, Yu Hao ; Ho, Chin Wen ; Wu, Chung-Hsien. / Speech emotion classification using multiple kernel Gaussian process. 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016. Institute of Electrical and Electronics Engineers Inc., 2017. (2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016).
@inproceedings{e356ba26f531438b8b785b8e33637bd6,
title = "Speech emotion classification using multiple kernel Gaussian process",
abstract = "Given the increasing attention paid to speech emotion classification in recent years, this work presents a novel speech emotion classification approach based on the multiple kernel Gaussian process. Two major aspects of a classification problem that play an important role in classification accuracy are addressed, i.e. feature extraction and classification. Prosodic features and other features widely used in sound effect classification are selected. A semi-nonnegative matrix factorization algorithm is then applied to the proposed features in order to obtain more information about the features. Following feature extraction, a multiple kernel Gaussian process (GP) is used for classification, in which two similarity notions from our data in the learning algorithm are presented by combining the linear kernel and radial basis function (RBF) kernel. According to our results, the proposed speech emotion classification apporach achieve an accuracy of 77.74{\%}. Moreover, comparing different apporaches reveals that the proposed system performs best than other apporaches.",
author = "Chen, {Sih Huei} and Wang, {Jia Ching} and Hsieh, {Wen Chi} and Chin, {Yu Hao} and Ho, {Chin Wen} and Chung-Hsien Wu",
year = "2017",
month = "1",
day = "17",
doi = "10.1109/APSIPA.2016.7820708",
language = "English",
series = "2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
booktitle = "2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016",
address = "United States",

}

Chen, SH, Wang, JC, Hsieh, WC, Chin, YH, Ho, CW & Wu, C-H 2017, Speech emotion classification using multiple kernel Gaussian process. in 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016., 7820708, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016, Institute of Electrical and Electronics Engineers Inc., 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016, Jeju, Korea, Republic of, 16-12-13. https://doi.org/10.1109/APSIPA.2016.7820708

Speech emotion classification using multiple kernel Gaussian process. / Chen, Sih Huei; Wang, Jia Ching; Hsieh, Wen Chi; Chin, Yu Hao; Ho, Chin Wen; Wu, Chung-Hsien.

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016. Institute of Electrical and Electronics Engineers Inc., 2017. 7820708 (2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Speech emotion classification using multiple kernel Gaussian process

AU - Chen, Sih Huei

AU - Wang, Jia Ching

AU - Hsieh, Wen Chi

AU - Chin, Yu Hao

AU - Ho, Chin Wen

AU - Wu, Chung-Hsien

PY - 2017/1/17

Y1 - 2017/1/17

N2 - Given the increasing attention paid to speech emotion classification in recent years, this work presents a novel speech emotion classification approach based on the multiple kernel Gaussian process. Two major aspects of a classification problem that play an important role in classification accuracy are addressed, i.e. feature extraction and classification. Prosodic features and other features widely used in sound effect classification are selected. A semi-nonnegative matrix factorization algorithm is then applied to the proposed features in order to obtain more information about the features. Following feature extraction, a multiple kernel Gaussian process (GP) is used for classification, in which two similarity notions from our data in the learning algorithm are presented by combining the linear kernel and radial basis function (RBF) kernel. According to our results, the proposed speech emotion classification apporach achieve an accuracy of 77.74%. Moreover, comparing different apporaches reveals that the proposed system performs best than other apporaches.

AB - Given the increasing attention paid to speech emotion classification in recent years, this work presents a novel speech emotion classification approach based on the multiple kernel Gaussian process. Two major aspects of a classification problem that play an important role in classification accuracy are addressed, i.e. feature extraction and classification. Prosodic features and other features widely used in sound effect classification are selected. A semi-nonnegative matrix factorization algorithm is then applied to the proposed features in order to obtain more information about the features. Following feature extraction, a multiple kernel Gaussian process (GP) is used for classification, in which two similarity notions from our data in the learning algorithm are presented by combining the linear kernel and radial basis function (RBF) kernel. According to our results, the proposed speech emotion classification apporach achieve an accuracy of 77.74%. Moreover, comparing different apporaches reveals that the proposed system performs best than other apporaches.

UR - http://www.scopus.com/inward/record.url?scp=85013752979&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85013752979&partnerID=8YFLogxK

U2 - 10.1109/APSIPA.2016.7820708

DO - 10.1109/APSIPA.2016.7820708

M3 - Conference contribution

T3 - 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016

BT - 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Chen SH, Wang JC, Hsieh WC, Chin YH, Ho CW, Wu C-H. Speech emotion classification using multiple kernel Gaussian process. In 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016. Institute of Electrical and Electronics Engineers Inc. 2017. 7820708. (2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016). https://doi.org/10.1109/APSIPA.2016.7820708