Speech emotion classification using multiple kernel Gaussian process

Sih Huei Chen, Jia Ching Wang, Wen Chi Hsieh, Yu Hao Chin, Chin Wen Ho, Chung-Hsien Wu

研究成果: Conference contribution

1 引文 (Scopus)

摘要

Given the increasing attention paid to speech emotion classification in recent years, this work presents a novel speech emotion classification approach based on the multiple kernel Gaussian process. Two major aspects of a classification problem that play an important role in classification accuracy are addressed, i.e. feature extraction and classification. Prosodic features and other features widely used in sound effect classification are selected. A semi-nonnegative matrix factorization algorithm is then applied to the proposed features in order to obtain more information about the features. Following feature extraction, a multiple kernel Gaussian process (GP) is used for classification, in which two similarity notions from our data in the learning algorithm are presented by combining the linear kernel and radial basis function (RBF) kernel. According to our results, the proposed speech emotion classification apporach achieve an accuracy of 77.74%. Moreover, comparing different apporaches reveals that the proposed system performs best than other apporaches.

原文English
主出版物標題2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016
發行者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子)9789881476821
DOIs
出版狀態Published - 2017 一月 17
事件2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016 - Jeju, Korea, Republic of
持續時間: 2016 十二月 132016 十二月 16

出版系列

名字2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016

Other

Other2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016
國家Korea, Republic of
城市Jeju
期間16-12-1316-12-16

指紋

Feature extraction
Factorization
Learning algorithms
Acoustic waves

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Information Systems
  • Signal Processing

引用此文

Chen, S. H., Wang, J. C., Hsieh, W. C., Chin, Y. H., Ho, C. W., & Wu, C-H. (2017). Speech emotion classification using multiple kernel Gaussian process. 於 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016 [7820708] (2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/APSIPA.2016.7820708
Chen, Sih Huei ; Wang, Jia Ching ; Hsieh, Wen Chi ; Chin, Yu Hao ; Ho, Chin Wen ; Wu, Chung-Hsien. / Speech emotion classification using multiple kernel Gaussian process. 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016. Institute of Electrical and Electronics Engineers Inc., 2017. (2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016).
@inproceedings{e356ba26f531438b8b785b8e33637bd6,
title = "Speech emotion classification using multiple kernel Gaussian process",
abstract = "Given the increasing attention paid to speech emotion classification in recent years, this work presents a novel speech emotion classification approach based on the multiple kernel Gaussian process. Two major aspects of a classification problem that play an important role in classification accuracy are addressed, i.e. feature extraction and classification. Prosodic features and other features widely used in sound effect classification are selected. A semi-nonnegative matrix factorization algorithm is then applied to the proposed features in order to obtain more information about the features. Following feature extraction, a multiple kernel Gaussian process (GP) is used for classification, in which two similarity notions from our data in the learning algorithm are presented by combining the linear kernel and radial basis function (RBF) kernel. According to our results, the proposed speech emotion classification apporach achieve an accuracy of 77.74{\%}. Moreover, comparing different apporaches reveals that the proposed system performs best than other apporaches.",
author = "Chen, {Sih Huei} and Wang, {Jia Ching} and Hsieh, {Wen Chi} and Chin, {Yu Hao} and Ho, {Chin Wen} and Chung-Hsien Wu",
year = "2017",
month = "1",
day = "17",
doi = "10.1109/APSIPA.2016.7820708",
language = "English",
series = "2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
booktitle = "2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016",
address = "United States",

}

Chen, SH, Wang, JC, Hsieh, WC, Chin, YH, Ho, CW & Wu, C-H 2017, Speech emotion classification using multiple kernel Gaussian process. 於 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016., 7820708, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016, Institute of Electrical and Electronics Engineers Inc., 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016, Jeju, Korea, Republic of, 16-12-13. https://doi.org/10.1109/APSIPA.2016.7820708

Speech emotion classification using multiple kernel Gaussian process. / Chen, Sih Huei; Wang, Jia Ching; Hsieh, Wen Chi; Chin, Yu Hao; Ho, Chin Wen; Wu, Chung-Hsien.

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016. Institute of Electrical and Electronics Engineers Inc., 2017. 7820708 (2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016).

研究成果: Conference contribution

TY - GEN

T1 - Speech emotion classification using multiple kernel Gaussian process

AU - Chen, Sih Huei

AU - Wang, Jia Ching

AU - Hsieh, Wen Chi

AU - Chin, Yu Hao

AU - Ho, Chin Wen

AU - Wu, Chung-Hsien

PY - 2017/1/17

Y1 - 2017/1/17

N2 - Given the increasing attention paid to speech emotion classification in recent years, this work presents a novel speech emotion classification approach based on the multiple kernel Gaussian process. Two major aspects of a classification problem that play an important role in classification accuracy are addressed, i.e. feature extraction and classification. Prosodic features and other features widely used in sound effect classification are selected. A semi-nonnegative matrix factorization algorithm is then applied to the proposed features in order to obtain more information about the features. Following feature extraction, a multiple kernel Gaussian process (GP) is used for classification, in which two similarity notions from our data in the learning algorithm are presented by combining the linear kernel and radial basis function (RBF) kernel. According to our results, the proposed speech emotion classification apporach achieve an accuracy of 77.74%. Moreover, comparing different apporaches reveals that the proposed system performs best than other apporaches.

AB - Given the increasing attention paid to speech emotion classification in recent years, this work presents a novel speech emotion classification approach based on the multiple kernel Gaussian process. Two major aspects of a classification problem that play an important role in classification accuracy are addressed, i.e. feature extraction and classification. Prosodic features and other features widely used in sound effect classification are selected. A semi-nonnegative matrix factorization algorithm is then applied to the proposed features in order to obtain more information about the features. Following feature extraction, a multiple kernel Gaussian process (GP) is used for classification, in which two similarity notions from our data in the learning algorithm are presented by combining the linear kernel and radial basis function (RBF) kernel. According to our results, the proposed speech emotion classification apporach achieve an accuracy of 77.74%. Moreover, comparing different apporaches reveals that the proposed system performs best than other apporaches.

UR - http://www.scopus.com/inward/record.url?scp=85013752979&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85013752979&partnerID=8YFLogxK

U2 - 10.1109/APSIPA.2016.7820708

DO - 10.1109/APSIPA.2016.7820708

M3 - Conference contribution

AN - SCOPUS:85013752979

T3 - 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016

BT - 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Chen SH, Wang JC, Hsieh WC, Chin YH, Ho CW, Wu C-H. Speech emotion classification using multiple kernel Gaussian process. 於 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016. Institute of Electrical and Electronics Engineers Inc. 2017. 7820708. (2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016). https://doi.org/10.1109/APSIPA.2016.7820708