TY - CHAP
T1 - Emotion perception and recognition from speech
AU - Wu, Chung Hsien
AU - Yeh, Jui Feng
AU - Chuang, Ze Jing
PY - 2009
Y1 - 2009
N2 - With the increasing role of speech interfaces in human-computer interac¬tion applications, automatically recognizing emotions from human speech becomes more and more important. This chapter begins by introducing the correlations be¬tween basic speech features such as pitch, intensity, formants, MFCC, and so on, and the emotions. Several recognition methods are then described to illustrate the performance of the previously proposed models, including support vector machine (SVM), K-nearest neighbors (KNN), neural networks, and the like. To give a more practical description of an emotion recognition procedure, a new approach to emotion recognition is provided as a case study. In this case study, the Intonation Groups (IGs) of the input speech signals are first defined and extracted for feature extraction. With the assumption of linear mapping between feature spaces in different emotional states, a feature compensation approach is proposed to char¬acterize the feature space with better discriminability among emotional states. The compensation vector with respect to each emotional state is estimated using the Min¬imum Classification Error (MCE) algorithm. The IG-based feature vectors compen¬sated by the compensation vectors are used to train the Gaussian Mixture Models (GMMs) for each emotional state. The emotional state with the GMM having the maximal likelihood ratio is determined as the emotion state output.
AB - With the increasing role of speech interfaces in human-computer interac¬tion applications, automatically recognizing emotions from human speech becomes more and more important. This chapter begins by introducing the correlations be¬tween basic speech features such as pitch, intensity, formants, MFCC, and so on, and the emotions. Several recognition methods are then described to illustrate the performance of the previously proposed models, including support vector machine (SVM), K-nearest neighbors (KNN), neural networks, and the like. To give a more practical description of an emotion recognition procedure, a new approach to emotion recognition is provided as a case study. In this case study, the Intonation Groups (IGs) of the input speech signals are first defined and extracted for feature extraction. With the assumption of linear mapping between feature spaces in different emotional states, a feature compensation approach is proposed to char¬acterize the feature space with better discriminability among emotional states. The compensation vector with respect to each emotional state is estimated using the Min¬imum Classification Error (MCE) algorithm. The IG-based feature vectors compen¬sated by the compensation vectors are used to train the Gaussian Mixture Models (GMMs) for each emotional state. The emotional state with the GMM having the maximal likelihood ratio is determined as the emotion state output.
UR - http://www.scopus.com/inward/record.url?scp=77950922687&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77950922687&partnerID=8YFLogxK
U2 - 10.1007/978-1-84800-306-4_6
DO - 10.1007/978-1-84800-306-4_6
M3 - Chapter
AN - SCOPUS:77950922687
SN - 9781848003057
SP - 93
EP - 110
BT - Affective Information Processing
PB - Springer London
ER -