This paper presents an approach to the automatic recognition of human emotions from audio-visual bimodal signals using an error weighted semi-coupled hidden Markov model (EWSC-HMM). The proposed approach combines an SC-HMM with a state-based bimodal alignment strategy and a Bayesian classifier weighting scheme to obtain the optimal emotion recognition result based on audio-visual bimodal fusion. The state-based bimodal alignment strategy in SC-HMM is proposed to align the temporal relation between audio and visual streams. The Bayesian classifier weighting scheme is then adopted to explore the contributions of the SC-HMM-based classifiers for different audio-visual feature pairs in order to obtain the emotion recognition output. For performance evaluation, two databases are considered: the MHMC posed database and the SEMAINE naturalistic database. Experimental results show that the proposed approach not only outperforms other fusion-based bimodal emotion recognition methods for posed expressions but also provides satisfactory results for naturalistic expressions.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Media Technology
- Computer Science Applications
- Electrical and Electronic Engineering