Audiovisual Emotion Recognition Using Semi-Coupled Hidden Markov Model with State-Based Alignment Strategy

Chung-Hsien Wu, Jen Chun Lin, Wen Li Wei

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

This chapter introduces the current data fusion strategies among audiovisual signals for bimodal emotion recognition. Face detection, in the chapter, is performed based on the adaboost cascade face detector and can be used to provide initial facial position and reduce the time for error convergence in feature extraction. In the chapter, active appearance model (AAM) is employed to extract the 68 labeled facial feature points (FPs) from 5 facial regions including eyebrow, eye, nose, mouth, and facial contours for later facial animation parameters (FAPs) calculation. Three kinds of primary prosodic features are adopted, including pitch, energy, and formants F1-F5 in each speech frame for emotion recognition. Finally, a semi-coupled hidden Markov model (SC-HMM) is proposed for emotion recognition based on state-based alignment strategy for audiovisual bimodal features.

Original languageEnglish
Title of host publicationEmotion Recognition
Subtitle of host publicationA Pattern Analysis Approach
Publisherwiley
Pages493-513
Number of pages21
ISBN (Electronic)9781118910566
ISBN (Print)9781118130667
DOIs
Publication statusPublished - 2015 Jan 2

Fingerprint

Adaptive boosting
Data fusion
Hidden Markov models
Face recognition
Animation
Feature extraction
Detectors

All Science Journal Classification (ASJC) codes

  • Engineering(all)
  • Computer Science(all)

Cite this

Wu, C-H., Lin, J. C., & Wei, W. L. (2015). Audiovisual Emotion Recognition Using Semi-Coupled Hidden Markov Model with State-Based Alignment Strategy. In Emotion Recognition: A Pattern Analysis Approach (pp. 493-513). wiley. https://doi.org/10.1002/9781118910566.ch19
Wu, Chung-Hsien ; Lin, Jen Chun ; Wei, Wen Li. / Audiovisual Emotion Recognition Using Semi-Coupled Hidden Markov Model with State-Based Alignment Strategy. Emotion Recognition: A Pattern Analysis Approach. wiley, 2015. pp. 493-513
@inbook{da68bf0d33154c6e80a7de3a79c367de,
title = "Audiovisual Emotion Recognition Using Semi-Coupled Hidden Markov Model with State-Based Alignment Strategy",
abstract = "This chapter introduces the current data fusion strategies among audiovisual signals for bimodal emotion recognition. Face detection, in the chapter, is performed based on the adaboost cascade face detector and can be used to provide initial facial position and reduce the time for error convergence in feature extraction. In the chapter, active appearance model (AAM) is employed to extract the 68 labeled facial feature points (FPs) from 5 facial regions including eyebrow, eye, nose, mouth, and facial contours for later facial animation parameters (FAPs) calculation. Three kinds of primary prosodic features are adopted, including pitch, energy, and formants F1-F5 in each speech frame for emotion recognition. Finally, a semi-coupled hidden Markov model (SC-HMM) is proposed for emotion recognition based on state-based alignment strategy for audiovisual bimodal features.",
author = "Chung-Hsien Wu and Lin, {Jen Chun} and Wei, {Wen Li}",
year = "2015",
month = "1",
day = "2",
doi = "10.1002/9781118910566.ch19",
language = "English",
isbn = "9781118130667",
pages = "493--513",
booktitle = "Emotion Recognition",
publisher = "wiley",

}

Audiovisual Emotion Recognition Using Semi-Coupled Hidden Markov Model with State-Based Alignment Strategy. / Wu, Chung-Hsien; Lin, Jen Chun; Wei, Wen Li.

Emotion Recognition: A Pattern Analysis Approach. wiley, 2015. p. 493-513.

Research output: Chapter in Book/Report/Conference proceedingChapter

TY - CHAP

T1 - Audiovisual Emotion Recognition Using Semi-Coupled Hidden Markov Model with State-Based Alignment Strategy

AU - Wu, Chung-Hsien

AU - Lin, Jen Chun

AU - Wei, Wen Li

PY - 2015/1/2

Y1 - 2015/1/2

N2 - This chapter introduces the current data fusion strategies among audiovisual signals for bimodal emotion recognition. Face detection, in the chapter, is performed based on the adaboost cascade face detector and can be used to provide initial facial position and reduce the time for error convergence in feature extraction. In the chapter, active appearance model (AAM) is employed to extract the 68 labeled facial feature points (FPs) from 5 facial regions including eyebrow, eye, nose, mouth, and facial contours for later facial animation parameters (FAPs) calculation. Three kinds of primary prosodic features are adopted, including pitch, energy, and formants F1-F5 in each speech frame for emotion recognition. Finally, a semi-coupled hidden Markov model (SC-HMM) is proposed for emotion recognition based on state-based alignment strategy for audiovisual bimodal features.

AB - This chapter introduces the current data fusion strategies among audiovisual signals for bimodal emotion recognition. Face detection, in the chapter, is performed based on the adaboost cascade face detector and can be used to provide initial facial position and reduce the time for error convergence in feature extraction. In the chapter, active appearance model (AAM) is employed to extract the 68 labeled facial feature points (FPs) from 5 facial regions including eyebrow, eye, nose, mouth, and facial contours for later facial animation parameters (FAPs) calculation. Three kinds of primary prosodic features are adopted, including pitch, energy, and formants F1-F5 in each speech frame for emotion recognition. Finally, a semi-coupled hidden Markov model (SC-HMM) is proposed for emotion recognition based on state-based alignment strategy for audiovisual bimodal features.

UR - http://www.scopus.com/inward/record.url?scp=85016374157&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85016374157&partnerID=8YFLogxK

U2 - 10.1002/9781118910566.ch19

DO - 10.1002/9781118910566.ch19

M3 - Chapter

SN - 9781118130667

SP - 493

EP - 513

BT - Emotion Recognition

PB - wiley

ER -