Speaking effect removal on emotion recognition from facial expressions based on eigenface conversion

Chung-Hsien Wu, Wen Li Wei, Jen Chun Lin, Wei Yu Lee

Research output: Contribution to journalArticle

26 Citations (Scopus)

Abstract

Speaking effect is a crucial issue that may dramatically degrade performance in emotion recognition from facial expressions. To manage this problem, an eigenface conversion-based approach is proposed to remove speaking effect on facial expressions for improving accuracy of emotion recognition. In the proposed approach, a context-dependent linear conversion function modeled by a statistical Gaussian Mixture Model (GMM) is constructed with parallel data from speaking and non-speaking facial expressions with emotions. To model the speaking effect in more detail, the conversion functions are categorized using a decision tree considering the visual temporal context of the Articulatory Attribute (AA) classes of the corresponding input speech segments. For verification of the identified quadrant of emotional expression on the Arousal-Valence (A-V) emotion plane, which is commonly used to dimensionally define the emotion classes, from the reconstructed facial feature points, an expression template is constructed to represent the feature points of the non-speaking facial expressions for each quadrant. With the verified quadrant, a regression scheme is further employed to estimate the A-V values of the facial expression as a precise point in the A-V emotion plane. Experimental results show that the proposed method outperforms current approaches and demonstrates that removing the speaking effect on facial expression is useful for improving the performance of emotion recognition.

Original languageEnglish
Article number6557471
Pages (from-to)1732-1744
Number of pages13
JournalIEEE Transactions on Multimedia
Volume15
Issue number8
DOIs
Publication statusPublished - 2013 Dec 2

Fingerprint

Decision trees

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Media Technology
  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this

@article{313dbed3b4524342b0c0176d7e88f3a7,
title = "Speaking effect removal on emotion recognition from facial expressions based on eigenface conversion",
abstract = "Speaking effect is a crucial issue that may dramatically degrade performance in emotion recognition from facial expressions. To manage this problem, an eigenface conversion-based approach is proposed to remove speaking effect on facial expressions for improving accuracy of emotion recognition. In the proposed approach, a context-dependent linear conversion function modeled by a statistical Gaussian Mixture Model (GMM) is constructed with parallel data from speaking and non-speaking facial expressions with emotions. To model the speaking effect in more detail, the conversion functions are categorized using a decision tree considering the visual temporal context of the Articulatory Attribute (AA) classes of the corresponding input speech segments. For verification of the identified quadrant of emotional expression on the Arousal-Valence (A-V) emotion plane, which is commonly used to dimensionally define the emotion classes, from the reconstructed facial feature points, an expression template is constructed to represent the feature points of the non-speaking facial expressions for each quadrant. With the verified quadrant, a regression scheme is further employed to estimate the A-V values of the facial expression as a precise point in the A-V emotion plane. Experimental results show that the proposed method outperforms current approaches and demonstrates that removing the speaking effect on facial expression is useful for improving the performance of emotion recognition.",
author = "Chung-Hsien Wu and Wei, {Wen Li} and Lin, {Jen Chun} and Lee, {Wei Yu}",
year = "2013",
month = "12",
day = "2",
doi = "10.1109/TMM.2013.2272917",
language = "English",
volume = "15",
pages = "1732--1744",
journal = "IEEE Transactions on Multimedia",
issn = "1520-9210",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "8",

}

Speaking effect removal on emotion recognition from facial expressions based on eigenface conversion. / Wu, Chung-Hsien; Wei, Wen Li; Lin, Jen Chun; Lee, Wei Yu.

In: IEEE Transactions on Multimedia, Vol. 15, No. 8, 6557471, 02.12.2013, p. 1732-1744.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Speaking effect removal on emotion recognition from facial expressions based on eigenface conversion

AU - Wu, Chung-Hsien

AU - Wei, Wen Li

AU - Lin, Jen Chun

AU - Lee, Wei Yu

PY - 2013/12/2

Y1 - 2013/12/2

N2 - Speaking effect is a crucial issue that may dramatically degrade performance in emotion recognition from facial expressions. To manage this problem, an eigenface conversion-based approach is proposed to remove speaking effect on facial expressions for improving accuracy of emotion recognition. In the proposed approach, a context-dependent linear conversion function modeled by a statistical Gaussian Mixture Model (GMM) is constructed with parallel data from speaking and non-speaking facial expressions with emotions. To model the speaking effect in more detail, the conversion functions are categorized using a decision tree considering the visual temporal context of the Articulatory Attribute (AA) classes of the corresponding input speech segments. For verification of the identified quadrant of emotional expression on the Arousal-Valence (A-V) emotion plane, which is commonly used to dimensionally define the emotion classes, from the reconstructed facial feature points, an expression template is constructed to represent the feature points of the non-speaking facial expressions for each quadrant. With the verified quadrant, a regression scheme is further employed to estimate the A-V values of the facial expression as a precise point in the A-V emotion plane. Experimental results show that the proposed method outperforms current approaches and demonstrates that removing the speaking effect on facial expression is useful for improving the performance of emotion recognition.

AB - Speaking effect is a crucial issue that may dramatically degrade performance in emotion recognition from facial expressions. To manage this problem, an eigenface conversion-based approach is proposed to remove speaking effect on facial expressions for improving accuracy of emotion recognition. In the proposed approach, a context-dependent linear conversion function modeled by a statistical Gaussian Mixture Model (GMM) is constructed with parallel data from speaking and non-speaking facial expressions with emotions. To model the speaking effect in more detail, the conversion functions are categorized using a decision tree considering the visual temporal context of the Articulatory Attribute (AA) classes of the corresponding input speech segments. For verification of the identified quadrant of emotional expression on the Arousal-Valence (A-V) emotion plane, which is commonly used to dimensionally define the emotion classes, from the reconstructed facial feature points, an expression template is constructed to represent the feature points of the non-speaking facial expressions for each quadrant. With the verified quadrant, a regression scheme is further employed to estimate the A-V values of the facial expression as a precise point in the A-V emotion plane. Experimental results show that the proposed method outperforms current approaches and demonstrates that removing the speaking effect on facial expression is useful for improving the performance of emotion recognition.

UR - http://www.scopus.com/inward/record.url?scp=84888365943&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84888365943&partnerID=8YFLogxK

U2 - 10.1109/TMM.2013.2272917

DO - 10.1109/TMM.2013.2272917

M3 - Article

VL - 15

SP - 1732

EP - 1744

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

SN - 1520-9210

IS - 8

M1 - 6557471

ER -