Coupled HMM-based multimodal fusion for mood disorder detection through elicited audio–visual signals

Tsung Hsien Yang, Chung-Hsien Wu, Kun Yi Huang, Ming Hsiang Su

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Mood disorders encompass a wide array of mood issues, including unipolar depression (UD) and bipolar disorder (BD). In diagnostic evaluation on the outpatients with mood disorder, a high percentage of BD patients are initially misdiagnosed as having UD. It is crucial to establish an accurate distinction between BD and UD to make a correct and early diagnosis, leading to improvements in treatment and course of illness. In this study, eliciting emotional videos are firstly used to elicit the patients’ emotions. After watching each video clips, their facial expressions and speech responses are collected when they are interviewing with a clinician. In mood disorder detection, the facial action unit (AU) profiles and speech emotion profiles (EPs) are obtained, respectively, by using the support vector machines (SVMs) which are built via facial features and speech features adapted from two selected databases using a denoising autoencoder-based method. Finally, a Coupled Hidden Markov Model (CHMM)-based fusion method is proposed to characterize the temporal information. The CHMM is modified to fuse the AUs and the EPs with respect to six emotional videos. Experimental results show the promising advantage and efficacy of the CHMM-based fusion approach for mood disorder detection.

Original languageEnglish
Pages (from-to)895-906
Number of pages12
JournalJournal of Ambient Intelligence and Humanized Computing
Volume8
Issue number6
DOIs
Publication statusPublished - 2017 Nov 1

Fingerprint

Hidden Markov models
Fusion reactions
Electric fuses
Support vector machines

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this

@article{8daaa46bb3dc498f847214d24507e951,
title = "Coupled HMM-based multimodal fusion for mood disorder detection through elicited audio–visual signals",
abstract = "Mood disorders encompass a wide array of mood issues, including unipolar depression (UD) and bipolar disorder (BD). In diagnostic evaluation on the outpatients with mood disorder, a high percentage of BD patients are initially misdiagnosed as having UD. It is crucial to establish an accurate distinction between BD and UD to make a correct and early diagnosis, leading to improvements in treatment and course of illness. In this study, eliciting emotional videos are firstly used to elicit the patients’ emotions. After watching each video clips, their facial expressions and speech responses are collected when they are interviewing with a clinician. In mood disorder detection, the facial action unit (AU) profiles and speech emotion profiles (EPs) are obtained, respectively, by using the support vector machines (SVMs) which are built via facial features and speech features adapted from two selected databases using a denoising autoencoder-based method. Finally, a Coupled Hidden Markov Model (CHMM)-based fusion method is proposed to characterize the temporal information. The CHMM is modified to fuse the AUs and the EPs with respect to six emotional videos. Experimental results show the promising advantage and efficacy of the CHMM-based fusion approach for mood disorder detection.",
author = "Yang, {Tsung Hsien} and Chung-Hsien Wu and Huang, {Kun Yi} and Su, {Ming Hsiang}",
year = "2017",
month = "11",
day = "1",
doi = "10.1007/s12652-016-0395-y",
language = "English",
volume = "8",
pages = "895--906",
journal = "Journal of Ambient Intelligence and Humanized Computing",
issn = "1868-5137",
publisher = "Springer Verlag",
number = "6",

}

Coupled HMM-based multimodal fusion for mood disorder detection through elicited audio–visual signals. / Yang, Tsung Hsien; Wu, Chung-Hsien; Huang, Kun Yi; Su, Ming Hsiang.

In: Journal of Ambient Intelligence and Humanized Computing, Vol. 8, No. 6, 01.11.2017, p. 895-906.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Coupled HMM-based multimodal fusion for mood disorder detection through elicited audio–visual signals

AU - Yang, Tsung Hsien

AU - Wu, Chung-Hsien

AU - Huang, Kun Yi

AU - Su, Ming Hsiang

PY - 2017/11/1

Y1 - 2017/11/1

N2 - Mood disorders encompass a wide array of mood issues, including unipolar depression (UD) and bipolar disorder (BD). In diagnostic evaluation on the outpatients with mood disorder, a high percentage of BD patients are initially misdiagnosed as having UD. It is crucial to establish an accurate distinction between BD and UD to make a correct and early diagnosis, leading to improvements in treatment and course of illness. In this study, eliciting emotional videos are firstly used to elicit the patients’ emotions. After watching each video clips, their facial expressions and speech responses are collected when they are interviewing with a clinician. In mood disorder detection, the facial action unit (AU) profiles and speech emotion profiles (EPs) are obtained, respectively, by using the support vector machines (SVMs) which are built via facial features and speech features adapted from two selected databases using a denoising autoencoder-based method. Finally, a Coupled Hidden Markov Model (CHMM)-based fusion method is proposed to characterize the temporal information. The CHMM is modified to fuse the AUs and the EPs with respect to six emotional videos. Experimental results show the promising advantage and efficacy of the CHMM-based fusion approach for mood disorder detection.

AB - Mood disorders encompass a wide array of mood issues, including unipolar depression (UD) and bipolar disorder (BD). In diagnostic evaluation on the outpatients with mood disorder, a high percentage of BD patients are initially misdiagnosed as having UD. It is crucial to establish an accurate distinction between BD and UD to make a correct and early diagnosis, leading to improvements in treatment and course of illness. In this study, eliciting emotional videos are firstly used to elicit the patients’ emotions. After watching each video clips, their facial expressions and speech responses are collected when they are interviewing with a clinician. In mood disorder detection, the facial action unit (AU) profiles and speech emotion profiles (EPs) are obtained, respectively, by using the support vector machines (SVMs) which are built via facial features and speech features adapted from two selected databases using a denoising autoencoder-based method. Finally, a Coupled Hidden Markov Model (CHMM)-based fusion method is proposed to characterize the temporal information. The CHMM is modified to fuse the AUs and the EPs with respect to six emotional videos. Experimental results show the promising advantage and efficacy of the CHMM-based fusion approach for mood disorder detection.

UR - http://www.scopus.com/inward/record.url?scp=85031827065&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85031827065&partnerID=8YFLogxK

U2 - 10.1007/s12652-016-0395-y

DO - 10.1007/s12652-016-0395-y

M3 - Article

VL - 8

SP - 895

EP - 906

JO - Journal of Ambient Intelligence and Humanized Computing

JF - Journal of Ambient Intelligence and Humanized Computing

SN - 1868-5137

IS - 6

ER -