Cell-Coupled Long Short-Term Memory with L-Skip Fusion Mechanism for Mood Disorder Detection through Elicited Audiovisual Features

Ming Hsiang Su, Chung Hsien Wu, Kun Yi Huang, Tsung Hsien Yang

Research output: Contribution to journalArticle

Abstract

In early stages, patients with bipolar disorder are often diagnosed as having unipolar depression in mood disorder diagnosis. Because the long-term monitoring is limited by the delayed detection of mood disorder, an accurate and one-time diagnosis is desirable to avoid delay in appropriate treatment due to misdiagnosis. In this paper, an elicitation-based approach is proposed for realizing a one-time diagnosis by using responses elicited from patients by having them watch six emotion-eliciting videos. After watching each video clip, the conversations, including patient facial expressions and speech responses, between the participant and the clinician conducting the interview were recorded. Next, the hierarchical spectral clustering algorithm was employed to adapt the facial expression and speech response features by using the extended Cohn-Kanade and eNTERFACE databases. A denoizing autoencoder was further applied to extract the bottleneck features of the adapted data. Then, the facial and speech bottleneck features were input into support vector machines to obtain speech emotion profiles (EPs) and the modulation spectrum (MS) of the facial action unit sequence for each elicited response. Finally, a cell-coupled long short-term memory (LSTM) network with an L-skip fusion mechanism was proposed to model the temporal information of all elicited responses and to loosely fuse the EPs and the MS for conducting mood disorder detection. The experimental results revealed that the cell-coupled LSTM with the L-skip fusion mechanism has promising advantages and efficacy for mood disorder detection.

Original languageEnglish
Article number8668691
Pages (from-to)124-135
Number of pages12
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume31
Issue number1
DOIs
Publication statusPublished - 2020 Jan

Fingerprint

Fusion reactions
Modulation
Electric fuses
Clustering algorithms
Support vector machines
Long short-term memory
Monitoring

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

@article{b2a8c584368f4f689bf5adb68effc0c7,
title = "Cell-Coupled Long Short-Term Memory with L-Skip Fusion Mechanism for Mood Disorder Detection through Elicited Audiovisual Features",
abstract = "In early stages, patients with bipolar disorder are often diagnosed as having unipolar depression in mood disorder diagnosis. Because the long-term monitoring is limited by the delayed detection of mood disorder, an accurate and one-time diagnosis is desirable to avoid delay in appropriate treatment due to misdiagnosis. In this paper, an elicitation-based approach is proposed for realizing a one-time diagnosis by using responses elicited from patients by having them watch six emotion-eliciting videos. After watching each video clip, the conversations, including patient facial expressions and speech responses, between the participant and the clinician conducting the interview were recorded. Next, the hierarchical spectral clustering algorithm was employed to adapt the facial expression and speech response features by using the extended Cohn-Kanade and eNTERFACE databases. A denoizing autoencoder was further applied to extract the bottleneck features of the adapted data. Then, the facial and speech bottleneck features were input into support vector machines to obtain speech emotion profiles (EPs) and the modulation spectrum (MS) of the facial action unit sequence for each elicited response. Finally, a cell-coupled long short-term memory (LSTM) network with an L-skip fusion mechanism was proposed to model the temporal information of all elicited responses and to loosely fuse the EPs and the MS for conducting mood disorder detection. The experimental results revealed that the cell-coupled LSTM with the L-skip fusion mechanism has promising advantages and efficacy for mood disorder detection.",
author = "Su, {Ming Hsiang} and Wu, {Chung Hsien} and Huang, {Kun Yi} and Yang, {Tsung Hsien}",
year = "2020",
month = "1",
doi = "10.1109/TNNLS.2019.2899884",
language = "English",
volume = "31",
pages = "124--135",
journal = "IEEE Transactions on Neural Networks and Learning Systems",
issn = "2162-237X",
publisher = "IEEE Computational Intelligence Society",
number = "1",

}

TY - JOUR

T1 - Cell-Coupled Long Short-Term Memory with L-Skip Fusion Mechanism for Mood Disorder Detection through Elicited Audiovisual Features

AU - Su, Ming Hsiang

AU - Wu, Chung Hsien

AU - Huang, Kun Yi

AU - Yang, Tsung Hsien

PY - 2020/1

Y1 - 2020/1

N2 - In early stages, patients with bipolar disorder are often diagnosed as having unipolar depression in mood disorder diagnosis. Because the long-term monitoring is limited by the delayed detection of mood disorder, an accurate and one-time diagnosis is desirable to avoid delay in appropriate treatment due to misdiagnosis. In this paper, an elicitation-based approach is proposed for realizing a one-time diagnosis by using responses elicited from patients by having them watch six emotion-eliciting videos. After watching each video clip, the conversations, including patient facial expressions and speech responses, between the participant and the clinician conducting the interview were recorded. Next, the hierarchical spectral clustering algorithm was employed to adapt the facial expression and speech response features by using the extended Cohn-Kanade and eNTERFACE databases. A denoizing autoencoder was further applied to extract the bottleneck features of the adapted data. Then, the facial and speech bottleneck features were input into support vector machines to obtain speech emotion profiles (EPs) and the modulation spectrum (MS) of the facial action unit sequence for each elicited response. Finally, a cell-coupled long short-term memory (LSTM) network with an L-skip fusion mechanism was proposed to model the temporal information of all elicited responses and to loosely fuse the EPs and the MS for conducting mood disorder detection. The experimental results revealed that the cell-coupled LSTM with the L-skip fusion mechanism has promising advantages and efficacy for mood disorder detection.

AB - In early stages, patients with bipolar disorder are often diagnosed as having unipolar depression in mood disorder diagnosis. Because the long-term monitoring is limited by the delayed detection of mood disorder, an accurate and one-time diagnosis is desirable to avoid delay in appropriate treatment due to misdiagnosis. In this paper, an elicitation-based approach is proposed for realizing a one-time diagnosis by using responses elicited from patients by having them watch six emotion-eliciting videos. After watching each video clip, the conversations, including patient facial expressions and speech responses, between the participant and the clinician conducting the interview were recorded. Next, the hierarchical spectral clustering algorithm was employed to adapt the facial expression and speech response features by using the extended Cohn-Kanade and eNTERFACE databases. A denoizing autoencoder was further applied to extract the bottleneck features of the adapted data. Then, the facial and speech bottleneck features were input into support vector machines to obtain speech emotion profiles (EPs) and the modulation spectrum (MS) of the facial action unit sequence for each elicited response. Finally, a cell-coupled long short-term memory (LSTM) network with an L-skip fusion mechanism was proposed to model the temporal information of all elicited responses and to loosely fuse the EPs and the MS for conducting mood disorder detection. The experimental results revealed that the cell-coupled LSTM with the L-skip fusion mechanism has promising advantages and efficacy for mood disorder detection.

UR - http://www.scopus.com/inward/record.url?scp=85077667606&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85077667606&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2019.2899884

DO - 10.1109/TNNLS.2019.2899884

M3 - Article

C2 - 30892247

AN - SCOPUS:85077667606

VL - 31

SP - 124

EP - 135

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

SN - 2162-237X

IS - 1

M1 - 8668691

ER -