TY - GEN
T1 - Detection of mood disorder using speech emotion profiles and LSTM
AU - Yang, Tsung Hsien
AU - Wu, Chung Hsien
AU - Huang, Kun Yi
AU - Su, Ming Hsiang
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/5/2
Y1 - 2017/5/2
N2 - In mood disorder diagnosis, bipolar disorder (BD) patients are often misdiagnosed as unipolar depression (UD) on initial presentation. It is crucial to establish an accurate distinction between BD and UD to make a correct and early diagnosis, leading to improvements in treatment and course of illness. To deal with this misdiagnosis problem, in this study, we experimented on eliciting subjects' emotions by watching six eliciting emotional video clips. After watching each video clips, their speech responses were collected when they were interviewing with a clinician. In mood disorder detection, speech emotions play an import role to detect manic or depressive symptoms. Therefore, speech emotion profiles (EP) are obtained by using the support vector machine (SVM) which are built via speech features adapted from selected databases using a denoising autoencoder-based method. Finally, a Long Short-Term Memory (LSTM) recurrent neural network is employed to characterize the temporal information of the EPs with respect to six emotional videos. Comparative experiments clearly show the promising advantage and efficacy of the LSTM-based approach for mood disorder detection.
AB - In mood disorder diagnosis, bipolar disorder (BD) patients are often misdiagnosed as unipolar depression (UD) on initial presentation. It is crucial to establish an accurate distinction between BD and UD to make a correct and early diagnosis, leading to improvements in treatment and course of illness. To deal with this misdiagnosis problem, in this study, we experimented on eliciting subjects' emotions by watching six eliciting emotional video clips. After watching each video clips, their speech responses were collected when they were interviewing with a clinician. In mood disorder detection, speech emotions play an import role to detect manic or depressive symptoms. Therefore, speech emotion profiles (EP) are obtained by using the support vector machine (SVM) which are built via speech features adapted from selected databases using a denoising autoencoder-based method. Finally, a Long Short-Term Memory (LSTM) recurrent neural network is employed to characterize the temporal information of the EPs with respect to six emotional videos. Comparative experiments clearly show the promising advantage and efficacy of the LSTM-based approach for mood disorder detection.
UR - http://www.scopus.com/inward/record.url?scp=85020206734&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85020206734&partnerID=8YFLogxK
U2 - 10.1109/ISCSLP.2016.7918439
DO - 10.1109/ISCSLP.2016.7918439
M3 - Conference contribution
AN - SCOPUS:85020206734
T3 - Proceedings of 2016 10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016
BT - Proceedings of 2016 10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016
A2 - Wang, Hsin-Min
A2 - Hou, Qingzhi
A2 - Wei, Yuan
A2 - Lee, Tan
A2 - Wei, Jianguo
A2 - Xie, Lei
A2 - Feng, Hui
A2 - Dang, Jianwu
A2 - Dang, Jianwu
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016
Y2 - 17 October 2016 through 20 October 2016
ER -