TY - JOUR
T1 - Unipolar depression vs. bipolar disorder
T2 - 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016
AU - Huang, Kun Yi
AU - Wu, Chung Hsien
AU - Kuo, Yu Ting
AU - Jang, Fong Lin
N1 - Publisher Copyright:
Copyright © 2016 ISCA.
PY - 2016
Y1 - 2016
N2 - Mood disorders include unipolar depression (UD) and bipolar disorder (BD). In this work, an elicitation-based approach to short-term detection of mood disorder based on the elicited speech responses is proposed. First, a long-short term memory (LSTM)-based classifier was constructed to generate the emotion likelihood for each segment in the elicited speech responses. The emotion likelihoods were then clustered into emotion codewords using the K-means algorithm. Latent semantic analysis (LSA) was then adopted to model the latent relationship between the emotion codewords and the elicited responses. The structural relationships among the emotion codewords in the LSA-based matrix were employed to construct a latent affective structure model (LASM) for characterizing each mood. For mood disorder detection, the similarity between the input speech LASM and each of the mood-specific LASMs was estimated. Finally, the mood with its LASM most similar to the input speech LASM is regarded as the detected mood. Experimental results show that the proposed LASM-based method achieved 73.3%, improving the detection accuracy by 13.3% compared to the commonly used SVM-base classifiers.
AB - Mood disorders include unipolar depression (UD) and bipolar disorder (BD). In this work, an elicitation-based approach to short-term detection of mood disorder based on the elicited speech responses is proposed. First, a long-short term memory (LSTM)-based classifier was constructed to generate the emotion likelihood for each segment in the elicited speech responses. The emotion likelihoods were then clustered into emotion codewords using the K-means algorithm. Latent semantic analysis (LSA) was then adopted to model the latent relationship between the emotion codewords and the elicited responses. The structural relationships among the emotion codewords in the LSA-based matrix were employed to construct a latent affective structure model (LASM) for characterizing each mood. For mood disorder detection, the similarity between the input speech LASM and each of the mood-specific LASMs was estimated. Finally, the mood with its LASM most similar to the input speech LASM is regarded as the detected mood. Experimental results show that the proposed LASM-based method achieved 73.3%, improving the detection accuracy by 13.3% compared to the commonly used SVM-base classifiers.
UR - http://www.scopus.com/inward/record.url?scp=84994227077&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84994227077&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2016-620
DO - 10.21437/Interspeech.2016-620
M3 - Conference article
AN - SCOPUS:84994227077
SN - 2308-457X
VL - 08-12-September-2016
SP - 1452
EP - 1456
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Y2 - 8 September 2016 through 16 September 2016
ER -