Mood disorders, including unipolar depression (UD) and bipolar disorder (BD), have become some of the commonest mental health disorders. The absence of diagnostic markers of BD can cause misdiagnosis of the disorder as UD on initial presentation. Short-term detection, which could be used in early detection and intervention, is desirable. This study proposed an approach for short-term detection of mood disorders based on elicited speech responses. Speech responses of participants were obtained through interviews by a clinician after participants viewed six emotion-eliciting videos. A domain adaptation method based on a hierarchical spectral clustering algorithm was proposed to adapt a labeled emotion database into a collected unlabeled mood database for alleviating the data bias problem in an emotion space. For modeling the local variation of emotions in each response, a convolutional neural network (CNN) with an attention mechanism was used to generate an emotion profile (EP) of each elicited speech response. Finally, long short-term memory (LSTM) was employed to characterize the temporal evolution of EPs of all six speech responses. Moreover, an attention model was applied to the LSTM network for highlighting pertinent speech responses to improve detection performance instead of treating all responses equally. For evaluation, this study elicited emotional speech data from 15 people with BD, 15 people with UD, and 15 healthy controls. Leave-one-group-out cross-validation was employed for the compiled database and proposed method. CNN- and LSTM-based attention models improved the mood disorder detection accuracy of the proposed method by approximately 11%. Furthermore, the proposed method achieved an overall detection accuracy of 75.56%, outperforming support-vector-machine- (62.22%) and CNN-based (66.67%) methods.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence