TY - GEN
T1 - Mood disorder identification using deep bottleneck features of elicited speech
AU - Huang, Kun Yi
AU - Wu, Chung Hsien
AU - Su, Ming Hsiang
AU - Chou, Chia Hui
PY - 2018/2/5
Y1 - 2018/2/5
N2 - In the diagnosis of mental health disorder, a large portion of the Bipolar Disorder (BD) patients is likely to be misdiagnosed as Unipolar Depression (UD) on initial presentation. As speech is the most natural way to express emotion, this work focuses on tracking emotion profile of elicited speech for short-term mood disorder identification. In this work, the Deep Scattering Spectrum (DSS) and Low Level Descriptors (LLDs) of the elicited speech signals are extracted as the speech features. The hierarchical spectral clustering (HSC) algorithm is employed to adapt the emotion database to the mood disorder database to alleviate the data bias problem. The denoising autoencoder is then used to extract the bottleneck features of DSS and LLDs for better representation. Based on the bottleneck features, a long short term memory (LSTM) is applied to generate the time-varying emotion profile sequence. Finally, given the emotion profile sequence, the HMM-based identification and verification model is used to determine mood disorder. This work collected the elicited emotional speech data from 15 BDs, 15 UDs and 15 healthy controls for system training and evaluation. Five-fold cross validation was employed for evaluation. Experimental results show that the system using the bottleneck feature achieved an identification accuracy of 73.33%, improving by 8.89%, compared to that without bottleneck features. Furthermore, the system with verification mechanism, improving by 4.44%, outperformed that without verification.
AB - In the diagnosis of mental health disorder, a large portion of the Bipolar Disorder (BD) patients is likely to be misdiagnosed as Unipolar Depression (UD) on initial presentation. As speech is the most natural way to express emotion, this work focuses on tracking emotion profile of elicited speech for short-term mood disorder identification. In this work, the Deep Scattering Spectrum (DSS) and Low Level Descriptors (LLDs) of the elicited speech signals are extracted as the speech features. The hierarchical spectral clustering (HSC) algorithm is employed to adapt the emotion database to the mood disorder database to alleviate the data bias problem. The denoising autoencoder is then used to extract the bottleneck features of DSS and LLDs for better representation. Based on the bottleneck features, a long short term memory (LSTM) is applied to generate the time-varying emotion profile sequence. Finally, given the emotion profile sequence, the HMM-based identification and verification model is used to determine mood disorder. This work collected the elicited emotional speech data from 15 BDs, 15 UDs and 15 healthy controls for system training and evaluation. Five-fold cross validation was employed for evaluation. Experimental results show that the system using the bottleneck feature achieved an identification accuracy of 73.33%, improving by 8.89%, compared to that without bottleneck features. Furthermore, the system with verification mechanism, improving by 4.44%, outperformed that without verification.
UR - http://www.scopus.com/inward/record.url?scp=85050451699&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050451699&partnerID=8YFLogxK
U2 - 10.1109/APSIPA.2017.8282296
DO - 10.1109/APSIPA.2017.8282296
M3 - Conference contribution
AN - SCOPUS:85050451699
T3 - Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
SP - 1648
EP - 1652
BT - Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
Y2 - 12 December 2017 through 15 December 2017
ER -