Mood disorder identification using deep bottleneck features of elicited speech

Kun Yi Huang, Chung Hsien Wu, Ming Hsiang Su, Chia Hui Chou

研究成果: Conference contribution

1 引文 (Scopus)

摘要

In the diagnosis of mental health disorder, a large portion of the Bipolar Disorder (BD) patients is likely to be misdiagnosed as Unipolar Depression (UD) on initial presentation. As speech is the most natural way to express emotion, this work focuses on tracking emotion profile of elicited speech for short-term mood disorder identification. In this work, the Deep Scattering Spectrum (DSS) and Low Level Descriptors (LLDs) of the elicited speech signals are extracted as the speech features. The hierarchical spectral clustering (HSC) algorithm is employed to adapt the emotion database to the mood disorder database to alleviate the data bias problem. The denoising autoencoder is then used to extract the bottleneck features of DSS and LLDs for better representation. Based on the bottleneck features, a long short term memory (LSTM) is applied to generate the time-varying emotion profile sequence. Finally, given the emotion profile sequence, the HMM-based identification and verification model is used to determine mood disorder. This work collected the elicited emotional speech data from 15 BDs, 15 UDs and 15 healthy controls for system training and evaluation. Five-fold cross validation was employed for evaluation. Experimental results show that the system using the bottleneck feature achieved an identification accuracy of 73.33%, improving by 8.89%, compared to that without bottleneck features. Furthermore, the system with verification mechanism, improving by 4.44%, outperformed that without verification.

原文English
主出版物標題Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
發行者Institute of Electrical and Electronics Engineers Inc.
頁面1648-1652
頁數5
2018-February
ISBN(電子)9781538615423
DOIs
出版狀態Published - 2018 二月 5
事件9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017 - Kuala Lumpur, Malaysia
持續時間: 2017 十二月 122017 十二月 15

Other

Other9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
國家Malaysia
城市Kuala Lumpur
期間17-12-1217-12-15

指紋

Scattering
Clustering algorithms
Identification (control systems)
Health
Long short-term memory

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Human-Computer Interaction
  • Information Systems
  • Signal Processing

引用此文

Huang, K. Y., Wu, C. H., Su, M. H., & Chou, C. H. (2018). Mood disorder identification using deep bottleneck features of elicited speech. 於 Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017 (卷 2018-February, 頁 1648-1652). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/APSIPA.2017.8282296
Huang, Kun Yi ; Wu, Chung Hsien ; Su, Ming Hsiang ; Chou, Chia Hui. / Mood disorder identification using deep bottleneck features of elicited speech. Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017. 卷 2018-February Institute of Electrical and Electronics Engineers Inc., 2018. 頁 1648-1652
@inproceedings{127d8d9892374636bb7a4eb171aab317,
title = "Mood disorder identification using deep bottleneck features of elicited speech",
abstract = "In the diagnosis of mental health disorder, a large portion of the Bipolar Disorder (BD) patients is likely to be misdiagnosed as Unipolar Depression (UD) on initial presentation. As speech is the most natural way to express emotion, this work focuses on tracking emotion profile of elicited speech for short-term mood disorder identification. In this work, the Deep Scattering Spectrum (DSS) and Low Level Descriptors (LLDs) of the elicited speech signals are extracted as the speech features. The hierarchical spectral clustering (HSC) algorithm is employed to adapt the emotion database to the mood disorder database to alleviate the data bias problem. The denoising autoencoder is then used to extract the bottleneck features of DSS and LLDs for better representation. Based on the bottleneck features, a long short term memory (LSTM) is applied to generate the time-varying emotion profile sequence. Finally, given the emotion profile sequence, the HMM-based identification and verification model is used to determine mood disorder. This work collected the elicited emotional speech data from 15 BDs, 15 UDs and 15 healthy controls for system training and evaluation. Five-fold cross validation was employed for evaluation. Experimental results show that the system using the bottleneck feature achieved an identification accuracy of 73.33{\%}, improving by 8.89{\%}, compared to that without bottleneck features. Furthermore, the system with verification mechanism, improving by 4.44{\%}, outperformed that without verification.",
author = "Huang, {Kun Yi} and Wu, {Chung Hsien} and Su, {Ming Hsiang} and Chou, {Chia Hui}",
year = "2018",
month = "2",
day = "5",
doi = "10.1109/APSIPA.2017.8282296",
language = "English",
volume = "2018-February",
pages = "1648--1652",
booktitle = "Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Huang, KY, Wu, CH, Su, MH & Chou, CH 2018, Mood disorder identification using deep bottleneck features of elicited speech. 於 Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017. 卷 2018-February, Institute of Electrical and Electronics Engineers Inc., 頁 1648-1652, 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017, Kuala Lumpur, Malaysia, 17-12-12. https://doi.org/10.1109/APSIPA.2017.8282296

Mood disorder identification using deep bottleneck features of elicited speech. / Huang, Kun Yi; Wu, Chung Hsien; Su, Ming Hsiang; Chou, Chia Hui.

Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017. 卷 2018-February Institute of Electrical and Electronics Engineers Inc., 2018. p. 1648-1652.

研究成果: Conference contribution

TY - GEN

T1 - Mood disorder identification using deep bottleneck features of elicited speech

AU - Huang, Kun Yi

AU - Wu, Chung Hsien

AU - Su, Ming Hsiang

AU - Chou, Chia Hui

PY - 2018/2/5

Y1 - 2018/2/5

N2 - In the diagnosis of mental health disorder, a large portion of the Bipolar Disorder (BD) patients is likely to be misdiagnosed as Unipolar Depression (UD) on initial presentation. As speech is the most natural way to express emotion, this work focuses on tracking emotion profile of elicited speech for short-term mood disorder identification. In this work, the Deep Scattering Spectrum (DSS) and Low Level Descriptors (LLDs) of the elicited speech signals are extracted as the speech features. The hierarchical spectral clustering (HSC) algorithm is employed to adapt the emotion database to the mood disorder database to alleviate the data bias problem. The denoising autoencoder is then used to extract the bottleneck features of DSS and LLDs for better representation. Based on the bottleneck features, a long short term memory (LSTM) is applied to generate the time-varying emotion profile sequence. Finally, given the emotion profile sequence, the HMM-based identification and verification model is used to determine mood disorder. This work collected the elicited emotional speech data from 15 BDs, 15 UDs and 15 healthy controls for system training and evaluation. Five-fold cross validation was employed for evaluation. Experimental results show that the system using the bottleneck feature achieved an identification accuracy of 73.33%, improving by 8.89%, compared to that without bottleneck features. Furthermore, the system with verification mechanism, improving by 4.44%, outperformed that without verification.

AB - In the diagnosis of mental health disorder, a large portion of the Bipolar Disorder (BD) patients is likely to be misdiagnosed as Unipolar Depression (UD) on initial presentation. As speech is the most natural way to express emotion, this work focuses on tracking emotion profile of elicited speech for short-term mood disorder identification. In this work, the Deep Scattering Spectrum (DSS) and Low Level Descriptors (LLDs) of the elicited speech signals are extracted as the speech features. The hierarchical spectral clustering (HSC) algorithm is employed to adapt the emotion database to the mood disorder database to alleviate the data bias problem. The denoising autoencoder is then used to extract the bottleneck features of DSS and LLDs for better representation. Based on the bottleneck features, a long short term memory (LSTM) is applied to generate the time-varying emotion profile sequence. Finally, given the emotion profile sequence, the HMM-based identification and verification model is used to determine mood disorder. This work collected the elicited emotional speech data from 15 BDs, 15 UDs and 15 healthy controls for system training and evaluation. Five-fold cross validation was employed for evaluation. Experimental results show that the system using the bottleneck feature achieved an identification accuracy of 73.33%, improving by 8.89%, compared to that without bottleneck features. Furthermore, the system with verification mechanism, improving by 4.44%, outperformed that without verification.

UR - http://www.scopus.com/inward/record.url?scp=85050451699&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050451699&partnerID=8YFLogxK

U2 - 10.1109/APSIPA.2017.8282296

DO - 10.1109/APSIPA.2017.8282296

M3 - Conference contribution

AN - SCOPUS:85050451699

VL - 2018-February

SP - 1648

EP - 1652

BT - Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Huang KY, Wu CH, Su MH, Chou CH. Mood disorder identification using deep bottleneck features of elicited speech. 於 Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017. 卷 2018-February. Institute of Electrical and Electronics Engineers Inc. 2018. p. 1648-1652 https://doi.org/10.1109/APSIPA.2017.8282296