ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

ECAPA-TDNN Based Depression Detection from Clinical Speech

Dong Wang, Yanhui Ding, Qing Zhao, Peilin Yang, Shuping Tan, Ya Li

Depression is a serious mood disorder that has become one of the major diseases that endanger human mental health. The automatic detection of depression using speech signals has become a promising approach for the early diagnosis of depression currently. However, there is still a performance gap between clinical practice and research, considering the lab-recorded corpus was used in most of the current studies. Therefore, we collected a Chinese clinical depression corpus, of which 131 participants with their speech during the Hamilton Rating Scale for Depression (HAMD) interview were included in this study. Furthermore, we developed a depression speech detection system based on a Time-Delay Neural Network (TDNN) model to distinguish depression. Our approach achieves a mean F1 score of 90.8% and an accuracy of 90.4% by five-fold cross-validation. The result suggests that the developed TDNN-based model has a potential clinical meaning in the diagnosis of depression.