ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Integrated and Enhanced Pipeline System to Support Spoken Language Analytics for Screening Neurocognitive Disorders

Helen Meng, Brian Mak, Man-Wai Mak, Helene Fung, Xianmin Gong, Timothy Kwok, Xunying Liu, Vincent Mok, Patrick Wong, Jean Woo, Xixin Wu, Ka Ho Wong, Shensheng Xu, Naijun Zheng, Ranzo Huang, Jiawen Kang, Xiaoquan Ke, Junan Li, Jinchao Li, Yi Wang

This paper presents an enhanced pipeline system for automated screening of neurocognitive disorders, e.g. Alzheimer's Disease (AD), using spoken language technologies. To ensure local relevance, the pipeline is applied to two-way interactions between clinical assessors and older adult participants in spoken Cantonese, the predominant language used in Hong Kong. The pipeline includes: (i) Speaker diarization using speaker-turn-aware scoring to capture the temporal structure of conversations. (ii) ASR using XLS-R wav2vec 2.0 models further pre-trained on Cantonese speech data and fine-tuned. (iii) Language modelling using RoBERTa with further fine-tuning. (iv) AD screening with neural network classification. A reference benchmark is obtained using the ADReSS corpus where no diarization is needed, and the partial pipeline attained a competitive detection accuracy of 87.5%.