ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

An Automatic Multimodal Approach to Analyze Linguistic and Acoustic Cues on Parkinson's Disease Patients

Daniel Escobar-Grisales, Tomás Arias-Vergara, Cristian David Ríos-Urrego, Elmar Nöth, Adolfo M. García, Juan Rafael Orozco-Arroyave

Early detection and monitoring of Parkinson's disease are crucial for properly treating and managing the symptoms. Automatic speech and language analysis has emerged as a promising non-invasive method to monitor the patient's state. This study analyzed different speech and language representations for automatic classification between Parkinson's disease patients and healthy controls. First, each modality is analyzed independently. General representations such as Wav2vec or BETO are used together with representations oriented to model disease traits such as phonemic identifiability in speech modality and grammatical units analysis in language modality. The best speech and language representations were combined using a fusion strategy based on Gated Multimodal Units. The best results are achieved with the multimodal approach, outperforming all results obtained with unimodal representations and the traditional fusion strategy.