ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Anne Rowling Neurological Speech Corpus: clinically annotated longitudinal dataset for developing speech biomarkers in neurodegenerative disorders

Johnny Tam, Christine Weaver, Oliver Watts, Siddharthan Chandran, Suvankar Pal, Rowling Speech Consortium

There is urgent need for scalable, non-invasive and quantifiable biomarkers in neurodegenerative disorders. Speech is an attractive candidate with potential for remote and cheap assessments. Progress is limited by a lack of high quality clinically annotated speech data. We present a longitudinal speech corpus including speakers with dementia, motor neuron disease, Parkinson’s disease, progressive multiple sclerosis, and healthy individuals. Participants complete standardised recordings on an app co-produced with patients, aligned to contemporaneous phenotyping (clinical rating scales, cognitive tests and blood-based biomarkers). 780 participants have provided 5169 recordings in 1033 assessments. Benchmark classification and regression models show promising performance, and predictions on non-speech segments demonstrate limited bias from recording conditions. We continue to upscale data collection and analysis across larger diverse populations to accelerate clinical translation.