ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Speech Taskonomy: Which Speech Tasks are the most Predictive of fMRI Brain Activity?

Subba Reddy Oota, Veeral Agarwal, Mounika Marreddy, Manish Gupta, Raju Bapi

Self-supervised speech based models have been found to be successful in predicting brain recordings of subjects experiencing naturalistic story listening. Inspired by the recent progress on deep learning models for various speech-processing tasks, existing literature has leveraged pretrained speech Transformer models for brain encoding. However, there is no work on exploring the efficacy of task-specific finetuned Transformer representations for this task. Hence, in this paper, we explore transfer learning from representations finetuned for eight different tasks from Speech processing Universal PERformance Benchmark (SUPERB) for predicting brain responses. Encoding models based on task features are used to predict activity in different regions across the whole brain, and also in language and auditory brain regions. Our experiments on finetuning the Wav2Vec2.0 model for these eight tasks show that the model finetuned on automatic speech recognition (ASR) yields the best encoding performance for the whole brain, language and auditory regions.