ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Similar Hierarchical Representation of Speech and Other Complex Sounds In the Brain and Deep Residual Networks: An MEG Study

Tzu-Han Zoe Cheng, Kuan-Lin Chen, Juliane Schubert, Ya-Ping Chen, Tim Brown, John Iversen

Listeners recognize a vast number of complex sounds, but vocal sounds, speech and song, are essential for communication. Recently, deep neural networks (DNNs) have achieved human-level accuracy in sound classification, but do they illuminate similar properties with biological brains? In this study, we compared DNNs to primary and secondary auditory cortex to understand the hierarchy of sound representations in the brain. Ten subjects listened to speech and other naturalistic sounds while their magnetoencephalography (MEG) signals were recorded. Widely-used DNNs were trained to classify the same sounds. Brain activity localized to secondary auditory areas decoded speech significantly more accurately than other non-human sounds. Secondary auditory selectivity best matched later, and more complex layers of DNNs. Our results are compatible with special coding for speech in the brain and suggest comparable hierarchical principles of DNNs and neural processing of sounds.