Listeners recognize a vast number of complex sounds, but vocal sounds, speech and song, are essential for communication. Recently, deep neural networks (DNNs) have achieved human-level accuracy in sound classification, but do they illuminate similar properties with biological brains? In this study, we compared DNNs to primary and secondary auditory cortex to understand the hierarchy of sound representations in the brain. Ten subjects listened to speech and other naturalistic sounds while their magnetoencephalography (MEG) signals were recorded. Widely-used DNNs were trained to classify the same sounds. Brain activity localized to secondary auditory areas decoded speech significantly more accurately than other non-human sounds. Secondary auditory selectivity best matched later, and more complex layers of DNNs. Our results are compatible with special coding for speech in the brain and suggest comparable hierarchical principles of DNNs and neural processing of sounds.