The main goal of the auditory system is to detect and identify incoming sound patterns that are distributed in time and frequency. Since a priori knowledge of the spectrotemporal structure of these patterns is not available, the optimal strategy for the auditory system is to integrate incoming signals in frequency and time according to the average spectrotemporal structure of ecologically relevant stimuli. In the current work, we measure the average spectrotemporal dependencies of continuous speech and show that the dependency structure can be interpreted as an optimal filter matched to the structure of speech, and that the characteristics of the obtained filters are notably similar to the critical bands of human hearing. This result provides further evidence that speech and the auditory system are matched for optimal signaling performance and that the dependency structure is learnable with a single Hebbian-like learning mechanism.
Index Terms: speech perception, auditory perception, statistical learning, sensory plasticity