ISCA Archive PSP 2005
ISCA Archive PSP 2005

Coding speech into useful structures: a neural basis for Polysp

Sarah Hawkins, Rachel Smith

This paper presents thinking concerning a neurological and psychoacoustic basis to a polysystemic approach to speech perception, Polysp, with particular reference to behavioural, physiological, and biochemical research on hippocampal function and plasticity in sound classification. Polysp proposes that the incoming signal is mapped probabilistically onto polysystemic prosodic-linguistic structures. Attention to fine detail is crucial for accessing the right structure and optimally adapting to new speakers; it sometimes allows auditory patterns to be understood without intermediate analytical stages. The model includes exemplar and abstract representation. Exemplar representation is plausible logically (speech is but one aspect of communicative actions) and psychoacoustically (signals seem to be analysed in the auditory pathway in terms of spectro-temporal excitation patterns (STEPs), which are assumed to be held in short-term memory; some may be passed to long-term memory). STEPs excite distinctive patterns of spectro-temporal receptive fields (STRFs) in the primary auditory cortex. STRFs reflect particular attributes of auditory stimuli, but, crucially, their response-specificity varies with task demands. Shifts occur within minutes, and can last hours. Such neocortical plasticity in responding to auditory stimuli might underpin the sensitivity of the perceptual system to changes in stimulus or task demands, including those which are typically 'controlled for' in categorical perception experiments (e.g. stimulus range, response set). This type of representation is proposed as the basis for storage, in associative networks, of all units derivable from speech, and for adaptation in ongoing speech perception.

Abstraction (including classification) and long-term storage are hypothesized to include hippocampal processing together with neocortical plasticity. The process involves linking disparate cortical excitation patterns by building hierarchical indices to them in the hippocampus. The links made depend on how attention is directed. Links might be thought of as creating 'road maps', each map representing a distinct polysystemic structure, linguistic, socialaffective, cognitive, as attention dictates. For example, connectivity with the limbic system allows social-affective attributes of heard speech to be understood, as well as (or sometimes instead of) the so-called purely linguistic message. Critical for Polysp, novel stimuli have a disproportionate influence on hippocampal coding, and an entire structure can be excited from activation of just part of the hippocampal hierarchy. This model is compatible with others that are based on emergence of auditory objects e.g. Adaptive Resonance Theory and auditory scene analysis, but includes a fundamental role for fine phonetic detail. The paper includes predictions derived from these principles.

Cite as: Hawkins, S., Smith, R. (2005) Coding speech into useful structures: a neural basis for Polysp. Proc. ISCA Workshop on Plasticity in Speech Perception (PSP 2005), 31

  author={Sarah Hawkins and Rachel Smith},
  title={{Coding speech into useful structures: a neural basis for Polysp}},
  booktitle={Proc. ISCA Workshop on Plasticity in Speech Perception (PSP 2005)},