Recent work has shown that understanding noisevocoded (NV) speech requires a top-down, lexically driven perceptual learning process. Higher-level influences are revealed by enhanced learning with spoken or written feedback and by the absence of learning from nonword NV sentences (Davis, Johnsrude, Hervais-Adelman, Taylor, McGettigan, in press). However, learning must modify pre-lexical processes, since report scores improve for words that have not been heard in NV speech. Here, we ask whether perceptual learning alters acoustic processing or more abstract, non-acoustic representations of speech. We address this question by changing the carrier signal (Experiments 1/2) and frequency bands (Experiment 3) used in vocoding and assessing whether training on one form of vocoded speech improves performance on an acoustically different form.
Experiment 1 tested transfer between vocoded stimuli created using noise (NV) or pulse-train carriers (PTV). Four groups of volunteers reported 40 vocoded sentences with feedback to promote perceptual learning (Davis et al, in press). For two groups, all sentences were vocoded using the same carrier signal. For two transfer groups the carrier signal changed after 20 sentences, allowing testing of generalisation from one carrier to the other. Results showed some significant transfer: listeners trained on NV speech outperformed nave listeners when tested on PTV speech. However, the reverse transfer effect was non-significant, perhaps because report of NV sentences was near ceiling. In Experiment 2 care was taken to ensure that wordreport scores were approximately equal for both carriers. Nonetheless, results still showed asymmetric transfer: training on NV speech improved report scores for PTV speech, but not vice versa. Asymmetric transfer suggests that some but not all perceptual learning of vocoded speech is independent of the fine-structure used to encode speech information.
Experiment 3 tested whether perceptual learning of NV speech generalises between non-overlapping frequency ranges. Two groups reported 40 NV sentences filtered into either low (50 to 1406Hz) or high (1593-8000Hz) frequency regions with feedback to assist learning as before. In two transfer groups, the frequency region changed after 20 sentences. Both transfer groups showed complete generalisation: report was not different from listeners exposed to a single frequency range throughout. Thus, perceptual learning of NV speech alters representations that are not frequencyspecific.
In summary, perceptual learning of vocoded speech generalises over changes to fine-structure (Experiments 1/2) and frequency range (Experiment 3), demonstrating that perceptual learning modifies pre-lexical representations that are abstracted from the acoustic properties of vocoded speech.