ISCA Archive PSP 2005
ISCA Archive PSP 2005

Perception of altered formant feedback influences speech production

David Purcell, Ingrid Johnsrude, Kevin Munhall

Auditory feedback from a speaker's voice is an important sensorimotor control signal. Hearing your own speech is essential for vocal learning in infancy, and is also important for the maintenance of fluent speech articulation as an adult. It is well known that hearing impairment can produce changes in pitch and loudness control as well as changes in the precision and reliability of consonant and vowel production. Recently, changes to this speech feedback have been studied in the laboratory using perturbations of the pitch, amplitude and spectral distribution of speech sounds. While subjects produce speech samples wearing headphones, realtime modifications to the auditory feedback can be carried out through signal processing.

In this paper, we present data showing that subjects respond rapidly to formant modification. We also present data on sensitivity to change, both in feedback perception during production, and in perception of recorded samples. In the work reported here, the first formant of steady-state isolated vowels was altered within trials. Since previous studies have employed whispered speech, it was necessary to develop a new system to manipulate the formants of voiced speech using real-time formant tracking and filtering. Formants (such as F1 and/or F2) can be shifted by filtering out the signal at the frequency where the given formant number was estimated, and emphasizing the signal at the new desired formant frequency. This is accomplished using a filter transfer function with a pair of spectral zeroes in the numerator to attenuate the energy (harmonics of the glottal fundamental frequency F0 for vowels) near the existing formant, and a pair of spectral poles in the denominator to amplify energy near the new formant. The first formant of vowel /{/ was manipulated within trials 100% towards either /a/ or /I/ for the individual. Participants responded by altering their production with average F1 compensation as large as 14.4% and 11.3% of the applied formant shift, respectively. Small concomitant changes were also observed in F2. In a separate study, vowel formant discrimination was tested to examine the similarity between speech perception of recorded samples, and the perception of speech feedback during production. The results will be discussed in terms of the relationship between speech perception and production.