ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Vowel classification for computer-based visual feedback for speech training for the hearing impaired

Stephen A. Zahorian, A. Matthew Zimmer, Fansheng Meng

A visual speech training aid for persons with hearing impairments has been developed using a Windows-based multimedia computer. The training aid provides real time visual feedback as to the quality of pronunciation for 10 steady-state American English monopthong vowels (/aa/, /iy/, /uw/, /ae/, /er/, /ih/, /eh/, /ao/, /ah/, and /uh/). This training aid is thus referred to as a Vowel Articulation Training Aid (VATA). Neural network (NN) classifiers are used to classify vowels and then provide real time feedback for several displays: a 10-category "vowel bargraph" which provides "discrete" feedback, an "ellipse display" which provides continuous feedback over a 2-D space similar to a formant1-formant2 space, and three game displays (a form of "tetrus", controlled by one vowel, a "chicken crossing the road", controlled by two vowels, and pacman, controlled by four vowels). Continuous feedback such as this is desirable for speech training to help improve articulation. In this paper we describe the overall speech training system, discuss some algorithmic refinements to the vowel classifier, and report some experiments related to the development of a database used for "training" the display.