ISCA Archive ICSLP 1992
ISCA Archive ICSLP 1992

Applications of generalized linear modeling to vowel data

Terrance M. Nearey

Generalized linear models (GLM) provide a flexible framework for investigating many long-standing issues concerning the relation of FO and formant frequencies to vowel categories. These issues include the choice of frequency scale (e.g., log Hz, Bark, ERB), the effect of F0, and the importance of longer-term, speaker-dependent extrinsic information such as formant ranges or average fundamental frequency to the specification of vowel quality. As noted in [2, 31 the differences in the empirical consequences of alternate models can be quite subtle. The present paper illustrates how GLM and related techniques may inform the choice among competing approaches from three perspectives: the modeling of patterns in production data, data-analytic pattern recognition, and direct perceptual modeling. Analysis of Fl patterns from the individual Peterson and Barney [4] data indicates a small but consistent advantage to the log scale over the other two and a preference for extrinsic, formant-average information as a normalization parameter. Simple pattern recognition schemes show relatively little difference among the alternate scales. Initial perceptual modeling via logistic regression on the data from [5] also fails to provide evidence for a substantial difference among the scales. Perceptual experiments analogous to those of [6] specifically designed to be sensitive to the relatively subtle differences among the scales will likely be necessary to resolve the issue of "the best scale" for the representation of vowel quality.