ISCA Archive SpeechProsody 2020
ISCA Archive SpeechProsody 2020

Creak in the phonetic space of low tones in Beijing Mandarin, Cantonese, and White Hmong

Seoyoung Kim, Claudia Matachana, Alex Nyman, Kristine Yu

Low pitch, irregular pitch, and constricted voicing have been proposed as three independent perceptual properties of creaky voice quality, with corresponding acoustic correlates fundamental frequency, harmonics-to-noise ratio, and spectral tilt measure H1-H2. We examined how these three acoustic measures described the variability in a small corpus of multispeaker productions of low falling tones that are often creaky in Beijing Mandarin, Cantonese, and White Hmong. Using principal components analysis, we found that harmonics-to-noise ratios strongly dominated the first principal component (50-60% of the variance across languages), while fundamental frequency and H1-H2 were strongly correlated. Moreover, in all three languages, tokens identified as likely to be creaky by a neural network creak classifier (Drugman et al., 2014) clustered in the high noise region of the principal component space according to the first principal component. No systematic patterns of clustering with respect to fundamental frequency or spectral tilt were found. Principal component analysis on only tokens identified as having greater than a 50% likelihood of being creaky indicated a lack of statistical independence between the three acoustic measures across languages and no distinct clusters were found in the principal component space in any language.