ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

An investigation of regression-based prediction of the femininity or masculinity in speech of transgender people

Leon Liebig, Christoph Wagner, Alexander Mainka, Peter Birkholz

Transgender individuals often seek for voice modification to more closely have their voice matched with their new sex, and avoid potential stigmatization or even discrimination. Whereas treatment options such as voice therapy or surgery exist, a quantitative measure of the treatment outcome is missing. In this paper, we therefore propose a novel regression-based method to predict the perceived femininity or masculinity of a speaker's voice. To this end, 86 speakers (34 male, 35 female, 17 transgender) were recorded reading aloud a German standard passage. Subsequently a group of 28 laypersons and 13 experts rated the femininity/masculinity of these speech samples. Each spoken utterance was automatically analysed with respect to nine different pitch-, resonance- and voice quality-related acoustic features. The ratings were the targets for three prediction models (linear, logistic and decision tree regression) based on the extracted features. The results show that, generally, f0 and the vocal tract length (VTL) are the main predictors. Furthermore, the continuous outcome logistic regression model with f0, smoothed cepstral peak prominence (CPPS), Jitter and VTL as input features performed best and achieved promising results with a cross-validated root mean-squared error of 0.117 on the normalized ratings [0,1].