Predicting the likability of speakers based on their voices is a challenging problem. In this paper, we study this problem in the context of the Likability Sub-challenge of InterSpeech 2012 Speaker Trait Challenge. We apply and extend the technique of Gaussian Processes to predict likability using acoustic features provided by the sub-challenge organizers. Our best performing systems improve published baselines modestly on the test set. We also show that likability of male speakers can be more accurately predicted than that of female speakers. Additionally, using a sparse subset of features typically leads to noticeably improved results. The proposed methods are promising and to fully explore their potentials, we plan to make further submissions to the competition.
Index Terms: likability of voice, Gaussian Processes, sparse models, intelligibility of voice