ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

Comparison of Models for Detecting Off-Putting Speaking Styles

Diego Aguirre, Nigel Ward, Jonathan E. Avila, Heike Lehnert-LeHouillier

In human-human interaction, speaking styles variation is pervasive. Modeling such variation has seen increasing interest, but there has been relatively little work on how best to discriminate among styles, and apparently none on how to exploit pretrained models for this. Moreover, little computational work has addressed questions of how styles are perceived, although this is often the most important aspect in terms of social and interpersonal relevance. Here we develop models of whether an utterance is likely to be perceived as off-putting. We explore different ways to leverage state-of-the-art pretrained representations, namely those for TRILL, COLA, and TRILLsson. We obtain reasonably good performance in detecting off-putting styles, and find that architectures and learned representations designed to capture multi-second temporal information perform better.