ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Analysing the prosodic characteristics of speech-chunks preceding silences in task-based interactions

John Kane, Irena Yanushevskaya, Céline de Looze, Brian Vaughan, Ailbhe Ní Chasaide

For many applications in human-computer interaction, it is desirable to predict between-(gaps) and within-(pauses) speaker silences independently of automatic speech recognition (ASR). In this study, we focus a dataset of 6 dyadic task-based interactions and aim at automatic discrimination of gaps and pauses based on F0, energy and glottal parameters derived from the speech just preceding the silence. Initial manual annotation reveals strong discriminative power of intonation tune types. In a subsequent automatic analysis using descriptive statistics of parameter contours, as well as a modelling of such contours using principal component analysis, we are able to speaker-independently predict pauses and gaps at an accuracy of 70% compared to a 56% baseline