ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Towards automatic detection of reported speech in dialogue using prosodic cues

Alessandra Cervone, Catherine Lai, Silvia Pareti, Peter Bell

The phenomenon of reported speech — whereby we quote the words, thoughts and opinions of others, or recount past dialogue — is widespread in conversational speech. Detecting such quotations automatically has numerous applications: for example, in enhancing automatic transcription or spoken language understanding applications. However, the task is challenging, not least because lexical cues of quotations are frequently ambiguous or not present in spoken language. The aim of this paper is to identify potential prosodic cues of reported speech which could be used, along with the lexical ones, to automatically detect quotations and ascribe them to their rightful source, that is reconstructing their attribution relations. In order to do so we analyze SARC, a small corpus of telephone conversations that we have annotated with attribution relations. The results of the statistical analysis performed on the data show how variations in pitch, intensity, and timing features can be exploited as cues of quotations. Furthermore, we build a SVM classifier which integrates lexical and prosodic cues to automatically detect quotations in speech that performs significantly better than chance.