This paper investigates the automatic recognition of conflict escalations during spontaneous conversations. In our previous work, we studied if the level of conflict in a segment of conversation can be automatically inferred by means of prosodic and conversational features. This work investigates the possibility of automatically recognizing if the conflict is increasing, i.e., escalating, or not. The dataset used for the study consists of political debates where short clips are classified into escalation, de-escalation and constant labels. Results show a Weighted Accuracy (WA) equals to 69.6% and an Unweighted Accuracy (UA) equals to 50.7% thus revealing lower accuracies compared to the simple conflict detection task (WA 86.1%, UA 78.2%). While the task appears more difficult compared to conflict detection, results are significantly better than chance level showing the feasibility of this approach. Furthermore, the paper investigates the use of a speaker diarization algorithm to extract features in a completely automatic fashion highlighting some limitations of diarization system.
Index Terms: Spoken Language Understanding, Conflicts, Paralinguistic, Spontaneous Conversation, Prosodic features, Turn-taking features