ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

Which Model is Best: Comparing Methods and Metrics for Automatic Laughter Detection in a Naturalistic Conversational Dataset

Gordon Rennie, Olga Perepelkina, Alessandro Vinciarelli

Laughter is a common paralinguistic vocalization that has been shown to be used for controlling the flow of a conversation, nullifying previous statements, and managing conversations on delicate topics. Already there have been concerted efforts to develop methods for automatically detecting laughter in speech. Many of these studies use artificial datasets and report their model performance using the AUC metric. This paper replicates previous work on laughter detection on those artificial datasets and then extends them by validating the methods on a larger and more naturalistic dataset made up of 60 spontaneous conversations (120 speakers and roughly 12 hours of material in total) with the best performing model achieving an AUC of 90.39\5 +/- 1.10 (precision=13.99 +/- 4.09, recall=76.36 +/- 12.00, F1=23.06 +/- 4.99). The paper then goes on to discuss the shortcomings with the current standard comparison metric in the field of AUC and suggests alternatives which may aid in the comparison and understanding of method's effectiveness.