ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

A CRF-based approach to automatic disfluency detection in a French call-centre corpus

Camille Dutrey, Chloé Clavel, Sophie Rosset, Ioana Vasilescu, Martine Adda-Decker

In this paper, we present a Conditional Random Field based approach for automatic detection of edit disfluencies in a conversational telephone corpus in French. We define disfluency patterns using both linguistic and acoustic features to perform disfluency detection. Two related tasks are considered: the first task aims at detecting the disfluent speech portion proper or reparandum, i.e. the portion to be removed if we want to improve the readability of transcribed data ; in the second task, we aim at identifying also the corrected portion or repair which can be useful in follow-up discourse and dialogue analyses or in opinion mining. For these two tasks, we present comparative results as a function of the involved type of features (acoustic and/or linguistic). Generally speaking, best results are obtained by CRF models combining both acoustic and linguistic features.


doi: 10.21437/Interspeech.2014-601

Cite as: Dutrey, C., Clavel, C., Rosset, S., Vasilescu, I., Adda-Decker, M. (2014) A CRF-based approach to automatic disfluency detection in a French call-centre corpus. Proc. Interspeech 2014, 2897-2901, doi: 10.21437/Interspeech.2014-601

@inproceedings{dutrey14_interspeech,
  author={Camille Dutrey and Chloé Clavel and Sophie Rosset and Ioana Vasilescu and Martine Adda-Decker},
  title={{A CRF-based approach to automatic disfluency detection in a French call-centre corpus}},
  year=2014,
  booktitle={Proc. Interspeech 2014},
  pages={2897--2901},
  doi={10.21437/Interspeech.2014-601},
  issn={2308-457X}
}