ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

A prosody only decision-tree model for disfluency detection

Elizabeth Shriberg, Rebecca Bates, Andreas Stolcke

Speech disfluencies (filled pauses, repetitions, repairs, and false starts) are pervasive in spontaneous speech. The ability to detect and correct disfluencies automatically is important for effective natural language understanding, as well as to improve speech models in general. Previous approaches to disfluency detection have relied heavily on lexical information, which makes them less applicable when word recognition is unreliable. We have developed a disfluency detection method using decision tree classifiers that use only local and automatically extracted prosodic features. Because the model doesn't rely on lexical information, it is widely applicable even when word recognition is unreliable. The model performed significantly better than chance at detecting four disfluency types. It also outperformed a language model in the detection of false starts, given the correct transcription. Combining the prosody model with a specialized language model improved accuracy over either model alone for the detection of false starts. Results suggest that a prosody-only model can aid the automatic detection of disfluencies in spontaneous speech.


doi: 10.21437/Eurospeech.1997-626

Cite as: Shriberg, E., Bates, R., Stolcke, A. (1997) A prosody only decision-tree model for disfluency detection. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 2383-2386, doi: 10.21437/Eurospeech.1997-626

@inproceedings{shriberg97_eurospeech,
  author={Elizabeth Shriberg and Rebecca Bates and Andreas Stolcke},
  title={{A prosody only decision-tree model for disfluency detection}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={2383--2386},
  doi={10.21437/Eurospeech.1997-626},
  issn={1018-4074}
}