Automatic punctuation of speech is important to make speech-to-text output more readable and easier for downstream language processing. We describe the development of an automatic punctuation system for French and English. The punctuation model using both textual information and acoustic (prosodic) information is based on adaptive boosting. The system is evaluated on a difficult speech database under real-application conditions using output from a state-of-the-art speech-to-text system and automatic audio segmentation and speaker diarization. Unlike previous work, we score automatic punctuation based on two independent manual references. We also compare the two languages and the performance of the automatic system with inter-annotator agreement.
Index Terms: automatic punctuation, rich transcription, prosody