ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Speech recognition with automatic punctuation

C. Julian Chen

We present a method of speech recognition with automatic punctuation based on a combination of acoustic and lexical evidence. In the recognizer vocabulary, punctuation marks are treated as word entries. By assigning the acoustic baseforms of silence, breath, and other non-speech sounds to punctuation marks, and using a properly processed N-gram language model, unpronounced punctuation marks of various types (commas, periods, etc.) appear naturally in the recognizer output. This technology can be used in dictation systems to improve usability, in commercial broadcast transcription systems to reduce editing time, and in information retrieval systems to provide phrasing information to facilitate natural language understanding.