ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Automatic generation of phonetic transcriptions for large speech corpora

Kris Demuynck, Tom Laureys, Steven Gillis

We describe a method for the automatic production of phonetic transcriptions in large speech corpora. First, we focus on the application of different techniques for the generation of pronunciation variants. Then, we explain the application of a speech recognition system for selecting the acoustically best matching phonetic transcription. The system is evaluated on different test sets selected from the Spoken Dutch Corpus, ranging from read-aloud text to spontaneous speech, and achieves promising first results.