ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

A random, semantically appropriate sentence generator for speaker verification

Jason Lilley, Amanda Stent, Ilija Zeljkovic

In this paper, we describe two systems for automatically generating English sentences, and evaluate the suitability of their output for speaker verification. The first system, SUSGen, generates grammatical but semantically anomalous sentences of controlled length, vocabulary and phonetic content. The second system, SASGen, extends SUSGen to generate a greater variety of sentences and ones which are, for the most part, semantically acceptable. We demonstrate that sentences generated by SASGen are significantly more readable and meaningful than those generated by SUSGen. While sentences generated by SASGen were not judged to be as readable or meaningful as human-generated sentences, the additional control SASGen provides for sentence length, vocabulary and phonetic content make it more suitable for speaker verification and other voice collection purposes.