ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

A training prompts generation algorithm for connected spoken word recognition

Ha-Jin Yu, Jin Suk Kim

This paper describes an efficient algorithm to generate compact prompts lists for training utterances. In building a connected speech recognizer such as a connected spoken digit recognizer, we have to acquire speech data with various combinations of the words contexts. However, in many speech databases the lists are made by random generators. We provide an efficient algorithm that can generate compact and complete list of words with various contexts. The algorithm begins with a series of unique digits, which is used for difference series of another digits series of wider context. The process is applied recursively until desired context range is achieved. The algorithm can be generalized to any range of contexts, such as tri-words and fourwords. This paper includes proof of optimality and completeness of the algorithm.