ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Connectionist segmental post-processing of the n-best solutions in isolated and connected word recognition task

Denys Boiteau, Patrick Haffner

Studies have shown that the segmental discriminative power of Neural Networks (NN) can be very useful to improve the performances of temporal sequence decoders like Hidden Markov Models (HMM) [1,2]. In this paper, we present a new connectionist segmental approach applied to the reordering of the N-best solutions provided by a HMM. The global system uses a segmental recognition framework, where phonetic segments are provided by the alignement of a speech utterance on a HMM. The scores of each solution are obtained by a two-level architecture. The first one is a «One Net One Class» connectionist architecture which provides phonetic scores for each phoneme belonging to a word, where phonetic scores can be interpreted as measures of validity of each segment labelling. The second level computes each word score as a product of the phonetic scores. In a final step, the NN scores and the HMM scores are combined for each of the N-best solutions in an optimal way in order to minimize word classification errors. The present system, with the knowledge of the 5-best solutions, leads to a 15 to 20% reduction of the error rate when compared to the one obtained with the HMM alone on several speaker-independent databases recorded over telephone networks.

Keywords: N-best solutions, segmental connectionist process- ing, segment labelling validation