ISCA Archive ECST 1987
ISCA Archive ECST 1987

An efficient technique for isolated word recognition of monosyllabic languages

P. C. Ching, W. M. Lai, Y. T. Chan

An automatic speech recognition system is discussed in which the energy-time profiles at several frequency bands are used to represent an input utterance and then compared with a reference set obtained during training with many different speakers. To reduce considerably the number of misrecognitions as well as the overall matching time, a zero-crossing count front end is used for a voice/fricative initial classification. The recognition scheme is most suitable for monosyllabic languages and has the advantages of being very simple, avoiding time-warping and permitting low-cost implementation on a microcomputer. The system was evaluated for speaker-independent isolated word recognition of the ten Cantonese digits. A mean recognition accuracy of about 90-957o was obtained.