Considering the succes of ESPRIT projects, especially the multilingual speech recognizer systems, we examined the possibility to adapt these systems to Hungarian language. Aglutinative languages (Hungarian, Finnish, Turkish, etc.) has much more word-forms than indoeuropean languages have. The same problem, although to a lesser degree, holds for some indoeuropean languages, too, especially for Slavic languages, and perhaps also for German. So even a large vocabulary isolated word recognizer would require the incorporation of CSR ideas. It would be highly impractical simply enumerate the possible word-forms in some sort of vocabulary, as it is generally done in English, for words in aglutinative languages has their own sophisticated grammatical structure. In the case of real CSR the problem is even harder, indeed. So it is clear, that the units to be recognized at the lowest level have to be smaller than words. The exact size and definition of these units however, depends on the language, and has to be determined on empirical bases. To this end a statistical study of (Hungarian) language is conducted on the phonetic to syllabic level, and the acoustic structure of possible unit candidates is also studied. A Hungarian text database is analyzed after grapheme-to-phoneme conversion (what can be done quite well by rules alone for Hungarian). The statistical distribution of several entities (consonant-clusters, half syllables, syllables) is found to be different in continuous speech from that of isolated words. Based on the statistical examinations we found half syllable units to be the most compact description of the phonological structure of Hungarian language.
Keywords: - Continuous speech recognition, - Automatic segmentation - Half syllable - Statistical study