ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Intelligent retrieval of very large Chinese dictionaries with speech queries

Sung-Chien Lin, Lee-Feng Chien, Ming-Chiuan Chen, Lin-Shan Lee, Ker-Jiann Chen

To retrieve a Chinese word from a Chinese dictionary, it needs the user to know exactly the first character of the desired word. Because there is more than 10,000 Chinese characters, this makes the Chinese dictionary relatively difficult to be used. To reduce the problem, this paper presents intelligent retrieval techniques for very large Chinese dictionaries with speech queries. The proposed techniques properly integrate the technologies of Mandarin speech recognition and Chinese information retrieval with a syllable-based approach utilizing the mono-syllabic structure of the language. Moreover, it is very nice to provide the function of retrieving all relevant word entries from the dictionaries using speech queries describing "general concepts" of the desired words. To achieve the challenging function, the techniques of relevance feedback are also included. Based on these techniques, a retrieval system was implemented successfully on a Pentium PC for a very large Chinese dictionary which includes 160,000 word entries and the total length of the lexical information under the word entries exceeds 20,000,000 words.