ISCA Archive Interspeech 2010
ISCA Archive Interspeech 2010

Unsupervised learning of vowels from continuous speech based on self-organized phoneme acquisition model

Kouki Miyazawa, Hideaki Kikuchi, Reiko Mazuka

All normal humans can acquire native phoneme systems naturally. However, it is unclear as to how infants learn the acoustic expression of each phoneme of their languages. In recent studies, researchers have inspected phoneme acquisition by using a computational model. However these studies have used a reading speech that has a limited vocabulary as input and do not handle a continuous speech. Therefore, we use a natural speech and build a self-organization model that simulates the cognitive ability, and we analyze the information that is necessary for the acquisition of the native vowels. Our model is designed to learn a natural continuation utterance and to estimate the number and boundaries of the vowel categories. In the simulation trial, we investigate the relationship between the quantity of learning and the accuracy for the vowels in a single Japanese speaker’s speech. As a result, it is found that the vowel recognition rate of our model is comparable to that of an adult.