All normal humans can acquire native phoneme systems naturally. However, it is unclear as to how infants learn the acoustic expression of each phoneme of their languages. In recent studies, researchers have inspected phoneme acquisition by using a computational model. However these studies have used a reading speech that has a limited vocabulary as input and do not handle a continuous speech. Therefore, we use a natural speech and build a self-organization model that simulates the cognitive ability, and we analyze the information that is necessary for the acquisition of the native vowels. Our model is designed to learn a natural continuation utterance and to estimate the number and boundaries of the vowel categories. In the simulation trial, we investigate the relationship between the quantity of learning and the accuracy for the vowels in a single Japanese speakers speech. As a result, it is found that the vowel recognition rate of our model is comparable to that of an adult.