Experiments investigated the effects of training set size and diversity of speech data in training an HMM-based, speaker-independent, continuous Japanese speech recognition system. Two different types of diversity were investigated: speaker diversity and phonetic diversity. The results indicate that greater amounts of training data improve recognition performance and that, given a fixed amount of training data, greater diversity of training materials both in terms of speakers and phonetic contexts improve recognition performance.