This paper describes a set of experiments using a speaker-independent, isolated-word recognition system over the telephone network. We are concerned with the sensitivity of recognition accuracy to a set of external factors. Specifically, we want to know how much performance degradation can be expected when no vocabulary-specific data are available for training. We also want to determine the relative importance of several factors in choosing a vocabulary-independent training set including: data collection paradigm, speaking mode, and recording environment. Our results indicate that recognition accuracy decreases appreciably in the absence of vocabulary-specific training data. If vocabulary-specific training data are not available, reasonable initial recognition performance can be achieved by using a phonetically balanced corpus for training, preferably consisting of isolated words recorded over the telephone network.