In this paper we present Ensemble of Gaussian Mixture Localized Neural Networks (EGMLNNs), a model for the joint probability density of input as well as output variables of any general mapping to be estimated. The model aims at identifying clusters in the input data, thereby replacing one complex classifier with an ensemble of relatively simpler classifiers, each of which is localized to operate within its associated cluster. We present an algorithm for maximum likelihood parameter estimation for this model using Expectation Maximization (EM). The reported results on phone recognition task on TIMIT database show that the model is able to obtain performance improvement over a single complex classifier while also reducing the computational complexity required for testing.