The paper introduces a mixture of auto-associative neural networks for speaker verification. A new objective function based on posterior probabilities of phoneme classes is used for training the mixture. This objective function allows each component of the mixture to model part of the acoustic space corresponding to a broad phonetic class. This paper also proposes how factor analysis can be applied in this setting. The proposed techniques show promising results on a subset of NIST-08 speaker recognition evaluation (SRE) and yield about 10% relative improvement when combined with the state-of-the-art Gaussian Mixture Model i-vector system.