To create a system for speech recognition customized for services in a particular domain, it is very important to add more and more languages to the `supported languages' database of the system. In this study, we have collected speech data from a sample of the population we were targeting the system for i.e. tasks for agricultural commodities. We performed the acoustic modelling of this data using a combination of Deep Neural Network (DNN) and Hidden Markov model (HMM) in which the HMM state likelihoods are taken from the outputs of the DNN. We have performed a three stage training: RBM pre-training, frame cross-entropy training, and sequence-training optimizing MMI/sMBR. After extensive experimentation, the accuracy of our system comes to about 82%. This study motivates further research for fine-tuning of such systems.