ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Development of hindi speech recognition system of agricultural commodities using deep neural network

Partho Mandal, Shalini Jain, Gaurav Ojha, Anupam Shukla

To create a system for speech recognition customized for services in a particular domain, it is very important to add more and more languages to the `supported languages' database of the system. In this study, we have collected speech data from a sample of the population we were targeting the system for i.e. tasks for agricultural commodities. We performed the acoustic modelling of this data using a combination of Deep Neural Network (DNN) and Hidden Markov model (HMM) in which the HMM state likelihoods are taken from the outputs of the DNN. We have performed a three stage training: RBM pre-training, frame cross-entropy training, and sequence-training optimizing MMI/sMBR. After extensive experimentation, the accuracy of our system comes to about 82%. This study motivates further research for fine-tuning of such systems.