In this paper, deep neural networks are investigated for language identification
in Indian languages. Deep neural networks (DNN) have been recently
proposed for this task. However many architectural choices and training
aspects that have been made while building such systems have not been
studied carefully. We perform several experiments on a dataset consisting
of 12 Indian languages with a total training data of about 120 hours
in evaluating the effect of such choices.
While DNN based approach
is inherently a frame based one, we propose an attention mechanism
based DNN architecture for utterance level classification there by
efficiently making use of the context. Evaluation of models were performed
on 30 hours of testing data with 2.5 hours for each language. In our
results, we find that deeper architectures outperform shallower counterparts.
Also, DNN with attention mechanism outperforms the regular DNN models
indicating the effectiveness of attention mechanism.