ISCA Archive Odyssey 2014
ISCA Archive Odyssey 2014

Neural Network Bottleneck Features for Language Identification

Pavel Matejka, Le Zhang, Tim Ng, Ondrej Glembek, Jeff Ma, Bing Zhang, Sri Harish Mallidi

This paper presents the application of Neural Network Bottleneck (BN) features in Language Identification (LID). BN features are generally used for Large Vocabulary Speech Recognition in conjunction with conventional acoustic features, such as MFCC or PLP. We compare the BN features to several common types of acoustic features used in the present-day state-of-the-art LID systems. The test set is from DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state-of-the-art detection capabilities on audio from highly degraded radio communication channels. On this type of noisy data, we show that in average, the BN features provide a 45% relative improvement in the Cavg or Equal Error Rate (EER) metrics across several test duration conditions, with respect to our single best acoustic features.

doi: 10.21437/Odyssey.2014-45

Cite as: Matejka, P., Zhang, L., Ng, T., Glembek, O., Ma, J., Zhang, B., Mallidi, S.H. (2014) Neural Network Bottleneck Features for Language Identification. Proc. The Speaker and Language Recognition Workshop (Odyssey 2014), 299-304, doi: 10.21437/Odyssey.2014-45

  author={Pavel Matejka and Le Zhang and Tim Ng and Ondrej Glembek and Jeff Ma and Bing Zhang and Sri Harish Mallidi},
  title={{Neural Network Bottleneck Features for Language Identification}},
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2014)},