ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Language recognition using phonotactic-based shifted delta coefficients and multiple phone recognizers

Luis Fernando D'Haro, Ricardo Cordoba, Christian Salamea, Javier Ferreiros

A new language recognition technique based on the application of the philosophy of the Shifted Delta Coefficients (SDC) to phone log-likelihood ratio features (PLLR) is described. The new methodology allows the incorporation of long-span phonetic information at a frame-by-frame level while dealing with the temporal length of each phone unit. The proposed features are used to train an i-vector based system and tested on the Albayzin LRE 2012 dataset. The results show a relative improvement of 33.3% in Cavg in comparison with different state-of-the-art acoustic i-vector based systems. On the other hand, the integration of parallel phone ASR systems where each one is used to generate multiple PLLR coefficients which are stacked together and then projected into a reduced dimension are also presented. Finally, the paper shows how the incorporation of state information from the phone ASR contributes to provide additional improvements and how the fusion with the other acoustic and phonotactic systems provides an important improvement of 25.8% over the system presented during the competition.


doi: 10.21437/Interspeech.2014-610

Cite as: D'Haro, L.F., Cordoba, R., Salamea, C., Ferreiros, J. (2014) Language recognition using phonotactic-based shifted delta coefficients and multiple phone recognizers. Proc. Interspeech 2014, 3042-3046, doi: 10.21437/Interspeech.2014-610

@inproceedings{dharo14_interspeech,
  author={Luis Fernando D'Haro and Ricardo Cordoba and Christian Salamea and Javier Ferreiros},
  title={{Language recognition using phonotactic-based shifted delta coefficients and multiple phone recognizers}},
  year=2014,
  booktitle={Proc. Interspeech 2014},
  pages={3042--3046},
  doi={10.21437/Interspeech.2014-610},
  issn={2308-457X}
}