ISCA Archive Interspeech 2016
ISCA Archive Interspeech 2016

Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information

Yougen Yuan, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li

We assume that only word pairs identified by human are available in a low-resource target language. The word pairs are parameterized by a bottleneck feature (BNF) extractor that is trained using transcribed data in a high-resource language. The cross-lingual BNFs of the word pairs are used for training another neural network to generate a new feature representation in the target language. Pairwise learning of frame-level and word-level feature representations are investigated. Our proposed feature representations were evaluated in a word discrimination task on the Switchboard telephone speech corpus. Our learned features could bring 27.5% relative improvement over the previously best reported result on the task.