ISCA Archive Interspeech 2016
ISCA Archive Interspeech 2016

Convolutional Neural Networks with Data Augmentation for Classifying Speakers’ Native Language

Gil Keren, Jun Deng, Jouni Pohjalainen, Björn Schuller

We use a feedforward Convolutional Neural Network to classify speakers’ native language for the INTERSPEECH 2016 Computational Paralinguistic Challenge Native Language Sub-Challenge, using no specialized features for computational paralinguistics tasks, but only MFCCs with their first and second order deltas. In addition, we augment the training data by replacing the original examples with shorter overlapping samples extracted from them, thus multiplying the number of training examples by almost 40. With the augmented training dataset and enhancements to neural network models such as Batch Normalization, Dropout, and Maxout activation function, we managed to improve upon the challenge baseline by a large margin, both for the development and the test set.