ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Investigation of parametric rectified linear units for noise robust speech recognition

Sunil Sivadas, Zhenzhou Wu, Ma Bin

Convolutional neural networks with rectified linear unit (ReLU) have been successful in speech recognition and computer vision tasks. ReLU was proposed as a better match to biological neural activation functions compared to sigmoidal non-linearity function. However, ReLU has a disadvantage that the gradient is zero whenever the unit is not active or saturated. To alleviate the potential problems due the zero gradient, Leaky ReLU (LReLU) was proposed. Recently, a parametrized form of ReLU (PReLU) was shown to give superior performance compared to ReLU on large scale computer vision tasks. PReLU is a generalized version of LReLU where the gradient is learned adaptively from the training data. In this paper we investigate PReLU based deep convolutional neural networks for noise robust speech recognition. We report experimental results on Aurora-4 multi-condition training task. We show that PReLU gives slightly better Word Error Rates (WERs) on noisy test sets compared to ReLU. In combination with dropout generalization method we report one of the best WERs in the literature for this noisy speech recognition task.