ISCA Archive Interspeech 2020
ISCA Archive Interspeech 2020

Multi-Scale Convolution for Robust Keyword Spotting

Chen Yang, Xue Wen, Liming Song

We propose a robust small-footprint keyword spotting system for resource-constrained devices. Small footprint is achieved by the use of depthwise-separable convolutions in a ResNet framework. Noise robustness is achieved with a multi-scale ensemble of classifiers: each classifier is specialized for a different view of the input, while the whole ensemble remains compact in size by heavy parameter sharing. Extensive experiments on public Google Command dataset demonstrate the effectiveness of our proposed method.