ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Focused word segmentation for ASR

Amarnag Subramanya, Jeff Bilmes, Chia-Ping Chen

We propose a new set of features based on the temporal statistics of the spectral entropy of speech. We show why these features make good inputs for a speech detector. Moreover, we propose a back-end that uses the evidence from the above features in a ‘focused' manner. Subsequently, by means of recognition experiments we show that using the above back-end leads to significant performance improvements, but merely appending the features to the standard feature vector does not improve performance. We also report a 10% average improvement in word error rate over our baseline for the highly mis-matched case in the Aurora3.0 corpus.