ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

Sinusoidal approach for the single-channel speech separation and recognition challenge

P. Mowlaee, R. Saeidi, Zheng-Hua Tan, M. G. Christensen, Tomi Kinnunen, P. Fränti, S. H. Jensen

Most of the single-channel speech separation (SCSS) systems use the short-time Fourier transform as their parametric features. Recent studies have shown that employing sinusoidal features for the SCSS application results in a high perceived speech quality. In this paper, we make a systematic study on automatic speech recognition results for a SCSS system that uses sinusoidal features composed of amplitude and frequency. We compare the speech recognition results with those already reported by other participants in the single-channel speech separation and recognition challenge. Our results show that a newly proposed system achieves an overall recognition accuracy of 52.3%, ranges at the median over all other participants in the challenge.