ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Vector taylor series based joint uncertainty decoding

Haitian Xu, Luca Rigazio, David Kryze

Joint uncertainty decoding has recently achieved promising results by using front-end uncertainty in the back-end in a mathematically consistent framework. One drawback of the method is that it relies on stereo-data or numerical algorithms, such as DPMC, which have high computational complexity and are difficult to deploy in real applications. We propose a Vector Taylor Series (VTS) approach to joint uncertainty decoding which provides a closed-form solution to the key problem of estimating the clean/noisy speech cross-covariance matrix. Our solution does not require stereo-data or numerical integration. We also propose a new strategy to deal with the cross-covariance matrix singularity. Experiments on Aurora2 show that VTS-based joint uncertainty decoding has similar accuracy compared to DPMC-based joint uncertainty decoding while being at least three times faster. Finally, VTS-based joint uncertainty decoding provided more than 2% absolute improvement when combined with our new strategy for cross-covariance singularity.