ISCA Archive Eurospeech 2003
ISCA Archive Eurospeech 2003

Missing feature theory applied to robust speech recognition over IP network

Toshiki Endo, Shingo Kuroiwa, Satoshi Nakamura

This paper addresses the problems involved in performing speech recognition over mobile and IP networks. The main problem is speech data loss caused by packet loss in the network. We present two missing-feature-based approaches that recover lost regions of speech data. These approaches are based on reconstruction of missing frames or on marginal distributions. For comparison, we also use a tacking method, which recognizes only received data. We evaluate these approaches with packet loss models, i.e., random loss and Gilbert loss models. The results show that the marginal-distributions-based approach is most effective for a packet loss environment; the degradation of word accuracy is only 5% when the packet loss rate is 30% and only 3% when mean burst loss length is 24 frames.