ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Profiling large-vocabulary continuous speech recognition on embedded devices: a hardware resource sensitivity analysis

Kai Yu, Rob A. Rutenbar

When deployed in embedded systems, speech recognizers are necessarily reduced from large-vocabulary continuous speech recognizers (LVCSR) found on desktops or servers to fit the limited hardware. However, embedded hardware continues to evolve in capability; today’s smartphones are vastly more powerful than their recent ancestors. This begets a new question: which hardware features not currently found on today’s embedded platforms, but potentially add-ons to tomorrow’s devices, are most likely to improve recognition performance? Said differently — what is the sensitivity of the recognizer to fine-grain details of the embedded hardware resources? To answer this question rigorously and quantitatively, we offer results from a detailed study of LVCSR performance as a function of micro-architecture options on an embedded ARM11 and an enterprise-class Intel Core2Duo. We estimate speed and energy consumption, and show, feature by feature, how hardware resources impact recognizer performance.