ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

A study on detection based automatic speech recognition

Chengyuan Ma, Yu Tsao, Chin-Hui Lee

We propose a new approach to automatic speech recognition based on word detection and knowledge-based verification. Given an utterance, we first design a collection of word detectors, one for each lexical item in the vocabulary. Some pruning strategies are used to eliminate unlikely word candidates. Then these detected words are combined into word strings. The proposed approach is different from the conventional maximum a posteriori decoding method, and it is a critical component in building a bottom-up, detection-based speech recognition system in which knowledge in acoustics, speech and language can easily be incorporated into pruning unlikely word hypotheses and rescoring. The proposed approach was evaluated on a connected digit task using phone models trained from the TIMIT corpus. When compared with state-of-the-art connected digit recognition algorithms, we found the proposed detection based framework works well even no digit samples were used for training the detectors and recognizers. ?Other knowledge based constraints, such as manner and place of articulation detectors, can be incorporated into this detection-based approach to improve the robustness and performance of the overall system.

doi: 10.21437/Interspeech.2006-104

Cite as: Ma, C., Tsao, Y., Lee, C.-H. (2006) A study on detection based automatic speech recognition. Proc. Interspeech 2006, paper 2053-Thu1CaP.13, doi: 10.21437/Interspeech.2006-104

  author={Chengyuan Ma and Yu Tsao and Chin-Hui Lee},
  title={{A study on detection based automatic speech recognition}},
  booktitle={Proc. Interspeech 2006},
  pages={paper 2053-Thu1CaP.13},