ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Selection for acoustic coverage from unlimited speech extracted from closed-captioned TV

Photina Jaeyun Jang, Alexander G. Hauptmann

Given unlimited amounts of speech training data, it is desirable to predict informative subsets that will still improve the resulting acoustic model. We present a triphone frequency threshold measure for predicting informative subsets from vast amounts of speech. Results with single pass decoding show that acoustic models built from our selection-based speech set perform better than when trained on similar amounts of non-selected speech, and perform similar to models built from the original, larger amount of speech.