ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Training topic classifiers for conversational speech with limited data

Rukmini Iyer, Jeffrey Ma, Herbert Gish, Owen Kimball

In this paper we demonstrate how automatically generated transcriptions can be used to develop an effective topic classification application. Two key contributions of our work are (a) investigating the impact of unsupervised transcriptions on topic classification where the transcription system has been trained with very limited amounts of data, and (b) demonstrating the use of mixture language models that significantly improve topic classification performance.