ISCA Archive Eurospeech 1995
ISCA Archive Eurospeech 1995

New telephone speech corpora at CSLU

Ronald A. Cole, M. Noel, T. Lander, T. Durham

The Center for Spoken Language Understanding (CSLU) collects, annotates and distributes telephone speech data to enable research in spoken language understanding and automatic language identification. This paper gives a brief overview of recent activities in pursuit of this mission. We summarize corpus development activities at CSLU and describe new corpora useful for research on specific tasks: alphabet recognition, numbers recognition, large vocabulary word recognition, and yes/no recognition. We then discuss our two newest data collection efforts, Cellular Speech and the 22-Language Telephone Speech Corpus. All CSLU corpora are available at no charge to academic institutions.