This paper describes a log-linear modeling framework suitable for large-scale speech recognition tasks. We introduce modifications to our training procedure that are required for extending our previous work on log-linear models to larger tasks. We give a detailed description of the training procedure with a focus on aspects that impact computational efficiency. The performance of our approach is evaluated on the English Quaero corpus, a challenging broadcast conversations task. The log-linear model consistenly outperforms the maximum likelihood baseline system. Comparable performance to a system with minimum-phone-error training is achieved.
Index Terms: acoustic modeling, discriminative models