ISCA Archive SUS 1995
ISCA Archive SUS 1995

Stressed speech classification with application to robust speech recognition

Brian D. Womack, John H. L. Hansen

It is well known that the variability in speech production due to task induced stress contributes significantly to loss in speech recognition performance. If an algorithm could be formulated which estimates the speech stress condition, then such knowledge could be integrated to improve robustness of speech recognizers in adverse conditions. In this paper, the problem of automatic stressed speech recognition is addressed. The primary goal is to formulate a tandem HMM and neural network based algorithm for stress independent recognition. To motivate an effective stress classifier, an analysis is performed of speech produced across eleven stress conditions (e.g. Angry, Clear, Fast, Lombard, Loud, Slow, Soft, etc.). Features that differentiate stress using a previously established stressed speech database (SUSAS) are employed (with 11 speakers). Two neural network algorithms are formulated to estimate a speech stress condition probability vector (with classification rates on the order of 58-100%). The stress classification output probability vector is used to weight the outputs of a codebook of stress dependent HMM recognizers to generate an improved overall recognition score. It is suggested that this approach will accommodate the intra-speaker variability due to task induced stress in adverse conditions.


Cite as: Womack, B.D., Hansen, J.H.L. (1995) Stressed speech classification with application to robust speech recognition. Proc. ESCA/NATO Workshop on Speech under Stress, 41-44

@inproceedings{womack95_sus,
  author={Brian D. Womack and John H. L. Hansen},
  title={{Stressed speech classification with application to robust speech recognition}},
  year=1995,
  booktitle={Proc. ESCA/NATO Workshop on Speech under Stress},
  pages={41--44}
}