Speech recognition in adverse environments: the role of human mediation

Jan Noyes, Chris Baber

Automatic Speech Recognition (ASR) technology allows human speech signals to be used in order to carry out pre-set activities; upon detection and recognition of a sound or string of sounds, the recogniser can be programmed to execute pre-determined actions. Consequently, many groups of individuals have benefited from using ASR in human-machine interaction, human to human communications and as a means of control in their immediate environments. It is generally recognised that environmental factors play a major role in influencing the use of ASR technology, and applications would be anticipated as being more successful in environments which do not impede the attainment of acceptable recognition performance.

In terms of extending the use of speech recognition technology to so-called 'adverse environments', an immediate solution might be to attempt to build recognisers robust enough to cope with environmental stressors, and indeed this is happening. However, the characteristics of the technology only represent one side of the interaction process, and do not take into account the role of human mediation. The influence of environmental factors can be shaped by human mediation, i.e. by decisions and efforts made by the user to counter the possible effects of different environmental factors. The conclusion drawn here is that although the focus of the research effort has been towards implementing mediation through the technology, there are also many benefits to be attained from consideration of the user's role when interacting with ASR technology.

Cite as: Noyes, J., Baber, C. (1995) Speech recognition in adverse environments: the role of human mediation. Proc. ESCA/NATO Workshop on Speech under Stress, 17-20

