A specific background noise, speaker or transmission line condition of a speech recognizer is referred as an environment. A mismatch between the training and operating environments can severely degrade recognition accuracy. We present a base transformation method for environment adaptation, which converts an environmental difference into a base difference and reduces the difference by a base transformation. Experiments were conducted on adapting to telephone quality speech, to a new speaker and to speech corrupted by additive Gaussian noise. Using two sentences (5 sec duration) as adaptation data, the method gives a telephone line adapted recognition accuracy of 93.5% and a speaker adapted accuracy of about 90%, for a city name recognition task. Using nine sentences (20 sec duration) with SNRs better than lOdB, a noise-adapted recognition accuracy of 90% was obtained on a 206 word recognition task.
Keywords: environment adaptation, base transformation, noisy speech recognition