ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

A speech pre-processing technique for end-point detection in highly non-stationary environments

Rafael Martinez, Agustin Alvarez, Vilda Pedro Gomez, Mercedes Perez, Victor Nieto, Victoria Rodellar

The determination of the precise moment in which speech begins or ends is an important problem in ASR. As showed in [1], small separations from the optimum beginning and ending point, imply a great decrease in the recognition accuracy. The presence of noise [2] [3], specially when its level is high (around 95 dB as in the case of this work), and its characteristics are highly non-stationary, is an added problem, since it can produce false shots (more probable when the noise includes speech sounds). That is the reason why in such conditions, it is important to have a pre-processing stage that removes as much noise as is possible, and that gives some clues that help to build an end-point detector for those environments. The method here presented offers a pre-processing technique for highly noisy and non stationary environments, which at the same time that enhances the speech, gives an equalised version of the SNR improvement (Mean Spectral Energy Difference), whose main characteristic is that large differences in the level of noise are changed to a little ripple, while the presence of speech is distinguished by a large decrease in this Mean Spectral Energy Difference. Following this technique, any End-point Detection approach (explicit, implicit or hybrid [3]) may render acceptable results.


doi: 10.21437/Eurospeech.1997-112

Cite as: Martinez, R., Alvarez, A., Gomez, V.P., Perez, M., Nieto, V., Rodellar, V. (1997) A speech pre-processing technique for end-point detection in highly non-stationary environments. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 1111-1114, doi: 10.21437/Eurospeech.1997-112

@inproceedings{martinez97_eurospeech,
  author={Rafael Martinez and Agustin Alvarez and Vilda Pedro Gomez and Mercedes Perez and Victor Nieto and Victoria Rodellar},
  title={{A speech pre-processing technique for end-point detection in highly non-stationary environments}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={1111--1114},
  doi={10.21437/Eurospeech.1997-112},
  issn={1018-4074}
}