Perceptual Estimation of Speech Quality (PESQ) [1] is an instrumental model to estimate speech quality. This model provides quite a good estimation of quality for narrow-band transmission. The wideband version of PESQ (WB-PESQ [2]) delivers estimates of WB transmission quality. In contrast to PESQ, WB-PESQ shows differences between estimated and more expressed auditory MOS scores. Based on different subjective tests, the detailed analysis of estimated and auditory MOS scores provided in this paper shows two problems of WB-PESQ: (1) The model under-estimates the quality of wideband hybrid speech coders, like G.722.2 [3] and the recently normalized G.729.1 [4]; (2) WB-PESQ makes differences in quality between male and female talkers. The female talkers are under-estimated by WB-PESQ. A description of the psychoacoustic model of WB-PESQ and the transformation of speech signals in the different stages of this psychoacoustic model show where these problems come from. Especially WB-PESQ overestimates the degradation due to noise in hybrid coders. The implementation of a modified WB-PESQ based on this observation shows reliable estimates of speech quality which are better in accordance with auditory results.
s ITU-T Recommendation P.862, Perceptual Evaluation of Speech Quality (PESQ), an Objective Method for End-to- End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs, International Telecommunication Union, CH-Geneva, 2001. ITU-T Recommendation P.862.2, Wideband Extension to Recommendation P.862 for the Assessment of Wideband Telephone Networks and Speech Codecs, International Telecommunication Union, CH-Geneva, 2005. ITU-T Recommendation G.722.2, Wideband Coding of Speech at Around 16 kbit/s Using Adaptive Multi-Rate Wideband (AMR-WB), International Telecommunication Union, CH-Geneva, 2003. ITU-T Recommendation G.729.1, G.729 Based Embedded Variable Bit-rate Coder: An 8-32 kbit/s Scalable Wideband Coder Bitstream Interoperable with G.729, International Telecommunication Union, CHGeneva, 2006. Côté, N., Qualité perçue de parole transmise par voie téléphonique large-bande, Master thesis, Université Pierre et Marie Curie, FR-Paris, 2005.