ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Context constrained-generalized posterior probability for verifying phone transcriptions

Hua Zhang, Lijuan Wang, Frank K. Soong, Wenju Liu

A new statistical confidence measure, Context Constrained- Generalized Posterior probability (CC-GPP), is proposed for verifying phone transcriptions in speech databases. Different from generalized posterior probability (GPP), CC-GPP is computed by considering string hypotheses that bear a focused phone with partially matched left and right contexts. Parameters used for CC-GPP include context window length, a minimal number of matched context phones, and verification thresholds. They are determined by minimizing verification errors in a development set. Evaluated on a test set of 500 sentences that consist of 2.1% phone errors, CCGPP achieves 99.6% accuracy and 78.7% recall when 90% of the phones are accepted.