ISCA Archive IWSLT 2006
ISCA Archive IWSLT 2006

Recent results on MT evaluation in the GALE program

Salim Roukos

We will give an overview of the first year's evaluation results of the GALE program that is based on human post editing of the output of MT systems. A post-editor edits the MT system output until the same meaning is conveyed as in the "Gold" reference. The Human Translation Error Rate (HTER) counts the number of edits performed by a post-editor normalized by the length of the "Gold" reference as the MT error metric. We will report on the sensitivity and stability of the new HTER metric for evaluating MT systems. We also compare the correlation of various automated metrics (BLEU, TER) to HTER.