Latency is one of the main challenges in the task of simultaneous spoken
language translation. While significant improvements in recent years
have led to high quality automatic translations, their usefulness in
real-time settings is still severely limited due to the large delay
between the input speech and the delivered translation.
In this paper, we
present a novel scheme which reduces the latency of a large scale speech
translation system drastically. Within this scheme, the transcribed
text and its translation can be updated when more context is available,
even after they are presented to the user. Thereby, this scheme allows
us to display an initial transcript and its translation to the user
with a very low latency. If necessary, both transcript and translation
can later be updated to better, more accurate versions until eventually
the final versions are displayed. Using this framework, we are able
to reduce the latency of the source language transcript into half.
For the translation, an average delay of 3.3s was achieved, which is
more than twice as fast as our initial system.