Simultaneous speech translation is a technology that attempts to reduce the delay inherent in speech translation by beginning translation before the end of explicit sentence boundaries. Despite best efforts, there is still often a trade-off between speed and accuracy in these systems, with systems with less delay also achieving lower accuracy. However, somewhat surprisingly, there is no previous work examining the relative importance of speed and accuracy, and thus given two systems with various speeds and accuracies, it is difficult to say with certainty which is better. In this paper, we make the first steps towards evaluation of simultaneous speech translation systems in consideration of both speed and accuracy. We collect user evaluations of speech translation results with different levels of accuracy and delay, and using this data to learn the parameters of an evaluation measure that can judge the trade-off between these two factors. Based on these results, we find that considering both accuracy and delay in the evaluation of speech translation results helps improve correlations with human judgements, and that users placed higher relative importance on reducing delay when results were presented through text, rather than speech.