The evaluation of scientific submissions through peer review is both the most fundamental component of the publication process, as well as the most frequently criticised and questioned. Academic journals and conferences request reviews from multiple reviewers per submission, which an editor, or area chair aggregates into the final acceptance decision. Reviewers are often in disagreement due to varying levels of domain expertise, confidence, levels of motivation, as well as due to the heavy workload and the different interpretations by the reviewers of the score scale. Herein, we explore the possibility of a computational decision support tool for the editor, based on Natural Language Processing, that offers an additional aggregated recommendation. We provide a comparative study of state-of-the-art text modelling methods on the newly crafted, largest review dataset of its kind based on Interspeech 2019, and we are the first to explore uncertainty-aware methods (soft labels, quantile regression) to address the subjectivity inherent in this problem.