Speech Quality Assessment (SQA) without reference signals has garnered attention due to its wide applications. Current SQA methods often rely on the Mean Square Error (MSE) loss to approximate human subjective ratings. However, MSE treats all deviations from the ground truth symmetrically, ignoring their direction and relative quality distinctions among speech samples. Therefore, predictions learned through MSE have limited correlations. This paper introduces a novel approach that leverages the relative quality distinctions among speech samples. By enforcing relative ranking using Pairwise and Triplet Ranking Losses, our method encourages the SQA model to learn not only the absolute quality of individual speech samples but also their quality in comparison to others, addressing the limitations of MSE-based approaches. Additionally, we suggest pretraining the SQA encoder with an ASR task to enhance generalization. Experiments on NISQA test sets confirm our approach's effectiveness.