Contextual Automatic Speech Recognition (ASR) requires scalable and accurate retrieval of content relevant to the user’s context. This paper presents a comparative study of two independent context retrieval methods: sequence and segment level scoring. Evaluated on datasets with up to 100k phrases, all methods exhibit excellent retrieval recall. Notably, the segment-level scoring achieves an outstanding 75.6% recall over 100k entities. When each method is further integrated with ASR through joint training, significant improvements over nonbiased ASR are observed, with WER reduction of up to 36% with 2k entities and 28% with 100k entities. This comparative analysis provides valuable insights for selecting the optimal context retrieval technique to achieve scalable and accurate performance in contextual ASR applications.