In recent years, there has been a rise in the popularity of large language model (LLM) based voice assistants. A practical question being raised in the evaluation of cascaded automatic speech recognition (ASR) systems in LLM-powered voice assistants is how to determine whether any errors in ASR transcriptions will result in task failures for the downstream assistants. Thus, measuring ASR systems that can reflect voice assistants' perception and judgement becomes increasingly important. In this paper, we propose novel evaluation metrics by leveraging the same assistant LLM to project ASR hypotheses into a vector space and compute their semantic distances with respect to the references. We perform experiments on a curated OpenAssistant test set and demonstrate that our presented methods with semantic embeddings calculated from LLMs are superior to conventional metrics on evaluating ASR performance towards LLM based voice assistants.