Spoken question answering (SQA), or the process of retrieving relevant information from speech recordings, has become a topic of great interest due in part to the recent growth of large language models. However, many state-of-the-art SQA have memory and compute costs that scale linearly with the length of the input audio context, making them infeasible to apply on long speech recordings (meetings, podcasts, etc.). This paper proposes a method which uses deep Q-learning to learn a policy for skipping over irrelevant audio segments in a longer audio file without analyzing them for more efficient SQA. In this framework, an agent model is trained on BERT sentence embeddings extracted from lightweight ASR transcripts to make decisions on how far it can safely skip through an audio file in order to move closer to the target answer span. Applied on the CORAAL QA database, this work approaches the performance of SOTA SQA systems while using less than half of the compute to analyze audio.