Measures of speech intelligibility are an essential tool for diagnosing hearing impairment and for tuning hearing aid parameters. This study explores the potential of automatic speech recognition (ASR) for conducting autonomous listening tests. In these tests (e.g., in the Oldenburg sentence matrix test employed here) the responses of participants are usually logged by a (human) supervisor. The target value is the speech reception threshold (SRT), i.e., the signal-to-noise ratio at which 50% speech intelligibility is achieved. We explore what ASR error rates can be obtained for such responses, and how ASR errors affect the measured SRT value. To this end, a speech database was recorded that contains utterances from 20 speakers and covers different levels of language complexity, ranging from simple five-word sentences to utterances as produced in typical human-human interactions during testing. While for the most complex speech material, the achievable SRT accuracy was not satisfactory, the ASR performance for sentences without out-of-vocabulary words was below 1.3% and hence sufficient to obtain a test-retest reliability of only 0.5 dB, which is identical to the reliability in human-supervised tests.