In this paper, we propose an improved approach for spoken term detection using pseudo-relevance feedback. To remedy the problem of unmatched acoustic models with respect to spoken utterances produced under different acoustic conditions, which may give relatively poor recognition output, we integrate the relevance scores derived from the lattices with the DTW distances derived from the feature space of MFCC parameters or phonetic posteriorgrams. These DTW distances are evaluated for a carefully selected set of pseudo-relevant utterances, which obtained from the first-pass returned list given by the search engine. The utterances on the first-pass returned list are then reranked accordingly and finally shown to the user. Very encouraging, performance improvements were obtained in the preliminary experiments, especially when the acoustic models are poorly matched to the spoken utterances.