This paper proposes a novel approach to address late reverberation, which degrades speech intelligibility by convolving clean speech with room impulse response. Our method combines metric-guided training and non-uniform state sampling within the Stochastic Regeneration Model (StoRM) diffusion architecture, enabling better diffusion variability modeling while maintaining computational efficiency. Key metrics such as STFT loss, spectral convergence loss, Mel Frequency Cepstral Coefficient (MFCC) loss and log-magnitude loss guide the regeneration process, improving convergence by reducing training epochs by ~19.6% with slight improvements in dereverberation. Meanwhile, the non-linear state sampling approach enhances training convergence by ~27.2% with practically similar perceptual performance. We evaluate the impact of these modifications on automatic speech recognition and clean speech distortion relative to the baseline, demonstrating optimal speech-quality-aware performance.