In this paper, we develop improved schemes for simultaneous speech interpolation and demodulation based on continuous-time models. This leads to robust algorithms to estimate the instantaneous amplitudes and frequencies of the speech resonances and extract novel acoustic features for ASR. The continuous-time models retain the excellent time resolution of the ESAs based on discrete energy operators and perform better in the presence of noise. We also introduce a robust algorithm based on the ESAs for amplitude compensation of the filtered signals. Furthermore, we use robust nonlinear modulation features to enhance the classic cepstrum-based features and use the augmented feature set for ASR applications. ASR experiments show promising evidence that the robust modulation features improve recognition.