Spectro-temporal feature extraction has shown its robustness in the field of speech recognition. However, these features are derived from magnitude spectrum of the complex Fourier Transform (FT). In this work, we investigate to see if phase information can substitute magnitude based spectro-temporal features. We compared with different state of art phase spectrum and evaluated its performance. The experiments are carried out in different noisy environments. We found Modified Group Delay (MODGD) spectrum to closely resemble the structure of power spectrum. A relative performance difference of 0.03% on average is observed for the MODGD spectro-temporal features compared to the magnitude based features. The analysis showed that phase can indeed carry equivalent or complementary information to magnitude based spectro-temporal features.