In this paper we propose a theoretical framework to extend classical continuous density HMM in order to consider different spectral representations depending on the state. We stress the need for a reference space and for spectral transformations between the model spectral representation spaces and the reference space. We show that this framework permits to obtain more precise pdfs in the reference space. Preliminary speech recognition experiments for two spectral representations MFCC and linear frequency scale cepstral coefficients show no improvements ; however they identify that the choice of the spectral representations is crucial and the determination of the spaces transformations is a complex problem.