We report results on vowel and stop consonant recognition with tokens extracted from the TIMIT database. Our current system differs from others doing similar tasks in that we do not use any specific time normalization techniques. We use a very detailed biologically motivated input representation of the speech tokens - Lyon's cochlear model as implemented by Slaney. This detailed, high dimensional representation, known as a cochleagram, is classified by either a backpropagation or by a hybrid supervised/unsupervised neural network classifier. The hybrid network is composed of a biologically motivated unsupervised network and a supervised back-propagation network. This approach produces results comparable to those obtained by others without the addition of time normalization.