By now it should not be surprising that high performance speech recognition systems can be designed for a wide variety of tasks in many different languages. This is mainly attributed to the use of powerful statistical pattern matching paradigms coupled with the availability of a large amount of task-specific language and speech training examples. However, it is also well-known that such a high performance can not be maintained when the testing data do not resemble the training data. The speech distortion usually appears as a combination of various acoustic differences but the exact form of the distortion is often unknown and difficult to model. One way to reduce such acoustic mismatches is to adjust speech features according to some models of the differences. Another method is to modify the parameters of the statistical models, e. g. hidden Markov models, to make the modified models better characterize the distorted speech features. Depending on the knowledge used, this family of feature and model compensation techniques can be roughly categorized into three classes, namely: (1) training-based compensation, (2) blind compensation, and (3) structure-based compensation. This paper provides an overview of the capabilities and limitations of the compensation approaches and illustrates their similarities and differences. The relationship between adaptation and compensation will also be discussed.