In spoken language understanding, getting manually labeled data such as domain, intent and slot labels is usually required for training classifiers. Starting with some manually labeled data, we propose a data generation approach to augment the training set with synthetic data sampled from a joint distribution between an input query and an output label. We propose using a recurrent neural network to model the joint distribution and sample synthetic data for classifier training. Evaluated on ATIS and live logs of Cortana, a Microsoft voice personal assistant, we showed consistent performance improvement on domain classification, intent classification, and slot tagging on multiple languages.