Abundant speech data for speech emotion recognition (SER) is often unlabeled, rendering it ineffective for model training. Models trained on existing labeled datasets struggle with unlabeled data due to mismatches in data distributions. To avoid the cost of annotating speech data, it is imperative to explore unsupervised adaptation techniques to leverage the potential of unlabeled data. Motivated by this observation, we propose a novel use of voice conversion (VC) for SER, which effectively enhances emotion recognition performance on an unlabeled dataset. Our approach involves leveraging the simplicity and efficacy of the k-nearest neighbor (kNN)-based VC technique to transform speech samples from the unlabeled domain to the labeled domain. In contrast to conventional domain adaptation methods, our approach avoids re-training of a model on transformed unlabeled data. We achieve good results by testing transformed unlabeled samples on a model trained with a different labeled dataset.