Due to the insufficient sentiment corpus in many languages, recent studies have proposed cross-lingual sentiment analysis to transfer sentiment analysis models from rich-resource languages to low-resource ones. However, existing models heavily rely on code-switched sentences to reduce the alignment discrepancy of cross-lingual embeddings, which could be limited by their inherent constraints. In this paper, we propose a novel method dubbed SOUL (short for Softmix and Multiview learning) to enhance zero-shot cross-lingual sentiment analysis. Instead of using the embeddings of code-switched sentences directly, SOUL first mixes them softly with the embeddings of original sentences. Furthermore, SOUL utilizes multi-view learning to encourage contextualized embeddings to align into a refined language-invariant space. Experimental results on four cross-lingual benchmarks across five languages clearly verify the effectiveness of our proposed SOUL.