Text style transfer is the task of converting textual style while preserving content. Content preservation is still challenging in text style transfer under the training condition with non-parallel data. We improve the content preservation performance of text style transfer using a labeled non-parallel corpus, targeting interest styles for text-to-speech synthesis. We propose a content word storage mechanism to preserve "content words”, particularly for improving content preservation, and incorporate it in the conditional variational autoencoder to capture the style information from the labeled non-parallel corpus. We have conducted a bi-directional transfer experiment of Japanese texts about "disfluency removal/insertion” and "standard/Kansai dialect conversion” as target styles. From the results of automatic and human evaluations, we found that 1) the proposed method improved the content preservation without compromising other performances and 2) the proposed method had different performances depending on the direction of style transfer.