This paper presents approach of multi-lingual speech corpus design, data collection and phonetic annotation for text-to-speech (TTS) system development. Under a uniform data structure, more than 10 languages and dialects speech corpora are shared with language independent data management approaches and data processing procedures. A specifically defined super phonetic symbol set are used for all languages and related dialects. The defined data management methods enable Motorola multi-lingual TTS systems employs a uniform architecture for cost function-based unit selection strategy and speech synthesizer modules on both sever-based and embedded platforms. Keywords: Multi-lingual, TTS, speech corpus.