This paper introduces TALP, a speech-to-speech statistical machine translation system developed at the TALP Research Center (Barcelona, Spain). TALP generates translations by searching for the best scoring path through a Finite-State Transducers (FSTs), which models an Xgram of the bilingual language defined by tuples. A detailed description of the system and the core processes to train it from a parallel corpus are presented. Results on the Chinese-English supplied task of the Int. Workshop on Spoken Language Translation (IWSLT'04) Evaluation Campaign are shown and discussed.