This paper briefly introduces the VOTIRS 2.0 system — a Chinese spoken dialog system for travel information accessing. Through this system, users can query the travel information about 52 routes and finally make a transaction for a proper travel plan. The strategy and method to understand spontaneous speech in the system are discussed in detail. To understand spontaneous speech, a Semantic Constituent Spotting and Assembling (SCoSA) hierarchical model is proposed. It is a semantic-driven multi-phase parsing process similar with human’s understanding process. The model is efficient in parsing spontaneous speech.