We investigated the features reflecting utterance structure and disfluency profile to improve the automated scoring of spontaneous speech responses by non-native speakers of English. On both human annotated structural events (SEs), e.g., clause structure and disfluencies, and automatically detected SEs on speech transcriptions, several features were derived and showed promisingly high correlations to the human proficiency scores. However, the usefulness of these SE-derived features on ASR hypotheses was still unknown. In this paper, we reported our studies related to the detection of SEs from noisy ASR outputs and the application of the detected SEs for automated speech scoring. We found that clause boundary (CB) detection was impacted much less compared to interruption point (IP) (of speech disfluencies) detection when facing ASR errors. Next, several features derived from the detected SEs were evaluated by considering their correlation to human scores and their relative importance in a linear regression model.