This paper presents strategies for measuring and assuring high quality when performing large-scale crowdsourcing data collections for acoustic model training. We examine different types of spam encountered while collecting and validating speech audio from unmanaged crowds and describe how we were able to identify these sources of spam and prevent our data from being tainted. We built a custom Android mobile application which funnels workers from a crowdsourcing platform and allows us to gather recordings and control conditions of the audio collection. We use a 2-step validation process which ensures that workers are paid only when they have actually used our application to complete their tasks. The collected audio is run through a second crowdsourcing job designed to validate that the speech matches the text with which the speakers were prompted. For the validation task, gold-standard test questions are used in combination with expected answer distribution rules and monitoring of worker activity levels over time to detect and expel likely spammers. Inter-annotator agreement is used to ensure high confidence of validated judgments. This process yielded millions of recordings with matching transcriptions in American English. The resulting set is 96% accurate with only minor errors.