Call centers record and store customer-agent conversations for the purpose of coaching, quality assurance and to comply with Industry Regulations. Good amount of these audio recordings contain sensitive information pertaining to their customers’ financial or personal details. To ensure data security, compliance and to reduce the risk of abuse/theft, it becomes important to identify such instances in audio recordings and mask these segments. To automate this process, we propose a cascaded system; first, Automatic Speech Recognition (ASR) generates transcript and text-to-audio alignment information for an audio recording. Then, Entity Extraction is performed on generated transcripts to identify and locate sensitive information, and the corresponding sensitive segments are masked in audio recordings using alignment information. We introduce a novel system for selective masking of sensitive information in both audio and transcript.