In a typical contact center environment, multiple agents often handle calls simultaneously and they are frequently in close proximity to one another. Consequently, there is a possibility that conversations of nearby agents may inadvertently be recorded during calls. This represents instances of background agents speech being captured during agent-customer interactions. Such unintended background speech may not only impact the quality of conversation but may also contain some sensitive information which may pose security concerns in contact centers. Therefore, contact centers are interested in identifying such scenarios. This knowledge can assist them to implement appropriate mitigating strategies and enhance the quality of audio conversations, thereby improving the overall customer experience. In this work, we utilise the pauses and gaps in the agent speech to clearly identify the background speech. Our approach that is based on speech features is simple, tuneable, computationally efficient and cost effective.