Accurate, microphone-based speaker localization in real-world environments, like office spaces or meeting rooms, must be able to track a single speaker and multiple concurrent speakers in the presence of reverberations and background noise. Our Multiband Joint Position-Pitch (M-PoPi) algorithm for circular microphone arrays already shows a frame-wise localization estimation score of about 95% for tracking a single speaker in a noisy, reverberant setting. In this paper, we present two extensions of the M-PoPi algorithm to improve the localization estimation accuracy also for multiple concurrent speakers. These extensions are a weighted spectro-temporal fragment analysis as a pre-processing step for the M-PoPi algorithm and a particle filter-based tracking as a post-processing step. Experiments using real-world recordings of two concurrent speakers in a typically reverberant meeting room show an improvement of the frame-wise localization estimation score from 43% using the plain M-PoPi algorithm to 66% using the M-PoPi algorithm with both extensions.