ISCA Archive Interspeech 2004
ISCA Archive Interspeech 2004

Synchronization of speaker selection for centralized tandem free voIP conferencing

Peter Kabal, Colm Elliott

Traditional teleconferencing uses a select-and-mix function at a centralized conferencing bridge. In VoIP environments, this mixing operation can lead to speech degradation when using high compression speech codecs due to tandem encodings and coding of multi-talker signals. A tandem-free architecture can eliminate tandem encodings and preserve speech quality. VoIP conference bridges must also consider the variable network delays experienced by different packetized voice streams. A synchronized speaker selection algorithm at the bridge can smooth out network delay variations and synchronize incoming voice streams. This provides a clean mapping of the N input packet streams to the M output streams representing selected speakers. This paper presents a synchronized speaker selection algorithm and evaluates its performance using a conference simulator. The synchronization process is shown to account for only a small part of the overall delay experienced by selected packets.