This paper proposes an improved approach of summarization for spoken multi-party interaction, in which intra-speaker and inter-speaker topics are modeled in a graph constructed with topical relations. Each utterance is represented as a node of the graph and the edge between two nodes is weighted by the similarity between the two utterances, which is topical similarity evaluated by probabilistic latent semantic analysis (PLSA). We model intra-speaker topics by sharing the topics from the same speaker and inter-speaker topics by partially sharing the topics from the adjacent utterances based on temporal information. We did experiments for ASR and manual transcripts. For both transcripts, experiments showed combining intra-speaker and inter-speaker topic modeling can help include the important utterances to offer the improvement for summarization.
Index Terms: summarization, multi-party meeting, topic model, probabilistic latent semantic analysis (PLSA), topic transition, temporal information, random walk