Most speech applications to-date have attempted to provide a more natural interface for human-computer interaction or human-computer data-input. Only recently, a whole new class of application is coming to the fore: computer enhanced human-human interaction. In these applications the computer is no longer addressed directly, but must observe, process and understand the interactions between humans in a room. In this paper we discuss two such applications: a meeting browser that observes and tracks meetings for later review and summarization, and a lecture tracker that provides not only summarization, but also implicit services during a presentation, such as control of AV equipmend and selection of the most suitable slides. Processing human-human conversational speech under unpredictable recording conditions and vocabularies presents new challenges for speech and language processing. We describe techniques designed to overcome these difficulties and report speech recognition as well as overall system performance results.