ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Composing Spoken Hints for Follow-on Question Suggestion in Voice Assistants

Pedro Faustini, Besnik Fetahu, Giuseppe Castellucci, Anjie Fang, Oleg Rokhlenko, Shervin Malmasi

The adoption of voice assistants like Alexa or Siri has grown rapidly, allowing users instant access to information via voice search. Query suggestion is a standard feature of screen-based search experiences, allowing users to explore additional topics. However, this is not trivial to implement in voice-based settings. To enable this, we tackle the novel task of suggesting uestions with compact and natural voice hints to allow users to ask follow-up questions. We first define the task of composing speech-based hints, ground it in syntactic theory, and outline linguistic desiderata for spoken hints. We propose a sequence-to-sequence approach to generate spoken hints from a list of questions. Using a new dataset of 6, 681 input questions and human written hints, we evaluate models with automatic metrics and human evaluation. Results show that a naive approach of concatenating suggested questions creates poor voice hints. Our most sophisticated approach applies a linguistically-motivated pretraining task and was strongly preferred by humans for producing the most natural hints.