ABSTRACT
This talk will discuss progress in building collaborative multimodal systems, both systems that offer a collaborative interface that augments human performance, and autonomous systems with which one can collaborate. To begin, I discuss what we will mean by collaboration, which revolves around plan recognition skills learned as a child. Then, I present a collaborative multimodal operations planning system, Sketch-Thru-Plan, that enables users to interact multimodally with speech and pen as it attempts to infer their plans. The system offers suggested actions and allows the user to confirm/disconfirm those suggestions. I show how the collaborative multimodal interface enables more rapid task performance and higher user satisfaction than existing deployed GUIs built for the same task.
In the second part of the talk, I discuss the differences for system design between building such a collaborative multimodal interface and building an autonomous agent with which one can collaborate through multimodal dialogue. I argue that interacting with an autonomous agent (e.g., a robot or virtual assistant) may require a more declarative approach to supporting collaborative communication. People’s deeply engrained collaboration strategies will be seen to be at the foundation of dialogue and are expected by human interlocutors. The approach I will advocate to implementing such a strategy is to build a belief-desire-intention (BDI) architecture that attempts to recognize the collaborator’s plans, and determine obstacles to their success. The system then plans and executes a response to overcome those obstacles, which results in the system’s planning appropriate actions (including speech acts). I will illustrate and demonstrate a system that embodies this type of collaboration, engaging users in dialogue about travel planning. Finally, I will compare this approach with current academic and research approaches to dialogue.
Index Terms
- Steps towards collaborative multimodal dialogue (sustained contribution award)
Recommendations
Human-robot collaborative tutoring using multiparty multimodal spoken dialogue
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interactionIn this paper, we describe a project that explores a novel experimental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robot interaction setup is designed, and a human-human dialogue corpus is ...
Beyond Conversational Discourse: A Framework for Collaborative Dialogue Analysis
CSAE '23: Proceedings of the 7th International Conference on Computer Science and Application EngineeringIn the collaboration scenario, video calls can not only improve the understanding of the collaborative conversation content but also assist group members in coordinating tasks reasonably and obtaining richer collaboration information. Although they can ...
Children's and adults' multimodal interaction with 2D conversational agents
CHI EA '05: CHI '05 Extended Abstracts on Human Factors in Computing SystemsFew systems combine both Embodied Conversational Agents (ECAs) and multimodal input. This research aims at modeling the behavior of adults and children during their multimodal interaction with ECAs. A Wizard-of-Oz setup was used and users were video-...
Comments