Skip to main content

Über dieses Buch

This book constitutes the proceedings of the 17th International Conference on Intelligent Virtual Agents, IVA 2017, held in Stockholm, Sweden, in August 2017. The 30 regular papers and 31 demo papers presented in this volume were carefully reviewed and selected from 78 submissions.
The annual IVA conference represents the main interdisciplinary scientic forum for presenting research on modeling, developing, and evaluating intelligent virtual agents (IVAs) with a focus on communicative abilities and social behavior.



Pedagogical Agents to Support Embodied, Discovery-Based Learning

This paper presents a pedagogical agent designed to support students in an embodied, discovery-based learning environment. Discovery-based learning guides students through a set of activities designed to foster particular insights. In this case, the animated agent explains how to use the Mathematical Imagery Trainer for Proportionality, provides performance feedback, leads students to have different experiences and provides remedial instruction when required. It is a challenging task for agent technology as the amount of concrete feedback from the learner is very limited, here restricted to the location of two markers on the screen. A Dynamic Decision Network is used to automatically determine agent behavior, based on a deep understanding of the tutorial protocol. A pilot evaluation showed that all participants developed movement schemes supporting proto-proportional reasoning. They were able to provide verbal proto-proportional expressions for one of the taught strategies, but not the other.

Ahsan Abdullah, Mohammad Adil, Leah Rosenbaum, Miranda Clemmons, Mansi Shah, Dor Abrahamson, Michael Neff

WalkNet: A Neural-Network-Based Interactive Walking Controller

We present WalkNet, an interactive agent walking movement controller based on neural networks. WalkNet supports controlling the agent’s walking movements with high-level factors that are semantically meaningful, providing an interface between the agent and its movements in such a way that the characteristics of the movements can be directly determined by the internal state of the agent. The controlling factors are defined across the dimensions of planning, affect expression, and personal movement signature. WalkNet employs Factored, Conditional Restricted Boltzmann Machines to learn and generate movements. We train the model on a corpus of motion capture data that contains movements from multiple human subjects, multiple affect expressions, and multiple walking trajectories. The generation process is real-time and is not memory intensive. WalkNet can be used both in interactive scenarios in which it is controlled by a human user and in scenarios in which it is driven by another AI component.

Omid Alemi, Philippe Pasquier

A Virtual Poster Presenter Using Mixed Reality

In this demo, we will showcase a platform we are currently developing for experimenting with situated interaction using mixed reality. The user will wear a Microsoft HoloLens and be able to interact with a virtual character presenting a poster. We argue that a poster presentation scenario is a good test bed for studying phenomena such as multi-party interaction, speaker role, engagement and disengagement, information delivery, and user attention monitoring.

Vanya Avramova, Fangkai Yang, Chengjie Li, Christopher Peters, Gabriel Skantze

Multiparty Interactions for Coordination in a Mixed Human-Agent Teamwork

Virtual environments for human learning enable one or more users to interact with virtual agents in order to perform their tasks. This collaboration necessitates that the members of the team share a set of beliefs and reason about resources, plans and actions to be implemented. This article introduces a new multiparty coordination model allowing several virtual and human agents to dialogue and reason about the tasks that the user must learn. The proposed model relies on a shared plan based approach to represent the beliefs of the team members. The management of the multiparty aspect makes it possible to differentiate the behaviors to be produced according to the type of receiver of a communication: recipient or listener. Finally, in the context of learning a procedural activity, a study examines the effect of our multiparty model on a learner. Results show that the use of proactive pedagogical agents with multiparty competencies boosts the construction of common beliefs.

Mukesh Barange, Julien Saunier, Alexandre Pauchet

A Dynamic Speech Breathing System for Virtual Characters

Human speech production requires the dynamic regulation of air through the vocal system. While virtual character systems commonly are capable of speech output, they rarely take breathing during speaking – speech breathing – into account. We believe that integrating dynamic speech breathing systems in virtual characters can significantly contribute to augmenting their realism. Here, we present a novel control architecture aimed at generating speech breathing in virtual characters. This architecture is informed by behavioral, linguistic and anatomical knowledge of human speech breathing. Based on textual input and controlled by a set of low- and high-level parameters, the system produces dynamic signals in real-time that control the virtual character’s anatomy (thorax, abdomen, head, nostrils, and mouth) and sound production (speech and breathing). The system is implemented in Python, offers a graphical user interface for easy parameter control, and simultaneously controls the visual and auditory aspects of speech breathing through the integration of the character animation system SmartBody [16] and the audio synthesis platform SuperCollider [12]. Beyond contributing to realism, the presented system allows for a flexible generation of a wide range of speech breathing behaviors that can convey information about the speaker such as mood, age, and health.

Ulysses Bernardet, Sin-hwa Kang, Andrew Feng, Steve DiPaola, Ari Shapiro

To Plan or Not to Plan: Lessons Learned from Building Large Scale Social Simulations

Building large scale social simulations in virtual environments requires having a large number of virtual agents. Often we need to simulate hundreds or even thousands of individuals in order to have a realistic and believable simulation. One of the obvious desires of the developers of such simulations is to have a high degree of automation in regards to agent behaviour. The key techniques to provide this automation are: crowd simulation, planning and utility based approaches. Crowd simulation algorithms are appropriate for simulating simple pedestrian movement or for showing group activities, which do not require complex object use, but are not suitable for simulating complex everyday life, where agents need to eat, sleep, work, etc. Planning and utility based approaches remain the most suitable for this situation. In our research we are interested in developing advanced history and cultural heritage simulations and have tried to utilise planning and utility based methods (the most popular one of which is used in the game “The Sims”). Here we examine pros and cons of each of the two techniques and illustrate the key lessons that we have learned with a case study focused on developing a simulation of everyday life in ancient Mesopotamia 5000 B.C.

Anton Bogdanovych, Tomas Trescak

Giving Emotional Contagion Ability to Virtual Agents in Crowds

Recent advances in crowd simulation models attempt to recreate realistic human behaviour by introducing psychological phenomena in virtual agents. In this direction, psychology studies on personality traits, emotions and emotional contagion attempt to cope with emerging behaviours such as panic spreading and fight picking. This work depicts a way to introduce a model of emotional contagion in the scope of crowd simulation. Challenges regarding the applicability of an emotional contagion model considering great number (hundreds or thousands) of agents are depicted. Results shows that the dynamics of space and time creates emergent behaviour in crowd agents that are tuned with emotional contagion phenomena and crowd behaviour as described by literature.

Amyr Borges Fortes Neto, Catherine Pelachaud, Soraia Raupp Musse

Selecting and Expressing Communicative Functions in a SAIBA-Compliant Agent Framework

In SAIBA-compliant agent systems, the Function Markup Language (FML) is used to describe the agent’s communicative functions that are transformed into utterances accompanied with appropriate non-verbal behaviours. In the context of the ARIA Framework, we propose a template-based approach, grounded in the DIT++ taxonomy, as an interface between the dialogue manager (DM) and the non-verbal behaviour generation (NVBG) components of this framework. Our approach enhances our current FML-APML implementation of FML with the capability of receiving on-the-fly generated natural language and socio-emotional parameters (e.g. emotional stance) for transforming the agent’s intents in believable verbal and non-verbal behaviours in an adaptive manner.

Angelo Cafaro, Merijn Bruijnes, Jelte van Waterschoot, Catherine Pelachaud, Mariët Theune, Dirk Heylen

Racing Heart and Sweaty Palms

What Influences Users’ Self-Assessments and Physiological Signals When Interacting with Virtual Audiences?

In psychotherapy, virtual audiences have been shown to promote successful outcomes when used to help treating public speaking anxiety. Additionally, early experiments have shown its potential to help improve public speaking ability. However, it is still unclear to what extent certain factors, such as audience non-verbal behaviors, impact users when interacting with a virtual audience. In this paper, we design an experimental study to investigate users’ self-assessments and physiological states when interacting with a virtual audience. Our results showed that virtual audience behaviors did not influence participants self-assessments or physiological responses, which were instead predominantly determined by participants’ prior anxiety levels.

Mathieu Chollet, Talie Massachi, Stefan Scherer

Effects of Social Priming on Social Presence with Intelligent Virtual Agents

This paper explores whether witnessing an Intelligent Virtual Agent (IVA) in what appears to be a socially engaging discussion with a Confederate Virtual Agent (CVA) prior to a direct interaction, can prime a person to feel and behave more socially engaged with the IVA in a subsequent interaction. To explore this social priming phenomenon, we conducted an experiment in which participants in a control group had no priming while those in an experimental group were briefly exposed to an engaging social interaction between an IVA and a nearby CVA (i.e. a virtual actor). The participants primed by exposure to the brief CVA-IVA interaction reported being significantly more excited and alert, perceiving the IVA as more responsive, and showed significantly higher measures of Co-Presence, Attentional Allocation, and Message Understanding dimensions of social presence for the IVA, compared to those who were not primed.

Salam Daher, Kangsoo Kim, Myungho Lee, Ryan Schubert, Gerd Bruder, Jeremy Bailenson, Greg Welch

Predicting Future Crowd Motion Including Event Treatment

Crowd simulation has become an important area, mainly in entertainment and security applications. In particular, this area has been explored in safety systems to evaluate environments in terms of people comfort and security. In general, the evaluation involves the execution of one or more simulations in order to provide statistical information about the crowd behavior in a certain environment. Real-time applications can also be desirable, for instance in order to estimate the crowd behavior in a near future knowing the current crowd state, aiming to anticipate a potential problem and prevent it. This paper presents a model to estimate crowd behaviors in a future time, presenting a good compromise between accuracy and running time. It also presents a new error measure to compare two crowds based on local density.

Cliceres Mack Dal Bianco, Soraia Raupp Musse, Adriana Braun, Rodrigo Poli Caetani, Claudio Jung, Norman Badler

The Intelligent Coaching Space: A Demonstration

Here we demonstrate our Intelligent Coaching Space, an immersive virtual environment in which users learn a motor action (e.g. a squat) under the supervision of a virtual coach. We detail how we assess the ability of the coachee in executing the motor action, how the intelligent coaching space and its features are realized and how the virtual coach leads the coachee through a coaching session.

Iwan de Kok, Felix Hülsmann, Thomas Waltemate, Cornelia Frank, Julian Hough, Thies Pfeiffer, David Schlangen, Thomas Schack, Mario Botsch, Stefan Kopp

Get One or Create One: the Impact of Graded Involvement in a Selection Procedure for a Virtual Agent on Satisfaction and Suitability Ratings

N = 86 participants were either confronted with a predefined virtual agent, or could select a virtual agent from predefined sets of six or 30 graphical models, or had the opportunity to self-customize the agent’s appearance more freely. We investigated the effect of graded user involvement in the selection procedure on their ratings of satisfaction with the agent and perceived task suitability. In a second step, we explored the psychological mechanism underlying this effect. Statistical analyses revealed that satisfaction with the chosen virtual agent increased with the degree of participants’ involvement in terms of more choice, but not in terms of self-customization. Furthermore, we show that this effect was driven by the perceived likeability, attractiveness, and competence of the agent. We discuss implications of our results for the development of a virtual agent serving as a virtual assistant in a smart home environment.

Charlotte Diehl, Birte Schiffhauer, Friederike Eyssel, Jascha Achenbach, Sören Klett, Mario Botsch, Stefan Kopp

Virtual Reality Negotiation Training System with Virtual Cognitions

A number of negotiation training systems have been developed to improve people’s performance in negotiation. They mainly focus on the skills development, and less on negotiation understanding and improving self-efficacy. We propose a virtual reality negotiation training system that exposes users to virtual cognitions during negotiation with virtual characters with the aim of improving people’s negotiation knowledge and self-efficacy. The virtual cognitions, delivered as a personalized voice-over, provide users with a stream of thoughts that reflects on the negotiation and people’s performance. To study the effectiveness of the system, a pilot study with eight participants was conducted. The results suggest that the system significantly enhanced people’s knowledge about negotiation and increased their self-efficacy.

Ding Ding, Franziska Burger, Willem-Paul Brinkman, Mark A. Neerincx

Do We Need Emotionally Intelligent Artificial Agents? First Results of Human Perceptions of Emotional Intelligence in Humans Compared to Robots

Humans are very apt at reading emotional signals in other humans and even artificial agents, which raises the question of whether artificial agents need to be emotionally intelligent to ensure effective social interactions. For artificial agents without emotional intelligence might generate behavior that is misinterpreted, unexpected, and confusing to humans, violating human expectations and possibly causing emotional harm. Surprisingly, there is a dearth of investigations aimed at understanding the extent to which artificial agents need emotional intelligence for successful interactions. Here, we present the first study in the perception of emotional intelligence (EI) in robots vs. humans. The objective was to determine whether people viewed robots as more or less emotionally intelligent when exhibiting similar behaviors as humans, and to investigate which verbal and nonverbal communication methods were most crucial for human observational judgments. Study participants were shown a scene in which either a robot or a human behaved with either high or low empathy, and then they were asked to evaluate the agent’s emotional intelligence and trustworthiness. The results showed that participants could consistently distinguish the high EI condition from the low EI condition regardless of the variations in which communication methods were observed, and that whether the agent was a robot or human had no effect on the perception. We also found that relative to low EI high EI conditions led to greater trust in the agent, which implies that we must design robots to be emotionally intelligent if we wish for users to trust them.

Lisa Fan, Matthias Scheutz, Monika Lohani, Marissa McCoy, Charlene Stokes

Pragmatic Multimodality: Effects of Nonverbal Cues of Focus and Certainty in a Virtual Human

In pragmatic multimodality, modal (pragmatic) information is conveyed multimodally by cues in gesture, facial expressions, head movements and prosody. We observed these cues in natural interaction data. They can convey positive and negative focus, in that they emphasise or de-emphasise a piece of information, and they can convey uncertainty. In this work, we test the effects on perception and recall in a human user, when those cues are carried out by a virtual human. The nonverbal behaviour of the virtual human was modelled using motion capture data and ensured a fully multimodal appearance. Results of the study show that the virtual human was perceived as very competent and as saying something important. A special case of de-emphasising cues led to lower content recall.

Farina Freigang, Sören Klett, Stefan Kopp

Simulating Listener Gaze and Evaluating Its Effect on Human Speakers

This paper presents an agent architecture designed as part of a multidisciplinary collaboration between embodied agents development and psycho-linguistic experimentation. This collaboration will lead to an empirical study involving an interactive human-like avatar following participants’ gaze. Instead of adapting existing “off the shelf” embodied agents solutions, experimenters and developers collaboratively designed and implemented experiment’s logic and the avatar’s real time behavior from scratch in the Blender environment following an agile methodology. Frequent iterations and short implementation sprints allowed the experimenters to focus on the experiment and test many interaction scenarios in a short time.

Laura Frädrich, Fabrizio Nunnari, Maria Staudte, Alexis Heloir

Predicting Head Pose in Dyadic Conversation

Natural movement plays a significant role in realistic speech animation. Numerous studies have demonstrated the contribution visual cues make to the degree we, as human observers, find an animation acceptable. Rigid head motion is one visual mode that universally co-occurs with speech, and so it is a reasonable strategy to seek features from the speech mode to predict the head pose. Several previous authors have shown that prediction is possible, but experiments are typically confined to rigidly produced dialogue.Expressive, emotive and prosodic speech exhibit motion patterns that are far more difficult to predict with considerable variation in expected head pose. People involved in dyadic conversation adapt speech and head motion in response to the others’ speech and head motion. Using Deep Bi-Directional Long Short Term Memory (BLSTM) neural networks, we demonstrate that it is possible to predict not just the head motion of the speaker, but also the head motion of the listener from the speech signal.

David Greenwood, Stephen Laycock, Iain Matthews

Negative Feedback In Your Face: Examining the Effects of Proxemics and Gender on Learning

While applications of virtual agents in training and pedagogy have largely concentrated on positive valenced environments and interactions, human-human interactions certainly also involve a fair share of negativity that is worth exploring in virtual environments. Further, in natural human interaction as well as in virtual spaces, physical actions arguably account for a great deal of variance in our representations of social concepts (e.g., emotions, attitudes). Proxemics, specifically, is a physical cue that can elicit varying perceptions of a social interaction. In the current paper, we explore the combined and individual effects of proxemic distance and gender in a specifically negative feedback educational context. We pursue this with a 2 (Proxemic Distance) $$\times $$ 2 (Virtual Instructor Gender) between subject design, where participants actively engage in a learning task with a virtual instructor that provides harsh, negative feedback. While this study demonstrates some anticipated negative reactions to negative feedback from a close distance, such as external attribution of failure, we also observe some unexpected positive outcomes to this negative feedback. Specifically, negative feedback from a close distance has raises positive affect and effort, particularly among male participants interacting with a male virtual professor. Objective measures (head movement data) corroborate these same-gender effects as participants demonstrate more engagement when interacting with a virtual professor of their same gender. The results of the present study have broad implications for the design of intelligent virtual agents for pedagogy and mental health outcomes.

David C. Jeong, Dan Feng, Nicole C. Krämer, Lynn C. Miller, Stacy Marsella

A Psychotherapy Training Environment with Virtual Patients Implemented Using the Furhat Robot Platform

We present a demonstration system for psychotherapy training that uses the Furhat social robot platform to implement virtual patients. The system runs an educational program with various modules, starting with training of basic psychotherapeutic skills and then moves on to tasks where these skills need to be integrated. Such training relies heavily on observing and dealing with both verbal and non-verbal in-session patient behavior. Hence, the Furhat robot is an ideal platform for implementing this. This paper describes the rationale for this system and its implementation.

Robert Johansson, Gabriel Skantze, Arne Jönsson

Crowd-Powered Design of Virtual Attentive Listeners

This demo presents a web-based system that generates attentive listening behaviours in a virtual agent acquired from audio-visual recordings of attitudinal feedback behaviour of crowdworkers.

Patrik Jonell, Catharine Oertel, Dimosthenis Kontogiorgos, Jonas Beskow, Joakim Gustafson

Learning and Reusing Dialog for Repeated Interactions with a Situated Social Agent

Content authoring for conversations is a limiting factor in creating verbal interactions with intelligent virtual agents. Building on techniques utilizing semi-situated learning in an incremental crowdworking pipeline, this paper introduces an embodied agent that self-authors its own dialog for social chat. In particular, the autonomous use of crowdworkers is supplemented with a generalization method that borrows and assesses the validity of dialog across conversational states. We argue that the approach offers a community-focused tailoring of dialog responses that is not available in approaches that rely solely on statistical methods across big data. We demonstrate the advantages that this can bring to interactions through data collected from 486 conversations between a situated social agent and 22 users during a 3 week long evaluation period.

James Kennedy, Iolanda Leite, André Pereira, Ming Sun, Boyang Li, Rishub Jain, Ricson Cheng, Eli Pincus, Elizabeth J. Carter, Jill Fain Lehman

Moveable Facial Features in a Social Mediator

Human face and facial features based behavior has a major impact in human-human communications. Creating face based personality traits and its representations in a social robot is a challenging task. In this paper, we propose an approach for a robotic face presentation based on moveable 2D facial features and present a comparative study when a synthesized face is projected using three setups; 1) 3D mask, 2) 2D screen, and 3) our 2D moveable facial feature based visualization. We found that robot’s personality and character is highly influenced by the projected face quality as well as the motion of facial features.

Muhammad Sikandar Lal Khan, Shafiq ur Réhman, Yongcui Mi, Usman Naeem, Jonas Beskow, Haibo Li

Recipe Hunt: Engaging with Cultural Food Knowledge Using Multiple Embodied Conversational Agents

The popularity in recent years of food media, particularly in the domain of documentary films, has brought the communicative potential of food to the fore. Recipe Hunt is an interactive documentary that simulates the cultural experience of connecting over food by sharing recipes. Embodied conversational agents (ECAs) are used to engage users with cultural food heritage from the U.S.-Mexico border. Recipe Hunt aims to use a distributed and participatory model of cross-cultural learning for users to engage with the culinary heritage from this region of the United States.

Sabiha Khan, Adriana Camacho, David Novick

Development and Perception Evaluation of Culture-Specific Gaze Behaviors of Virtual Agents

Gaze plays an important role in human-human communication. Adequate gaze control of a virtual agent is also essential for successful and believable human-agent interaction. Researchers on IVA have developed gaze control models by taking account of gaze duration, frequency, and timing of gaze aversion. However, none of this work has considered cultural differences in gaze behaviors. We aimed to investigate cultural differences in gaze behaviors and their perception by developing virtual agents with Japanese gaze behaviors, American gaze behaviors, hybrid gaze behaviors, and full gaze behaviors. We then compared their effects on the impressions of the agents and interactions. Our experimental results with Japanese participants suggest that the impression of the agent is affected by participants’ shyness and familiarity of the gaze patterns performed by the agent.

Tomoko Koda, Taku Hirano, Takuto Ishioh

A Demonstration of the ASAP Realizer-Unity3D Bridge for Virtual and Mixed Reality Applications

Modern game engines such as Unity make prototyping and developing experiences in virtual and mixed reality environments increasingly accessible and efficient, and their value has long been recognized by the scientific community as well. However, these game engines do not easily allow control of virtual embodied characters, situated in such environments, with the same expressiveness, flexibility, and generalizability, as offered by modern BML realizers that generate synchronized multimodal behavior from Behavior Markup Language (BML). We demonstrate our integration of the ASAP BML Realizer and the Unity3D game engine at the hand of an Augmented Reality setup. We further show an in-unity editor for BML animations in the same system.

Jan Kolkmeier, Merijn Bruijnes, Dennis Reidsma

An ASAP Realizer-Unity3D Bridge for Virtual and Mixed Reality Applications

Modern game engines such as Unity make prototyping and developing experiences for virtual and mixed reality contexts increasingly accessible and efficient, and their value has long been recognized by the scientific community as well. However, these game engines do not have the capabilities to control virtual embodied characters, situated in such environments, with the same expressiveness, flexibility, and generalizability, as offered by modern BML realizers that generate synchronized multimodal behavior from Behavior Markup Language (BML). We implemented a Unity embodiment bridge to the Articulated Social Agents Platform (ASAP) to combine the benefits of a modern game engine and a modern BML realizer. The challenges and solutions we report can help others integrate other game engines with BML realizers, and we end with a glimpse at future challenges and features of our implementation.

Jan Kolkmeier, Merijn Bruijnes, Dennis Reidsma, Dirk Heylen

Moral Conflicts in VR: Addressing Grade Disputes with a Virtual Trainer

A Virtual Trainer (VT) for moral expertise development can potentially contribute to organizational and personal moral well-being. In a pilot study a prototype of the VT confronted university employees with a complaint from an anonymous student on unfair grading: a plausible scenario. Addressing criticisms from students may be a stressful situation for many teaching professionals. For successful training, adapting the agent’s strategy based on the performance of the user is crucial. To this end, we further recorded a multimodal dataset of the interactions between the participants and the VT for future analysis. Participants saw the value in a VT that lets them practice such encounters. What is more, many participants felt truly taken aback when our VT announced that a student was unhappy with them. We further describe a first look at the multimodal dataset.

Jan Kolkmeier, Minha Lee, Dirk Heylen

Evaluated by a Machine. Effects of Negative Feedback by a Computer or Human Boss

In today’s remote working environments that include tasks given and performed via the Internet, people will encounter computer bosses that supervise their work. There is no knowledge on whether people will accept (negative) feedback that is given by an autonomous agent instead of a human. In a 2x2 between subject online experiment 183 participants performed a proofreading task and received either emotional or factual feedback by a human or computer boss. Results indicate that while the bosses´ behavior affects perceived warmness, human likeness and perceived psychological safety in the sense that factual feedback is perceived as more positive, there was only one significant result for the manipulation of the boss with regard to the perception of human-likeness.

Nicole C. Krämer, Lilly-Marie Leiße, Andrea Hollingshead, Jonathan Gratch

A Web-Based Platform for Annotating Sentiment-Related Phenomena in Human-Agent Conversations

This paper introduces a web-based platform dedicated to the annotation of sentiment-related phenomena in human-agent conversations. The platform focuses on verbal content and deliberately sets aside non-verbal features. It is designed for managing two dialogue features: adjacency pair and conversation progression. Two annotation tasks are considered: (i) the detection of sentiment expressions, (ii) the ranking of user’s preferences. These two tasks focus on a set of specific targets. With this demonstration, we aim to introduce this platform to a large scientific audience and to get feedback for future improvements. Our long-term goal is to make the platform available as open-source tool.

Caroline Langlet, Guillaume Dubuisson Duplessis, Chloé Clavel

Does a Robot Tutee Increase Children’s Engagement in a Learning-by-Teaching Situation?

This paper presents initial attempts to combine a humanoid robot with the teachable agent approach. Several design choices are discussed, including the decision to use a robot instead of a virtual agent and which behaviours to implement in the robot. A pilot study explored how the interaction with a robot seemed to influence children’s engagement as well as their attribution of mental states to a robot and to a virtual agent. Eight children participated and the interaction was measured via an observational protocol and a conversational interview. A main outcome was large individual differences between the children’s interaction with the robot compared to the virtual agent.

Markus Lindberg, Kristian Månsson, Birger Johansson, Agneta Gulz, Christian Balkenius

The Expression of Mental States in a Humanoid Robot

We explore to what degree movement together with facial features in a humanoid robot, such as eyes and mouth, can be used to convey mental states. Several animation variants were iteratively tested in a series of experiments to reach a set of five expressive states that can be reliably expressed by the robot. These expressions combine biologically motivated cues such as eye movements and pupil dilation with elements that only have a conventional significance, such as changes in eye color.

Markus Lindberg, Hannes Sandberg, Marcus Liljenberg, Max Eriksson, Birger Johansson, Christian Balkenius

You Can Leave Your Head on

Attention Management and Turn-Taking in Multi-party Interaction with a Virtual Human/Robot Duo

In two small studies, we investigated how a virtual human/ robot duo can complement each other in joint interaction with one or more users. The robot takes care of turn management while the virtual human draws attention to the robot. Our results show that having the virtual human address the robot, highlights the latter’s role in the interaction. Having the robot nonverbally indicate the intended addressee of a question asked by the virtual human proved successful in all cases when the robot was first addressed by the virtual human.

Jeroen Linssen, Meike Berkhoff, Max Bode, Eduard Rens, Mariët Theune, Daan Wiltenburg

Say Hi to Eliza

An Embodied Conversational Agent on the Web

The creation and support of Embodied Conversational Agents (ECAs) has been quite challenging, as features required might not be straight-forward to implement and to integrate in a single application. Furthermore, ECAs as desktop applications present drawbacks for both developers and users; the former have to develop for each device and operating system and the latter must install additional software, limiting their widespread use. In this paper we demonstrate how recent advances in web technologies show promising steps towards capable web-based ECAs, through some off-the-shelf technologies, in particular, the Web Speech API, Web Audio API, WebGL and Web Workers. We describe their integration into a simple fully functional web-based 3D ECA accessible from any modern device, with special attention to our novel work in the creation and support of the embodiment aspects.

Gerard Llorach, Josep Blat

A Computational Model of Power in Collaborative Negotiation Dialogues

This paper presents a conversational agent that can deploy different strategies of negotiation based on its social power. The underlying computational model is based on three principles of collaborative negotiation from the literature in social psychology. The social behavior of the agent is made visible through its dialogue strategy. We evaluated our model by showing that these principles are correctly perceived by human observers on synthetic dialogues.

Lydia Ould Ouali, Nicolas Sabouret, Charles Rich

Prestige Questions, Online Agents, and Gender-Driven Differences in Disclosure

This work considers the possibility of using virtual agents to encourage disclosure for sensitive information. In particular, this research used “prestige questions”, which asked participants to disclose information relevant to their socioeconomic status, such as credit limit, as well as university attendance, and mortgage or rent payments they could afford. We explored the potential for agents to enhance disclosure compared to conventional web-forms, due to their ability to serve as relational agents by creating rapport. To consider this possibility, agents were framed as artificially intelligent versus avatars controlled by a real human, and we compared these conditions to a version of the financial questionnaire with no agent. In this way, both the perceived agency of the agent and its ability to generate rapport were tested. Additionally, we examined the differences in disclosure between men and women in these conditions. Analyses reveled that agents (either AI- or human-framed) evoked greater disclosure compared to the no agent condition. However, there was some evidence that human-framed agents evoked greater lying. Thus, users in general responded more socially to the presence of a human- or AI-framed agent, and the benefits and costs of this approach were made apparent. The results are discussed in terms of rapport and anonymity.

Johnathan Mell, Gale Lucas, Jonathan Gratch

To Tell the Truth: Virtual Agents and Morning Morality

This paper investigates the impact of time of day on truthfulness in human-agent interactions. Time of day has been found to have important implications for moral behavior in human-human interaction. Namely, the morning morality effect shows that people are more likely to act ethically (i.e., tell fewer lies) in the morning than in the afternoon. Based on previous work on disclosure and virtual agents, we propose that this effect will not bear out in human-agent interactions. Preliminary evaluation shows that individuals who lie when engaged in multi-issue bargaining tasks with the Conflict Resolution Agent, a semi-automated virtual human, tell more lies to human negotiation partners than virtual agent negotiation partners in the afternoon and are more likely to tell more lies in the afternoon than in the morning when they believe they are negotiating with a human. Time of day does not have a significant effect on the amount of lies told to the virtual agent during the multi-issue bargaining task.

Sharon Mozgai, Gale Lucas, Jonathan Gratch

Fixed-pie Lie in Action

Negotiation is a crucial skill for socially intelligent agents. Sometimes negotiators lie to gain advantage. In particular, they can claim that they want the same thing as their opponents (i.e., use a “fixed-pie lie”) to gain an advantage while appearing fair. The current work is the first attempt to examine effectiveness of this strategy when used by agents against humans in realistic negotiation settings. Using the IAGO platform, we show that the exploitative agent indeed wins more points while appearing fair and honest to its opponent. In a second study, we investigated how far the exploitative agents could push for more gain and examined their effect on people’s behavior. This study shows that even though exploitative agents gained high value in short-term, their long-term success remains questioned as they left their opponents unhappy and unsatisfied.

Zahra Nazari, Gale Lucas, Jonathan Gratch

Generation of Virtual Characters from Personality Traits

We present a method to generate a virtual character whose physical attributes reflect public opinion of a given personality profile. An initial reverse correlation experiment trains a model which explains the perception of personality traits from physical attributes. The reverse model, solved using linear programming, allows for the real-time generation of virtual characters from an input personality. The method has been applied on three personality traits (dominance, trustworthiness, and agreeableness) and 14 physical attributes and verified through both an analytic test and a subjective study.

Fabrizio Nunnari, Alexis Heloir

Effect of Visual Feedback Caused by Changing Mental States of the Avatar Based on the Operator’s Mental States Using Physiological Indices

Use of a virtual environment allows rehearsal or practice in specialized situations, extraordinary environments, or circumstances in which mistakes are not allowed. However, virtual experiences often do not translate to the real world. We propose to regard a virtual avatar as an agent which mediates between virtual experiences and the real ones. The aim of this study is to investigate how to enhance the effect of virtual experiences. To achieve this, we propose a method of providing feedback on human mental state by using affective avatar expressions based on physiological indices. We conduct experiments to evaluate the effect of the method. As a result, we find that feedback from avatar expressions can increase participants’ physiological responses without reducing concentration on the task. We suggest that this feedback can enhance commitment to virtual-world experiences.

Yoshimasa Ohmoto, Seiji Takeda, Toyoaki Nishida

That’s a Rap

Increasing Engagement with Rap Music Performance by Virtual Agents

Many applications of virtual agents, including those in healthcare and education, require engaging users in dozens or hundreds of interactions over long periods of time. In this effort, we are developing conversational agents to engage young adults in longitudinal lifestyle health behavior change interventions. Hip-hop and rap are one of the most popular genres of music among our stakeholders, and we are exploring rap as an engagement mechanism and communication channel in our agent-based interventions. We describe a method for integrating rap into a counseling dialog by a conversational agent, including the acoustic manipulation of synthetic speech and accompanying character dance animation. We demonstrate in a within-subjects study that the participants who like rap music preferred the rapping character significantly more than an equivalent agent that does not rap in its dialog, based on both self-report and behavioral measures. Participants also found the rapping agent significantly more engaging than the non-rapping one.

Stefan Olafsson, Everlyne Kimani, Reza Asadi, Timothy Bickmore

Design of an Emotion Elicitation Tool Using VR for Human-Avatar Interaction Studies

With the development of socially interacting machines, it is important to understand how people react depending on their emotional state. Research in this area require emotion elicitation devices. This paper presents such a tool using virtual reality (VR), that merges classical elicitation techniques to emphasize emotional response. The design choices are depicted for four emotions, and a performance analysis using questionnaires is achieved.

P-H. Orefice, M. Ammi, M. Hafez, A. Tapus

Toward an Automatic Classification of Negotiation Styles Using Natural Language Processing

We present a natural language processing model that allows automatic classification and prediction of the user’s negotiation style during the interaction with virtual humans in a 3D game. We collected the sentences used in the interactions of the users with virtual artificial agents and their associated negotiation style as measured by ROCI-II test. We analyzed the documents containing the sentences for each style applying text mining techniques and found statistical differences among the styles in agreement with their theoretical definitions. Finally, we trained two machine learning classifiers on the two datasets using pre-trained Word2Vec embeddings.

Daniela Pacella, Elena Dell’Aquila, Davide Marocco, Steven Furnell

Interactive Narration with a Child: Avatar versus Human in Video-Conference

This article reviews a part of the data collected in a “Wizard-of-Oz” environment, where children interact with a virtual character in a narrative setup. The experiment compares children’s engagement depending on the narrator type: either a piloted virtual character or a human in video-conference. The results show that engagement exists, but the modality of the interaction feedback varies in the two contexts.

Alexandre Pauchet, Ovidiu Şerban, Mélodie Ruinet, Adeline Richard, Émilie Chanoni, Mukesh Barange

Who, Me? How Virtual Agents Can Shape Conversational Footing in Virtual Reality

The nonverbal behaviors of conversational partners reflect their conversational footing, signaling who in the group are the speakers, addressees, bystanders, and overhearers. Many applications of virtual reality (VR) will involve multiparty conversations with virtual agents and avatars of others where appropriate signaling of footing will be critical. In this paper, we introduce computational models of gaze and spatial orientation that a virtual agent can use to signal specific footing configurations. An evaluation of these models through a user study found that participants conformed to conversational roles signaled by the agent and contributed to the conversation more as addressees than as bystanders. We observed these effects in immersive VR, but not on a 2D display, suggesting an increased sensitivity to virtual agents’ footing cues in VR-based interfaces.

Tomislav Pejsa, Michael Gleicher, Bilge Mutlu

Cubus: Autonomous Embodied Characters to Stimulate Creative Idea Generation in Groups of Children

Creativity is an ability that is crucial in nowadays societies. It is, therefore, important to develop activities that stimulate creativity at a very young age. It seems, however, that there is a lack of tools to support these activities. In this paper, we introduce Cubus, a tool that uses autonomous synthetic characters to stimulate idea generation in groups of children during a storytelling activity. With Cubus, children can invent a story and use the stop-motion technique to record a movie depicting it. In this paper, we explain Cubus’ system design and architecture and present the evaluation of Cubus’ impact in a creative task. This evaluation investigated idea generation in groups of children during their creative process of storytelling. Results showed that the autonomous behaviors of Cubus’ virtual agents contributed to the generation of more ideas in children, a key dimension of creativity.

André Pires, Patrícia Alves-Oliveira, Patrícia Arriaga, Carlos Martinho

Interacting with a Semantic Affective ECA

This paper presents an affective enhanced semantic ECA named E-VOX. The core of E-VOX is a cognitive-affective architecture based on Soar and extended with an affective model inspired by ALMA, that takes into account emotions, mood, and personality. E-VOX works as an assistant to provide useful information from Wikipedia, supporting real-feel human-computer interaction. User interaction with the ECA is explained and first tests with users are shown. These tests have revealed that the ECA is perceived as useful, easy to use and entertaining. Thanks to the cognitive-affective architecture, the agent’s behavior is modulated its personality, influencing agent-user interaction and the perception of the agent by the user. The agent’s emotional behavior has been perceived by users as realistic though not always sufficiently expressive.

Joaquín Pérez, Yanet Sánchez, Francisco J. Serón, Eva Cerezo

Towards Believable Interactions Between Synthetic Characters

Believable interactions between synthetic characters are an important factor defining the success of a virtual environment relying on human participants being able to create emotional bonds with artificial characters. As important as the characters being themselves believable is that the interaction with or between such characters is believable. In this work, we bridge affective computing and traditional animation principles to create 3Motion, a model for synthetic character interaction based on anticipation and emotion that allows for precise affective communication of intention-based behaviors. We present an exploratory study with 52 participants supporting that our approach is able to increase overall interaction believability.

Ricardo Rodrigues, Carlos Martinho

Joint Learning of Speech-Driven Facial Motion with Bidirectional Long-Short Term Memory

The face conveys a blend of verbal and nonverbal information playing an important role in daily interaction. While speech articulation mostly affects the orofacial areas, emotional behaviors are externalized across the entire face. Considering the relation between verbal and nonverbal behaviors is important to create naturalistic facial movements for conversational agents (CAs). Furthermore, facial muscles connect areas across the face, creating principled relationships and dependencies between the movements that have to be taken into account. These relationships are ignored when facial movements across the face are separately generated. This paper proposes to create speech-driven models that jointly capture the relationship not only between speech and facial movements, but also across facial movements. The input to the models are features extracted from speech that convey the verbal and emotional states of the speakers. We build our models with bidirectional long-short term memory (BLSTM) units which are shown to be very successful in modeling dependencies for sequential data. The objective and subjective evaluations of the results demonstrate the benefits of joint modeling of facial regions using this framework.

Najmeh Sadoughi, Carlos Busso

Integration of Multi-modal Cues in Synthetic Attention Processes to Drive Virtual Agent Behavior

Simulations and serious games require realistic behavior of multiple intelligent agents in real-time. One particular issue is how attention and multi-modal sensory memory can be modeled in a natural but effective way, such that agents controllably react to salient objects or are distracted by other multi-modal cues from their current intention. We propose a conceptual framework that provides a solution with adherence to three main design goals: natural behavior, real-time performance, and controllability. As a proof of concept, we implement three major components and showcase effectiveness in a real-time game engine scenario. Within the exemplified scenario, a visual sensor is combined with static saliency probes and auditory cues. The attention model weighs bottom-up attention against intention-related top-down processing, controllable by a designer using memory and attention inhibitor parameters. We demonstrate our case and discuss future extensions.

Sven Seele, Tobias Haubrich, Tim Metzler, Jonas Schild, Rainer Herpers, Marcin Grzegorzek

A Categorization of Virtual Agent Appearances and a Qualitative Study on Age-Related User Preferences

Various variables influence the perception of appearance, which are difficult to examine holistically in a quantitative approach. To give a holistic overview of appearance variables, a systematic categorization of different dimensions was developed. This is also of special importance with a view to the application field of companions for seniors whose preferences regarding appearance are under-researched. Therefore, based on the categorization, 11 interviews with two different target groups (six students, five elderly) were conducted. Results indicate that seniors tend to prefer a realistic humanoid agent, while students mostly rejected this appearance and instead favored zoomorphic or machinelike agents in comic stylization. In sum, the current research gives a first hint that there are age-related differences with regard to appearance.

Carolin Straßmann, Nicole C. Krämer

Towards Reasoned Modality Selection in an Embodied Conversation Agent

We present work in progress on (verbal, facial, and gestural) modality selection in an embodied multilingual and multicultural conversation agent. In contrast to most of the recent proposals, which consider non-verbal behavior as being superimposed on and/or derived from the verbal modality, we argue for a holistic model that assigns modalities to individual content elements in accordance with semantic and contextual constraints as well as with cultural and personal characteristics of the addressee. Our model is thus in line with the SAIBA framework, although methodological differences become apparent at a more fine-grained level of realization.

Carla Ten-Ventura, Roberto Carlini, Stamatia Dasiopoulou, Gerard Llorach Tó, Leo Wanner

Lay Causal Explanations of Human vs. Humanoid Behavior

The present study used a questionnaire-based method for investigating people’s interpretations of behavior exhibited by a person and a humanoid robot, respectively. Participants were given images and verbal descriptions of different behaviors and were asked to judge the plausibility of seven causal explanation types. Results indicate that human and robot behavior are explained similarly, but with some significant differences, and with less agreement in the robot case.

Sam Thellman, Annika Silvervarg, Tom Ziemke

Generating Situation-Based Motivational Feedback in a PTSD E-health System

Motivating users is an important task for virtual agents in behaviour chance support systems. In this study we present a system which generates motivational statements based on situation type, aimed at a virtual agent for Post-Traumatic Stress Disorder therapy. Using input from experts (n=13), we built a database containing what categories of motivation to use based on therapy progress and current user trust. A statistical analysis confirmed that we can significantly predict the category of statements to use. Using the database, we present a system that generates motivational statements. Because this system is based on expert data, it has the advantage of not needing large amounts of patient data. We envision that by basing the content directly on expert knowledge, the virtual agent can motivate users as a human expert would.

Myrthe Tielman, Mark Neerincx, Willem-Paul Brinkman

Talk About Death: End of Life Planning with a Virtual Agent

For those nearing the end of life, “wellness” must encompass reduction in suffering as well as the promotion of behaviors that mitigate stress and help people prepare for death. We discuss the design of a virtual conversational palliative care coach that works with individuals during their last year of life to help them manage symptoms, reduce stress, identify and address unmet spiritual needs, and support advance care planning. We present the results of an experiment that features the reactions of older adults in discussing these topics with a virtual agent, and note the importance of discussing spiritual needs in the context of end-of-life conversations. We find that all participants are comfortable discussing these topics with an agent, and that their discussion leads to reductions in state and death anxiety, as well as significant increase in intent to create a last will and testament.

Dina Utami, Timothy Bickmore, Asimina Nikolopoulou, Michael Paasche-Orlow

Social Gaze Model for an Interactive Virtual Character

This paper describes a live demo of our autonomous social gaze model for an interactive virtual character situated in the real world. We are interested in estimating which user has an intention to interact, in other words which user is engaged with the virtual character. The model takes into account behavioral cues such as proximity, velocity, posture and sound, estimates an engagement score and drives the gaze behavior of the virtual character. Initially, we assign equal weights to these features. Using data collected in a real setting, we analyze which features have higher importance. We found that the model with weighted features correlates better with the ground-truth data.

Bram van den Brink, Christyowidiasmoro, Zerrin Yumak

Studying Gender Bias and Social Backlash via Simulated Negotiations with Virtual Agents

This research investigates whether (female and male) virtual negotiators experience a social backlash during negotiations with an economical outcome when they are using a negotiation style that is congruent with the opposite gender. An interactive turn-based negotiation using a virtual agent as employee is used in an experiment with 93 participants. Results show that the effect of gender on negotiation outcome and social backlash was less pronounced in this experiment than expected based on existing literature. Nevertheless, the results found provide several interesting pointers for follow-up research.

L. M. van der Lubbe, T. Bosse

The Dynamics of Human-Agent Trust with POMDP-Generated Explanations

Partially Observable Markov Decision Processes (POMDPs) enable optimized decision making by robots, agents, and other autonomous systems. This quantitative optimization can also be a limitation in human-agent interaction, as the resulting autonomous behavior, while possibly optimal, is often impenetrable to human teammates, leading to improper trust and, subsequently, disuse or misuse of such systems [1].

Ning Wang, David V. Pynadath, Susan G. Hill, Chirag Merchant

Virtual Role-Play with Rapid Avatars

Digital doppelgangers possess great potential to serve as powerful models for behavioral change. An emerging technology, the Rapid Avatar Capture and Simulation (RACAS), enables low-cost and high-speed scanning of a human user and creation of a digital doppelganger that is a fully animatable virtual 3D model of the user. We designed a virtual role-playing game, DELTA, with digital doppelgangers to influence a human user’s attitude to-wards sexism on college campuses. In this demonstration, we will showcase the RACAS system and the DELTA game.

Ning Wang, Ari Shapiro, David Schwartz, Gabrielle Lewine, Andrew Wei-Wen Feng

Motion Capture Synthesis with Adversarial Learning

We propose a new statistical modeling approach that we call Sequential Adversarial Auto-encoder (SAAE) for learning a synthesis model for motion sequences. This model exploits the adversarial idea that has been popularized in the machine learning field for learning accurate generative models. We further propose a conditional variant of this model that takes as input an additional information such as the activity which is performed in a sequence, or the emotion with which it is performed, and which allows to perform synthesis in context.

Qi Wang, Thierry Artières


Weitere Informationen

BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.




Der Hype um Industrie 4.0 hat sich gelegt – nun geht es an die Umsetzung. Das Whitepaper von Protolabs zeigt Unternehmen und Führungskräften, wie sie die 4. Industrielle Revolution erfolgreich meistern. Es liegt an den Herstellern, die besten Möglichkeiten und effizientesten Prozesse bereitzustellen, die Unternehmen für die Herstellung von Produkten nutzen können. Lesen Sie mehr zu: Verbesserten Strukturen von Herstellern und Fabriken | Konvergenz zwischen Soft- und Hardwareautomatisierung | Auswirkungen auf die Neuaufstellung von Unternehmen | verkürzten Produkteinführungszeiten
Jetzt gratis downloaden!