Skip to main content

Über dieses Buch

The origin of the Intelligent Virtual Agents conference dates from a successful workshop on Intelligent Virtual Environments held in Brighton, UK at the 13th European Conference on Arti?cial Intelligence (ECAI’98). This workshop was followed by a second one held in Salford in Manchester, UK in 1999.Subsequent events took place in Madrid, Spain in 2001, Isree, Germany in 2003 and Kos, Greece in 2005. Starting in 2006, Intelligent Virtual Agents moved from being a biennial to an annual event and became a full ?edged international conference, hosted in California. This volume contains the proceedings of the 6th International Conference on Intelligent Virtual Agents, IVA 2006, held in Marina del Rey, California, USA from August 21–23.For the second year in a row,IVA also hosted the Gathering of Animated Lifelike Agents (GALA 2006), an annual festival to showcase the latest animated lifelike agents created by university students and academic or industrial research groups. IVA 2006 received 73 submissions from Europe, the AmericasandAsia.Thepaperspublishedherearethe24fullpapersand11short papers presented at the conference, as well as one-page descriptions of posters and the featured invited talks by Brian Parkinson of Oxford University, Rod Humble of Electronic Arts, and Michael Mateas of the University of California, Santa Cruz and Andrew Stern of Procedural Arts.



Social Impact of IVAs

Why Fat Interface Characters Are Better e-Health Advisors

In an experimental setting, we investigated whether body shape similarity between user and interface character affected involvement with, distance towards, as well as intentions to use the character in an e-health context. Users interacted with an interface character with the same (similar) or with a different (dissimilar) body shape as their own. Furthermore, the character’s body shape was negatively valenced (heavier than ideal) or positively valenced (same as ideal). In contrast to what one might expect from stereotype research, users perceived non-ideal (fatter) characters as more credible and trustworthy than ideal (slim) characters. Especially users similar in body shape to a non-ideal character felt the least distant towards fatter characters. These users also preferred to


relatively fat characters over slim characters. Considering the increasing amount of overweight people in society, it seems most effective to design interface characters with bodies fatter than in current e-health applications, which often feature slim characters.

H. C. van Vugt, E. A. Konijn, J. F. Hoorn, J. Veldhuis

Virtual Rapport

Effective face-to-face conversations are highly interactive. Participants respond to each other, engaging in nonconscious behavioral mimicry and backchanneling feedback. Such behaviors produce a subjective sense of rapport and are correlated with effective communication, greater liking and trust, and greater influence between participants. Creating rapport requires a tight sense-act loop that has been traditionally lacking in embodied conversational agents. Here we describe a system, based on psycholinguistic theory, designed to create a sense of rapport between a human speaker and virtual human listener. We provide empirical evidence that it increases speaker fluency and engagement.

Jonathan Gratch, Anna Okhmatovskaia, Francois Lamothe, Stacy Marsella, Mathieu Morales, R. J. van der Werf, Louis-Philippe Morency

IVAs Recognizing Human Behavior

Imitation Learning and Response Facilitation in Embodied Agents

Imitation is supposedly a fundamental mechanism for humans to learn new actions and to gain knowledge about another’s intentions. The basis of this behavior seems to be a direct influencing of the motor system by the perceptual system, affording fast, selective enhancement of a motor response already in the repertoire (

response facilitation

) as well as learning and delayed reproduction of new actions (

true imitation).

In this paper, we present an approach to attain these capabilities in virtual embodied agents. Building upon a computational motor control model, our approach connects visual representations of observed hand and arm movements to graph-based representations of motor commands. Forward and inverse models are employed to allow for both fast mimicking responses as well as imitation learning.

Stefan Kopp, Olaf Graeser

Robust Recognition of Emotion from Speech

This paper presents robust recognition of a subset of emotions by animated agents from salient spoken words. To develop and evaluate the model for each emotion from the chosen subset, both the prosodic and acoustic features were used to extract the intonational patterns and correlates of emotion from speech samples. The computed features were projected using a combination of linear projection techniques for compact and clustered representation of features. The projected features were used to build models of emotions using a set of classifiers organized in hierarchical fashion. The performances of the models were obtained using number of classifiers from the WEKA machine learning toolbox. Empirical analysis indicated that the lexical information computed from both the prosodic and acoustic features at word level yielded robust classification of emotions.

Mohammed E. Hoque, Mohammed Yeasin, Max M. Louwerse

Affect Detection from Human-Computer Dialogue with an Intelligent Tutoring System

We investigated the possibility of detecting affect from natural language dialogue in an attempt to endow an intelligent tutoring system, AutoTutor, with the ability to incorporate the learner’s affect into its pedagogical strategies. Training and validation data were collected in a study in which college students completed a learning session with AutoTutor and subsequently affective states of the learner were identified by the learner, a peer, and two trained judges. We analyzed each of these 4 data sets with the judges’ affect decisions, along with several dialogue features that were mined from AutoTutor’s log files. Multiple regression analyses confirmed that dialogue features could significantly predict particular affective states (boredom, confusion, flow, and frustration). A variety of standard classifiers were applied to the dialogue features in order to assess the accuracy of discriminating between the individual affective states compared with the baseline state of neutral.

Sidney D’Mello, Art Graesser

Exploitation in Affect Detection in Improvisational E-Drama

We report progress on adding affect-detection to a program for virtual dramatic improvisation, monitored by a human director. To aid the director, we have partially implemented emotion detection. within users’ text input. The affect-detection module has been used to help develop an automated virtual actor. The work involves basic research into how affect is conveyed through metaphor and contributes to the conference themes such as building improvisational intelligent virtual agents for interactive narrative environments.

Li Zhang, John A. Barnden, Robert J. Hendley, Alan M. Wallington

Human Interpretation of IVA Behavior

An Exploration of Delsarte’s Structural Acting System

The designers of virtual agents often draw on a large research literature in psychology, linguistics and human ethology to design embodied agents that can interact with people. In this paper, we consider a structural acting system developed by Francois Delsarte as a possible resource in designing the nonverbal behavior of embodied agents. Using human subjects, we evaluate one component of the system,

Delsarte’s Cube

, that addresses the meaning of differing attitudes of the hand in gestures.

Stacy C. Marsella, Sharon Marie Carnicke, Jonathan Gratch, Anna Okhmatovskaia, Albert Rizzo

Perception of Blended Emotions: From Video Corpus to Expressive Agent

Real life emotions are often blended and involve several simultaneous superposed or masked emotions. This paper reports on a study on the perception of multimodal emotional behaviors in Embodied Conversational Agents. This experimental study aims at evaluating if people detect properly the signs of emotions in different modalities (speech, facial expressions, gestures) when they appear to be superposed or masked. We compared the perception of emotional behaviors annotated in a corpus of TV interviews and replayed by an expressive agent at different levels of abstraction. The results provide insights on the use of such protocols for studying the effect of various models and modalities on the perception of complex emotions.

Stéphanie Buisine, Sarkis Abrilian, Radoslaw Niewiadomski, Jean-Claude Martin, Laurence Devillers, Catherine Pelachaud

Perceiving Visual Emotions with Speech

Embodied Conversational Agents (ECAs) with realistic faces are becoming an intrinsic part of many graphics systems employed in HCI applications. A fundamental issue is how people visually perceive the affect of a speaking agent. In this paper we present the first study evaluating the relation between objective and subjective visual perception of emotion as displayed on a speaking human face, using both full video and sparse point-rendered representations of the face. We found that objective machine learning analysis of facial marker motion data is correlated with evaluations made by experimental subjects, and in particular, the lower face region provides insightful emotion clues for visual emotion perception. We also found that affect is captured in the abstract point-rendered representation.

Zhigang Deng, Jeremy Bailenson, J. P. Lewis, Ulrich Neumann

Embodied Conversational Agents

Dealing with Out of Domain Questions in Virtual Characters

We consider the problem of designing virtual characters that support speech-based interactions in a limited domain. Previously we have shown that classification can be an effective and robust tool for selecting appropriate in-domain responses. In this paper, we consider the problem of dealing with out-of-domain user questions. We introduce a taxonomy of out-of-domain response types. We consider three classification architectures for selecting the most appropriate out-of-domain responses. We evaluate these architectures and show that they significantly improve the quality of the response selection making the user’s interaction with the virtual character more natural and engaging.

Ronakkumar Patel, Anton Leuski, David Traum

MIKI: A Speech Enabled Intelligent Kiosk

We introduce MIKI, a three-dimensional, directory assistance-type digital persona displayed on a prominently-positioned 50 inch plasma unit housed at the FedEx Institute of Technology at the University of Memphis. MIKI, which stands for Memphis Intelligent Kiosk Initiative, guides students, faculty and visitors through the Institute’s maze of classrooms, labs, lecture halls and offices through graphically-rich, multidimensional, interactive, touch and voice sensitive digital content. MIKI differs from other intelligent kiosk systems by its advanced natural language understanding capabilities that provide it with the ability to answer informal verbal queries without the need for rigorous phraseology. This paper describes, in general, the design, implementation, and observations of visitor reactions to the Intelligent Kiosk.

Lee McCauley, Sidney D’Mello

Architecture of a Framework for Generic Assisting Conversational Agents

In this paper, we focus on the notion of Assisting Conversational Agents (ACA) that are embodied agents dedicated to the function of assistance for novice users of software components and/or web services. We discuss the main requirements of such agents and we emphasize the

genericity issue

arising in the dialogical part of such architectures. This prompts us to propose a mediator-based framework, using a dynamic symbolic representation of the runtime of the assisted components. Then we define three strategies for the development of the mediators that are validated by the implementation of experiences taken in various situations.

Jean-Paul Sansonnet, David Leray, Jean-Claude Martin

A Comprehensive Context Model for Multi-party Interactions with Virtual Characters

Contextual information plays a crucial role in nearly every conversational setting. When people engage in conversations they rely on what has previously been uttered or done in various ways. Some nonverbal actions are ambiguous when viewed on their own. However, when viewed in their context of use their meaning is obvious. Autonomous virtual characters that perceive and react to events in conversations just like humans do also need a comprehensive representation of this contextual information. In this paper we describe the design and implementation of a comprehensive context model for virtual characters.

Norbert Pfleger, Markus Löckelt

“What Would You Like to Talk About?” An Evaluation of Social Conversations with a Virtual Receptionist

We describe an empirical study of Marve, a virtual receptionist located at the entrance of our research laboratory. Marve engages with lab members and visitors in natural face-to-face communication, takes and delivers messages, tells knock-knock jokes, conducts natural small talk on movies, and discusses the weather. In this research, we investigate the relative popularity of Marve’s social conversational capabilities and his role-specific messaging tasks, as well as his perceived social characteristics. Results indicate that users are interested in interacting with Marve, use social conversational conventions with Marve, and perceive and describe him as a social entity.

Sabarish Babu, Stephen Schmugge, Tiffany Barnes, Larry F. Hodges

Characteristics of Nonverbal Behavior

Gesture Expressivity Modulations in an ECA Application

In this paper, we propose a study of co-verbal gesture properties that could enhance the animation of an Embodied Conversational Agent and their communicative performances. This work is based on the analysis of gesture expressivity over time that we have study from a corpus of 2D animations. First results point out two types of modulations in gesture expressivity that are evaluated on their communicative performances. A model of these modulations is proposed.

Nicolas Ech Chafai, Catherine Pelachaud, Danielle Pelé, Gaspard Breton

Visual Attention and Eye Gaze During Multiparty Conversations with Distractions

Our objective is to develop a computational model to predict visual attention behavior for an embodied conversational agent. During interpersonal interaction, gaze provides signal feedback and directs conversation flow. Simultaneously, in a dynamic environment, gaze also directs attention to peripheral movements. An embodied conversational agent should therefore employ social gaze not only for interpersonal interaction but also to possess human attention attributes so that its eyes and facial expression portray and convey appropriate distraction and engagement behaviors.

Erdan Gu, Norman I. Badler

Behavior Representation Languages

Towards a Common Framework for Multimodal Generation: The Behavior Markup Language

This paper describes an international effort to unify a multimodal behavior generation framework for Embodied Conversational Agents (ECAs). We propose a three stage model we call SAIBA where the stages represent intent planning, behavior planning and behavior realization. A Function Markup Language (FML), describing intent without referring to physical behavior, mediates between the first two stages and a Behavior Markup Language (BML) describing desired physical realization, mediates between the last two stages. In this paper we will focus on BML. The hope is that this abstraction and modularization will help ECA researchers pool their resources to build more sophisticated virtual humans.

Stefan Kopp, Brigitte Krenn, Stacy Marsella, Andrew N. Marshall, Catherine Pelachaud, Hannes Pirker, Kristinn R. Thórisson, Hannes Vilhjálmsson

MPML3D: A Reactive Framework for the Multimodal Presentation Markup Language

MPML3D is our first candidate of the next generation of authoring languages aimed at supporting digital content creators in providing highly appealing and highly interactive content with little effort. The language is based on our previously developed family of Multimodal Presentation Markup Languages (MPML) that broadly followed the “sequential” and “parallel” tagging structure scheme for generating pre-synchronized presentations featuring life-like characters and interactions with the user. The new markup language MPML3D deviates from this design framework and proposes a reactive model instead, which is apt to handle interaction-rich scenarios with highly realistic 3D characters. Interaction in previous versions of MPML could be handled only at the cost of considerable scripting effort due to branching. By contrast, MPML3D advocates a reactive model that allows perceptions of other characters or the user interfere with the presentation flow at any time, and thus facilitates natural and unrestricted interaction. MPML3D is designed as a powerful and flexible language that is easy-to-use by non-experts, but it is also extensible as it allows content creators to add functionality such as a narrative model by using popular scripting languages.

Michael Nischt, Helmut Prendinger, Elisabeth André, Mitsuru Ishizuka

Generation of Nonverbal Behavior with Speech

Creativity Meets Automation: Combining Nonverbal Action Authoring with Rules and Machine Learning

Providing virtual characters with natural gestures is a complex task. Even if the range of gestures is limited, deciding when to play which gesture may be considered both an engineering or an artistic task. We want to strike a balance by presenting a system where gesture selection and timing can be human authored in a script, leaving full artistic freedom to the author. However, to make authoring faster we offer a rule system that generates gestures on the basis of human authored rules. To push automation further, we show how machine learning can be utilized to suggest further rules on the basis of previously annotated scripts. Our system thus offers different degrees of automation for the author, allowing for creativity and automation to join forces.

Michael Kipp

Nonverbal Behavior Generator for Embodied Conversational Agents

Believable nonverbal behaviors for embodied conversational agents (ECA) can create a more immersive experience for users and improve the effectiveness of communication. This paper describes a nonverbal behavior generator that analyzes the syntactic and semantic structure of the surface text as well as the affective state of the ECA and annotates the surface text with appropriate nonverbal behaviors. A number of video clips of people conversing were analyzed to extract the nonverbal behavior generation rules. The system works in real-time and is user-extensible so that users can easily modify or extend the current behavior generation rules.

Jina Lee, Stacy Marsella

[HUGE]: Universal Architecture for Statistically Based HUman GEsturing

We introduce a universal architecture for statistically based HUman GEsturing (HUGE) system, for producing and using statistical models for facial gestures based on any kind of inducement. As inducement we consider any kind of signal that occurs in parallel to the production of gestures in human behaviour and that may have a statistical correlation with the occurrence of gestures, e.g. text that is spoken, audio signal of speech, bio signals etc. The correlation between the inducement signal and the gestures is used to first build the statistical model of gestures based on a training corpus consisting of sequences of gestures and corresponding inducement data sequences. In the runtime phase, the raw, previously unknown inducement data is used to trigger (induce) the real time gestures of the agent based on the previously constructed statistical model. We present the general architecture and implementation issues of our system, and further clarify it through two case studies. We believe that this universal architecture is useful for experimenting with various kinds of potential inducement signals and their features and exploring the correlation of such signals or features with the gesturing behaviour.

Karlo Smid, Goranka Zoric, Igor S. Pandzic

A Story About Gesticulation Expression

Gesticulation is essential for the storytelling experience thus, virtual storytellers should be endowed with gesticulation expression. This work proposes a gesticulation expression model based on psycholinguistics. The model supports: (a) real-time gesticulation animation described as sequences of constraints on static (Portuguese Sign Language hand shapes, orientations and positions) and dynamic (motion profiles) features; (b) multimodal synchronization between gesticulation and speech; (c) automatic reproduction of annotated gesticulation according to GestuRA, a gesture transcription algorithm. To evaluate the model two studies, involving 147 subjects, were conducted. In both cases, the idea consisted of comparing the narration of the Portuguese traditional story “The White Rabbit” by a human storyteller with a version by a virtual storyteller. Results indicate that synthetic gestures fared well when compared to real gestures however, subjects preferred the human storyteller.

Celso de Melo, Ana Paiva

IVAs in Serious Games

Introducing EVG: An Emotion Evoking Game

A dungeon role playing game intended to induce emotions such as boredom, surprise, joy, anger and disappointment is introduced. From the preliminary study, facial expressions indicating boredom and anger were observed. Individual differences were found on appraisal and facial expression of surprise, joy and disappointment.

Ning Wang, Stacy Marsella

Towards a Reactive Virtual Trainer

A Reactive Virtual Trainer (RVT) is an Intelligent Virtual Agent (IVA) capable of presenting physical exercises that are to be performed by a human, monitoring the user and providing feedback at different levels. Depending on the motivation and the application context, the exercises may be general ones of fitness to improve the user’s physical condition, special exercises to be performed from time to time during work to prevent for example RSI, or physiotherapy exercises with medical indications. In the paper we discuss the functional and technical requirements of a framework which can be used to author specific RVT applications. The focus is on the reactivity of the RVT, manifested in natural language comments on readjusting the tempo, pointing out mistakes or rescheduling the exercises. We outline the components we have implemented so far: our animation engine, the composition of exercises from basic motions and the module for analysis of tempo in acoustic input.

Zsófia Ruttkay, Job Zwiers, Herwin van Welbergen, Dennis Reidsma

Making It Up as You Go Along – Improvising Stories for Pedagogical Purposes

We consider the issues involved in taking educational role-play into a virtual environment with intelligent graphical characters, who implement a cognitive appraisal system and autonomous action selection. Issues in organizing emergent narratives are discussed with respect to a Story Facilitator as well as the impact on the authoring process.

Ruth Aylett, Rui Figueiredo, Sandy Louchart, João Dias, Ana Paiva

Cognition and Emotion I

A Neurobiologically Inspired Model of Personality in an Intelligent Agent

We demonstrate how current knowledge about the neurobiology and structure of human personality can be used as the basis for a computational model of personality in intelligent agents (PAC—personality, affect, and culture). The model integrates what is known about the neurobiology of human motivation and personality with knowledge about the psy chometric structure of trait language and personality tests. Thus, the current model provides a principled theoretical account that is based on what is currently known about the structure and neurobiology of human personality and tightly integrates it into a computational architecture. The result is a motive-based computational model of personality that provides a psychologically principled basis for intelligent virtual agents with realistic and engag ing personality.

Stephen Read, Lynn Miller, Brian Monroe, Aaron Brownstein, Wayne Zachary, Jean-Christophe LeMentec, Vassil Iordanov

Feeling Ambivalent: A Model of Mixed Emotions for Virtual Agents

Mixed emotions, especially those in conflict, sway agent decisions and result in dramatic changes in social scenarios. However, the emotion models and architectures for virtual agents are not yet advanced enough to be imbued with coexisting emotions. In this paper, an improved emotion model integrated with decision making algorithms is proposed to deal with two topics: the generation of coexisting emotions, and the resolution to ambivalence, in which two emotions conflict. A scenario of ambivalence is provided to illustrate the process of agent’s decision-making.

Benny Ping-Han Lee, Edward Chao-Chun Kao, Von-Wun Soo

Are Computer-Generated Emotions and Moods Plausible to Humans?

This paper presents results of the plausibility evaluation of computer-generated emotions and moods. They are generated by ALMA (A Layered Model of Affect), a real-time computational model of affect, designed to serve as a modular extension for virtual humans. By a unique integration of psychological models of affect, it provides three major affect types: emotions, moods and personality that cover short, medium, and long term affect. The evaluation of computer-generated affect is based on textual dialog situations in which at least two characters are interacting with each other. In this setup, elicited emotions or the change of mood are defined as consequences of dialog contributions from the involved characters. The results indicate that ALMA provides authentic believable emotions and moods. They can be used for modules that control cognitive processes and physical behavior of virtual humans in order to improve their lifelikeness and their believable qualities.

Patrick Gebhard, Kerstin H. Kipp

Creating Adaptive and Individual Personalities in Many Characters Without Hand-Crafting Behaviors

Believable characters significantly increase the immersion of users or players in interactive applications. A key component of believable characters is their personality, which has previously been implemented statically using the time consuming task of hand-crafting individuality for each character. Often personality has been modeled based on theories that assume behavior is the same regardless of situation and environment. This paper presents a simple affective and cognitive framework for interactive entertainment characters that allows adaptation of behavior based on the environment and emotions. Different personalities are reflected in behavior preferences which are generated based on individual experience. An initial version of the framework has been implemented in a simple scenario to explore which parameters have the greatest effect on agent diversity.

Jennifer Sandercock, Lin Padgham, Fabio Zambetta

Cognition and Emotion II

Thespian: Modeling Socially Normative Behavior in a Decision-Theoretic Framework

To facilitate


conversations with the human players in interactive dramas, virtual characters should follow similar conversational norms as those that govern human-human conversations. In this paper, we present a model of conversational norms in a decision-theoretic framework. This model is employed in the Thespian interactive drama system. In Thespian, characters have explicit goals of following norms, in addition to their other personal goals, and use a unified decision-theoretic framework to reason about conflicts among these goals. Different characters can weigh their goals in different ways and therefore have different behaviors. We discuss the model of conversational norms in Thespian. We also present preliminary experiments on modeling various kinds of characters using this model.

Mei Si, Stacy C. Marsella, David V. Pynadath

Autobiographic Knowledge for Believable Virtual Characters

It has been widely acknowledged in the areas of human memory and cognition that behaviour and emotion are essentially grounded by autobiographic knowledge. In this paper we propose an overall framework of human autobiographic memory for modelling believable virtual characters in narrative story-telling systems and role-playing computer games. We first lay out the background research of autobiographic memory in Psychology, Cognitive Science and Artificial Intelligence. Our autobiographic agent framework is then detailed with features supporting other cognitive processes which have been extensively modelled in the design of believable virtual characters (e.g. goal structure, emotion, attention, memory schema and reactive behaviour-based control at a lower level). Finally we list directions for future research at the end of the paper.

Wan Ching Ho, Scott Watson

Teachable Characters: User Studies, Design Principles, and Learning Performance

Teachable characters can enhance entertainment technology by providing new interactions, becoming more competent at game play, and simply being fun to teach. It is important to understand how human players try to teach virtual agents in order to design agents that learn effectively from this instruction. We present results of a user study where people teach a virtual agent a novel task within a reinforcement-based learning framework. Analysis yields lessons of how human players approach the task of teaching a virtual agent: 1) they want to direct the agent’s attention; 2) they communicate both instrumental and motivational intentions; 3) they tailor their instruction to their understanding of the agent; and 4) they use negative communication as both feedback and as a suggestion for the next action. Based on these findings we modify the agent’s learning algorithm and show improvements to the learning interaction in follow-up studies. This work informs the design of real-time learning agents that better match human teaching behavior to learn more effectively and be more enjoyable to teach.

Andrea L. Thomaz, Cynthia Breazeal

Applications of IVAs

FearNot’s Appearance: Reflecting Children’s Expectations and Perspectives

This paper discusses FearNot, a virtual learning environment populated by synthetic characters aimed at the 8-12 year old age group for the exploration of bullying and coping strategies. Currently, FearNot is being redesigned from a lab-based prototype into a classroom tool. In this paper we focus on informing the design of the characters and of the virtual learning environment through our interpretation of qualitative data gathered about interaction with FearNot by 345 children. The paper focuses on qualitative data collected using the Classroom Discussion Forum technique and discusses its implications for the redesign of the media used for FearNot. The interpretation of the data identifies that the use of fairly naïve synthetic characters for achieving empathic engagement appears to be an appropriate approach. Results do indicate a focus for redesign, with a clear need for improved transitions for animations; identification and repair of inconsistent graphical elements; and for a greater cast of characters and range of sets to achieve optimal engagement levels.

Lynne Hall, Marco Vala, Marc Hall, Marc Webster, Sarah Woods, Adrian Gordon, Ruth Aylett

Populating Reconstructed Archaeological Sites with Autonomous Virtual Humans

Significant multidisciplinary efforts combining archaeology and computer science have yielded virtual reconstructions of archaeological sites for visualization. Yet comparatively little attention has been paid to the difficult problem of populating these models, not only to enhance the quality of the visualization, but also to arrive at quantitative computer simulations of the human inhabitants that can help test hypotheses about the possible uses of these sites in ancient times. We introduce an artificial life approach to populating large-scale reconstructions of archaeological sites with virtual humans. Unlike conventional “crowd” models, our comprehensive, detailed models of individual autonomous pedestrians span several modeling levels, including appearance, locomotion, perception, behavior, and cognition. We review our human simulation system and its application to a “modern archaeological” recreation of activity in New York City’s original Pennsylvania Station. We also describe an extension of our system and present its novel application to the visualization of possible human activity in a reconstruction of the Great Temple of ancient Petra in Jordan.

Wei Shao, Demetri Terzopoulos

Evaluating the Tangible Interface and Virtual Characters in the Interactive COHIBIT Exhibit

When using virtual characters in the human-computer interface the question arises of how useful this kind of interface is: whether the human user accepts, enjoys and profits from this form of interaction. Thorough system evaluations, however, are rarely done. We propose a post-questionnaire evaluation for a virtual character system that we apply to COHIBIT, an interactive museum exhibit with virtual characters. The evaluation study investigates the subjects’ experiences with the exhibit with regard to informativeness, entertainment and virtual character perception. Our subjects rated the exhibit both entertaining and informative and gave it a good overall mark. We discuss the detailed results and identify useful factors to consider when building and evaluating virtual character applications.

Michael Kipp, Kerstin H. Kipp, Alassane Ndiaye, Patrick Gebhard

Invited Talks

Invited Talk: Rule Systems and Video Games

In this talk I will examine the different kinds of rule systems which have been used historically within games. I will then explore the emerging field of self modifying rule systems within computer games. In conclusion I will cover various techniques which maybe applied and how games relating to everyday life such as The Sims can benefit from such systems.

Rod Humble

Invited Talk: Façade: Architecture and Authorial Idioms for Believable Agents in Interactive Drama


is a first person, real-time interactive drama that integrates autonomous characters, an interactive plot that goes beyond simple story graphs, and natural language understanding, into a first-person, real-time interactive drama experience. Since its release in July 2005 as freeware,


has been downloaded over 350,000 times and received widespread critical acclaim among players, game developers and mainstream press.

Michael Mateas, Andrew Stern

Invited Talk: Social Effects of Emotion: Two Modes of Relation Alignment

This talk proposes that a central function of many emotions is to configure and reconfigure the relational positions of two or more social agents with respect to some intentional object. This process of relation alignment can proceed at two levels (often in parallel). At one level, there is implicit adjustment to the unfolding transaction on a moment-by-moment basis. At the second level, there is a more strategic presentation of a relational stance oriented to anticipated reactions from the other (and from the material environment). I contrast this relational approach to emotion with the appraisal account which sees emotions simply as reactions to apprehended relational meaning. In my view, appraisal often emerges as a consequence of the adoption of a relational stance rather than as its original cause. This distinction is illustrated with apparently anomalous examples of emotions such as embarrassment and guilt.

Brian Parkinson


Computer Model of Emotional Agents

This article presents a computer model of intelligent agents inhabiting a virtual world. The software model we developed, the agent parameters are stored in relational tables. The agents based on this architecture can be visualized graphically.

The agent’s architecture includes the following components: needs, emotions, actions, self-knowledge, knowledge of places and events, rules, meta-rules and characteristics. Every component can influence over every another component.

The agent’s characteristics are energy, work, adaptation, and inertia. The agent performs work to gather features and to evaluate and use them. Then, its potential energy will be equal to the work on collecting features from the world. Inertia is the inflexibility of the intelligent agents towards the alteration of their state.

A behavior rule is each statement that has one or more antecedents and conclusion. When the agent acquires some new information, it is able to change its behavior rule considering its own “principles”.

The meta-rule conceives as abstract “principle” or “consciousness” of the agent. It has attitude to concrete behavior rules in situations, which requires behavior choice or reconsideration. They are fewer than the behavior rules and require more time and knowledge. The rules and meta-rules for agent’s behavior are dynamically content of relational tables and the program interprets them as a data. Thus, any modification is only the table content change.

The explorer always knows the agent’s state and the reasons for this state. Every place, action, state, event or rule is a function of its features. Its value is a sum of the values of the features divided by their count. Every feature or a rule antecedent is a function of its emotions, emotions values, values and weights of the needs, and inertia. Every feature interacts with all other features.

Relations between the components are shown and the possibilities for their quantitative representation are suggested. Expressions for calculating various parameters of the model including basic agent need weights and features of the places, actions, states, generalized agent states are presented and summarized. A coefficient used for reordering the basic needs is introduced.

For the purposes of the experiment, the simple scenario is suggested.

Dilyana Budakova, Lyudmil Dakovski

Affective Robots as Mediators in Smart Environments

Following the Ambient Intelligence vision, a Smart Environment (SE) has the main aim of facilitating the user in interacting with its services by making their fruition easy, natural and adapted to the user needs. In this abstract we propose, as counterpart of the interaction, an “affective robot” that acts as mediator between the user and the environment. On one hand, the robot can be thought as a mobile and intelligent interface to the environment system. On the other hand, users establish an affective and familiar relation with it. Since this kind of interaction involves socio-emotional content it is important to take social-affective factors into account. In this case robots lose their connotation of technological tools and embody the one of friendly companion [3]. According to this metaphor, the robot has a role, a personality and coherent behaviours that allow it to follow social dynamics, to create relationships with humans and to invoke social responses. Then, as a consequence,

user modelling

becomes a key issue for developing such a successful proactive robot: it has to be able to consider and combine rational factors (i.e. interests, beliefs, abilities, preferences) and extra-rational ones (i.e. attitude, affective and social factors) considering the dependencies and influences between them.

Gianni Cozzolongo, Berardina De Carolis

Expression of Emotion in Body and Face

Intelligent interaction with an environment, other IVAs, and human users requires a system that identifies subtle expressive cues and behaves naturally using modalities such as body, face, and voice to communicate. Although research on individual affective channels has increased, little is known about expressive qualities in whole body movement. This study has three goals: (1) to determine rates of observer recognition of emotion in walking, (2) to use kinematic analysis to quantify how emotions change gait patterns in characteristic ways, and (3) to describe the concurrence of facial and bodily expression of emotion.

Twenty-six undergraduate students recalled an experience from their own lives in which they felt angry, sad, content, joy, or no emotion at all (neutral). After recalling a target emotion, participants walked across the lab. Whole body motion capture data were acquired using a video-based, 6-camera system. Side view video was also recorded. Ten participants wore a special head mounted camera designed to record video of facial expression. After each trial, participants rated the intensity of eight emotions (4 target and 4 non-target). After blurring the faces in the side view video so that facial expressions were not observable, randomized composite videos were shown to untrained observers. After viewing each video clip, observers selected one of ten responses corresponding to the emotion that they thought the walker felt during the trial. FACS coding was used to evaluate the face video for evidence of emotion and timing of facial expressions with respect to the gait cycle.

Self-report data indicated that the walkers felt the target emotions at levels corresponding to “moderately” or above in all trials. Validation data were collected from five observers on gait trials from a subset of subjects (n=16). Recognition rates for sad, anger, neutral and content were 45%, 25%, 20% and 16%, respectively. Joy was recognized at chance levels (10%). Normalized velocity, normalized stride length, cycle duration and velocity were significantly affected by emotion.

This study is unique in describing the effects of specific emotions on gait. The preliminary results indicate that gait kinematics change with emotion. Although temporal-spatial kinematics were related to arousal levels, angular kinematics are needed to distinguish emotions with similar levels of arousal.

Elizabeth A. Crane, M. Melissa Gross, Barbara L. Fredrickson

Towards Primate-Like Synthetic Sociability

This research addresses synthetic agents as autonomous software entities, capable of managing social relationships in small scale societies. An individual architecture is structurally designed as enabling primate-like social organization, which is in turn individually modulated by an affective action-selection mechanism. The aim is to improve agent social reactive, and social cognitive capabilities, by implementing plain communication conveying behavioral rewards or sanctions. This artificial society simulation is being developed as an experimental model aimed at exploring the nature of (1) the adaptation of inter-agent social norms, (2) individual behavioral arbitration, and the (3) interplay of reaction and deliberation. This computational outlook on social cognition offers a contrast with traditional socio-unaware action-selection systems, frequently based on function optimization of decision-making processes [1]. To anthropomorphize the model, social networks are analyzed in terms of situated agents and their internal states. Individuals are able to recognize current counterparts and have their community size dependent on accumulated experiences [2]; thus food and relationship management become crucial individual tasks. However, this work does not seek an ethologically realistic approach like [3] – nor does it aim at a complete account of animal or human language interaction. It rather argues for a simpler alternative to represent synthetic social intelligence. By interleaving processes of reaction and planning, agents are expected to act following their individual modulation of pre-configured abilities – dealing both with passive objects (resources) and other active characters (agents). Finally, to interpret interactions and the operation of affective feedback, essential observations and analysis are required on the (1) administration of basic social constraints and (2) processes producing social change, relating invidual behavior choices to group dynamics [4].

Pablo Lucas dos Anjos, Ruth Aylett

Here Be Dragons: Integrating Agent Behaviors with Procedural Emergent Landscapes and Structures

Here Be Dragons is a virtual environment containing creatures and cities that have no direct human designer. Digital genes, defining a structure similar to a Lindenmeyer system, form both the agents and structures within the space. Each component of these genes contains behavioral as well as structural content, and their format allows alterations like genetic crossbreeding and mutation. Protocol exists for communication between the cities and inhabitants, allowing more intricate interactions.

The name “Here Be Dragons” refers to the unexplored regions found past the borders of old maps. The virtual space invites exploration, and to that end creature and city structures form novel shapes and behaviors as the user navigates the terrain. Explicitly authored spaces can only be as large as someone makes them, while traditional procedural spaces either become too predictable or too chaotic. Using algorithms traditionally reserved for virtual agents, as well as a mixture of techniques found in Artificial Life and other emergent schools of thought, the virtual space attempts to balance coherency with novelty. This approach has turned the typical production pipeline on its head. Traditional asset creation has been replaced with designing ways to generate and interpret pliable data. Upon completion of this design phase, a world can be generated in seconds.

The creatures have an enforced symmetry, and are rendered in silhouette, inviting analogies to birds, bats, viruses, and dragons. Each segment within a creature acts as its own state machine, its own actions rippling through the entire body, generation the illusion of animism. The cities take information similar to those of the creatures but instead use it to build walls, spires, and buttresses. While the underlying code can inform very concrete visuals, Here Be Dragons intentionally abstracts the forms, invoking an effect similar to a Rorschach inkblot.

As the line between agents and environment blurs, the potential for coherent interaction and visualization increases. A universal gene could potentially define 3D models, music, decision trees, and story flow within the same virtual space.

Todd Furmanski

Virtual Pedagogical Agents: Naturalism vs. Stylization

In discussions on naturalism vs. stylization in the design of virtual pedagogical agents (VPAs), the

smooth communication argument

is one of the most central in favour of visual naturalism. Yet, this argument dissolves if we separate design into the levels of: (l)

linguistic performance

, where the support for naturalism is strong; (2)

dynamic visual appearance,

in the sense of bodily behaviour, with increasing design freedom still matching positive user responses; and (3)

static visual appearance

, as the under lying, inanimate, visual model, with naturalism a well defined state, but stylization spanning a complex, multidimensional design space. A comparison with cartoons and animated movies here suggest a potential for stylized designs.

Agneta Gulz, Magnus Haake

The Role of Social Norm in User-Engagement and Appreciation of the Web Interface Agent Bonzi Buddy

Whether or not an agent application is accepted or rejected is not only the effect of its empathic qualities but also of the assessment of empathy by the user as influenced by peer group norms. If a Computer Science student visits, for example, a Web site that advertises the Bonzi Buddy Web agent, he or she will be prejudiced against the agent before even interacting with it and despite the empathic qualities it advertises. One of the factors causing these effects is that Computer Science students reckon with the group norms of their peers, who forbid appreciating Microsoft applications in the first place, particularly user applications that may be judged childish. This study focused on the effects of peer-group norms on individual judgments of Bonzi Buddy. Hypotheses and other details on the experiment can be found in Hoorn and Van Vugt (2004).

Johan F. Hoorn, Henriette C. van Vugt

Countering Adversarial Strategies in Multi-agent Virtual Scenarios

Mutual modeling and plotting between virtual characters is an important issue in scenarios with an interactive multi-agent drama and games. This paper provides a framework for agents to model the intentions of others and to select the optimal strategy by simulating multi-level mutual modeling. The resulting plan considers multiple rounds of counter-strategy and possible adversarial reactions, creating deeper strategic interaction.

Yu-Cheng Hsu, Paul Hsueh-Min Chang, Von-Wun Soo

Avatar’s Gaze Control to Facilitate Conversational Turn-Taking in Virtual-Space Multi-user Voice Chat System

Aiming at facilitating multi-party conversations in a shared-virtual-space voice chat environment, we propose an avatar’s gaze behavior model for turn-taking in multi-party conversations, and a shared-virtual-space voice chat system with automatic avatar gaze direction control function using user utterance information. The use of the utterance information attained easy-to-use automatic gaze control without eye-tracking camera or manual operation. In our gaze behavior model, a conversation was divided into three states: during-utterance, right-after-utterance, and silence. For each state, avatar’s gaze behaviors are controlled based on a probabilistic state transition model.

Previous studies reveled that gaze has a power of selecting the next speaker and urge her/him to speak, and continuous gaze has a risk of giving intimidating impression to the listener. Although explicit look-away from the conversational partner generally means interest to others, such gaze behaviors seem to help the speaker avoid threatening the listener’s face. In order to express less-face-threatening eye-gaze in virtual space avatars, our model introduces vague-gaze: the avatar looks at five degrees lower than the user’s eye position. Thus, in during-utterance state, the avatars were controlled using a probabilistic state transition model that transits among three states: eye contact, vague-gaze and look-away. It is expected that the vague-gaze reduces intimidating impression as well as facilitates conversational turn-taking. In right-after-utterance state, the speaker avatar keeps an eye contact for a few seconds to urge the next speaker to start a new turn. This is based on an observation of real face-to-face conversation. Finally, in silent state, avatar’s gaze direction is randomly changed to avoid giving intimidating impression.

In our evaluation experiment, twelve subjects were divided into four groups, and requested to chat with the avatars and answer impressions for them using Likert scale. As for the during-utterance state, in terms of naturalness, intimidating impression reduction and turn-taking facilitation, a transition model consisting of vague-gaze and look-away was significantly effective, compared to the vague-gaze alone, the look-away alone and the fixed-gaze alone models . In the right-after-utterance state, any of the gaze control methods were significantly effective in facilitating turn-taking, compared to the fixed-gaze method. The evaluation experiment demonstrated the effectiveness of our avatar’s gaze control mechanism, and suggested that the gaze control based on the user utterance facilitates multi-party conversations in a virtual-space voice chat system.

Ryo Ishii, Toshimitsu Miyajima, Kinya Fujita, Yukiko Nakano

The Role of Discourse Structure and Response Time in Multimodal Communication

In an ongoing project on multimodal communication in humans and agents [1], we investigate the interaction between two linguistic modalities (prosody and dialog structure) and two non-linguistic modalities (eye gaze and facial expressions). The goal is to gain a better understanding of the use of communicative channels in discourse and can subsequently aid the development of more effective animated conversational agents. We studied conversations between humans involved in the Map Task scenario whereby an Instruction Giver (IG) navigates an Instruction Follower (IF) from a starting point to an end point on a map [2].

Patrick Jeuniaux, Max M. Louwerse, Xiangen Hu

The PAC Cognitive Architecture



(personality, affect, and cognition) Architecture is a new modeling architecture designed to create Intelligent Virtual Agents with spe cific personality traits, emotions, and cultural characteristics. PAC integrates theory and data from personality and social psychology, cognitive sci ence, and neuroscience to build a model of personality, emotion, and culture based on fundamental underlying human motivational systems (e.g., dominance, coalition formation, affectional relationships, self-protection, mate choice, parenting, and attachment). These motives are activated by situational cues, but individual agents can have differing baseline activations for different motives. Motives are controlled through a hierarchy of control processes (e.g., Approach and Avoidance systems, Disinhibition/Constraint system) that can be differentially set to capture individual differences. In PAC, the activation dynamics of underlying motives influence both the way in which another agent’s behavior is interpreted and the target agent’s choice of actions. Thus, the motive dynamics give rise to persistent individual behavioral tendencies, that is, differences in personality.

Lynn C. Miller, Stephen J. Read, Wayne Zachary, Jean-Christophe LeMentec, Vassil Iordanov, Andrew Rosoff, James Eilbert

Control of Avatar’s Facial Expression Using Fundamental Frequency in Multi-user Voice Chat System

An automatic facial expression control algorithm of CG avatar based on the fundamental frequency of the user’s utterance is proposed, in order to facilitate the multi-party casual chat in a multi-user virtual-space voice chat system. The proposed method utilizes the common tendency of the voice fundamental frequency that reflects the emotional activity, especially the strength of the delight. This study simplified the facial expression control problem by limiting the expression in the strength of the delight, because it appears the expression of the delight is the most important to facilitate the casual chat. The problem of using the fundamental frequency is that fundamental frequency varies with intonation as well as emotion; hence the use of the raw fundamental frequency changes the expression of the avatar passionately. Therefore, Emotional Point by emotional Activity (EPa) was defined as the moving-average of the normalized fundamental frequency, to suppress the influence of the intonation. The strength of the delight of the avatar facial expression was linearly controlled using EPa, based on the Facial Action Coding System (FACS). The duration of the moving average was chosen as five seconds experimentally. However, the moving average delays the avatar behavior, and the delay is more serious especially in the response utterance. Therefore, to compensate the delay of the response, the Emotional Point by Response (EPr), was defined using the initial voice volume of the response utterance. EPr was calculated for only the response utterance, which means the utterance just after another user’s utterance. The ratio of EPr to EPa was decided experimentally as one to one. The proposed automatic avatar facial expression control algorithm was implemented on the previously developed virtual-space multi-user voice chat system. The subjective evaluation was performed in ten subjects. The each subject in separate room was required to chat with an experimental partner using the system for four minutes and to answer four questions using Likert scale. Throughout the experiments, the subjects reported better impression of the automatic control of facial expression according to the utterances. The facial control using both EPa and EPr demonstrated better performance in terms of naturalness, favorability, familiarity and interactivity, compared to the fixed facial expression, the automatic control using EPa alone and the EPr alone conditions.

Toshimitsu Miyajima, Kinya Fujita

Modeling Cognition with a Human Memory Inspired Advanced Neural Controller

In this paper is studied how the imitation of the structures and the processes of memory can possibly makes cognition arise in a computational model. More precisely, the combination of a perceptron and an associative memory leads to build a scalable behavioral controller expected to reveal


behaviors. This approach differs from traditional behavioral animation hybrid architectures [1], in which the agent knowledge is a collection of modeller-defined symbolic objects or frames [2] and its behavior a set of scripts or automatons [3]. To our concern, this prevents the agent from adaptiveness in dynamic environments.

D. Panzoli, H. Luga, Y. Duthen

Storytelling – The Difference Between Fantasy and Reality

Can we create virtual storytellers that have enough expressive power to convey a story? This paper presents a study comparing the storytelling ability between a virtual and a human storyteller. In order to evaluate it, three means of communication were taken into account: voice, facial expression and gestures. One hundred and eight students from computer engineering watched a video where a storyteller narrated the traditional Portuguese story entitled ”O Coelhinho Branco” (The little white rabbit). The students were divided into four groups. Each of these groups saw one video where the storyteller was portrayed either by a synthetic character or a human. The storyteller’s voice, no matter the nature of the character, could also be real or synthetic. After the video display, the participants filled a questionnaire where they rated the storyteller performance.

For all dependent variables in Facial Expression, the synthesized version has a significant lower rating than the real one. Of particular interest is that the rating of this communication means is strongly affected not only by the visual expression but also by the voice. In fact, the use of synthesized voice has a significant negative effect when rating the facial expression.

Regarding gestures, only one significant difference was found in the rating of its believability. In this case the synthetic storyteller presents worse performance than the human actor. In the remaining ratings the data suggests that the synthetic gestures have a close performance to the real ones. It is also worthy of notice that gestures rating have always a majority of positive ratings. Similarly to what happens in the facial expression, gestures also seem to be affected by the nature of the used voice, but this time in an inverse manner. Positive gestures ratings percentages have an increase or stay on the same value when the synthesized voice is taken into account.

The voice was the medium that had a clearer significant difference between the real and the synthetic versions , having the real voice higher ratings than its counterpart. Nevertheless, only the satisfaction regarding the synthetic voice obtained a majority of negative ratings. Both the emotion and believability aspect of the synthetic voice gathered a majority of positive ratings.

Guilherme Raimundo, João Cabral, Celso Melo, Luís C. Oliveira, Ana Paiva

A Plug-and-Play Framework for Theories of Social Group Dynamics

We present an extensible framework for behavior control of social agents in a multi-agent system that has the following features. It implements a basic repertoire of socio-psychological models of behavior and interpersonal interactions that can be plugged and unplugged at will depending on the specific context of the application. This enables us to test several theories in isolation or combination to increase the transparency of the system and to investigate how the inclusion of a certain theory influences the behavior of the agents. Unlike earlier approaches, our approach is not bound to a specific theory. Thus, it becomes possible to run a simulation with the same set of agents using different theories to compare their effect.

Matthias Rehm, Birgit Endraß, Elisabeth André

Learning Classifier Systems and Behavioural Animation of Virtual Characters

Producing intuitive systems for the directing of virtual actors is one of the major objectives of research in virtual animation. So, it is often interesting to conceive systems that enable behavioral animation of autonomous characters, able to correctly fulfill directives from a human user considering their goal and their perception of the virtual environment. Common ways to generate behaviors of such virtual characters use usually determinist algorithm (scripts or automatons [1]). Thus the autonomy of the characters is a fixed routine that cannot adapt to novelty or any situation not previously considered. To make these virtual actors able of adaptation, we propose to combine a behavioral framework (ViBes [2]) and an evolutionist learning system, the Learning Classifier Systems [3]. Using classifiers systems we managed to make a virtual human to learn to select and to cook an aliment in order to eat something. The association of ViBes framework and two trained classifiers systems produced the following real time animation (fig. [1]) in a dynamic virtual environment.

S. Sanchez, H. Luga, Y. Duthen

Using Intelligent Agents to Facilitate Game Based Cultural Familiarization Training

CHI Systems, under contract to the U. S. Army Research Institute, developed an immersive training system, called Virtual Environment Composable Training for Operational Readiness Training Delivery (VECTOR-TD), which provides scenario-based virtual environments for cultural familiarization. VECTOR-TD was designed to provide a new technology for game-based training in cultural familiarization through the application of scenario-based training. VECTOR-TD integrates the Lithtech Jupiter game engine for the virtual environment and CHI System’s iGEN® cognitive agent development toolkit to implement intelligent game characters and integrated performance monitoring. In order for VECTOR-TD to have true long term value to the Army, it was determined that it would be necessary to develop content management tools which would allow Army training personnel to edit and author VECTOR-TD scenarios. To address this, VECTOR-Scenario Editor (VECTOR-SE) was developed based on VECTOR-TD XML-based scripting language to externalize the representation of the training scenario, the behavior of non-player characters (NPCs) within the scenario, and the behavior of the instructor NPC responsible for performance evaluation and After Action Review (AAR). The XML-based language provides direct control of the behavior, dialog, emotional state, and predispositions of NPCs within the scenario. VECTOR-SE was successfully developed to provide content development and management tools that allow non-programmers to author scenarios and manipulate training parameters. VECTOR-SE is based on an instructional design process model and involves the specification of learning objectives, the creation of scenario segments (i.e., vignettes), the authoring of character/trainee dialog, the designation of dialog branching, the placement of characters at game locations, and the designation of when training assistance/remediation will be delivered. The significance of VECTOR-SE is twofold. First, it drastically reduces the time and skill required to develop VECTOR scenarios. Within the original VECTOR project, a 15-minute cultural training scenario was developed which required 2 months of programming and cognitive modeling effort. The same scenario was later implemented in VECTOR-SE in under 3 days. Additional scenarios with comparable complexity have also been implemented using VECTOR-SE and similar significant reductions in development time have been observed. Second, VECTOR-SE makes scenario development or modification accessible to a wider audience of professionals. VECTOR-TD and SE are currently being evaluated at the U.S. Military Academy at West Point.

Thomas Santarelli, Charles Barba, Floyd A. Glenn, Daphne Bogert

Mind the Body

Filling the Gap Between Minds and Bodies in Synthetic Characters

Interactive virtual environments (IVEs) are inhabited by synthetic characters that guide and engage children in a wide variety of activities, like playing games or learning new things. To build those environments, we need believable autonomous synthetic characters that are able to think and act in very dynamic environments. These characters have often able minds that are limited by the actions that the body can do. In one hand, we have minds capable of creating interesting non-linear behaviour; on the other hand, we have bodies that are limited by the number of animations they can perform. This usually leads to a large planning effort to anticipate possible situations and define which animations are necessary. When we aim at non-linear narrative and non-deterministic plots, there is an obvious gap between what minds can think and what bodies can do. We propose smart bodies as way to fill this gap between minds and bodies. A smart body extends the notion of standard body since it is enriched with semantic information and can do things on its own. The mind still decides what the character should do, but the body chooses how it is done. Smart bodies, like standard bodies, have a model and a collection of animations which are provided by a graphics engine. But they also have access to knowledge about other elements in the world like locations, interaction information and particular attributes. At this point, the notions of interaction spot and action trigger come into play. Interaction spots are specific positions around smart bodies or items where other smart bodies can do particular interactions. Action triggers define automatic reactions which are triggered by smart bodies when certain actions or interactions occur. We use both these constructs to create abstract references for physical elements, to act as a resource and pre-condition mechanisms, and to simulate physics using rule-based reactions. Smart bodies use all this information to create high-level actions which are used by the minds. Thus, minds operate at a higher level and do not have to deal with low-level body geometry or physics. Smart bodies were used in FearNot!, an anti-bullying application. In FearNot! children experience virtual stories generated in real-time where they can witness (from a third-person perspective) a series of bullying situations towards a character. Clearly, in such an emergent narrative scenario, minds need to work at a higher-level of abstraction without worrying with bodies and how a particular action is carried out at low-level. Smart bodies provided this abstraction layer. We performed a small study to validate our work in FearNot! with positive results. We believe there may be other applications where smart bodies have much to offer, particularly when using unscripted and non-linear narrative approaches.

Marco Vala, João Dias, Ana Paiva

CAB: A Tool for Interoperation Among Cognitive Architectures

The rapid technology development for modeling and simulating human behavior and cognition as Intelligent Virtual Agents (IVAs) has resulted in broad incompatibilities among underlying architectures and specific models. At the same time, the growing interest in practical application of IVAs in defense/aerospace, healthcare, and training systems is bringing demands of easier and cheaper IVA development, and increased re-usability of agent features and components. The Cognitive Architecture Bridge (CAB) is being developed as a new ‘middleware’ approach to providing multiple levels of interoperability and composibility between and among IVA models and model-components. CAB is designed to allow capabilities from different IVA models and modeling architectures to be integrated into a single ‘virtual’ IVA model, through four general classes of mechanisms and a common run-time infrastructure.

Jean-Christophe LeMentec, Wayne Zachary


Weitere Informationen