Top

2011 | Book

Read chapter Read first chapter

Affective Computing and Intelligent Interaction

4th International Conference, ACII 2011, Memphis, TN, USA, October 9–12, 2011, Proceedings, Part I

Editors: Sidney D’Mello, Arthur Graesser, Björn Schuller, Jean-Claude Martin

Publisher: Springer Berlin Heidelberg

Book Series : Lecture Notes in Computer Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

The two-volume set LNCS 6974 and LNCS 6975 constitutes the refereed proceedings of the Fourth International Conference on Affective Computing and Intelligent Interaction, ACII 2011, held in Memphis,TN, USA, in October 2011. The 135 papers in this two volume set presented together with 3 invited talks were carefully reviewed and selected from 196 submissions. The papers are organized in topical sections on recognition and synthesis of human affect, affect-sensitive applications, methodological issues in affective computing, affective and social robotics, affective and behavioral interfaces, relevant insights from psychology, affective databases, Evaluation and annotation tools.

Frontmatter

Invited Talks

To Our Emotions, with Love: How Affective Should Affective Computing Be?

“Affective computing” has become the rallying call for a heterogeneous group of researchers that, among other goals, tries to improve the interaction of humans and machines via the development of affective and multimodal intelligent systems. This development appears

logical

based on the popular notion that emotions play an important part in social interactions. In fact, research shows that humans missing the possibility to express or perceive/interpret emotions, seem to have difficulties navigating social conventions and experience a negative impact on their relationships.

However, while emotions are certainly somewhat important, the desire to implement affect in machines might also be influenced by

romantic

notions, echoed in the plight of iconic science fiction characters, such as Star Trek’s Data, who struggles to achieve humanity via emotions. The emphasis on emotions in the psychological research on nonverbal communication is in part due to theoretical discussions in that community. However, taken out of this context, there is the risk to overestimate the importance of discrete emotional expressions. Inversely, behaviors relevant to successful interaction will be underestimated in psychology because they might be culturally variable and perhaps even idiosyncratic for smaller groups or individuals. In other words, what might be noise for the emotion theorist could be the data to create believable conversational agents.

I will discuss how much emotion might or might not be needed when trying to build emotional or emotion-savvy systems, depending on the type of application that is desired, based on a multi-level approach. At one level of analysis, a clear distinction of encoding and decoding processes is required to know what (real) people actually show in certain situations, or what people might in fact perceive. It is not obvious how much information is actually “read” from faces, as opposed to “read” into faces. In other words, context plays a large role for the interpretation of nonverbal behavior. Some of this context is verbal, but some is situational.

At a different level of analysis, interactive characteristics in conversation need to be considered. This refers to issues such as responsiveness, synchrony, or imitation that are often neglected in affective computing applications

and

in basic psychological research. For example, an artificial system that will only react to observed patterns of verbal/nonverbal behavior might be too slow and create strange delayed effects, as opposed to systems that seem to react, but that, in fact, anticipate the interactant’s reactions. It is these areas where much interesting work is, should be, and will be happening in the next few years.

Arvid Kappas

Affect, Learning, and Delight

Because of the growing recognition of the role that affect plays in learning, affective computing has become the subject of increasing attention in research on interactive learning environments. The intelligent tutoring systems community has begun actively exploring computational models of affect, and game-based learning environments present a significant opportunity for investigating student affect in interactive learning. One family of game-based learning environments, narrative-centered learning environments, offer a particularly compelling laboratory for investigating student affect. In narrative-centered environments, learning activities play out in dynamically generated interactive narratives and training scenarios. These afford significant opportunities for investigating computational models of student emotion. In this talk, we explore the role that affective computing can play in next-generation interactive learning environments, with a particular focus on affect recognition, affect understanding, and affect synthesis in game-based learning.

James C. Lester

Measuring Affect in the Wild

Our teams at MIT and at Affectiva have invented mobile sensors and software that can help sense autonomic stress and activity levels comfortably while you are on the go, e.g. the Affectiva Q

Sensor for capturing sympathetic nervous system activation, or without distracting you while you are online, e.g. webcam-based software capturing heart rate variability and facial expressions. We are also developing new technologies that capture and respond to negative and positive thoughts, combining artificial intelligence and crowdsourced online human computation to provide just-in-time emotional support through a mobile phone with texting. Our technologies are all opt-in, and are currently being used robustly for “outside the lab, mobile” studies where core emotional processes are involved in autism, PTSD, sleep disorders, eating disorders, substance abuse, epilepsy, stressful workplaces and learning environments, online customer experiences, and more. The new technologies enable collecting orders of magnitude more data than previous lab-based studies, containing many fascinating variations of “what people really do” especially when making expressions such as smiles. This talk will highlight some of the most interesting findings from recent work together with stories of personal adventures in emotion measurement out in the wild.

Rosalind W. Picard

Oral Presentations

Affective Modeling from Multichannel Physiology: Analysis of Day Differences

Physiological signals are widely considered to contain affective information. Consequently, pattern recognition techniques such as classification are commonly used to detect affective states from physiological data. Previous studies have achieved some success in detecting affect from physiological measures, especially in controlled environments where emotions are experimentally induced. One challenge that arises is that physiological measures are expected to exhibit considerable day variations due to a number of extraneous factors such as environmental changes and sensor placements. These variations pose challenges to effectively classify affective sates from future physiological data; this is a common problem for real world requirements. The present study provides a quantitative analysis of day variations of physiological signals from different subjects. We propose a classifier ensemble approach using a Winnow algorithm to address the problem of day-variation in physiological signals. Our results show that the Winnow ensemble approach outperformed a static classification approach for detecting affective states from physiological signals that exhibited day variations.

Omar Alzoubi, Md. Sazzad Hussain, Sidney D’Mello, Rafael A. Calvo

The Dynamics between Student Affect and Behavior Occurring Outside of Educational Software

We present an analysis of the affect that precedes, follows, and co- occurs with students’ choices to go off-task or engage in on-task conversation within two versions of a virtual laboratory for chemistry. This analysis is conducted using field observation data collected within undergraduate classes using the virtual laboratory software as part of their regular chemistry classes. We find that off-task behavior co-occurs with boredom, but appears to relieve boredom, leading to significantly lower probability of later boredom. We also find that on-task conversation leads to greater future probability of engaged concentration. These results help to clarify the role that behavior outside of educational software plays in students’ affect during use of that software.

Ryan S. J. d. Baker, Gregory R. Moore, Angela Z. Wagner, Jessica Kalka, Aatish Salvi, Michael Karabinos, Colin A. Ashe, David Yaron

ikannotate – A Tool for Labelling, Transcription, and Annotation of Emotionally Coloured Speech

In speech recognition and emotion recognition from speech, qualitatively high transcription and annotation of given material is important. To analyse prosodic features, linguistics provides several transcription systems. Furthermore, in emotion labelling different methods are proposed and discussed. In this paper, we introduce the tool

ikannotate

, which combines prosodic information with emotion labelling. It allows the generation of a transcription of material directly annotated with prosodic features. Moreover, material can be emotionally labelled according to Basic Emotions, the Geneva Emotion Wheel, and Self Assessment Manikins. Finally, we present results of two usability tests observing the ability to identify emotions in labelling and comparing the transcription tool “Folker” with our application.

Ronald Böck, Ingo Siegert, Matthias Haase, Julia Lange, Andreas Wendemuth

Being Happy, Healthy and Whole Watching Movies That Affect Our Emotions

This paper discusses the power of emotions in our health, happiness and wholeness, and the emotional impact of movies. It presents iFelt, an interactive video application to classify, access, explore and visualize movies based on their emotional properties and impact.

Teresa Chambel, Eva Oliveira, Pedro Martins

Investigating the Prosody and Voice Quality of Social Signals in Scenario Meetings

In this study we propose a methodology to investigate possible prosody and voice quality correlates of social signals, and test-run it on annotated naturalistic recordings of scenario meetings. The core method consists of computing a set of prosody and voice quality measures, followed by a Principal Components Analysis (PCA) and Support Vector Machine (SVM) classification to identify the core factors predicting the associated social signal or related annotation. We apply the methodology to controlled data and two types of annotations in the AMI meeting corpus that are relevant for social signalling: dialogue acts and speaker roles.

Marcela Charfuelan, Marc Schröder

Fast-FACS: A Computer-Assisted System to Increase Speed and Reliability of Manual FACS Coding

FACS (Facial Action Coding System) coding is the state of the art in manual measurement of facial actions. FACS coding, however, is labor intensive and difficult to standardize. A goal of automated FACS coding is to eliminate the need for manual coding and realize automatic recognition and analysis of facial actions. Success of this effort depends in part on access to reliably coded corpora; however, manual FACS coding remains expensive and slow. This paper proposes Fast-FACS, a computer vision aided system that improves speed and reliability of FACS coding. Three are the main novelties of the system: (1) to the best of our knowledge, this is the first paper to predict onsets and offsets from peaks, (2) use Active Appearance Models for computer assisted FACS coding, (3) learn an optimal metric to predict onsets and offsets from peaks. The system was tested in the RU-FACS database, which consists of natural facial behavior during a two-person interview. Fast-FACS reduced manual coding time by nearly 50% and demonstrated strong concurrent validity with manual FACS coding.

Fernando De la Torre, Tomas Simon, Zara Ambadar, Jeffrey F. Cohn

A Computer Model of the Interpersonal Effect of Emotion Displayed in a Social Dilemma

The paper presents a computational model for decision-making in a social dilemma that takes into account the other party’s emotion displays. The model is based on data collected in a series of recent studies where participants play the iterated prisoner’s dilemma with agents that, even though following the same action strategy, show different emotion displays according to how the game unfolds. We collapse data from all these studies and fit, using maximum likelihood estimation, probabilistic models that predict likelihood of cooperation in the next round given different features. Model 1 predicts based on round outcome alone. Model 2 predicts based on outcome and emotion displays. Model 3 also predicts based on outcome and emotion but, considers contrast effects found in the empirical studies regarding the order with which participants play cooperators and non-cooperators. To evaluate the models, we replicate the original studies but, substitute the humans for the models. The results reveal that Model 3 best replicates human behavior in the original studies and Model 1 does the worst. The results, first, emphasize recent research about the importance of nonverbal cues in social dilemmas and, second, reinforce that people attend to contrast effects in their decision-making. Theoretically, the model provides further insight into how people behave in social dilemmas. Pragmatically, the model could be used to drive an agent that is engaged in a social dilemma with a human (or another agent).

Celso M. de Melo, Peter Carnevale, Dimitrios Antos, Jonathan Gratch

Agents with Emotional Intelligence for Storytelling

One core aspect of engaging narratives is the existence and development of social relations between the characters. However, creating agents for interactive storytelling and making them to be perceived as a close friend or a hated enemy by an user is an hard task. This paper addresses the problem of creating autonomous agents capable of establishing social relations with others in an interactive narrative. We present an innovative approach by looking at emotional intelligence and in particular to the skills of understanding and regulating emotions in others. To that end we propose a model for an agent architecture that has an explicit model of Social Relations and a Theory of Mind about others, and is able to plan about emotions of others and perform interpersonal emotion regulation in order to dynamically create relations with others. Some sample scenario are presented in order to illustrate the type of behaviour achieved by the model and the creation of social relations.

João Dias, Ana Paiva

“That’s Aggravating, Very Aggravating”: Is It Possible to Classify Behaviors in Couple Interactions Using Automatically Derived Lexical Features?

Psychology is often grounded in observational studies of human interaction behavior, and hence on human perception and judgment. There are many practical and theoretical challenges in observational practice. Technology holds the promise of mitigating some of these difficulties by assisting in the evaluation of higher level human behavior. In this work we attempt to address two questions: (1) Does the lexical channel contain the necessary information towards such an evaluation; and if yes (2) Can such information be captured by a noisy automated transcription process. We utilize a large corpus of couple interaction data, collected in the context of a longitudinal study of couple therapy. In the original study, each spouse was manually evaluated with several session-level behavioral codes (e.g., level of acceptance toward other spouse). Our results will show that both of our research questions can be answered positively and encourage future research into such assistive observational technologies.

Panayiotis G. Georgiou, Matthew P. Black, Adam C. Lammert, Brian R. Baucom, Shrikanth S. Narayanan

Predicting Facial Indicators of Confusion with Hidden Markov Models

Affect plays a vital role in learning. During tutoring, particular affective states may benefit or detract from student learning. A key cognitive-affective state is confusion, which has been positively associated with effective learning. Although identifying episodes of confusion presents significant challenges, recent investigations have identified correlations between confusion and specific facial movements. This paper builds on those findings to create a predictive model of learner confusion during task-oriented human-human tutorial dialogue. The model leverages textual dialogue, task, and facial expression history to predict upcoming confusion within a hidden Markov modeling framework. Analysis of the model structure also reveals meaningful modes of interaction within the tutoring sessions. The results demonstrate that because of its predictive power and rich qualitative representation, the model holds promise for informing the design of affective-sensitive tutoring systems.

Joseph F. Grafsgaard, Kristy Elizabeth Boyer, James C. Lester

Recording Affect in the Field: Towards Methods and Metrics for Improving Ground Truth Labels

One of the primary goals of affective computing is enabling computers to recognize human emotion. To do this we need accurately labeled affective data. This is challenging to obtain in real situations where affective events are not scripted and occur simultaneously with other activities and feelings. Affective labels also rely heavily on subject self-report for which can be problematic. This paper reports on methods for obtaining high quality emotion labels with reduced bias and variance and also shows that better training sets for machine learning algorithms can be created by combining multiple sources of evidence. During a 7 day, 13 participant field study we found that recognition accuracy for physiological activation improved from 63% to 79% with two sources of evidence and in an additional pilot study this improved to 100% accuracy for one subject over 10 days when context evidence was also included.

Jennifer Healey

Using Individual Light Rigs to Control the Perception of a Virtual Character’s Personality

We investigate how lighting can be used to influence how the personality of virtual characters is perceived. We propose a character-centric lighting system composed of three dynamic lights that can be configured using an interactive editor. To study the effect of character-centric lighting on observers, we created four lighting configurations derived from the photography and film literature. A user study with 32 subjects shows that the lighting setups do influence the perception of the characters’ personality. We found lighting effects with regard to the perception of dominance. Moreover, we found that the personality perception of female characters seems to change more easily than for male characters.

Alexis Heloir, Kerstin H. Kipp, Michael Kipp

Call Center Stress Recognition with Person-Specific Models

Nine call center employees wore a skin conductance sensor on the wrist for a week at work and reported stress levels of each call. Although everyone had the same job profile, we found large differences in how individuals reported stress levels, with similarity from day to day within the same participant, but large differences across the participants. We examined two ways to address the individual differences to automatically recognize classes of stressful/non-stressful calls, namely modifying the loss function of Support Vector Machines (SVMs) to adapt to the varying priors, and giving more importance to training samples from the most similar people in terms of their skin conductance lability. We tested the methods on 1500 calls and achieved an accuracy across participants of 78.03% when trained and tested on different days from the same person, and of 73.41% when trained and tested on different people using the proposed adaptations to SVMs.

Javier Hernandez, Rob R. Morris, Rosalind W. Picard

Are You Friendly or Just Polite? – Analysis of Smiles in Spontaneous Face-to-Face Interactions

This work is part of a research effort to understand and characterize the morphological and dynamic features of polite and amused smiles. We analyzed a dataset consisting of young adults (n=61), interested in learning about banking services, who met with a professional banker face-to-face in a conference room while both participants’ faces were unobtrusively recorded. We analyzed 258 instances of amused and polite smiles from this dataset, noting also if they were shared, which we defined as if the rise of one starts before the decay of another. Our analysis confirms previous findings showing longer durations of amused smiles while also suggesting new findings about symmetry of the smile dynamics. We found more symmetry in the velocities of the rise and decay of the amused smiles, and less symmetry in the polite smiles. We also found fastest decay velocity for polite but shared smiles.

Mohammed Hoque, Louis-Philippe Morency, Rosalind W. Picard

Multiple Instance Learning for Classification of Human Behavior Observations

Analysis of audiovisual human behavior observations is a common practice in behavioral sciences. It is generally carried through by expert annotators who are asked to evaluate several aspects of the observations along various dimensions. This can be a tedious task. We propose that automatic classification of behavioral patterns in this context can be viewed as a multiple instance learning problem. In this paper, we analyze a corpus of married couples interacting about a problem in their relationship. We extract features from both the audio and the transcriptions and apply the Diverse Density-Support Vector Machine framework. Apart from attaining classification on the expert annotations, this framework also allows us to estimate salient regions of the complex interaction.

Athanasios Katsamanis, James Gibson, Matthew P. Black, Shrikanth S. Narayanan

Form as a Cue in the Automatic Recognition of Non-acted Affective Body Expressions

The advent of whole-body interactive technology has increased the importance of creating systems that take into account body expressions to determine the affective state of the user. In doing so, the role played by the form and motion information needs to be understood. Neuroscience studies have shown that biological motion is recognized by separate pathways in the brain. This paper investigates the contribution of body configuration (form) in the automatic recognition of non-acted affective dynamic expressions in a video game context. Sequences of static postures are automatically extracted from motion capture data and presented to the system which is a combination of an affective posture recognition module and a sequence classification rule to finalize the affective state of each sequence. Our results show that using form information only, the system recognition reaches performances very close to the agreement between observers who viewed the affective expressions as animations containing both form and temporal information.

Andrea Kleinsmith, Nadia Bianchi-Berthouze

Design of a Virtual Reality Based Adaptive Response Technology for Children with Autism Spectrum Disorder

Impairments in social communication skills are thought to be core deficits in children with autism spectrum disorder (ASD). In recent years, several assistive technologies, particularly Virtual Reality (VR), have been investigated to promote social interactions in this population. It is well-known that these children demonstrate atypical viewing patterns during social interactions and thus monitoring eye-gaze can be valuable to design intervention strategies. However, presently available VR-based systems are designed to chain learning via aspects of one’s performance only permitting limited degree of individualization. Given the promise of VR-based social interaction and the usefulness of monitoring eye-gaze in real-time, a novel VR-based dynamic eye-tracking system is developed in this work. The developed system was tested through a small usability study with four adolescents with ASD. The results indicate the potential of the system to promote improved social task performance along with socially-appropriate mechanisms during VR-based social conversation tasks.

Uttama Lahiri, Esubalew Bekele, Elizabeth Dohrmann, Zachary Warren, Nilanjan Sarkar

Exploring the Relationship between Novice Programmer Confusion and Achievement

Using a discovery-with-models approach, we study the relationships between novice Java programmers’ experiences of confusion and their achievement, as measured through their midterm examination scores. Two coders manually labeled samples of student compilation logs with whether they represent a student who was confused. From the labeled data, we built a model that we used to label the entire data set. We then analysed the relationship between patterns of confusion and non-confusion over time, and students’ midterm scores. We found that, in accordance with prior findings, prolonged confusion is associated with poorer student achievement. However, confusion which is resolved is associated with statistically significantly better midterm performance than never being confused at all.

Diane Marie C. Lee, Ma. Mercedes T. Rodrigo, Ryan S. J. d. Baker, Jessica O. Sugay, Andrei Coronel

Semi-Coupled Hidden Markov Model with State-Based Alignment Strategy for Audio-Visual Emotion Recognition

This paper presents an approach to bi-modal emotion recognition based on a semi-coupled hidden Markov model (SC-HMM). A simplified state-based bi-modal alignment strategy in SC-HMM is proposed to align the temporal relation of states between audio and visual streams. Based on this strategy, the proposed SC-HMM can alleviate the problem of data sparseness and achieve better statistical dependency between states of audio and visual HMMs in most real world scenarios. For performance evaluation, audio-visual signals with four emotional states (happy, neutral, angry and sad) were collected. Each of the invited seven subjects was asked to utter 30 types of sentences twice to generate emotional speech and facial expression for each emotion. Experimental results show the proposed bi-modal approach outperforms other fusion-based bi-modal emotion recognition methods.

Jen-Chun Lin, Chung-Hsien Wu, Wen-Li Wei

Associating Textual Features with Visual Ones to Improve Affective Image Classification

Many images carry a strong emotional semantic. These last years, some investigations have been driven to automatically identify induced emotions that may arise in viewers when looking at images, based on low-level image properties. Since these features can only catch the image atmosphere, they may fail when the emotional semantic is carried by objects. Therefore additional information is needed, and we propose in this paper to make use of textual information describing the image, such as tags. Thus, we have developed two textual features to catch the text emotional meaning: one is based on the semantic distance matrix between the text and an emotional dictionary, and the other one carries the valence and arousal meanings of words. Experiments have been driven on two datasets to evaluate visual and textual features and their fusion. The results have shown that our textual features can improve the classification accuracy of affective images.

Ningning Liu, Emmanuel Dellandréa, Bruno Tellez, Liming Chen

3D Corpus of Spontaneous Complex Mental States

Hand-over-face gestures, a subset of emotional body language, are overlooked by automatic affect inference systems. We propose the use of hand-over-face gestures as a novel affect cue for automatic inference of cognitive mental states. Moreover, affect recognition systems rely on the existence of publicly available datasets, often the approach is only as good as the data. We present the collection and annotation methodology of a 3D multimodal corpus of 108 audio/video segments of natural complex mental states. The corpus includes spontaneous facial expressions and hand gestures labelled using crowd-sourcing and is publicly available.

Marwa Mahmoud, Tadas Baltrušaitis, Peter Robinson, Laurel D. Riek

Evaluating the Communication of Emotion via Expressive Gesture Copying Behaviour in an Embodied Humanoid Agent

We present an evaluation of copying behaviour in an embodied agent capable of processing expressivity characteristics of a user’s movement and conveying aspects of it in real-time. The agent responds to affective cues from gestures performed by actors, producing synthesised gestures that exhibit similar expressive qualities. Thus, copying is performed only at the expressive level and information about other aspects of the gesture, such as the shape, is not retained. This research is significant to social interaction between agents and humans, for example, in cases where an agent wishes to show empathy with a conversational partner without an exact copying of their motions.

Maurizio Mancini, Ginevra Castellano, Christopher Peters, Peter W. McOwan

Multi-score Learning for Affect Recognition: The Case of Body Postures

An important challenge in building automatic affective state recognition systems is establishing the ground truth. When the ground-truth is not available, observers are often used to label training and testing sets. Unfortunately, inter-rater reliability between observers tends to vary from fair to moderate when dealing with naturalistic expressions. Nevertheless, the most common approach used is to label each expression with the most frequent label assigned by the observers to that expression. In this paper, we propose a general pattern recognition framework that takes into account the variability between observers for automatic affect recognition. This leads to what we term a multi-score learning problem in which a single expression is associated with multiple values representing the scores of each available emotion label. We also propose several performance measurements and pattern recognition methods for this framework, and report the experimental results obtained when testing and comparing these methods on two affective posture datasets.

Hongying Meng, Andrea Kleinsmith, Nadia Bianchi-Berthouze

Multi-modal Affect Induction for Affective Brain-Computer Interfaces

Reliable applications of affective brain-computer interfaces (aBCI) in realistic, multi-modal environments require a detailed understanding of the processes involved in emotions. To explore the modality-specific nature of affective responses, we studied neurophysiological responses (i.e., EEG) of 24 participants during visual, auditory, and audiovisual affect stimulation. The affect induction protocols were validated by participants’ subjective ratings and physiological responses (i.e., ECG). Coherent with literature, we found modality-specific responses in the EEG: posterior alpha power decreases during visual stimulation and increases during auditory stimulation, anterior alpha power tends to decrease during auditory stimulation and to increase during visual stimulation. We discuss the implications of these results for multi-modal aBCI.

Christian Mühl, Egon L. van den Broek, Anne-Marie Brouwer, Femke Nijboer, Nelleke van Wouwe, Dirk Heylen

Toward a Computational Framework of Suspense and Dramatic Arc

We propose a computational framework for the recognition of suspense and dramatic arc in stories. Suspense is an affective response to narrative structure that accompanies the reduction in quantity or quality of plans available to a protagonist faced with potential goal failure and/or harm. Our work is motivated by the recognition that computational systems are historically unable to reliably reason about aesthetic or affective qualities of story structures. Our proposed framework, Dramatis, reads a story, identifies potential failures in the plans and goals of the protagonist, and computes a suspense rating at various points in the story. To compute suspense, Dramatis searches for ways in which the protagonist can overcome the failure and produces a rating inversely proportional to the likelihood of the best approach to overcoming the failure. If applied to story generation, Dramatis could allow for the creation of stories with knowledge of suspense and dramatic arc.

Brian O’Neill, Mark Riedl

A Generic Emotional Contagion Computational Model

This work describes a computational model designed for emotional contagion simulation in societies of agents, integrating the influence of interpersonal relationships and personality. It models the fundamental differences in individual susceptibilities to contagion based on the psychology study of Emotional Contagion Scale. The contagion process can also be biased by inter-individual relationships depending on the intimacy and power difference aspects of relationships between agents. Individuals’ expressiveness in a group is influenced by both the extroversion personality trait and power difference.

Additionally, the computational model includes the process of mood decay, as usually observed in people, expanding its application domain beyond that of pure simulation, like games. In this paper we present simulation results that verify the basic emotional contagion behaviors. The possibility of more complex contagion dynamics depending on agent group relationships is also presented.

Gonçalo Pereira, Joana Dimas, Rui Prada, Pedro A. Santos, Ana Paiva

Generic Physiological Features as Predictors of Player Experience

This paper examines the generality of features extracted from heart rate (HR) and skin conductance (SC) signals as predictors of self-reported player affect expressed as pairwise preferences. Artificial neural networks are trained to accurately map physiological features to expressed affect in two dissimilar and independent game surveys. The performance of the obtained affective models which are trained on one game is tested on the unseen physiological and self-reported data of the other game. Results in this early study suggest that there exist features of HR and SC such as average HR and one and two-step SC variation that are able to predict affective states across games of different genre and dissimilar game mechanics.

Héctor Perez Martínez, Maurizio Garbarino, Georgios N. Yannakakis

Guess What? A Game for Affective Annotation of Video Using Crowd Sourcing

One of the most time consuming and laborious problems facing researchers in Affective Computing is annotation of data, particularly with the recent adoption of multimodal data. Other fields, such as Computer Vision, Language Processing and Information Retrieval have successfully used crowd sourcing (or human computation) games to label their data sets. Inspired by their work, we have developed a Facebook game called

Guess What?

for labeling multimodal, affective video data. This paper describes the game and an initial evaluation of it for social context labeling. In our experiment, 33 participants used the game to label 154 video/question pairs over the course of a few days, and their overall inter-rater reliability was good (Krippendorff’s

= .70). We believe this game will be a useful resource for other researchers and ultimately plan to make

Guess What?

open source and available to anyone who is interested.

Laurel D. Riek, Maria F. O’Connor, Peter Robinson

Modeling Learner Affect with Theoretically Grounded Dynamic Bayesian Networks

Evidence of the strong relationship between learning and emotion has fueled recent work in modeling affective states in intelligent tutoring systems. Many of these models are based on general models of affect without a specific focus on learner emotions. This paper presents work that investigates the benefits of using theoretical models of learner emotions to guide the development of Bayesian networks for prediction of student affect. Predictive models are empirically learned from data acquired from 260 students interacting with the game-based learning environment,

Crystal Island

. Results indicate the benefits of using theoretical models of learner emotions to inform predictive models. The most successful model, a dynamic Bayesian network, also highlights the importance of temporal information in predicting learner emotions. This work demonstrates the benefits of basing predictive models of learner emotions on theoretical foundations and has implications for how these models may be used to validate theoretical models of emotion.

Jennifer Sabourin, Bradford Mott, James C. Lester

Evaluations of Piezo Actuated Haptic Stimulations

The present aim was to study emotion-related evaluations of piezo actuated haptic stimulations. We conducted three experiments where the presentation type (i.e., haptic only, haptic auditory, and auditory only) of the stimulus was varied. The participants’ task was to rank which of the two sequentially presented stimuli was more pleasant and which was more arousing. All pairwise comparisons were created from 9 stimuli varied by rise time (i.e., 1, 3, and 10 ms) and amplitude (i.e., 2, 7, and 30

m). The results showed that in general the haptic only and haptic auditory stimuli were ranked as more pleasant and arousing than the auditory only stimuli. In addition, the results suggest that the stimuli with long rise times can be seen as more applicable than the stimuli with short rise times as they were in general ranked as more pleasant and arousing.

Katri Salminen, Veikko Surakka, Jani Lylykangas, Jussi Rantala, Pauli Laitinen, Roope Raisamo

The Relationship between Carelessness and Affect in a Cognitive Tutor

We study the relationship between student carelessness and affect among high-school students using a Cognitive Tutor for Scatterplots, using a machine-learned detector of carelessness and field observations of student affect. In line with previous research, we say a student is careless when he/she makes a mistake performing a task that he/she already knows. This construct is also known as slipping. Somewhat non-intuitively, we find that students exhibiting high levels of engaged concentration slip frequently. These findings imply that a student who is engaged in a task may be overconfident, impulsive or hurried, leading to more careless errors. On the other hand, students who display confusion or boredom make fewer careless errors. Further analysis over time suggests that confused and bored students have lower learning overall. Therefore, these students’ mistakes stem from a genuine lack of knowledge rather than carelessness. The use of two versions of the tutor in this study, with and without an Embodied Conversational Agent (ECA), shows no significant difference in terms of the relationship between carelessness and affect.

Maria Ofelia Clarissa Z. San Pedro, Ma. Mercedes T. Rodrigo, Ryan S. J. d. Baker

EmotionML – An Upcoming Standard for Representing Emotions and Related States

The present paper describes the specification of Emotion Markup Language (EmotionML) 1.0, which is undergoing standardisation at the World Wide Web Consortium (W3C). The language aims to strike a balance between practical applicability and scientific well-foundedness. We briefly review the history of the process leading to the standardisation of EmotionML. We describe the syntax of EmotionML as well as the vocabularies that are made available to describe emotions in terms of categories, dimensions, appraisals and/or action tendencies. The paper concludes with a number of relevant aspects of emotion that are not covered by the current specification.

Marc Schröder, Paolo Baggia, Felix Burkhardt, Catherine Pelachaud, Christian Peter, Enrico Zovato

Emotion-Based Intrinsic Motivation for Reinforcement Learning Agents

In this paper, we propose an adaptation of four common appraisal dimensions that evaluate the relation of an agent with its environment into reward features within an

intrinsically motivated reinforcement learning

framework. We show that, by optimizing the relative weights of such features for a given environment, the agents attain a greater degree of fitness while overcoming some of their perceptual limitations. This optimization process resembles the evolutionary adaptive process that living organisms are subject to. We illustrate the application of our method in several simulated foraging scenarios.

Pedro Sequeira, Francisco S. Melo, Ana Paiva

The Good, the Bad and the Neutral: Affective Profile in Dialog System-User Communication

We describe the use of affective profiles in a dialog system and its effect on participants’ perception of conversational partners and experienced emotional changes in an experimental setting, as well as the mechanisms for realising three different affective profiles and for steering task-oriented follow-up dialogs. Experimental results show that the system’s affective profile determines the rating of chatting enjoyment and user-system emotional connection to a large extent. Self-reported emotional changes experienced by participants during an interaction with the system are also strongly correlated with the type of applied profile. Perception of core capabilities of the system, realism and coherence of dialog, are only influenced to a limited extent.

Marcin Skowron, Stefan Rank, Mathias Theunis, Julian Sienkiewicz

Effect of Affective Profile on Communication Patterns and Affective Expressions in Interactions with a Dialog System

Interlocutors’ affective profile and character traits play an important role in interactions. In the presented study, we apply a dialog system to investigate the effects of the affective profile on user-system communication patterns and users’ expressions of affective states. We describe the data-set acquired from experiments with the affective dialog system, the tools used for its annotation and findings regarding the effect of affective profile on participants’ communication style and affective expressions.

Marcin Skowron, Mathias Theunis, Stefan Rank, Anna Borowiec

Persuasive Language and Virality in Social Networks

This paper aims to provide new insights on the concept of virality and on its structure - especially in social networks. We argue that: (a) virality is a phenomenon strictly connected to the nature of the content being spread (b) virality is a phenomenon with many affective responses, i.e. under this generic term several different effects of persuasive communication are comprised. To give ground to our claims, we provide initial experiments in a machine learning framework to show how various aspects of virality can be predicted according to content features. We further provide a class-based psycholinguistic analysis of the features salient for virality components.

Carlo Strapparava, Marco Guerini, Gözde Özbal

A Multimodal Database for Mimicry Analysis

In this paper we introduce a multi-modal database for the analysis of human interaction, in particular mimicry, and elaborate on the theoretical hypotheses of the relationship between the occurrence of mimicry and human affect. The recorded experiments are designed to explore this relationship. The corpus is recorded with 18 synchronised audio and video sensors, and is annotated for many different phenomena, including dialogue acts, turn-taking, affect, head gestures, hand gestures, body movement and facial expression. Recordings were made of two experiments: a discussion on a political topic, and a role-playing game. 40 participants were recruited, all of whom self-reported their felt experiences. The corpus will be made available to the scientific community.

Xiaofan Sun, Jeroen Lichtenauer, Michel Valstar, Anton Nijholt, Maja Pantic

Mood Recognition Based on Upper Body Posture and Movement Features

While studying body postures in relation to mood is not a new concept, the majority of these studies rely on actors interpretations. This project investigated the temporal aspects of naturalistic body postures while users listened to mood inducing music. Video data was collected while participants listened to eight minutes of music during two sessions (happy and sad) in a within-subjects design. Subjectively reported mood scores validated that mood did differ significantly for valence and energy. Video analysis consisted of postural ratings for the head, shoulders, trunk, arms, and head and hand tapping. Results showed significant differences for the majority of these dimensions by mood. This study showed that certain body postures are indicative of certain mood states in a naturalistic setting.

Michelle Thrasher, Marjolein D. Van der Zwaag, Nadia Bianchi-Berthouze, Joyce H. D. M. Westerink

Emotional Aware Clustering on Micro-blogging Sources

Microblogging services have nowadays become a very popular communication tool among Internet users. Since millions of users share opinions on different aspects of life everyday, microblogging web-sites are considered as a credible source for exploring both factual and subjective information. This fact has inspired research in the area of automatic sentiment analysis. In this paper we propose an emotional aware clustering approach which performs sentiment analysis of users tweets on the basis of an emotional dictionary and groups tweets according to the degree they express a specific set of emotions. Experimental evaluations on datasets derived from Twitter prove the efficiency of the proposed approach.

Katerina Tsagkalidou, Vassiliki Koutsonikola, Athena Vakali, Konstantinos Kafetsios

A Phonetic Analysis of Natural Laughter, for Use in Automatic Laughter Processing Systems

In this paper, we present the detailed phonetic annotation of the publicly available AVLaughterCycle database, which can readily be used for automatic laughter processing (analysis, classification, browsing, synthesis, etc.). The phonetic annotation is used here to analyze the database, as a first step. Unsurprisingly, we find that h-like phones and central vowels are the most frequent sounds in laughter. However, laughs can contain many other sounds. In particular, nareal fricatives (voiceless friction in the nostrils) are frequent both in inhalation and exhalation phases. We show that the airflow direction (inhaling or exhaling) changes significantly the duration of laughter sounds. Individual differences in the choice of phones and their duration are also examined. The paper is concluded with some perspectives the annotated database opens.

Jérôme Urbain, Thierry Dutoit

The Impact of Music on Affect during Anger Inducing Drives

Driver anger could be potentially harmful for road safety and long-term health. Because of its mood inducing properties, music is assumed to be a potential medium that could prevent anger induction during driving. In the current study the influence of music on anger, mood, skin conductance, and systolic blood pressure was investigated during anger inducing scenarios in a driving simulator. 100 participants were split into five groups: four listened to different types of music (high / low energy in combination with both positive / negative valence) or a no music control. Results showed that anger induction was highest during high energy negative music compared to positive music irrespective of energy level. Systolic blood pressure and skin conductance levels were higher during high energy negative music and no music compared to low energy music. Music was demonstrated to mediate the state of anger and therefore can have positive health benefits in the long run.

Marjolein D. van der Zwaag, Stephen Fairclough, Elena Spiridon, Joyce H. D. M. Westerink

Unsupervised Temporal Segmentation of Talking Faces Using Visual Cues to Improve Emotion Recognition

The mouth region of human face possesses highly discriminative information regarding the expressions on the face. Facial expression analysis to infer the emotional state of a user becomes very challenging when the user talks, as most of the mouth actions while uttering certain words match with mouth shapes expressing various emotions. We introduce a novel unsupervised method to temporally segment talking faces from the faces displaying only emotions, and use the knowledge of talking face segments to improve emotion recognition. The proposed method uses integrated gradient histogram of local binary patterns to represent mouth features suitably and identifies temporal segments of talking faces online by estimating the uncertainties of mouth movements over a period of time. The algorithm accurately identifies talking face segments on a real-world database where talking and emotion happens naturally. Also, the emotion recognition system, using talking face cues, showed considerable improvement in recognition accuracy.

Sudha Velusamy, Viswanath Gopalakrishnan, Bilva Navathe, Hariprasad Kannan, Balasubramanian Anand, Anshul Sharma

The Affective Experience of Handling Digital Fabrics: Tactile and Visual Cross-Modal Effects

In the textile sector, emotions are often associated with both physical touch and manipulation of the product. Thus there is the need to recreate the affective experiences of touching and interacting with fabrics using commonly available internet technology. New digital interactive representations of fabrics simulating handling have been proposed with the idea of bringing the digital experience of fabrics closer to the reality. This study evaluates the contribution of handling real fabrics to viewing digital interactive animations of said fabrics and vice versa. A combination of self-report and physiological measures was used. Results showed that having previous physical handling experience of the fabrics significantly increased pleasure and engagement in the visual experience of the digital handling of the same fabrics. Two factors mediated these experiences: gender and interoceptive awareness. Significant results were not found for the opposite condition.

Di Wu, Ting-I Wu, Harsimrat Singh, Stefano Padilla, Douglas Atkinson, Nadia Bianchi-Berthouze, Mike Chantler, Sharon Baurley

Ranking vs. Preference: A Comparative Study of Self-reporting

This paper introduces a comparative analysis between rating and pairwise self-reporting via questionnaires in user survey experiments. Two dissimilar game user survey experiments are employed in which the two questionnaire schemes are tested and compared for reliable affect annotation. The statistical analysis followed to test our hypotheses shows that even though the two self-reporting schemes are consistent there are significant

order of reporting

effects when subjects report via a rating questionnaire. The paper concludes with a discussion of the appropriateness of each self-reporting scheme under conditions drawn from the experimental results obtained.

Georgios N. Yannakakis, John Hallam

Poster Papers

Towards a Generic Framework for Automatic Measurements of Web Usability Using Affective Computing Techniques

We propose a generic framework for the automatic usability evaluation of web sites by combining traditional automatic usability methods with affective computing techniques. To evaluate a framework a pilot study was carried out where users (n=4) reported their affective states using dimensional and categorical models. Binary task completion, time, mouse clicks, and error rates as an indicator of web usability were automatically captured for each page. Results suggested that frustration experienced when error rates and time for the task were higher. Delight on the other hand was at the other side of the spectrum. In the case that usability measurements had almost same values (e.g. confusing or engaging pages), affective states may be a way to show the difference.

Payam Aghaei Pour, Rafael A. Calvo

Simulating Affective Behaviours : An Approach Based on the COR Theory

The expression of emotion is usually considered an important step towards the believability of a virtual agent. However, current models based on emotion categories face important challenges in their attempts to model the influence of emotions on agents’ behaviour. To adress this problem, we propose an architecture based on the COnservation of Resources theory (COR) which aims at producing affective behaviours in various scenarios. In this paper we explain the principle of such a model, how it is implemented and can be evaluated.

Sabrina Campano, Etienne de Sevin, Vincent Corruble, Nicolas Sabouret

Emotional Investment in Naturalistic Data Collection

We present results from two experiments intended to allow naturalistic data collection of the physiological effects of cognitive load. Considering the example of command and control environments, we identify shortcomings of previous studies which use either laboratory-based scenarios, lacking realism, or real-world scenarios, lacking repeatability. We identify the hybrid approach of remote-control which allows experimental subjects to remain in a laboratory setting, performing a real-world task in a completely controlled environment. We show that emotional investment is vital for evoking natural responses and that physiological indications of cognitive load manifest themselves more readily in our hybrid experimental setup. Finally, we present a set of experimental design recommendations for naturalistic data collection.

Ian Davies, Peter Robinson

When Do We Smile? Analysis and Modeling of the Nonverbal Context of Listener Smiles in Conversation

In this paper we will look into reactive models for embodied conversational agents for generating smiling behavior. One trigger for smiling behaviour is smiling of the human interlocutor which is used in reactive models based on mimicry. However, other features might be useful as well. In order to develop such models we look at the nonverbal context of smiles in human-human conversation. We make a distinction between three types of smiles - amused, polite and embarrassed - and highlighted differences in context where each type occurs in conversation. Using machine learning techniques we have build predictive models using the nonverbal contextual features analyzed. Results show that reactive models can offer an interesting contribution to the generation of smiling behaviors.

Iwan de Kok, Dirk Heylen

Emotional Cognitive Architectures

We investigate the value of bringing emotional components into cognitive architectures. We start by presenting CELTS, an emotional cognitive architecture, with an aim at showing that the emotional component of the architecture is an essential element of CELTS value as a cognitive architecture. We do so by analyzing the role that the emotional mechanism plays and how respecting the emotion criterion defined by Picard[15] may be a way to address at once several of the architectural features covered by Sun’s desiderata[10] or Newell’s functional criteria[9].

Usef Faghihi, Pierre Poirier, Othalia Larue

Kalman Filter-Based Facial Emotional Expression Recognition

In this work we examine the use of State-Space Models to model the temporal information of dynamic facial expressions. The later being represented by the 3D animation parameters which are recovered using 3D Candide model. The 3D animation parameters of an image sequence can be seen as the observation of a stochastic process which can be modeled by a linear State-Space Model, the Kalman Filter. In the proposed approach each emotion is represented by a Kalman Filter, with parameters being State Transition matrix, Observation matrix, State and Observation noise covariance matrices. Person-independent experimental results have proved the validity and the good generalization ability of the proposed approach for emotional facial expression recognition. Moreover, compared to the state-of-the-art techniques, the proposed system yields significant improvements in recognizing facial expressions.

Ping Fan, Isabel Gonzalez, Valentin Enescu, Hichem Sahli, Dongmei Jiang

SARA: Social Affective Relational Agent: A Study on the Role of Empathy in Artificial Social Agents

Over the last decade extensive research has been conducted in the area of conversational agents focusing in many different aspects of these agents. In this research, and aiming at building agents that maintain a social connection with users, empathy has been one of those areas, as it plays a leading role in the establishment of social relationships. In this paper we present a relationship model of empathy that takes advantage of Social Penetration Theory’s concepts for relationship building. This model has been implemented into an agent that attempts to establish a relationship with the user, expressing empathy both verbally and visually. The visual expression of empathy consists of facial expression and physical proximity representation. The user tests performed showed that while users were able to develop a simple relationship with the agents, they however developed stronger relationships with a version of the agent that is most visually expressive and takes advantage of the proximity element, confirming the significance of our model based on social penetration theory may have and, consequently, the importance of the visual representation of empathic responses.

Sandra Gama, Gabriel Barata, Daniel Gonçalves, Rui Prada, Ana Paiva

Learning General Preference Models from Physiological Responses in Video Games: How Complex Is It?

An affective preference model can be successfully learnt from pairwise comparison of physiological responses. Several approaches to do this obtain different performances. The higher ranked seem to use non linear models and complex feature selection strategies. We present a comparison of three linear and non linear classification methods, combined with a simple and a complex feature selection strategy (sequential forward selection and a genetic algorithm), on two datasets. We apply a strict crossvalidation framework to test the generalization capability of the models when facing physiological data coming from a new user. We show that, when generalization is the goal, complex non-linear models trained using fancy strategies might easily get trapped by overfitting, while linear ones might be preferable. Although this could be expected, the only way to appreciate it has to pass through proper cross-validation, and this is often forgot when rushing in the “best” performance challenge.

Maurizio Garbarino, Simone Tognetti, Matteo Matteucci, Andrea Bonarini

Towards Real-Time Affect Detection Based on Sample Entropy Analysis of Expressive Gesture

Aiming at providing a solid foundation to the creation of future affect detection applications in HCI, we propose to analyze human expressive gesture by computing movement Sample Entropy (SampEn). This method provides two main advantages: (i) it is adapted to the non-linearity and non-stationarity of human movement; (ii) it allows a fine-grain analysis of the information encoded in the movement features dynamics. A realtime application is presented, implementing the SampEn method. Preliminary results obtained by computing SampEn on two expressive features, smoothness and symmetry, are provided in a video available on the web.

Donald Glowinski, Maurizio Mancini

Predicting Learner Engagement during Well-Defined and Ill-Defined Computer-Based Intercultural Interactions

This article reviews the first of two experiments investigating the effect tailoring of training content has on a learner’s perceived engagement, and to examine the influence the Big Five Personality Test and the Self-Assessment Manikin (SAM) mood dimensions have on these outcome measures. A secondary objective is to then correlate signals from physiological sensors and other variables of interest, and to develop a model of learner engagement. Self-reported measures were derived from the engagement index of the Independent Television Commission-Sense of Presence Inventory (ITC-SOPI). Physiological measures were based on the commercial Emotiv Epoc Electroencephalograph (EEG) brain-computer interface. Analysis shows personality factors to be reliable predictors of general engagement within well-defined and ill-defined tasks, and could be used to tailor instructional strategies where engagement was predicted to be non-optimal. It was also evident that Emotiv provides reliable measures of engagement and excitement in near real-time.

Benjamin S. Goldberg, Robert A. Sottilare, Keith W. Brawner, Heather K. Holden

Context-Independent Facial Action Unit Recognition Using Shape and Gabor Phase Information

In this paper we investigate the combination of shape features and Phase-based Gabor features for context-independent Action Unit Recognition. For our recognition goal, three regions of interest have been devised that efficiently capture the AUs activation/deactivation areas. In each of these regions a feature set consisting of geometrical and histogram of Gabor phase appearance-based features have been estimated. For each Action Unit, we applied Adaboost for feature selection, and used a binary SVM for context-independent classification. Using the Cohn-Kanade database, we achieved an average

score of 93.8% and an average area under the ROC curve of 97.9 %, for the 11 AUs considered.

Isabel Gonzalez, Hichem Sahli, Valentin Enescu, Werner Verhelst

Conveying Emotion with Moving Images: Relationship between Movement and Emotion

We investigated the relationship between movement and conveying emotion with moving images. We developed a software system for generating moving images in which movement is specified with moving effects consisting of a few elements. We prepared eight movements from the Vertex Noise moving effect, which consists of the three elements of speed, density, and strength, by giving each element different values and combined them with still images and sound data to generate moving images that would convey emotions. Subjects looked at moving images without sound and determined whether they felt certain emotions. The results showed the the higher density value affects the conveyance of any emotions with moving images, and strength distinguishes the conveyance of anger from fear and sadness. Anger is the most recognizable emotion, and fear and sadness are difficult to distinguish from movements.

Rumi Hiraga, Keitaro Takahashi

Hybrid Fusion Approach for Detecting Affects from Multichannel Physiology

Bringing emotional intelligence to computer interfaces is one of the primary goals of affective computing. This goal requires detecting emotions often through multichannel physiology and/or behavioral modalities. While most affective computing studies report high affect detection rate from physiological data, there is no consensus on which methodology in terms of feature selection or classification works best for this type of data. This study presents a framework for fusing physiological features from multiple channels using machine learning techniques to improve the accuracy of affect detection. A hybrid fusion based on weighted majority vote technique for integrating decisions from individual channels and feature level fusion is proposed. The results show that decision fusion can achieve higher classification accuracy for affect detection compared to the individual channels and feature level fusion. However, the highest performance is achieved using the hybrid fusion model.

Md. Sazzad Hussain, Rafael A. Calvo, Payam Aghaei Pour

Investigating the Suitability of Social Robots for the Wellbeing of the Elderly

This study aims to understand if, and how, social robots can promote wellbeing in the elderly. The existing literature suggests that social robots have the potential to improve wellbeing in the elderly, but existing robots focus more on healthcare and healthy behaviour among the elderly. This work describes a new investigation based on focus groups and home studies, in which we produced a set of requirements for social robots that reduce loneliness and improve psychological wellbeing among the elderly. The requirements were validated with the participants of our study. We anticipate that the results of this work will lead to the design of a new social robot more suited to improving wellbeing of the elderly.

Suzanne Hutson, Soo Ling Lim, Peter J. Bentley, Nadia Bianchi-Berthouze, Ann Bowling

The Effects of Emotionally Worded Synthesized Speech on the Ratings of Emotions and Voice Quality

The present research investigated how the verbal content of synthetic messages affects participants’ emotional responses and the ratings of voice quality. 28 participants listened to emotionally worded sentences produced by a monotonous and a prosodic tone of voice while the activity of corrugator supercilii facial muscle was measured. Ratings of emotions and voice quality were also collected. The results showed that the ratings of emotions were significantly affected by the emotional contents of the sentences. The prosodic tone of voice evoked more emotion-relevant ratings of arousal than the monotonous voice. Corrugator responses did not seem to reflect emotional reactions. Interestingly, the quality of the same voice was rated higher when the content of the sentences was positive as compared to the neutral and negative sentences. Thus, the emotional content of the spoken messages can be used to regulate users’ emotions and to evoke positive feelings about the voices.

Mirja Ilves, Veikko Surakka, Toni Vanhala

Evaluating a Cognitive-Based Affective Student Model

Predicting students’ emotion raises several questions about the data on which these predictions should be grounded. This article describes an empirical evaluation of a cognitive-based affective user model accomplished with 7th grade students. The affective model is based on the OCC psychological theory of emotions in order to infer the students’ emotions from their actions and choices in the interface of the learning system. The model relies on a BDI model to implement the process of inference of students’ emotions in a web-based learning environment. Two experiments were conducted based on a direct and an indirect approach. The results of the evaluation are discussed and some ideas of improvement for the experiments protocol are presented.

Patricia A. Jaques, Rosa Vicari, Sylvie Pesty, Jean-Claude Martin

Audio Visual Emotion Recognition Based on Triple-Stream Dynamic Bayesian Network Models

We present a triple stream DBN model (T_AsyDBN) for audio visual emotion recognition, in which the two audio feature streams are synchronous, while they are asynchronous with the visual feature stream within controllable constraints. MFCC features and the principle component analysis (PCA) coefficients of local prosodic features are used for the audio streams. For the visual stream, 2D facial features as well 3D facial animation unit features are defined and concatenated, and the feature dimensions are reduced by PCA. Emotion recognition experiments on the eNERFACE’05 database show that by adjusting the asynchrony constraint, the proposed T_AsyDBN model obtains 18.73% higher correction rate than the traditional multi-stream state synchronous HMM (MSHMM), and 10.21% higher than the two stream asynchronous DBN model (Asy_DBN).

Dongmei Jiang, Yulu Cui, Xiaojing Zhang, Ping Fan, Isabel Ganzalez, Hichem Sahli

Erratum

Erratum: Ranking vs. Preference: A Comparative Study of Self-reporting