Skip to main content

Über dieses Buch

This volume constitutes the proceedings of the 11th International Conference on Augmented Cognition, AC 2017, held as part of the International Conference on Human-Computer Interaction, HCII 2017, which took place in Vancouver, BC, Canada, in July 2017. HCII 2017 received a total of 4340 submissions, of which 1228 papers were accepted for publication after a careful reviewing process. The papers thoroughly cover the entire field of Human-Computer Interaction, addressing major advances in knowledge and effective use of computers in a variety of application areas.

The two volumes set of AC 2017 presents 81 papers which are organized in the following topical sections: electroencephalography and brain activity measurement, eye tracking in augmented cognition, physiological measuring and bio-sensing, machine learning in augmented cognition, cognitive load and performance, adaptive learning systems, brain-computer interfaces, human cognition and behavior in complex tasks and environments.



Electroencephalography and Brain Activity Measurement


My Brain Is Out of the Loop: A Neuroergonomic Approach of OOTL Phenomenon

The world surrounding us has become increasingly technological. Nowadays, the influence of automation is perceived in each aspect of everyday life and not only in the world of industry. Automation certainly makes some aspects of life easier, faster and safer. Nonetheless, empirical data suggests that traditional automation has many negative performance and safety consequences. Particularly, in cases of automatic equipment failure, human supervisors seemed effectively helpless to diagnose the situation, determine the appropriate solution and retake control, a set of difficulties called the “out-of-the-loop” (OOL) performance problem. Because automation is not powerful enough to handle all abnormalities, this difficulty in “takeover” is a central problem in automation design.The OOL performance problem represents a key challenge for both systems designers and human factor society. After decades of research, this phenomenon remains difficult to grasp and treat and recent tragic accidents remind us the difficulty for human operator to interact with highly automated system. The general objective of our research project is to improve our comprehension of the OOL performance problem. To address this issue, we aim (1) to identify the neuro-functional correlates of the OOL performance problem, (2) to propose design recommendations to optimize human-automation interaction and decrease OOL performance problem occurrence. Behavioral data and brain imaging studies will be used to provide a better understanding of this phenomenon at both physiological and psychological levels.

Bruno Berberian, Jonas Gouraud, Bertille Somon, Aisha Sahai, Kevin Le Goff

Testing the Specificity of EEG Neurofeedback Training on First- and Second-Order Measures of Attention

During electroencephalography (EEG) neurofeedback training, individuals learn to willfully modulate their brain oscillations. Successful modulation has been shown to be related to cognitive benefits and wellbeing. The current paper addresses the specificity of three neurofeedback protocols in influencing first- (basic Stroop effect) and second-order (Gratton effect) measures of attentional control. The data come from two previously presented studies that included the Stroop task to assess attentional control. The three neurofeedback protocols were upregulation of frontal alpha, sensorimotor (SMR), and mid-frontal theta oscillations. The results show specific effects of different EEG neurofeedback protocols on attentional control and are modulated by the cognitive effort needed in the Stroop task. To summarize, in less-demanding versions of the Stroop task, alpha training improves first- and second-order attentional control, whereas SMR and theta training had no effect. In the demanding version of the Stroop task, theta training improves first-order, but not second-order control and SMR training has no effect on either. Using a drift diffusion model-based analysis, it is shown that only alpha and theta training modulate the underlying cognitive processing, with theta upregulation enhancing evidence accumulation. Although the current results need to be interpreted with caution, they support the use of different neurofeedback protocols to augment specific aspects of the attentional system. Recommendations for future work are made.

Eddy J. Davelaar

Neural Dynamics of Spontaneous Thought: An Electroencephalographic Study

Spontaneous thinking is a ubiquitous aspect of our mental life and has increasingly become a hot topic of research in cognitive neuroscience. To date, functional neuroimaging studies of spontaneous thought have revealed general brain recruitment centered on a combination of default mode network and executive regions. Despite recent findings about general brain recruitment, very little is known about how these regions are recruited dynamically over time. The current research addresses this gap in the literature by using EEG to investigate the fine-grained temporal dynamics of brain activity underlying spontaneous thoughts. We employed the first-person reports of experienced meditators to index the onset of spontaneous thoughts, and examined brain electrical activity preceding indications of spontaneous thought onset. An independent component analysis-based source localization procedure recovered sources very similar to those previously found with fMRI (Ellamil et al. in NeuroImage 136:186–196, 2016). In addition, phase synchrony analyses revealed a temporal trajectory that begins with default network midline and salience network connectivity, followed by the incorporation of language and executive regions during the period from thought generation to appraisal.

Manesh Girn, Caitlin Mills, Eric Laycock, Melissa Ellamil, Lawrence Ward, Kalina Christoff

Deep Transfer Learning for Cross-subject and Cross-experiment Prediction of Image Rapid Serial Visual Presentation Events from EEG Data

Transfer learning (TL) has gained significant interests recently in brain computer interface (BCI) as a key approach to design robust predictors for cross-subject and cross-experiment prediction of the brain activities in response to cognitive events. We carried out in this.aper the first comprehensive investigation of the transferability of deep convolutional neural network (CNN) for cross-subject and cross-experiment prediction of image Rapid Serial Visual Presentation (RSVP) events. We show that for both cross-subject and cross-experiment predictions, all convolutional layers and fully connected layers contain both general and subject/experiment-specific features and transfer learning with weights fine-tuning can improve the prediction performance over that without transfer. However, for cross-subject prediction, the convolutional layers capture more subject-specific features, whereas for cross-experiment prediction, the convolutional layers capture more general features across experiment. Our study provides important information that will guide the design of more sophisticated deep transfer learning algorithms for EEG based classifications in BCI applications.

Mehdi Hajinoroozi, Zijing Mao, Yuan-Pin Lin, Yufei Huang

Using Portable EEG to Assess Human Visual Attention

Over the past ten years there has been a rapid increase in the number of portable electroencephalographic (EEG) systems available to researchers. However, to date, there has been little work validating these systems for event-related potential (ERP) research. Here we demonstrate that the MUSE portable EEG system can be used to quickly assess and quantify the ERP responses associated with visuospatial attention. Specifically, in the present experiment we had participants complete a standard “oddball” task wherein they saw a series of infrequently (targets) and frequently (control) appearing circles while EEG data was recorded from a MUSE headband. For task performance, participants were instructed to count the number of target circles that they saw. After the experiment, an analysis of the EEG data evoked by the target circles when contrasted with the EEG data evoked by the control circles revealed two ERP components – the N200 and the P300. The N200 is typically associated with stimulus/perceptual processing whereas the P300 is typically associated with a variety of cognitive processes including the allocation of visuospatial attention [1]. It is important to note that the physical manifestation of the N200 and P300 ERP components differed from reports using standard EEG systems; however, we have validated that this is due to the quantification of these ERP components at non-standard electrode locations. Importantly, our results demonstrate that a portable EEG system such as the MUSE can be used to examine the ERP responses associated with the allocation of visuospatial attention.

Olave E. Krigolson, Chad C. Williams, Francisco L. Colino

Investigating Brain Dynamics in Industrial Environment – Integrating Mobile EEG and Kinect for Cognitive State Detection of a Worker

In the present work we used wearable EEG sensor for recording brain activity during simulated assembly work, in replicated industrial environment. We investigated attention related modalities of P300 ERP component and engagement index (EI), which is extracted from signal power ratios of α, β and θ frequency bands. Simultaneously, we quantified the task unrelated movements, which are previously reported to be related to attention level, in an automated way employing kinectTM sensor. Reaction times were also recorded and investigated. We found that during the monotonous task, both the P300 amplitude and EI decreased as the time of the task progressed. On the other hand, the increase of the task unrelated movement quantity was observed, together with the increase in RTs. These findings lead to conclusion that the monotonous assembly work induces the decrease of attention and engagement of the workers as the task progresses, which is observable in both neural (EEG) and behavioral (RT and unrelated movements) signal modalities. Apart from observing how the attention-related modalities are changing over time, we investigated the functional relationship between the neural and behavioral modalities by using Pearson’s correlation. Since the Person’s correlation coefficients showed the functional relationship between the attention-related modalities, we proposed the creation of the multimodal implicit Human-Computer Interaction (HCI) system, which could acquire and process neural and behavioral data in real-time, with the aim of creating the system that could be aware of the operator’s mental states during the industrial work, consequently improving the operator’s well-being.

Pavle Mijović, Miloš Milovanović, Ivan Gligorijević, Vanja Ković, Ivana Živanović-Mačužić, Bogdan Mijović

Characteristic Alpha Reflects Predictive Anticipatory Activity (PAA) in an Auditory-Visual Task

Several lines of evidence suggest that humans can predict events that seem to be unpredictable through ordinary sensory means. After reviewing the literature in this controversial field, I present an exploratory EEG study that addresses this hypothesis. I used a pattern classification algorithm drawing on EEG data prior to stimulus presentation to successfully predict upcoming motor responses that were constrained by the upcoming stimulus. Both the phase of peak alpha activity and overall amplitude at ~550 ms prior to the presentation of the stimulus were useful in predicting the upcoming motor response. Although these results support the idea that brain activity may reflect precognitive processes in certain situations, due to the exploratory nature of this study, additional pre-registered confirmatory experiments are required before the results can be considered solid. Implications for creating a closed-loop predictive system based on human physiology are discussed.

Julia A. Mossbridge

Influence of Spontaneous Rhythm on Movement-Related Cortical Potential - A Preliminary Neurofeedback Study

In this work, the variation of the waveform of the movement related cortical potential (MRCP) was investigated in a real-time neurofeedback study, in which the spontaneous slow cortical potential (SCP) within the same frequency band as MRCP ([0.05 3] Hz) was provided as feedback to the subjects. Experiments have shown that the background SCP activity has a strong influence on the waveform of the self-paced MRCP. Negative potential SCP has been shown to increase the negative peak of the MRCP waveform, while positive potential SCP has been shown to reduce the negative peak. The variation of the single-trial MRCP waveform was correlated with the background SCP activity. This study provided a new approach to evaluate and modulate MRCP waveform, which directly determines the brain switch detection BCI performance.

Lin Yao, Mei Lin Chen, Xinjun Sheng, Natalie Mrachacz-Kersting, Xiangyang Zhu, Dario Farina, Ning Jiang

Multiple Human EEG Synchronous Analysis in Group Interaction-Prediction Model for Group Involvement and Individual Leadership

Successful communication relies on the ability to express and obtain information and fast adaptability to the communication that others think has high quality [1]. The one with high exchange quality in group-based communication is generally supposed to have leadership. The leader’s neural mechanism during the communication is not deeply studied in the previous researches. In this paper, a new method is proposed to evaluate the leadership in group activity by utilizing the characteristic of EEG. We collect the brain electrical activity of the group members with non-intrusive high precision wireless EEG acquisition device to reduce the barrier in exchange activity. Through classification of interactive and noninteractive multivariate analysis with multi-person EEG electrode, it’s found that the left temporal lobe cerebral region of leader elected by voting features obvious activation of electrode after receiving messages from others. Further, his α EEG is significantly inhibited and β EEG is obviously activated. This cerebral region is considered to be the one disposing and predicting errors, which indicates that the leader is good at analyzing each person’s information and disposing errors and used the resources for predicting and planning after accepting the problem. Besides, the frontal lobe α wave of the leader during the stage of communication and discussion is inhibited obviously and it is the same as the voting result.

Jiacai Zhang, Zixiong Zhou

Interactive Image Segmentation Method of Eye Movement Data and EEG Data

Interactive image segmentation method plays a vital role in various applications, such as image processing, computer vision and other fields. Traditional interactive image segmentation methods focus on using the way of manually adding interactive information, such as sketching the edges, distinguishing foreground backgrounds with dotted frames, and so on. The information acquisition and decoding technology has become more mature, such as in eye movement and electroencephalogram, and based on which, this paper presents an interactive image segmentation method that uses eye movement trajectory and EEG as interactive information. While observing the image, it collects the data from EEG and eye movement, based on these physiological signals to establish a more natural interactive image object segmentation method. The results show that the method of brain-computer interaction based image segmentation has advantages in the following aspects: first, it is hand-free, and can be applied to special occasions; second, there will be higher efficiency and better results in multi-target image segmentation. This research provides a new way to establish a new method of image segmentation based on human-computer cooperation.

Jiacai Zhang, Song Liu, Jialiang Li

Eye Tracking in Augmented Cognition


Geometry and Gesture-Based Features from Saccadic Eye-Movement as a Biometric in Radiology

In this study, we present a novel application of sketch gesture recognition on eye-movement for biometric identification and estimating task expertise. The study was performed for the task of mammographic screening with simultaneous viewing of four coordinated breast views as typically done in clinical practice. Eye-tracking data and diagnostic decisions collected for 100 mammographic cases (25 normal, 25 benign, 50 malignant) and 10 readers (three board certified radiologists and seven radiology residents), formed the corpus for this study. Sketch gesture recognition techniques were employed to extract geometric and gesture-based features from saccadic eye-movements. Our results show that saccadic eye-movement, characterized using sketch-based features, result in more accurate models for predicting individual identity and level of expertise than more traditional eye-tracking features.

Folami T. Alamudun, Tracy Hammond, Hong-Jun Yoon, Georgia D. Tourassi

Assessing Workload with Low Cost Eye Tracking During a Supervisory Control Task

Automation is fundamentally shifting the tasks that many humans perform. Unmanned aerial vehicles, which originally had stick and rudder control, now rely on waypoint based navigation. The future operators of these systems are increasingly becoming supervisors of automated systems and their primary role is shifting to simply monitoring those systems. This represents a challenge for assessing human performance since there is limited interaction with the systems. Low cost eye tracking, specifically measures of pupil diameter and gaze dispersion, may serve as a means of assessing operator engagement and workload while using these automated systems. The present study investigated the use of a low cost eye tracking system to differentiate low and high workload during an unmanned vehicle supervisory control task. The results indicated that pupil diameter significantly increased during periods of high workload; however, there was no change in the distribution of eye gazes. These results suggest that low cost eye tracking may be an effective means of determining an operator’s workload in an automated environment, however more research is needed on the relationship between gaze distribution, workload and performance within a supervisory control environment.

Joseph T. Coyne, Ciara Sibley, Sarah Sherwood, Cyrus K. Foroughi, Tatana Olson, Eric Vorm

The Analysis and Prediction of Eye Gaze When Viewing Statistical Graphs

Statistical graphs are images that display quantitative information in a visual format that allows for the easy and consistent interpretation of the information. Often, statistical graphs are in the form of line graphs or bar graphs. In fields, such as cybersecurity, sets of statistical graphs are used to present complex information; however, the interpretation of these more complex graphs is often not obvious. Unless the viewer has been trained to understand each graph used, the interpretation of the data may be limited or incomplete [1]. In order to study the perception of statistical graphs, we tracked users’ eyes while studying simple statistical graphs. Participants studied a graph, and later viewed a graph purporting to be a subset of the data. They were asked to look for a substantive change in the meaning of the second graph compared to the first.To model where the participants would direct their attention, we ran several visual saliency models over the graphs [2–4]. Visual saliency models try to predict where people will look in an image; however, visual saliency models are typically designed and evaluated to predict where people look in natural images (images of natural or real world scenes), which have lots of potential information, subjective interpretations, and are not typically very quantitative. The ideal observer model [2], unlike most saliency models, tries to predict where people look based on the amount of information contained within each location in an image. The underlying theory of the ideal observer model is that when a person sees a new image, they want to understand that image as quickly as possible. To do this, the observer directs their attention first to the locations in the image that will provide the most information (i.e. give the best understanding of the information).Within this paper, we have analyzed the eye gaze from a study on statistical graphs to evaluate the consistency between participants in the way they gazed at graphs and how well a saliency model can predict where those people are likely to look in the graph. During the study, as a form of mental diversion to the primary task, participants also looked at natural images, between each set of graphs. When the participants looked at the images, they did so without guidance, i.e. they weren’t told to look at the images for any particular reason or objective. This allowed the viewing pattern for graphs to be compared to eye gaze data for the natural images, while also showing the differences, in the processing of simple graphs versus complex natural images.An interesting result shows that viewers processed the graphs differently than natural images. The center of the graph was not a strong predictor of attention. In natural images, a Gaussian kernel at the center of an image can achieve a receiver operating characteristic (ROC) score of over 80% due to inherent center bias in both the selection of natural images and the gaze patterns of participants [5]. This viewing pattern was present when participants looked at the natural images during the diversion task, but it was not present when they studied the graphs. Results from the study also found fairly consistent, but unusually low inter-subject consistency ROC scores. Inter-subject consistency is the ability to predict one participant’s gaze locations using the gaze positions of the other (n − 1) participants [3]. The saliency model itself was an inconsistent predictor of participants’ eye gaze by default. Like the participants, the saliency model identified titles and axis labels as salient. The saliency model also found the bars and lines on the graphs to be salient; however, the eye gaze of most participants rarely fell or focused on the line or bar graphs. This may be due to the simplicity of the graphs, implying that very little time or attention needed to be directed to the actual bar or line graph in order to remember it.

Andre Harrison, Mark A. Livingston, Derek Brock, Jonathan Decker, Dennis Perzanowski, Christopher Van Dolson, Joseph Mathews, Alexander Lulushi, Adrienne Raglin

Performance Evaluation of the Gazepoint GP3 Eye Tracking Device Based on Pupil Dilation

Eye tracking is considered one of the most salient methods to study the cognitive demands of humans in human computer interactive systems, due to the unobtrusiveness, flexibility and the development of inexpensive eye trackers. In this work, we evaluate the applicability of these low cost eyetrackers to study pupillary response to varying memory loads and luminance conditions. Specifically, we examine a low-cost eye tracker, the Gazepoint GP3, and objectively evaluate its ability to differentiate pupil dilation metrics under different cognitive loads and luminance conditions. The classification performance is computed in the form of a receiver operating characteristic (ROC) curve and the results indicate that Gazepoint provides a reliable eye tracker to human computer interaction applications requiring pupil dilation studies.

Pujitha Mannaru, Balakumar Balasingam, Krishna Pattipati, Ciara Sibley, Joseph T. Coyne

Patterns of Attention: How Data Visualizations Are Read

Data visualizations are used to communicate information to people in a wide variety of contexts, but few tools are available to help visualization designers evaluate the effectiveness of their designs. Visual saliency maps that predict which regions of an image are likely to draw the viewer’s attention could be a useful evaluation tool, but existing models of visual saliency often make poor predictions for abstract data visualizations. These models do not take into account the importance of features like text in visualizations, which may lead to inaccurate saliency maps. In this paper we use data from two eye tracking experiments to investigate attention to text in data visualizations. The data sets were collected under two different task conditions: a memory task and a free viewing task. Across both tasks, the text elements in the visualizations consistently drew attention, especially during early stages of viewing. These findings highlight the need to incorporate additional features into saliency models that will be applied to visualizations.

Laura E. Matzen, Michael J. Haass, Kristin M. Divis, Mallory C. Stites

Eye Tracking for Dynamic, User-Driven Workflows

Researchers at Sandia National Laboratories in Albuquerque, New Mexico, are engaged in the empirical study of human-information interaction in high-consequence national security environments. This focus emerged from our longstanding interactions with military and civilian intelligence analysts working across a broad array of domains, from signals intelligence to cybersecurity to geospatial imagery analysis. In this paper, we discuss how several years’ of work with Synthetic Aperture Radar (SAR) imagery analysts revealed the limitations of eye tracking systems for capturing gaze events in the dynamic, user-driven problem-solving strategies characteristic of geospatial analytic workflows. We also explain the need for eye tracking systems capable of supporting inductive study of dynamic, user-driven problem-solving strategies characteristic of geospatial analytic workflows. We then discuss an ongoing project in which we are leveraging some of the unique properties of SAR image products to develop a prototype eyetracking data collection and analysis system that will support inductive studies of visual workflows in SAR image analysis environments.

Laura A. McNamara, Kristin M. Divis, J. Daniel Morrow, David Perkins

Investigating Eye Movements in Natural Language and C++ Source Code - A Replication Experiment

Natural language text and source code are very different in their structure and semantics. Source code uses words from natural language such as English mainly in comments and identifier names. Is there an inherent difference in the way programmers read natural language text compared to source code? Does expertise play a role in the reading behavior of programmers? In order to start answering these questions, we conduct a controlled experiment with novice and non-novice programmers while they read small short snippets of natural language text and C++ source code. This study is a replication of an earlier study by Busjahn et al. [1] but uses C++ instead of Java source code. The study was conducted with 33 students, who were each given ten tasks: a set of seven programs, and three natural language texts. They were asked one of three random comprehension questions after each task. Using several linearity metrics presented in an earlier study [1], we analyze the eye movements on source code and natural language. The results indicate that novices and non-novices both read source code less linearly than natural language text. We did not find any differences between novices and non-novices between natural language text and source code. We compare our results to the Busjahn study and provide directions for future work.

Patrick Peachock, Nicholas Iovino, Bonita Sharif

Adapting Human-Computer-Interaction of Attentive Smart Glasses to the Trade-Off Conflict in Purchase Decisions: An Experiment in a Virtual Supermarket

In many everyday purchase decisions, consumers have to trade-off their decisions between alternatives. For example, consumers often have to decide whether to buy the more expensive high quality product or the less expensive product of lower quality. Marketing researchers are especially interested in finding out how consumers make decisions when facing such trade-off conflicts and eye-tracking has been used as a tool to investigate the allocation of attention in such situations. Conflicting decision situations are also particularly interesting for human-computer interaction research because designers may use knowledge about the information acquisition behavior to build assistance systems which can help the user to solve the trade-off conflict. In this paper, we build and test such an assistance system that monitors the user’s information acquisition processes using mobile eye-tracking in the virtual reality. In particular, we test whether and how strongly the trade-off conflict influences how consumers direct their attention to products and features. We find that trade-off conflict, task experience and task involvement significantly influence how much attention products receive. We discuss how this knowledge might be used in the future to build assistance systems in the form of attentive smart glasses.

Jella Pfeiffer, Thies Pfeiffer, Anke Greif-Winzrieth, Martin Meißner, Patrick Renner, Christof Weinhardt

Practical Considerations for Low-Cost Eye Tracking: An Analysis of Data Loss and Presentation of a Solution

This paper presents data loss figures from three experiments, varying in length and visual complexity, in which low-cost eye tracking data were collected. Analysis of data from the first two experiments revealed higher levels of data loss in the visually complex task environment and that task duration did not appear to impact data loss. Results from the third experiment demonstrate how data loss can be mitigated by including periodic eye tracking data quality assessments, which are described in detail. The paper concludes with a discussion of overall findings and provides suggestions for researchers interested in employing low-cost eye tracking in human subject experiments.

Ciara Sibley, Cyrus K. Foroughi, Tatana Olson, Cory Moclaire, Joseph T. Coyne

A Comparison of an Attention Acknowledgement Measure and Eye Tracking: Application of the as Low as Reasonable Assessment (ALARA) Discount Usability Principle for Control System Studies

The measurement of attention allocation is a valuable diagnostic tool for research. As Low As Reasonable Assessment (ALARA) is a research approach concerned with leveraging the simplest and most straightforward methods to capture usability data needed for the design process. Often complicated environments, such as nuclear process control, create an impetus to use accompanying complicated experimental designs and technical data collection methods; however, simple methods can in many circumstances capture equivalent data that can be used to answer the same theoretical and applied research questions. The attention acknowledgment method is an example of a simple measure capable of capturing attention allocation. The attention acknowledgment method assesses attention allocation via attention markers dispersed through the visual scene. As participants complete a scenario and interact with an associated interface, they perform a secondary acknowledgment task in which they respond to any attention markers they detect in their designated target state. The patterns of acknowledgment serve as a means to assess both location and temporal dimensions of attention allocation. The attention acknowledgment method was compared against a standard accepted measure of attention allocation consisting of infrared pupil and corneal reflection gaze tracking. The attention acknowledgment method is not able to measure attention at the same temporal and spatial resolution as the eye tracking method; however, the resolutions it is capable of achieving are sufficient to answer usability evaluation questions. Furthermore, the ease of administration and analysis of the attention acknowledgment measure are advantageous for rapid usability evaluation.

Thomas A. Ulrich, Ronald L. Boring, Steffen Werner, Roger Lew

Physiological Measuring and Bio-sensing


Rim-to-Rim Wearables at the Canyon for Health (R2R WATCH): Experimental Design and Methodology

The Rim-to-Rim Wearables At The Canyon for Health (R2R WATCH) study examines metrics recordable on commercial off the shelf (COTS) devices that are most relevant and reliable for the earliest possible indication of a health or performance decline. This is accomplished through collaboration between Sandia National Laboratories (SNL) and The University of New Mexico (UNM) where the two organizations team up to collect physiological, cognitive, and biological markers from volunteer hikers who attempt the Rim-to-Rim (R2R) hike at the Grand Canyon. Three forms of data are collected as hikers travel from rim to rim: physiological data through wearable devices, cognitive data through a cognitive task taken every 3 hours, and blood samples obtained before and after completing the hike. Data is collected from both civilian and warfighter hikers. Once the data is obtained, it is analyzed to understand the effectiveness of each COTS device and the validity of the data collected. We also aim to identify which physiological and cognitive phenomena collected by wearable devices are the most relatable to overall health and task performance in extreme environments, and of these ascertain which markers provide the earliest yet reliable indication of health decline. Finally, we analyze the data for significant differences between civilians’ and warfighters’ markers and the relationship to performance. This is a study funded by the Defense Threat Reduction Agency (DTRA, Project CB10359) and the University of New Mexico (The main portion of the R2R WATCH study is funded by DTRA. UNM is currently funding all activities related to bloodwork. DTRA, Project CB10359; SAND2017-1872 C). This paper describes the experimental design and methodology for the first year of the R2R WATCH project.

Glory Emmanuel Aviña, Robert Abbott, Cliff Anderson-Bergman, Catherine Branda, Kristin M. Divis, Lucie Jelinkova, Victoria Newton, Emily Pearce, Jon Femling

Investigation of Breath Counting, Abdominal Breathing and Physiological Responses in Relation to Cognitive Load

Computers and mobile devices can enhance learning processes but may also impose or exacerbate stress. This fact may be particularly applicable to some college and university students who already experience high stress levels. Breathing has long been used in meditative traditions for self-regulation and Western science has clearly shown the complex relationship between breathing, blood circulation and the autonomic nervous system. Since breathing is both automatic and volitional, this study seeks to examine if college students can manage physiological responses from a cognitive load imposed by a Stroop color word test by using either breath counting, abdominal breathing or the two combined. The findings of this study may provide evidence which promotes the idea of teaching breath-based self-regulation strategies in college and university settings. The findings may also be of interest to designers of affective computer systems by suggesting that device interfaces and software can be configured to monitor users’ cognitive load indirectly through physiological signals and alert the user to irregularities or adapt to the user’s needs.

Hubert K. Brumback

Investigating the Role of Biofeedback and Haptic Stimulation in Mobile Paced Breathing Tools

Previous studies have shown that mindfulness meditation and paced breathing are effective tools for stress management. There are a number of mobile applications currently available that are designed to guide the breath to support these relaxation practices. However, these focus mainly on audio/visual cues and are mostly non-interactive. Our goal is to develop a mobile paced breathing tool focusing on the exploration of haptic cues and biofeedback. We conducted user studies to investigate the effectiveness of the system. This study explores the following questions: Do users prefer control of the breathing rate interval through an on-screen slider (manual mode) or through a physiological sensor (biofeedback mode)? How effective is haptic guidance on its own? And how may the addition of haptic feedback enhance audio-based guidance? Our analysis suggests that while both manual and biofeedback modes are desirable, manual control leads to a greater overall increase in relaxation. Additionally, the findings of this study support the value of haptic guidance in mobile paced breathing tools.

Antoinette Bumatay, Jinsil Hwaryoung Seo

Pupil Dilation and Task Adaptation

Individuals adapt to tasks as they repeatedly practice them resulting in increased overall performance. Historically, time and accuracy are two metrics used to measure these adaptations. Here we show preliminary evidence that changes in pupil dilation may be able to capture within-task learning changes. A group of enlisted Sailors and Marines completed forty-eight trials of a cognitive task while their pupils were recorded with a low-cost eye tracking system. As expected, accuracy increased across trials while reaction times significantly decreased. We found a strong, negative correlation of pupil size across the trials. These data suggest that changes in pupil dilation can be used to measure within-task adaptations.

Cyrus K. Foroughi, Joseph T. Coyne, Ciara Sibley, Tatana Olson, Cory Moclaire, Noelle Brown

Rim-to-Rim Wearables at the Canyon for Health (R2R WATCH): Correlation of Clinical Markers of Stress with Physiological COTS Data

Commercial off-the-shelf (COTS) wearable devices can provide easily deployable physiologic measurement systems that generate large amounts of crucial health status data. This data, although similar to physiologic data recorded and used routinely in the health care environment, lacks validation in the non-clinical environment. To address this gap in knowledge and to translate clinical expertise to the field we examined healthy volunteers attempting a strenuous task of crossing the Grand Canyon from rim to rim (R2R) in a single day. Subjects completed a pre-crossing questionnaire with baseline biometric measurements and blood collection for analysis of a comprehensive metabolic panel. Enrolled subjects were then asked to wear COTS wearable fitness devices as they attempted the crossing. Subjects were asked to provide a post-crossing questionnaire, repeat biometric measurements and blood collections. We obtained 52 complete sets of pre- and post-hike blood samples. We identified multiple significant changes in metabolic measurements consistent with expected stresses endured. In addition to the subjective fatigue expectedly reported by subjects, subjects had signs of significant muscle breakdown, yet no subject required immediate medical attention upon completing the task. We linked these clinical markers of stress to the physiologic output from COTS wearable devices and are now able to translate the output measures of these devices to meaningful clinical outcomes. In addition, we have begun to establish new expected ranges for physiologic data during extreme stress that does not require immediate medical attention. This data is crucial to defining usage parameters for wearable devices in deployed field settings.

Lucie Jelinkova, Emily Pearce, Christopher Bossart, Risa Garcia, Jon Femling

Grounded Approach for Understanding Changes in Human Emotional States in Real Time Using Psychophysiological Sensory Apparatuses

This paper discusses the technical and philosophical challenges that researchers and practitioners face when attempting to classify human emotion based upon raw physiological data. It proposes the use of a representational learning approach that adopts techniques from industrial internet of things (IoT) solutions. It applies this approach to the classification of emotional states using functional near infrared spectroscopy (fNIRS) sensor data.The algorithm used first pre-processes the data using a combination of signal processing and vector quantization techniques. Next, it found the optimal number of natural clusters within human emotional states and used these as the target variables for either shallow or for deep learning classification. The deep learning variant used a Restricted Boltzmann Machine (RBM) to form a compressive representation of the input data prior to classification. A final single layer perception model learned the relationship between the input and output states.This approach would be useful for detecting real-time changes in human emotional state. It is able automatically create emotional states that are both highly separable and balanced. It is able to distinguish between low v. high emotional states across all tasks (F1-score of 71.4%) and is better at forming this distinction for tasks intended to elicit higher cognitive load such as the Tetris video game (F1-score of 87.1%) or the Multi Attribute Task Battery (F1-score of 77%).

Ryan A. Kirk

Augmented Cognition for Continuous Authentication

Authentication serves the gatekeeping function in computing systems. Methods used in authentication fall into three major paradigms: ‘what you know’, ‘who you are’ and ‘what you have’ of which the first is still the most commonly applied in the form of passwords authentication. Recall and recognition are the cognitive functions central to the ‘what you know’ authentication paradigm. Studies have shown that more secure passwords are harder to recall and this often leads to habits that facilitate recollection at the expense of security. Combining the uniqueness of physiological measures, such as brainwave patterns, with memorable augmented passwords shows the promise of providing a secure and memorable authentication process. In this paper, we discuss authentication and related problems and considerations in literature. We then test a password system designed to make use of character property transformations such as color and font to minimize the need for complex passwords while not compromising security. The findings from this study suggest that applying transformations to passwords facilitates memorability. We then discuss a study to combine an augmented password system with physiological measures that can provide a more secure model for continuous authentication.

Nancy Mogire, Michael-Brian Ogawa, Brent Auernheimer, Martha E. Crosby

Analysis of Social Interaction Narratives in Unaffected Siblings of Children with ASD Through Latent Dirichlet Allocation

Children with autism spectrum disorders (ASD) and their unaffected siblings (US) are frequent targets of social bullying, which leads to severe physical, emotional, and social consequences. Understanding the risk factors is essential for developing preventative measures. We suggest that one such risk factor may be a difficulty to discriminate different biological body movements (BBM), a task that requires fast and flexible processing and interpretation of complex visual cues, especially during social interactions. Deficits in cognition of BBM have been reported in ASD. Since US display an autism endophenotype we expect that they will also display deficits in social interpretation of BBM. Methods. Participants: 8 US, 8 matched TD children, age 7–14; Tasks/Measurements: Social Blue Man Task: Narrative interpretation with a Latent Dirichlet Allocation [LDA] analysis; Social Experience Questionnaires with children and parents. Results. The US displayed as compared to TD: (i) low self-awareness of social bullying in contrast to high parental reports; (ii) reduced speed in identifying social cues; (iii) lower quality and repetitious wording in social interaction narratives (LDA). Conclusions. US demonstrate social endophenotype of autism reflected in delayed identification, interpretation and verbalization of social cues; these may constitute a high risk factor for becoming a victim of social bullying.

Victoria Newton, Isabel Solis, Glory Emmanuel Aviña, Jonathan T. McClain, Cynthia King, Kristina T. Rewin Ciesielski

Smart Watch Potential to Support Augmented Cognition for Health-Related Decision Making

In this paper, we review current smart watch research in the health domain to inform an Augmented Cognition (AugCog) research agenda for health-related decision making and patient self-management. We connect this AugCog research agenda to prior Clinical Decision Support (CDS), workflow, and informatics research efforts using Persons Living With HIV (PLWH) and Chronic Obstructive Pulmonary Disorder (COPD) patients as examples to illustrate potential research directions.

Blaine Reeder, Paul F. Cook, Paula M. Meek, Mustafa Ozkaynak

Multidimensional Real-Time Assessment of User State and Performance to Trigger Dynamic System Adaptation

In adaptive human-machine interaction, technical systems adapt their behavior to the current state of the human operator to mitigate critical user states and performance decrements. While many researchers use measures of workload as triggers for adjusting levels of automation, we have proposed a more holistic approach to adaptive system design that includes a multidimensional assessment of user state. This paper outlines the design requirements, conceptual framework, and proof-of-concept implementation of a Real-time Assessment of Multidimensional User State (RASMUS). RASMUS diagnostics provide information on user performance, potentially critical user states, and their related impact factors on a second-by-second-basis in real-time. Based on these diagnoses adaptive systems are enabled to infer when the user needs support and to dynamically select and apply an appropriate adaptation strategy for a given situation. While the conceptual framework is generic, the implementation has been applied to an air surveillance task, providing real-time diagnoses for high workload, passive task-related fatigue and incorrect attentional focus.

Jessica Schwarz, Sven Fuchs

An Affordable Bio-Sensing and Activity Tagging Platform for HCI Research

We present a novel multi-modal bio-sensing platform capable of integrating multiple data streams for use in real-time applications. The system is composed of a central compute module and a companion headset. The compute node collects, time-stamps and transmits the data while also providing an interface for a wide range of sensors including electroencephalogram, photoplethysmogram, electrocardiogram, and eye gaze among others. The companion headset contains the gaze tracking cameras. By integrating many of the measurements systems into an accessible package, we are able to explore previously unanswerable questions ranging from open-environment interactions to emotional-response studies. Though some of the integrated sensors are designed from the ground-up to fit into a compact form factor, we validate the accuracy of the sensors and find that they perform similarly to, and in some cases better than, alternatives.

Siddharth, Aashish Patel, Tzyy-Ping Jung, Terrence J. Sejnowski

Machine Learning in Augmented Cognition


Facial Expression Recognition from Still Images

With the development of technology, Facial Expression Recognition (FER) become one of the important research areas in Human Computer Interaction. Changes in the movement of some muscles in face create the facial expressions. By defining these changes, facial expressions can be recognized. In this study, a cascaded structure consists of Local Zernike Moments (LZM), Local XOR Patterns (LXP) and Global Zernike Moments (GZM) methods is proposed for the FER problem. The generally used database is the Extended Chon - Kanade (CK +) in FER problems. The database consists of image sequences of 327 expressions of 118 people. Most FER system includes recognition of 7 classes of emotions happiness, sadness, surprise, anger, disgust, fear and contempt, and we use Library of Support Vector Machines (LIBSVM) classifier for multi class classification with the leave one out cross-validation method. Our overall system performance is measured as 90.34% for FER.

Bilge Süheyla Akkoca Gazioğlu, Muhittin Gökmen

CHISSL: A Human-Machine Collaboration Space for Unsupervised Learning

We developed CHISSL, a human-machine interface that utilizes interactive supervision to help the user group unlabeled instances by her own mental model. The user primarily interacts via correction (moving a misplaced instance into its correct group) or confirmation (accepting that an instance is placed in its correct group). Concurrent with the user’s interactions, CHISSL trains a classification model guided by the user’s grouping of the data. It then predicts the group of unlabeled instances and arranges some of these alongside the instances manually organized by the user. We hypothesize that this mode of human and machine collaboration is more effective than Active Learning, wherein the machine decides for itself which instances should be labeled by the user. We found supporting evidence for this hypothesis in a pilot study where we applied CHISSL to organize a collection of handwritten digits.

Dustin Arendt, Caner Komurlu, Leslie M. Blaha

Toward an Open Data Repository and Meta-Analysis of Cognitive Data Using fNIRS Studies of Emotion

HCI research has increasingly incorporated the use of neurophysiological sensors to identify users’ cognitive and affective states. However, a persistent problem in machine learning on cognitive data is generalizability across participants. A proposed solution has been aggregating cognitive and survey data across studies to generate higher sample populations for machine learning and statistical analyses to converge in stable, generalizable results. In this paper, I argue that large data-sharing projects can facilitate the aggregation of results of brain imaging studies to address these issues, by smoothing noise in high-dimensional datasets. This paper contributes a small step towards large cognitive data sharing systems-design by proposing methods that facilitate the merging of currently incompatible fNIRS and FMRI datasets through term-based metadata analysis. To that end, I analyze 20 fNIRS studies of emotion using content analysis for: (1) synonym terms and definitions for ‘emotion,’ (2) the experimental stimuli, and (3) the use or non-use of self-report surveys. Results suggest that fNIRS studies of emotion have stable synonymy, using technical and folk conceptualizations of affective terms within and between publications to refer to emotion. The studies use different stimuli to elicit emotion but also show commonalities between shared use of standardized stimuli materials and self-report surveys. These similarities in conceptual synonymy and standardized experiment materials indicate promise for neuroimaging communities to establish open-data repositories based on metadata term-based analyses. This work contributes to efforts toward merging datasets across studies and between labs, unifying new modalities in neuroimaging such as fNIRS with fMRI datasets, increasing generalizability of machine learning models, and promoting the acceleration of science through open data-sharing infrastructure.

Sarah Bratt

Establishing Ground Truth on Pyschophysiological Models for Training Machine Learning Algorithms: Options for Ground Truth Proxies

One of the core aspects of human-human interaction is the ability to recognize and respond to the emotional and cognitive states of the other person, leaving human-computer interaction systems, at their core, to perform many of the same tasks.

Keith Brawner, Michael W. Boyce

The Impact of Streaming Data on Sensemaking with Mixed-Initiative Visual Analytics

Visual data analysis helps people gain insights into data via interactive visualizations. People generate and test hypotheses and questions about data in context of the domain. This process can generally be referred to as sensemaking. Much of the work on studying sensemaking (and creating visual analytic techniques in support of it) has been focused on static datasets. However, how do the cognitive processes of sensemaking change when data are changing? Further, what implication for design does this create for mixed-initiative visual analytics systems? This paper presents the results of a user study analyzing the impact of streaming data on sensemaking. To perform this study, we developed a mixed-initiative visual analytic prototype, the Streaming Canvas, that affords the analysis of streaming text data. We compare the sensemaking process of people using this tool for a static and streaming dataset. We present the results of this study and discuss the implications on future visual analytic systems that combine machine learning and interactive visualization to help people make sense of streaming data.

Nick Cramer, Grant Nakamura, Alex Endert

Some Syntax-Only Text Feature Extraction and Analysis Methods for Social Media Data

Automated characterization of online social behavior is becoming increasingly important as day-to-day human interaction migrates from expensive “real world” encounters to less expensive virtual interactions over computing networks. The effective automated characterization of human interaction in social media has important political, economic, social applications.New analytic concepts are presented for the extraction and enhancement of salient numeric features from unstructured text. These concepts employ relatively simple syntactic metrics for characterizing and distinguishing human and automated social media posting behaviors. The concepts are domain agnostic, and are empirically demonstrated using posted text from a particular social medium (Twitter).An innovation uses a feature-imputation regression method to perform feature sensitivity analysis.

Monte Hancock, Charles Li, Shakeel Rajwani, Payton Brown, Olivia Hancock, Corinne Lee, Yaniv Savir, Nicolas Nuon, Francesca Michaels

Using the Hash Tag Histogram and Social Kinematics for Semantic Clustering in Social Media

This work addresses automated semantic clustering of twitter users by analysis of their aggregated text posts (tweets). This semantic clustering of text is an application of a theory we refer to as Social Kinematics. Social Kinematics is a term coined by our team to refer to the field-theoretic approach we develop and describe in [1–3, 5]. It is used here to model human interaction in social media. This social modeling technique regards social media users as field sources, and uses the Laplacian to model their interaction. This yields a natural analogy with physical kinematics. Automation is described that allows social media text posts (organized by author into “threads”) to self-organize as a precursor to analysis and characterization. The goal of this work is to automate the characterization of user-generated text content in terms of its semantics (meaning). Characterization here means the determination of intuitive “categories” for content, and the automatic assignment of user-generated content to these categories. Categories might include: Advertising, Subscribed feeds (news, weather, traffic, etc.), Discussion of current events (politics, sports, popular culture, etc.), and Casual conversation (filial, friend-to-friend, etc.) Characterization is performed by retrieving text posts by Twitter users; numericizing these using a field model; and clustering them by their semantics. An innovation is the application of the field model to semantic characterization of text. This is based upon the observation that user hash tags are a priori semantic tags, making expensive and brittle semantic mapping of the tweet text unnecessary.

Monte Hancock, Chloe Lo, Shakeel Rajwani, Shai Neumann, Dale Franklin, Esnet Gros Negre, Tracy Hollis, Steven Knight, Vikram Tutupalli, Vineet Chintamaneni, Sheila Daniels, Brian Gabak, Venkata Undavalli, Payton Brown, Olivia Hancock

Interface Metaphors for Interactive Machine Learning

To promote more interactive and dynamic machine learning, we revisit the notion of user-interface metaphors. User-interface metaphors provide intuitive constructs for supporting user needs through interface design elements. A user-interface metaphor provides a visual or action pattern that leverages a user’s knowledge of another domain. Metaphors suggest both the visual representations that should be used in a display as well as the interactions that should be afforded to the user. We argue that user-interface metaphors can also offer a method of extracting interaction-based user feedback for use in machine learning. Metaphors offer indirect, context-based information that can be used in addition to explicit user inputs, such as user-provided labels. Implicit information from user interactions with metaphors can augment explicit user input for active learning paradigms. Or it might be leveraged in systems where explicit user inputs are more challenging to obtain. Each interaction with the metaphor provides an opportunity to gather data and learn. We argue this approach is especially important in streaming applications, where we desire machine learning systems that can adapt to dynamic, changing data.

Robert J. Jasper, Leslie M. Blaha

Classifying Tweets Using User Account Information

Twitter is a short-text message system developed 6 years ago. It now has more than 100 million users generating over 300 million tweets every day. Twitter accounts are used for diverse purposes, such as social, advertising, political, religious, benevolent or vicious ideologies, among other activities. These activities can be communicated by humans, a machine or a robot. The purpose of this paper is to build predictive models, such as Logistic Regression, K Nearest Neighbors and Neural Network in order to identify the best variables that help predict, based on the contents, whether the tweets are coming from a human or a machine with the least possible error.

John Khoury, Charles Li, Chloe Lo, Corinne Lee, Shakeel Rajwani, David Woolfolk, Alexis-Walid Ahmed, Loredana Crusov, Arnold Pérez-Goicochea, Christopher Romero, Rob French, Vasco Ribeiro

Machine Learning-Based Prediction of Changes in Behavioral Outcomes Using Functional Connectivity and Clinical Measures in Brain-Computer Interface Stroke Rehabilitation

The goal of this work is to evaluate if changes in brain connectivity can predict behavioral changes among subjects who have suffered stroke and have completed brain-computer interface (BCI) interventional therapy. A total of 23 stroke subjects, with persistent upper-extremity motor deficits, received the stroke rehabilitation therapy using a closed-loop neurofeedback BCI device. Over the course of the entire interventional therapy, resting-state fMRI were collected at two time points: prior to start and immediately upon completion of therapy. Behavioral assessments were administered at each time point via neuropsychological testing to collect measures on Action Research Arm Test, Nine-Hole Peg Test, Barthel Index and Stroke Impact Scale. Resting-state functional connectivity changes in the motor network were computed from pre- to post-interventional therapy and were combined with clinical data corresponding to each subject to estimate the change in behavioral performance between the two time-points using a machine learning based predictive model. Inter-hemispheric correlations emerged as stronger predictors of changes across multiple behavioral measures in comparison to intra-hemispheric links. Additionally, age predicted behavioral changes better than other clinical variables such as gender, pre-stroke handedness, etc. Machine learning model serves as a valuable tool in predicting BCI therapy-induced behavioral changes on the basis of functional connectivity and clinical data.

Rosaleena Mohanty, Anita Sinha, Alexander Remsik, Janerra Allen, Veena Nair, Kristin Caldera, Justin Sattin, Dorothy Edwards, Justin C. Williams, Vivek Prabhakaran

Content Feature Extraction in the Context of Social Media Behavior

Twitter accounts are used for a multitude of reasons, including social, commercial, political, religious, and ideological purposes. The wide variety of activities on Twitter may be automated or non-automated. Any serious attempt to explore the nature of the vast amount of information being broadcast over such a medium may depend on identifying a potentially useful set of content features hidden within the data. This paper proposes a set of content features that may be promising in efforts to categorize social media activities, with the goal of creating predictive models that will classify or estimate the probabilities of automated behavior given certain account content history. Suggestions for future work are offered.

Shai Neumann, Charles Li, Chloe Lo, Corinne Lee, Shakeel Rajwani, Suraj Sood, Buttons A. Foster, Toni Hadgis, Yaniv Savir, Frankie Michaels, Alexis-Walid Ahmed, Nikki Bernobic, Markus Hollander

Detecting Mislabeled Data Using Supervised Machine Learning Techniques

A lot of data sets, gathered for instance during user experiments, are contaminated with noise. Some noise in the measured features is not much of a problem, it even increases the performance of many Machine Learning (ML) techniques. But for noise in the labels (mislabeled data) the situation is quite different, label noise deteriorates the performance of all ML techniques. The research question addressed in this paper is to what extent can one detect mislabeled data using a committee of supervised Machine Learning models. The committee under consideration consists of a Bayesian model, Random Forest, Logistic classifier, a Neural Network and a Support Vector Machine. This committee is applied to a given data set in several iterations of 5-fold Cross validation. If a data sample is misclassified by all committee members in all iterations (consensus) then it is tagged as mislabeled. This approach was tested on the Iris plant data set, which is artificially contaminated with mislabeled data. For this data set the precision of detecting mislabeled samples is 100% and the recall is approximately 5%. The approach was also tested on the Touch data set, a data set of naturalistic social touch gestures. It is known that this data set contains mislabeled data, but the amount is unknown. For this data set the proposed method achieved a precision of 70% and for almost all other tagged samples the corresponding touch gesture deviated a lot from the prototypical touch gesture. Overall the proposed method shows high potential for detecting mislabeled samples, but the precision on other data sets needs to be investigated.

Mannes Poel


Weitere Informationen