Article

Analyzing and predicting focus of attention in remote collaborative tasks

Authors:
Jiazhi Ou

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Lui Min Oh

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Susan R. Fussell

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Tal Blum

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Jie Yang

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

ICMI '05: Proceedings of the 7th international conference on Multimodal interfacesOctober 2005Pages 116–123https://doi.org/10.1145/1088463.1088485

Published:04 October 2005Publication History

ICMI '05: Proceedings of the 7th international conference on Multimodal interfaces

Pages 116–123

ABSTRACT

To overcome the limitations of current technologies for remote collaboration, we propose a system that changes a video feed based on task properties, people's actions, and message properties. First, we examined how participants manage different visual resources in a laboratory experiment using a collaborative task in which one partner (the helper) instructs another (the worker) how to assemble online puzzles. We analyzed helpers' eye gaze as a function of the aforementioned parameters. Helpers gazed at the set of alternative pieces more frequently when it was harder for workers to differentiate these pieces, and less frequently over repeated trials. The results further suggest that a helper's desired focus of attention can be predicted based on task properties, his/her partner's actions, and message properties. We propose a conditional Markov model classifier to explore the feasibility of predicting gaze based on these properties. The accuracy of the model ranged from 65.40% for puzzles with easy-to-name pieces to 74.25% for puzzles with more difficult to name pieces. The results suggest that we can use our model to automatically manipulate video feeds to show what helpers want to see when they want to see it.

References

Argyle, M. & Cook, M. (1976). Gaze and Mutual Gaze. Cambridge University Press.]]Google Scholar
Brumitt B., Krumm J., Meyers B., & Shafer S. (2000). Let there be light: Comparing interfaces for homes of the future. IEEE Personal Communications, August 2000.]]Google Scholar
Campana, E., Baldridge, J., Dowding, J., Hockey, B. A., Remington, R. W., & Stone, L. S. (2001). Using eye movements to determine referents in a spoken dialogue system. Proceedings of the 2001 Workshop on Perceptive User Interfaces, pp. 1--5.]] Google ScholarDigital Library
Clark, H. H. (1996). Using Language. Cambridge, England: Cambridge University Press.]]Google Scholar
Clark, H. H., & Brennan, S. E. (1991). Grounding in Communication. In L. B. Resnick, R. M. Levine, & S. D. Teasley (Eds.). Perspectives on socially shared cognition (pp. 127--149). Washington, DC: APA.]]Google Scholar
Clark, H. H. & Wilkes-Gibbs, D. (1986). Referring as a collaborative process. Cognition, 22, 1--39.]]Google ScholarCross Ref
Frey L. A., White K. P. Jr., Hutchinson T. E. (1990). Eye-gaze word processing. IEEE Transactions on Systems, Man and Cybernetics, 20(4), 944--950.]]Google ScholarCross Ref
Fussell, S. R., Kraut, R. E., & Siegel, J. (2000). Coordination of communication: Effects of shared visual context on collaborative work. Proceedings of CSCW 2000 (pp. 21--30). NY: ACM Press.]] Google ScholarDigital Library
Fussell, S. R., Setlock, L. D., & Kraut, R. E. (2003). Effects of head-mounted and scene-oriented video systems on remote collaboration on physical tasks. Proceedings of CHI 2003 (pp. 513--520). NY: ACM Press.]] Google ScholarDigital Library
Gaver, W., Sellen, A., Heath, C., & Luff, P. (1993). One is not enough: Multiple views in a media space. Proceedings of Interchi '93 (pp. 335--341). NY: ACM Press.]] Google ScholarDigital Library
Gergle, D., Millan, D. R., Kraut, R. E., & Fussell, S. R. (2004). Persistence matters: Making the most of chat in tightly-coupled work. CHI 2004 (pp. 431--438). NY: ACM Press.]] Google ScholarDigital Library
Hutchinson T. E., White K. P. Jr., Martin W. N., Reichert K. C., Frey L. A. (1989). Human-computer interaction using eye-gaze input. IEEE Transaction on Systems, Man, and Cybernetics, 19, pp. 1527--1534.]]Google ScholarCross Ref
Jacob, R. J. K. (1993). Eye-movement-based human-computer interaction techniques. In H. R. Hartson & D. Hix (Eds.), Advances in Human-Computer Interaction, Vol. 4 (pp. 151--190). Norwood, NJ: Ablex.]]Google Scholar
Kraut, R. E., Fussell, S. R., Brennan, S., & Siegel, J. (2003). A framework for understanding effects of proximity on collaboration : Implications for technologies to support remote collaborative work. In P. Hinds & S. Kiesler (Eds.). Technology and Distributed Work.]]Google Scholar
Kraut, R. E., Fussell, S. R., & Siegel, J. (2003). Visual information as a conversational resource in collaborative physical tasks. Human Computer Interaction, 18, 13--49.]]Google ScholarDigital Library
Kraut, R. E., Gergle, D., & Fussell, S. R. (2002). The use of visual information in shared visual spaces: Informing the development of virtual co-presence. Proceedings of CSCW 2002 (pp. 31--40). NY: ACM Press.]] Google ScholarDigital Library
Kuzuoka, H., Kosuge, T., & Tanaka, K.. (1994) GestureCam: A video communication system for sympathetic remote collaboration. Proceedings of CSCW 1994 (pp. 35--43). NY: ACM.]] Google ScholarDigital Library
Kuzuoka, H., Oyama, S., Yamazaki, K., Suzuki, K., & Mitsuishi, M. (2000). GestureMan: A mobile robot that embodies a remote instructor's actions. Proceedings of CSCW 2000 (pp. 155--162). NY: ACM Press.]] Google ScholarDigital Library
Maglio, P. P., Matlock T., Campbell C. S., Zhai S., Smith B. A. (2000). Gaze and speech in attentive user interfaces. In Proceedings of the International Conference on Multimodal Interfaces, volume 1948 from LNCS. Springer, 2000.]] Google ScholarDigital Library
Oh, A., Fox, H., Kleek, M. V., Adler, A., Gajos, K., Morency, L., Darrell, T., Evaluating Look-to-Talk: A Gaze-Aware Interface in a Collaborative Environment (2002), In Proceedings of CHI '02 extended abstracts on Human factors in computing systems, pp. 650 -- 651.]] Google ScholarDigital Library
Ou, J. (unpublished). DOVE-2: Combining gesture with remote camera control.]]Google Scholar
Ou, J., Fussell, S. R., Chen, X., Setlock, L. D., & Yang, J. (2003). Gestural communication over video stream: Supporting multimodal interaction for remote collaborative physical tasks. In Proceedings of International Conference on Multimodal Interfaces, Nov. 5-7, 2003, Vancouver, Canada.]] Google ScholarDigital Library
Ou, J., Oh, L.M., Yang, J., & Fussell, S. R. (2005). Effects of task properties, partner actions, and message content on eye gaze patterns in a collaborative task. Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 231 -- 240. ACM Press.]] Google ScholarDigital Library
Salvucci Dario D. (1999). Inferring Intent in Eye-Based Interfaces: Tracing Eye Movements with Process Models. In Human Factors in Computing Systems: CHI 99, 1999.]] Google ScholarDigital Library
Sibert, L. E., and Jacob, R. J. (2000). Evaluation of Eye Gaze Interaction, In Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 281 -- 288.]] Google ScholarDigital Library
Stiefelhagen, R., Yang, J. (1997). Gaze Tracking for Multimodal Human-Computer Interaction. In Proceedings of International Conf. on Acoustics, Speech, and Signal Processing, April 1997.]] Google ScholarDigital Library
Stiefelhagen, R., Yang, J., & Waibel, A. (2002). Modeling focus of attention for meeting indexing based on multiple cues. IEEE Transactions on Neural Networks, 13, 928--938.]]Google ScholarDigital Library
Vertegaal, R., Slagter, R., van der Veer, G., & Nijholt, A. (2001). Eye gaze patterns in conversations: There is more to conversational agents than meets the eyes. Proceedings of CHI 2001 (pp. 301--308). NY: ACM Press.]] Google ScholarDigital Library

Index Terms

Analyzing and predicting focus of attention in remote collaborative tasks
1. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing theory, concepts and paradigms
      1. Computer supported cooperative work
2. Social and professional topics
  1. Professional topics
    1. Computing and business
      1. Computer supported cooperative work

Recommendations

Predicting Visual Focus of Attention From Intention in Remote Collaborative Tasks

While shared visual space plays a very important role in remote collaboration on physical tasks, it is challenging and expensive to track users' focus of attention (FOA) during these tasks. In this paper, we propose to identify a user's FOA from his/her ...
Read More
Combining audio and video to predict helpers' focus of attention in multiparty remote collaboration on physical tasks
ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces

The increasing interest in supporting multiparty remote collaboration has created both opportunities and challenges for the research community. The research reported here aims to develop tools to support multiparty remote collaborations and to study ...
Read More
Deixis and gaze in collaborative work at a distance (over a shared map): a computational model to detect misunderstandings
ETRA '08: Proceedings of the 2008 symposium on Eye tracking research & applications

This paper presents an algorithm that detects misunderstandings in collaborative work at a distance. It analyses the movements of collaborators' eyes on the shared workspace, their utterances containing references about this workspace, and the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '05: Proceedings of the 7th international conference on Multimodal interfaces
October 2005
344 pages
ISBN:1595930280
DOI:10.1145/1088463
General Chairs:
Gianni Lazzari
ITC-irst, Trento (Italy)
,
Fabio Pianesi
ITC-irst, Trento (Italy)
,
Program Chairs:
James Crowley
I.N.P. Grenoble (France)
,
Kenji Mase
Nagoya University (Japan)
,
Sharon Oviatt
Oregon Health & Sciences University
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 October 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
computer-supported cooperative work
eye tracking
focus of attention
keyword spotting
remote collaborative tasks
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate453of1,080submissions,42%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 540
  Total Downloads
- Downloads (Last 12 months)16
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Analyzing and predicting focus of attention in remote collaborative tasks

ICMI '05: Proceedings of the 7th international conference on Multimodal interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Predicting Visual Focus of Attention From Intention in Remote Collaborative Tasks

Combining audio and video to predict helpers' focus of attention in multiparty remote collaboration on physical tasks

Deixis and gaze in collaborative work at a distance (over a shared map): a computational model to detect misunderstandings