ABSTRACT
To overcome the limitations of current technologies for remote collaboration, we propose a system that changes a video feed based on task properties, people's actions, and message properties. First, we examined how participants manage different visual resources in a laboratory experiment using a collaborative task in which one partner (the helper) instructs another (the worker) how to assemble online puzzles. We analyzed helpers' eye gaze as a function of the aforementioned parameters. Helpers gazed at the set of alternative pieces more frequently when it was harder for workers to differentiate these pieces, and less frequently over repeated trials. The results further suggest that a helper's desired focus of attention can be predicted based on task properties, his/her partner's actions, and message properties. We propose a conditional Markov model classifier to explore the feasibility of predicting gaze based on these properties. The accuracy of the model ranged from 65.40% for puzzles with easy-to-name pieces to 74.25% for puzzles with more difficult to name pieces. The results suggest that we can use our model to automatically manipulate video feeds to show what helpers want to see when they want to see it.
- Argyle, M. & Cook, M. (1976). Gaze and Mutual Gaze. Cambridge University Press.]]Google Scholar
- Brumitt B., Krumm J., Meyers B., & Shafer S. (2000). Let there be light: Comparing interfaces for homes of the future. IEEE Personal Communications, August 2000.]]Google Scholar
- Campana, E., Baldridge, J., Dowding, J., Hockey, B. A., Remington, R. W., & Stone, L. S. (2001). Using eye movements to determine referents in a spoken dialogue system. Proceedings of the 2001 Workshop on Perceptive User Interfaces, pp. 1--5.]] Google ScholarDigital Library
- Clark, H. H. (1996). Using Language. Cambridge, England: Cambridge University Press.]]Google Scholar
- Clark, H. H., & Brennan, S. E. (1991). Grounding in Communication. In L. B. Resnick, R. M. Levine, & S. D. Teasley (Eds.). Perspectives on socially shared cognition (pp. 127--149). Washington, DC: APA.]]Google Scholar
- Clark, H. H. & Wilkes-Gibbs, D. (1986). Referring as a collaborative process. Cognition, 22, 1--39.]]Google ScholarCross Ref
- Frey L. A., White K. P. Jr., Hutchinson T. E. (1990). Eye-gaze word processing. IEEE Transactions on Systems, Man and Cybernetics, 20(4), 944--950.]]Google ScholarCross Ref
- Fussell, S. R., Kraut, R. E., & Siegel, J. (2000). Coordination of communication: Effects of shared visual context on collaborative work. Proceedings of CSCW 2000 (pp. 21--30). NY: ACM Press.]] Google ScholarDigital Library
- Fussell, S. R., Setlock, L. D., & Kraut, R. E. (2003). Effects of head-mounted and scene-oriented video systems on remote collaboration on physical tasks. Proceedings of CHI 2003 (pp. 513--520). NY: ACM Press.]] Google ScholarDigital Library
- Gaver, W., Sellen, A., Heath, C., & Luff, P. (1993). One is not enough: Multiple views in a media space. Proceedings of Interchi '93 (pp. 335--341). NY: ACM Press.]] Google ScholarDigital Library
- Gergle, D., Millan, D. R., Kraut, R. E., & Fussell, S. R. (2004). Persistence matters: Making the most of chat in tightly-coupled work. CHI 2004 (pp. 431--438). NY: ACM Press.]] Google ScholarDigital Library
- Hutchinson T. E., White K. P. Jr., Martin W. N., Reichert K. C., Frey L. A. (1989). Human-computer interaction using eye-gaze input. IEEE Transaction on Systems, Man, and Cybernetics, 19, pp. 1527--1534.]]Google ScholarCross Ref
- Jacob, R. J. K. (1993). Eye-movement-based human-computer interaction techniques. In H. R. Hartson & D. Hix (Eds.), Advances in Human-Computer Interaction, Vol. 4 (pp. 151--190). Norwood, NJ: Ablex.]]Google Scholar
- Kraut, R. E., Fussell, S. R., Brennan, S., & Siegel, J. (2003). A framework for understanding effects of proximity on collaboration : Implications for technologies to support remote collaborative work. In P. Hinds & S. Kiesler (Eds.). Technology and Distributed Work.]]Google Scholar
- Kraut, R. E., Fussell, S. R., & Siegel, J. (2003). Visual information as a conversational resource in collaborative physical tasks. Human Computer Interaction, 18, 13--49.]]Google ScholarDigital Library
- Kraut, R. E., Gergle, D., & Fussell, S. R. (2002). The use of visual information in shared visual spaces: Informing the development of virtual co-presence. Proceedings of CSCW 2002 (pp. 31--40). NY: ACM Press.]] Google ScholarDigital Library
- Kuzuoka, H., Kosuge, T., & Tanaka, K.. (1994) GestureCam: A video communication system for sympathetic remote collaboration. Proceedings of CSCW 1994 (pp. 35--43). NY: ACM.]] Google ScholarDigital Library
- Kuzuoka, H., Oyama, S., Yamazaki, K., Suzuki, K., & Mitsuishi, M. (2000). GestureMan: A mobile robot that embodies a remote instructor's actions. Proceedings of CSCW 2000 (pp. 155--162). NY: ACM Press.]] Google ScholarDigital Library
- Maglio, P. P., Matlock T., Campbell C. S., Zhai S., Smith B. A. (2000). Gaze and speech in attentive user interfaces. In Proceedings of the International Conference on Multimodal Interfaces, volume 1948 from LNCS. Springer, 2000.]] Google ScholarDigital Library
- Oh, A., Fox, H., Kleek, M. V., Adler, A., Gajos, K., Morency, L., Darrell, T., Evaluating Look-to-Talk: A Gaze-Aware Interface in a Collaborative Environment (2002), In Proceedings of CHI '02 extended abstracts on Human factors in computing systems, pp. 650 -- 651.]] Google ScholarDigital Library
- Ou, J. (unpublished). DOVE-2: Combining gesture with remote camera control.]]Google Scholar
- Ou, J., Fussell, S. R., Chen, X., Setlock, L. D., & Yang, J. (2003). Gestural communication over video stream: Supporting multimodal interaction for remote collaborative physical tasks. In Proceedings of International Conference on Multimodal Interfaces, Nov. 5-7, 2003, Vancouver, Canada.]] Google ScholarDigital Library
- Ou, J., Oh, L.M., Yang, J., & Fussell, S. R. (2005). Effects of task properties, partner actions, and message content on eye gaze patterns in a collaborative task. Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 231 -- 240. ACM Press.]] Google ScholarDigital Library
- Salvucci Dario D. (1999). Inferring Intent in Eye-Based Interfaces: Tracing Eye Movements with Process Models. In Human Factors in Computing Systems: CHI 99, 1999.]] Google ScholarDigital Library
- Sibert, L. E., and Jacob, R. J. (2000). Evaluation of Eye Gaze Interaction, In Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 281 -- 288.]] Google ScholarDigital Library
- Stiefelhagen, R., Yang, J. (1997). Gaze Tracking for Multimodal Human-Computer Interaction. In Proceedings of International Conf. on Acoustics, Speech, and Signal Processing, April 1997.]] Google ScholarDigital Library
- Stiefelhagen, R., Yang, J., & Waibel, A. (2002). Modeling focus of attention for meeting indexing based on multiple cues. IEEE Transactions on Neural Networks, 13, 928--938.]]Google ScholarDigital Library
- Vertegaal, R., Slagter, R., van der Veer, G., & Nijholt, A. (2001). Eye gaze patterns in conversations: There is more to conversational agents than meets the eyes. Proceedings of CHI 2001 (pp. 301--308). NY: ACM Press.]] Google ScholarDigital Library
Index Terms
- Analyzing and predicting focus of attention in remote collaborative tasks
Recommendations
Predicting Visual Focus of Attention From Intention in Remote Collaborative Tasks
While shared visual space plays a very important role in remote collaboration on physical tasks, it is challenging and expensive to track users' focus of attention (FOA) during these tasks. In this paper, we propose to identify a user's FOA from his/her ...
Combining audio and video to predict helpers' focus of attention in multiparty remote collaboration on physical tasks
ICMI '06: Proceedings of the 8th international conference on Multimodal interfacesThe increasing interest in supporting multiparty remote collaboration has created both opportunities and challenges for the research community. The research reported here aims to develop tools to support multiparty remote collaborations and to study ...
Deixis and gaze in collaborative work at a distance (over a shared map): a computational model to detect misunderstandings
ETRA '08: Proceedings of the 2008 symposium on Eye tracking research & applicationsThis paper presents an algorithm that detects misunderstandings in collaborative work at a distance. It analyses the movements of collaborators' eyes on the shared workspace, their utterances containing references about this workspace, and the ...
Comments