ABSTRACT
This article presents a User Interface (UI) framework for multimodal interactions targeted at immersive virtual environments. Its configurable input and gesture processing components provide an advanced behavior graph capable of routing continuous data streams asynchronously. The framework introduces a Knowledge Representation Layer which augments objects of the simulated environment with Semantic Entities as a central object model that bridges and interfaces Virtual Reality (VR) and Artificial Intelligence (AI) representations. Specialized node types use these facilities to implement required processing tasks like gesture detection, preprocessing of the visual scene for multimodal integration, or translation of movements into multimodally initialized gestural interactions. A modified Augmented Transition Nettwork (ATN) approach accesses the knowledge layer as well as the preprocessing components to integrate linguistic, gestural, and context information in parallel. The overall framework emphasizes extensibility, adaptivity and reusability, e.g., by utilizing persistent and interchangeable XML-based formats to describe its processing stages.
- F. Althoff, G. McGlaun, B. Schuller, P. Morguet, and M. Lang. Using multimodal interaction to navigate in arbitrary virtual vrml worlds. In Proceedings of PUI 2001, 2001. Google ScholarDigital Library
- R. Arangarasan and G. N. J. Phillips. Modular Approach of Multimodal Integration in a Virtual Environment. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 331--336. IEEE, 2002. Google ScholarDigital Library
- K. Böhm, W. Hübner, and K. Väänänen. Given: Gesture driven interactions in virtual environments; a toolkit approach to 3D interactions. In Interfaces to Real and Virtual Worlds, 1992.Google Scholar
- R. A. Bolt. Put-That-There: Voice and gesture at the graphics interface. In ACM SIG-GRAPH Computer Graphics, New York, 1980. ACM Press. Google ScholarDigital Library
- R. Carey, G. Bell, and C. Marrin. ISO/IEC 14772-1:1997 virtual reality modeling language (VRML). Technical report, The VRML Consortium Incorporated, 1997.Google Scholar
- M. Cavazza, X. Pouteau, and D. Pernel. Multimodal communication in virtual environments. In Symbiosis of Human and Artifact, pages 597--604. Elsevier Science B. V., 1995.Google Scholar
- P. Cohen, D. McGee, S. Oviatt, L. Wu, J. Clow, R. King, S. Julier, and L. Rosenblum. Multimodal interactions for 2d and 3d environments. IEEE Computer Graphics and Applications, pages 10--13, 1999. Google ScholarDigital Library
- A. Hauptmann and P. McAvinney. Gestures with speech for graphic manipulation. International Journal of Man-Machine Studies, 38:231--249, 1993. Google ScholarDigital Library
- G. Heumer, M. Schilling, and M. E. Latoschik. Automatic data exchange and synchronization for knowledge-based intelligent virtual environments. In Proceedings of the IEEE VR2005, pages 43--50, Bonn, Germany, 2005. Google ScholarDigital Library
- M. Johnston. Unification-based multimodal parsing. In Proceedings of the 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics COLING-ACL, pages 624 -- 630, 1998. Google ScholarDigital Library
- M. Johnston and S. Bangalore. Finite-state methods for multimodal parsing and integration. In Finite-state Methods Workshop, ESSLLI Summer School on Logic Language and Information,Helsinki, Finland, august 2001.Google Scholar
- M. Johnston, P. R. Cohen, D. McGee, S. L. Oviatt, J. A. Pittman, and I. Smith. Unification-based multimodal integration. In 35th Annual Meeting of the Association for Computational Linguistics, Madrid, pages 281--288, 1997. Google ScholarDigital Library
- E. Kaiser, A. Olwal, D. McGee, H. Benko, A. Corradini, X. Li, P. Cohen, and S. Feiner. Mutual disambiguation of 3d multimodal interaction in augmented and virtual reality. In Proceedings of the 5th international conference on Multimodal interfaces, pages 12--19. ACM Press, 2003. Google ScholarDigital Library
- D. Koons, C. Sparrel, and K. Thorisson. Intergrating simultaneous input from speech, gaze and hand gestures. In Intelligent Multimedia Interfaces. AAAI Press, 1993. Google ScholarDigital Library
- F. Landragin, N. Bellalem, and L. Romary. Referring to Objects with Spoken and Haptic Modalities. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 99--104. IEEE, 2002. Google ScholarDigital Library
- M. E. Latoschik. A gesture processing framework for multimodal interaction in virtual reality. In A. Chalmers and V. Lalioti, editors, AFRIGRAPH 2001, 1st International Conference on Computer Graphics, Virtual Reality and Visualisation in Africa, conference proceedings, pages 95--100. ACM SIG-GRAPH, 2001. Google ScholarDigital Library
- M. E. Latoschik. Designing Transition Networks for Multimodal VR-Interactions Using a Markup Language. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 411--416. IEEE, 2002. Google ScholarDigital Library
- M. E. Latoschik and M. Schilling. Incorporating VR Databases into AI Knowledge Representations: A Framework for Intelligent Graphics Applications. In Proceedings of the Sixth International Conference on Computer Graphics and Imaging. IASTED, ACTA Press, 2003.Google Scholar
- B. Lenzmann. Benutzeradaptive und multimodale Interface-Agenten. PhD thesis, Technische Fakultät, Universität Bielefeld, 1998.Google Scholar
- M. Lucente, G.-J. Zwart, and A. D. George. Visualization space: A testbed for deviceless multimodal user interface. In Intelligent Environments Symposium, American Assoc. for Artificial Intelligence Spring Symposium Series, Mar. 1998.Google Scholar
- M. T. Maybury. Research in multimedia an multimodal parsing and generation. In P. McKevitt, editor, Journal of Artificial Intelligence Review: Special Issue on the Integration of Natural Language and Vision Processing, volume 9, pages 2--27. 1993. Google ScholarDigital Library
- J. G. Neal and S. C. Shapiro. Intelligent User Interfaces, chapter Intelligent Multi-Media Interface Technology, pages 11--45. Addison-Wesley Publishing Company, 1991. Google ScholarDigital Library
- S. Oviatt. The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications, chapter Multimodal Interfaces. Lawrence Erlbaum Assoc., 2003. Google ScholarDigital Library
- T. Pfeiffer and M. E. Latoschik. Resolving Object References in multimodal Dialogues for Immersive Virtual Environments. In Proceedings of the IEEE Virtual Reality conference 2004, pages 35--42, 2004. Google ScholarDigital Library
- C. J. Sparrell and D. B. Koons. Interpretation of coverbal depictive gestures. In AAAI Spring Symposium Series, pages 8--12. Stanford University, March 1994.Google Scholar
- P. S. Strauss and R. Carey. An object-oriented 3D graphics toolkit. In Computer Graphics, volume 26 of SIGGRAPH Proceedings, pages 341--349, 1992. Google ScholarDigital Library
- D. Thalmann. The virtual human as a multimodal interface. In Proceedings of the Working Conference on Advanced Visual Interfaces, pages 14--20. ACM Press, 2000. Google ScholarDigital Library
- D. Touraine, P. Bourdot, Y. Bellik, and L. Bolot. A framework to manage multimodal fusion of events for advanced interactions within virtual environments. In Proceedings of the workshop on Virtual environments 2002, pages 159--168. Eurographics Association, 2002. Google ScholarDigital Library
- H. Tramberend. A distributed virtual reality framework. In IEEE Virtual Reality Conference, pages 14--21, 1999. Google ScholarDigital Library
- M. Vo and C. Wood. Building an application framework for speech and pen input integration in multimodal learning interfaces. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing, 1996. Google ScholarDigital Library
- E. Zudilova, P. Sloot, and R. Belleman. A Multi-modal Interface for an Interactive Simulated Vascular Reconstruction System. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 313--318. IEEE, 2002. Google ScholarDigital Library
Index Terms
- A user interface framework for multimodal VR interactions
Recommendations
Multimodal augmented reality: the norm rather than the exception
MVAR '16: Proceedings of the 2016 workshop on Multimodal Virtual and Augmented RealityAugmented reality (AR) is commonly seen as a technology that overlays virtual imagery onto a participant's view of the world. In line with this, most AR research is focused on what we see. In this paper, we challenge this focus on vision and make a case ...
A Wizard of Oz study for an AR multimodal interface
ICMI '08: Proceedings of the 10th international conference on Multimodal interfacesIn this paper we describe a Wizard of Oz (WOz) user study of an Augmented Reality (AR) interface that uses multimodal input (MMI) with natural hand interaction and speech commands. Our goal is to use a WOz study to help guide the creation of a ...
Extending chatterbot system into multimodal interaction framework with embodied contextual understanding
HRI '12: Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot InteractionThis work aims to realize multimodal interaction with embodied contextual understanding based on the simple chatterbot system. A system framework is proposed to integrate the dialogue system into a 3D simulation platform, SIGVerse to attain multimodal ...
Comments