Conceptual spatial representations for indoor mobile robots☆
Introduction
Recently, there has been an increasing interest in service robots, such as domestic or elderly care robots, whose aim is to assist people in human-made environments. In such situations, the robots will no longer be operated by trained personnel but instead have to interact with people from the general public. Thus, an important challenge lies in facilitating the communication between robots and humans.
One of the most intuitive and powerful ways for humans to communicate is spoken language. It is therefore interesting to design robots that are able to speak with people and understand their words and expressions. If a dialogue between robots and humans is to be successful, the robots must make use of the same concepts to refer to things and phenomena as a person would do. For this, the robot needs to perceive the world similarly to a human.
An important aspect of human-like perception of the world is the robot’s understanding of the spatial and functional properties of human-made environments, while still being able to safely act in it. For the robot, one of the first tasks will consist in learning the environment in the same way as a person does, sharing common concepts like, for instance, corridor or living room. These terms are used not only as labels, but as semantic expressions that relate them to some complex object or objective situation. For example, the term living room usually implies a place with some particular structure, and which includes objects like a couch or a television set. Thus representing the space in a way similar to humans needs to also account for the way linguistic references to spatial entities are established in situated natural language dialogues. In addition, a spatial knowledge representation for robotic assistants must address the issues involved with safe and reliable navigation control. Only then robots can be deployed in semi-structured environments, such as offices, where they have to interact with humans in everyday situations.
The specific problem we focus on in this article is how, given innate (possibly human-like) concepts a robot may have of spatial organization, the robot can autonomously build an internal representation of the environment by combining these concepts with different low-level sensory systems. This is done by creating a conceptual representation of the environment, in which the concepts represent spatial and functional properties of typical human-made indoor environments.
In order to meet both of the aforementioned requirements–robust robot control and human-like conceptualization–we propose a spatial representation that contains maps at different levels of abstraction. This stepwise abstraction from raw sensor input not only produces maps that are suitable for reliable robot navigation, but also yields a level of representation that is similar to a human conceptualization of spatial organization. Furthermore, this model provides a richer semantic view of an environment that permits the robot to do spatial categorization rather than only instantiation.
Our approach has been integrated into a system running on a mobile robot. This robot is capable of conceptual spatial mapping in an indoor environment, perceiving the world through different typical sensors like a laser range finder and a camera. Moreover, the robot is endowed with the necessary abilities to conduct a reflected, situated dialogue about its environment.
The rest of the paper is organized as follows. In Section 2 we present related work. Section 3 gives an overview of the components of our robotic system. After explaining the individual techniques that are used for evaluating the sensory input in Section 4, we describe our approach to a multi-layered conceptual spatial representation that bridges the gap between sensory input and human spatial concepts in Section 5. Then, the general principles of our robot’s situated dialogue capabilities are introduced in Section 6. In Section 7, we discuss the integration of the complete system in a mobile robot. Finally, concluding remarks are given in Section 8.
Section snippets
Related work
An approach to endowing autonomous robots with a human-like conceptualization of space inherently needs to take into account research in sensor-based mapping and localization for robots as well as findings about human spatial cognition.
Research in cognitive psychology addresses the inherently qualitative nature of human spatial knowledge. Backed up by experimental studies, it is nowadays generally assumed that humans adopt a partially hierarchical representation of spatial organization [1], [2]
System overview
Following the research in spatial cognition and qualitative spatial reasoning on the one hand, and in mobile robotics and artificial intelligence on the other hand, we propose a spatial representation for indoor mobile robots that is divided into layers. These layers represent different levels of abstraction from sensory input to human-like spatial concepts.
This multi-layered spatial representation is the centerpiece of our integrated robotic system. It is created using information coming from
Perception
The perception subsystem gathers information from the laser range scanner and from a camera. Different techniques are used for evaluation of the sensory input. The laser data is processed and used to create the low-level layers of the spatial representation. At the same time the input from the laser scanner is used by a component for detecting and following people [25]. Finally, the images acquired by the camera are analyzed by a computer vision component for object recognition.
Multi-layered spatial representation
The sensors that a robot has are very different from the human sensory modalities. Yet if a robot is to act in a human-populated environment, and to interact with users that are not expert roboticists, it needs to understand its surroundings in terms of human spatial concepts. We propose a layered model of space at different levels of abstraction that range from low-level metric maps for robot localization and navigation to a conceptual layer that provides a human-like decomposition and
Situated dialogue
In this section, we discuss the functionality which enables a robot to carry out a natural language dialogue with a human.
A core characteristic of our approach is that the robot builds up a semantic representation for each utterance. The robot interprets it against the dialogue context, relating it to previously mentioned objects and events, and to previous utterances in terms of “speech acts” (dialogue moves). Since dialogues in human–robot interaction are inherently situated, the robot also
System integration
Our approach has been implemented as an integrated system, running on an ActivMedia PeopleBot mobile robot platform. In this section, we discuss the integration of the components presented in the earlier sections. We focus on what integration brings us in terms of achieving a better understanding of sensory signals, i.e. one that is more complete and more appropriate for interacting with humans; particularly, given that sensory information usually only provides a partial, potentially noisy view
Conclusions
We presented an integrated approach for creating conceptual representations of human-made environments where the concepts represent spatial and functional properties of typical office indoor environments. Our representation is based on multiple maps at different levels of abstraction. The information needed for each level stems from different modalities, including a laser sensor, a camera, and a natural language processing system. The complete system was integrated and tested on a mobile robot
H. Zender is a PhD student researcher at the Language Technology Lab of the German Research Center for Artificial Intelligence (DFKI). His research interests are linguistic aspects of spatial cognition and spatial knowledge representations for human–robot interaction. He received his Diploma degree in Computational Linguistics from Saarland University in 2006.
References (46)
- et al.
Distortions in judged spatial relations
Cognitive Psychology
(1978) Mental representations of spatial relations
Cognitive Psychology
(1986)The Spatial semantic hierarchy
Artificial Intelligence
(2000)Robox at expo.02: A large scale installation of personal robots
Robotics and Autonomous Systems
(2003)- et al.
Qualitative spatial representation and reasoning: An overview
Fundamenta Informaticae
(2001) - et al.
Evidence for hierarchies in cognitive maps
Memory and Cognition
(1985) How shall a thing be called?
Psychological Review
(1958)Principles of categorization
- et al.
A taxonomy of spatial knowledge for navigation and its application to the Bremen autonomous wheelchair
- S. Vasudevan, S. Gachter, M. Berger, R. Siegwart, Cognitive maps for mobile robots an object based approach, in: Proc....
Robovie: An interactive humanoid robot
Int. J. Industrial Robotics
Biron - the Bielefeld robot companion
The information state approach to dialogue management
Cited by (265)
Exploiting the confusions of semantic places to improve service robotic tasks in indoor environments
2023, Robotics and Autonomous SystemsAccurate indoor location awareness based on machine learning of environmental sensing data
2022, Computers and Electrical EngineeringCitation Excerpt :Winterhalter, et al. also [16] proposed an effective method of using a smartphone or tablet equipped with an RGB-D camera to analyze two-dimensional floor maps for the purpose of indoor localization. One of the most significant tasks of efficient indoor localization is the ability to learn functional properties or location semantics of man-made environments, such as recognizing corridor, living room or water room, and so on [17]. A few of studies have been proposed for modeling and querying location semantics [18].
The semantic PHD filter for multi-class target tracking: From theory to practice
2022, Robotics and Autonomous SystemsViMantic, a distributed robotic architecture for semantic mapping in indoor environments
2021, Knowledge-Based SystemsMultiview vision-based human crowd localization for UAV fleet flight safety
2021, Signal Processing: Image CommunicationCitation Excerpt :DSMs and DTMs often come in raster format, i.e., essentially georeferenced images where a pixel’s value denotes elevation of the corresponding location. Several approaches aiming to augment topological maps [22] with semantic information [23,24] and high-level attributes have been proposed over the past years, allowing aerial robots to handle more expressive concepts or be deployed for more sophisticated tasks. Typically, the goal is to segment the environment into regions that have a coherent semantic meaning.
Foundations of spatial perception for robotics: Hierarchical representations and real-time systems
2024, International Journal of Robotics Research
H. Zender is a PhD student researcher at the Language Technology Lab of the German Research Center for Artificial Intelligence (DFKI). His research interests are linguistic aspects of spatial cognition and spatial knowledge representations for human–robot interaction. He received his Diploma degree in Computational Linguistics from Saarland University in 2006.
O. Martínez Mozos is a Ph.D. student at the lab of Autonomous Intelligent Systems headed by Wolfram Burgard at the University of Freiburg in Germany. His areas of interest lie on mobile robotics, artificial intelligence, and pattern recognition. In 2005, he received a M.Sc. in applied Computer Science at the University of Freiburg. In 1997 he completed a M.Eng. in Computer Science at the University of Alicante in Spain.
P. Jensfelt is an assistant professor at the Centre for Autonomous Systems at the Royal Institute of Technology, Stockholm, Sweden. He received his M.Sc. in Engineering Physics in 1996 and Ph.D. in Automatic Control in 2001. His research interests include mapping and localization, mobile robotics, and system integration.
G.-J. Kruijff is a Senior Researcher at the DFKI Language Technology Lab, where he leads efforts in the area of ”cognitive systems.” His research focuses on developing ”talking robots”. He is particularly interested in developing theories and has implemented architectures for cognitive robots to understand, and produce, situated dialogue with human users — in other words, how do we make talking robots? He has over 90 refereed conference papers and articles in human–robot interaction, and formal and computational linguistics. He is a member of IEEE.
W. Burgard is a professor at the Department of Computer Science at the University of Freiburg, where he heads the lab for Autonomous Intelligent Systems. He studied Computer Science at the University of Dortmund and received his Ph.D. degree in Computer Science from the University of Bonn in 1991. His research focuses on mobile robotics and system integration.
- ☆
This work was supported by the EU FP6 IST Cognitive Systems Integrated Project “CoSy” FP6-004250-IP.