Self-localization in non-stationary environments using omni-directional vision

https://doi.org/10.1016/j.robot.2007.02.002Get rights and content

Abstract

This paper presents an image-based approach for localization in non-static environments using local feature descriptors, and its experimental evaluation in a large, dynamic, populated environment where the time interval between the collected data sets is up to two months. By using local features together with panoramic images, robustness and invariance to large changes in the environment can be handled. Results from global place recognition with no evidence accumulation and a Monte Carlo localization method are shown. To test the approach even further, experiments were conducted with up to 90% virtual occlusion in addition to the dynamic changes in the environment.

Introduction

One of the most essential abilities needed by robots is self-localization (“where am I?”), which can be divided into geometric and topological localization. Geometric localization tries to estimate the position of the robot as accurately as possible, e.g., by calculating a pose estimate (x,y,θ), while topological localization gives a more abstract position estimate, e.g., “I’m in the coffee room”. There has been much research on using accumulated sensory evidence to improve localization performance, including a highly successful class of algorithms that estimate posterior probability distributions over the space of possible locations [1], [2], [3], [4], [5], [6]. This approach enables both position tracking and relocalization from scratch, for example, when the robot is started (no prior knowledge of its position) or becomes lost or “kidnapped” (incorrect prior knowledge).

In recent years, panoramic or omni-directional cameras have become popular for self-localization because of their relatively low cost and large field of view. This makes it possible to create features that are invariant to the robot’s orientation, for example, using various colour histograms [7], [8], [9] or Eigenspace models [10], [11], [12].

Other innovations include increased robustness to lighting variations [11] and multi-view registration to deal with occlusions [13]. Recent work has combined panoramic vision with particle filters for global localization, including feature matching using Fourier transform [14], PCA [15] and colour histograms [16].

Most previous work on robot mapping and localization assumes a static world, i.e., that there are no changes to the environment between the time of mapping and the time of using the map. However, this assumption does not hold for typical populated indoor environments. Humans (and other robots) are not merely “dynamic obstacles” that may occlude the robot’s sensors — they also make changes to the world. For example, they may leave temporary objects such as packages, or move the furniture. In addition to these sudden changes, there may be gradual changes such as plants growing, coloured paint fading, etc.

Our approach to self-localization in non-static environments uses an image matching scheme that is robust to many of the changes that occur under natural conditions. Our hypothesis is that a map that is out of date can still contain much useful information. Thus the important question is how to extract features that can be used for matching new sensor data to a map that is only partially correct. We present an appearance-based approach to matching panoramic images that does not require calibration or geometric modelling of the scene or imaging system, thus it should be applicable to any mobile robot using omni-directional vision for self-localization. The hypothesis is validated through experiments using sensor data collected by a mobile robot in a real dynamic environment over a period of two months.

The image matching algorithm uses local features extracted from many small subregions of the image rather than global features extracted from the whole image, which makes the method very robust to variations and occlusions. For example, Fig. 2 shows some of the local features that were matched between two different panoramic images of a laboratory environment, recorded 56 days apart, despite changes such as a television appearing, chairs moving and people working. The approach is similar to other approaches using local features for self-localization [17], [18], [19], with the differences that our method is adapted for panoramic images and was specifically designed and tested to work in long-term experiments conducted in a real dynamic environment.

We use a version of Lowe’s SIFT algorithm [20], which is modified so that stored panoramic images are only recognised from a local area around the corresponding location in the world (Section 2.3). A novel scheme is introduced for combining local feature matching with a particle filter for global localization (Section 2.4), which minimizes computational costs as the filter converges. See also Fig. 1 for a brief overview of the method. To evaluate the method, we also compare its performance with that of several other types of features, including both global and local features (Section 3). We also show how image matching performance can be further improved by incorporating information about the relative orientation of corresponding features between images (Section 3.3). Our experiments were designed to test the system under a wide variety of conditions, including results in a large populated indoor environment (up to 5 persons visible) on different days under different lighting conditions (Section 4).

The results demonstrate that the robot is able to localize itself from scratch, including experiments in “kidnapping”, and that the performance shows a graceful degradation to occlusions (up to 90% of the robot’s field of view).

Section snippets

Basic methods

This section describes the methods used for extracting and matching the features, see also Fig. 1. To be able to match the current image with the images stored in the database, each image is converted into a set of features. The matching is done by comparing the features in the database with the features created from the current image.

Our localization methods assume that the database (map) is already created. The database is constructed from another set of images collected by the same robot.

Other image matching methods compared

Together with MSIFT, three other methods for image matching were evaluated, including one global feature and two local features. Matching of local descriptors was done as described in Section 2.3, where the match score is the total number of matched features between two images.

Results

The results are divided into two parts, one considering the location and orientation recognition performance with no prior knowledge (Section 4.2) and the other evaluates the full Monte Carlo localization scheme (Section 4.3).

Conclusion

A self-localization algorithm for mobile robots using panoramic images was presented. The main contributions are the integration of existing state-of-the-art algorithms for creating local descriptors [20] and probabilistic state estimation [25] with an omni-directional imaging system on a mobile robot, and the experimental evaluation of the entire system in a real dynamic environment over an extended period of time. By using experiments with data collected on different days over a period of

Henrik Andreasson is a Ph.D. student at Centre for Applied Autonomous Sensor System, Örebro University, Sweden. He received his Master degree in Mechatronics from Royal Institute of Technology, Sweden, in 2001. His research interests include mobile robotics, computer vision, and machine learning.

References (28)

  • J. Gonzalez-Barbosa et al.

    Rover localization in natural environments by indexing panoramic images

  • N. Winters et al.

    Omni-directional vision for robot navigation

  • M. Artac et al.

    Mobile robot localization using an incremental Eigenspace model

  • L. Paletta et al.

    Robust localization using context in omnidirectional imaging

  • Cited by (33)

    • Dense topological maps and partial pose estimation for visual control of an autonomous cleaning robot

      2013, Robotics and Autonomous Systems
      Citation Excerpt :

      In contrast to sparse model-based maps (Section 2.1.1), the spatial positions of the features in the world are not known. Topo-metric maps and feature-based landmarks are used for Monte-Carlo localization [19,20,23,53–59] and for trajectory-based SLAM [56,60–64]. As trajectory-based SLAM methods correct the map by fusing odometry and bearing information, these methods are closely related to the described trajectory controller.

    • Human detection for a robot tractor using omni-directional stereo vision

      2012, Computers and Electronics in Agriculture
      Citation Excerpt :

      For developing an omni-directional safety protection system, an omni-directional camera is a preferable choice because it can provide a 360° horizontal field of view. In the previous studies, omni-directional stereo vision (OSV) was developed by using a rotary stereo camera (Lin et al., 2008) or two omni-directional catadioptric cameras (Lima et al., 2001; Andreasson et al., 2007). The fact that a rotary stereo camera based on OSV cannot provide a real-time panoramic stereo image is a major problem, while a catadioptric camera has low image resolution and it is suitable for an in indoor environment.

    • A co-processor design to accelerate sequential monocular SLAM EKF process

      2012, Measurement: Journal of the International Measurement Confederation
      Citation Excerpt :

      To increase SLAM estimation performance, many researchers simplify the prior data before passing them to the estimation component. For example Andreasson et al. [20] simplifies SIFT algorithm to minimize data inputs to the estimation process. Alternatively, Civera et al. [6] introduce linearity index approach to shift from more complex 6-D state vector to simpler Euclidean XYZ representation before the data is feed to the estimation component.

    • Two novel real-time local visual features for omnidirectional vision

      2010, Pattern Recognition
      Citation Excerpt :

      The original algorithms of local visual features should be modified when they are applied to omnidirectional vision because of its special imaging character, especially in determining the feature regions. In Ref. [34], the standard SIFT is simplified, and used for robot localization. The features are detected only in one resolution of the panoramic images without considering scale invariance, and then each feature region is rotated to the same global orientation to ensure rotation invariance.

    • Modeling floor-cleaning coverage performances of some domestic mobile robots in a reduced scenario

      2010, Robotics and Autonomous Systems
      Citation Excerpt :

      In the case of an unknown, unstructured and dynamic cleaning scenario, as the originated by the typical domestic decoration things and human disorder habits, the complete-coverage problem becomes more complex [2,9,22,23] and most commercial mobile robots use inefficient random path-planning algorithms [24] and very few inexpensive contact (or non-contact) collision sensors to get complete-coverage. The main objective of this work is to model floor-cleaning coverage performances of some domestic random path-planning mobile robots through the measurement of its position and trajectory using an external vision system in a similar manner that is performed to identify kinematic parameters in robotic manipulators [25] instead of the typical self-localization problem [26,27]. Three commercial floor-cleaning mobile robots and one research prototype are measured, analyzed and modeled.

    View all citing articles on Scopus

    Henrik Andreasson is a Ph.D. student at Centre for Applied Autonomous Sensor System, Örebro University, Sweden. He received his Master degree in Mechatronics from Royal Institute of Technology, Sweden, in 2001. His research interests include mobile robotics, computer vision, and machine learning.

    André Treptow recieved his diploma in computer engineering from the University of Siegen, Germany, in 2000. He was a Ph.D. student at the W.-Schickard-Institute of Computer Science at the Eberhard-Karls-University of Tuebingen, Germany and received his Ph.D. in 2007. Since 2006 he is working at the Robert Bosch GmbH in the field of signal processing for automotive radar sensors. His research interests include real-time object detection and tracking, computer vision, biologically motivated vision algorithms and evolutionary algorithms.

    Tom Duckett is a Reader at the Department of Computing and Informatics, University of Lincoln. He was formerly a docent (Associate Professor) at Örebro University, where he founded the Learning Systems Laboratory, one of four research laboratories within the Centre for Applied Autonomous Sensor Systems. He obtained his Ph.D. from Manchester University, M.Sc. with distinction from Heriot-Watt University and B.Sc. (Hons.) from Warwick University, and has also studied at Karlsruhe and Bremen Universities. His research interests include mobile robotics, navigation, machine learning, AI, computer vision, and sensor fusion for perception-based control of autonomous systems.

    1

    Present address: Department of Computing and Informatics, University of Lincoln, Lincoln LN6 7TS, UK.

    View full text