Skip to main content
Erschienen in: Intelligent Service Robotics 4/2016

01.10.2016 | Original Research Paper

Building 3D semantic maps for mobile robots using RGB-D camera

verfasst von: Zhe Zhao, Xiaoping Chen

Erschienen in: Intelligent Service Robotics | Ausgabe 4/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The wide availability of affordable RGB-D sensors changes the landscape of indoor scene analysis. Years of research on simultaneous localization and mapping (SLAM) have made it possible to merge multiple RGB-D images into a single point cloud and provide a 3D model for a complete indoor scene. However, these reconstructed models only have geometry information, not including semantic knowledge. The advancements in robot autonomy and capabilities for carrying out more complex tasks in unstructured environments can be greatly enhanced by endowing environment models with semantic knowledge. Towards this goal, we propose a novel approach to generate 3D semantic maps for an indoor scene. Our approach creates a 3D reconstructed map from a RGB-D image sequence firstly, then we jointly infer the semantic object category and structural class for each point of the global map. 12 object categories (e.g. walls, tables, chairs) and 4 structural classes (ground, structure, furniture and props) are labeled in the global map. In this way, we can totally understand both the object and structure information. In order to get semantic information, we compute semantic segmentation for each RGB-D image and merge the labeling results by a Dense Conditional Random Field. Different from previous techniques, we use temporal information and higher-order cliques to enforce the label consistency for each image labeling result. Our experiments demonstrate that temporal information and higher-order cliques are significant for the semantic mapping procedure and can improve the precision of the semantic mapping results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Arbelaez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Anal Mach Intell 33(5):898–916CrossRef Arbelaez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Anal Mach Intell 33(5):898–916CrossRef
2.
Zurück zum Zitat Banica D, Sminchisescu C (2013) CPMC-3D-O2P: Semantic segmentation of rgb-d images using cpmc and second order pooling. CoRR. arXiv:1312.7715 Banica D, Sminchisescu C (2013) CPMC-3D-O2P: Semantic segmentation of rgb-d images using cpmc and second order pooling. CoRR. arXiv:​1312.​7715
3.
Zurück zum Zitat Bay H, Tuytelaars T, Van Gool L (2006) Surf: Speeded up robust features. In: Computer vision–ECCV 2006, pp 404–417. Springer, New York Bay H, Tuytelaars T, Van Gool L (2006) Surf: Speeded up robust features. In: Computer vision–ECCV 2006, pp 404–417. Springer, New York
4.
Zurück zum Zitat Chen Y, Shuai W, Chen X (2015) A probabilistic, variable-resolution and effective quadtree representation for mapping of large environments. In: Advanced Robotics (ICAR), 2015 International Conference on, pp 605–610. IEEE Chen Y, Shuai W, Chen X (2015) A probabilistic, variable-resolution and effective quadtree representation for mapping of large environments. In: Advanced Robotics (ICAR), 2015 International Conference on, pp 605–610. IEEE
5.
Zurück zum Zitat Couprie C, Farabet C, Najman L, LeCun Y (2013) Indoor semantic segmentation using depth information. arXiv:1301.3572 (preprint) Couprie C, Farabet C, Najman L, LeCun Y (2013) Indoor semantic segmentation using depth information. arXiv:​1301.​3572 (preprint)
6.
Zurück zum Zitat Engelhard N, Endres F, Hess J, Sturm J, Burgard W (2011) Real-time 3d visual slam with a hand-held rgb-d camera. In: Proc. of the RGB-D Workshop on 3D Perception in Robotics at the European Robotics Forum. Vasteras, Sweden Engelhard N, Endres F, Hess J, Sturm J, Burgard W (2011) Real-time 3d visual slam with a hand-held rgb-d camera. In: Proc. of the RGB-D Workshop on 3D Perception in Robotics at the European Robotics Forum. Vasteras, Sweden
7.
Zurück zum Zitat Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vision 59(2):167–181CrossRef Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vision 59(2):167–181CrossRef
8.
Zurück zum Zitat Grisetti G, Grzonka S, Stachniss C, Pfaff P, Burgard W (2007) Efficient estimation of accurate maximum likelihood maps in 3d. In: IROS, pp 3472–3478 Grisetti G, Grzonka S, Stachniss C, Pfaff P, Burgard W (2007) Efficient estimation of accurate maximum likelihood maps in 3d. In: IROS, pp 3472–3478
9.
Zurück zum Zitat Gupta S, Arbelaez P, Malik J (2013) Perceptual organization and recognition of indoor scenes from rgb-d images. In: Computer Vision and Pattern Recognition (CVPR), IEEE Conference on, pp 564–571. IEEE Gupta S, Arbelaez P, Malik J (2013) Perceptual organization and recognition of indoor scenes from rgb-d images. In: Computer Vision and Pattern Recognition (CVPR), IEEE Conference on, pp 564–571. IEEE
10.
Zurück zum Zitat Henry P, Krainin M, Herbst E, Ren X, Fox D (2010) Rgb-d mapping: using depth cameras for dense 3d modeling of indoor environments. In: The 12th International Symposium on Experimental Robotics (ISER), vol 20, pp 22–25 Henry P, Krainin M, Herbst E, Ren X, Fox D (2010) Rgb-d mapping: using depth cameras for dense 3d modeling of indoor environments. In: The 12th International Symposium on Experimental Robotics (ISER), vol 20, pp 22–25
11.
Zurück zum Zitat Henry P, Krainin M, Herbst E, Ren X, Fox D (2012) Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments. I. J Robot Res 31(5):647–663CrossRef Henry P, Krainin M, Herbst E, Ren X, Fox D (2012) Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments. I. J Robot Res 31(5):647–663CrossRef
12.
Zurück zum Zitat Hermans A, Floros G, Leibe B (2014) Dense 3d semantic mapping of indoor scenes from rgb-d images. In: IEEE International Conference on Robotics and Automation (ICRA), pp 2631–2638 Hermans A, Floros G, Leibe B (2014) Dense 3d semantic mapping of indoor scenes from rgb-d images. In: IEEE International Conference on Robotics and Automation (ICRA), pp 2631–2638
13.
Zurück zum Zitat Koppula HS, Anand A, Joachims T, Saxena A (2011) Semantic labeling of 3d point clouds for indoor scenes. In: NIPS, pp 244–252 Koppula HS, Anand A, Joachims T, Saxena A (2011) Semantic labeling of 3d point clouds for indoor scenes. In: NIPS, pp 244–252
14.
Zurück zum Zitat Krähenbühl P, Koltun V (2012) Efficient inference in fully connected crfs with gaussian edge potentials. CoRR. arXiv:1210.5644 Krähenbühl P, Koltun V (2012) Efficient inference in fully connected crfs with gaussian edge potentials. CoRR. arXiv:​1210.​5644
15.
Zurück zum Zitat Lucas BD, Kanade T et al (1981) An iterative image registration technique with an application to stereo vision. IJCAI 81:674–679 Lucas BD, Kanade T et al (1981) An iterative image registration technique with an application to stereo vision. IJCAI 81:674–679
16.
Zurück zum Zitat Newcombe RA, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison AJ, Kohli P, Shotton J, Hodges S, Fitzgibbon AW (2011) Kinectfusion: real-time dense surface mapping and tracking. In: ISMAR, pp 127–136 Newcombe RA, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison AJ, Kohli P, Shotton J, Hodges S, Fitzgibbon AW (2011) Kinectfusion: real-time dense surface mapping and tracking. In: ISMAR, pp 127–136
17.
Zurück zum Zitat Nüchter A, Hertzberg J (2008) Towards semantic maps for mobile robots. Robot Auton Syst 56(11):915–926CrossRef Nüchter A, Hertzberg J (2008) Towards semantic maps for mobile robots. Robot Auton Syst 56(11):915–926CrossRef
18.
Zurück zum Zitat Ren X, Bo L, Fox D (2012) Rgb-(d) scene labeling: features and algorithms. In: CVPR, pp 2759–2766 Ren X, Bo L, Fox D (2012) Rgb-(d) scene labeling: features and algorithms. In: CVPR, pp 2759–2766
19.
Zurück zum Zitat Shi J, Tomasi C (1994) Good features to track. In: Computer vision and pattern recognition, pp 593–600. IEEE Shi J, Tomasi C (1994) Good features to track. In: Computer vision and pattern recognition, pp 593–600. IEEE
20.
Zurück zum Zitat Silberman N, Fergus R (2011) Indoor scene segmentation using a structured light sensor. In: ICCV Workshops, pp 601–608 Silberman N, Fergus R (2011) Indoor scene segmentation using a structured light sensor. In: ICCV Workshops, pp 601–608
21.
Zurück zum Zitat Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from rgbd images. ECCV 5:746–760 Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from rgbd images. ECCV 5:746–760
22.
Zurück zum Zitat Stuckler J, Biresev N, Behnke S (2012) Semantic mapping using object-class segmentation of rgb-d images. In: Intelligent Robots and Systems (IROS), pp 3005–3010. IEEE Stuckler J, Biresev N, Behnke S (2012) Semantic mapping using object-class segmentation of rgb-d images. In: Intelligent Robots and Systems (IROS), pp 3005–3010. IEEE
23.
Zurück zum Zitat Valentin JP, Sengupta S, Warrell J, Shahrokni A, Torr PH (2013) Mesh based semantic modelling for indoor and outdoor scenes. In: Computer Vision and Pattern Recognition (CVPR), pp 2067–2074. IEEE Valentin JP, Sengupta S, Warrell J, Shahrokni A, Torr PH (2013) Mesh based semantic modelling for indoor and outdoor scenes. In: Computer Vision and Pattern Recognition (CVPR), pp 2067–2074. IEEE
24.
Zurück zum Zitat Whelan T, Kaess M, Fallon M, Johannsson H, Leonard J, McDonald J (2012) Kintinuous: Spatially extended KinectFusion. In: RSS Workshop on RGB-D: advanced reasoning with depth cameras. Sydney Whelan T, Kaess M, Fallon M, Johannsson H, Leonard J, McDonald J (2012) Kintinuous: Spatially extended KinectFusion. In: RSS Workshop on RGB-D: advanced reasoning with depth cameras. Sydney
25.
Zurück zum Zitat Xu C, Xiong C, Corso JJ (2012) Streaming hierarchical video segmentation. In: Computer Vision–ECCV, pp 626–639. Springer, New York Xu C, Xiong C, Corso JJ (2012) Streaming hierarchical video segmentation. In: Computer Vision–ECCV, pp 626–639. Springer, New York
26.
Zurück zum Zitat Zhao Z, Chen X (2014) Semantic mapping for object category and structural class. In: Intelligent Robots and Systems (IROS 2014), pp 724–729. IEEE Zhao Z, Chen X (2014) Semantic mapping for object category and structural class. In: Intelligent Robots and Systems (IROS 2014), pp 724–729. IEEE
Metadaten
Titel
Building 3D semantic maps for mobile robots using RGB-D camera
verfasst von
Zhe Zhao
Xiaoping Chen
Publikationsdatum
01.10.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
Intelligent Service Robotics / Ausgabe 4/2016
Print ISSN: 1861-2776
Elektronische ISSN: 1861-2784
DOI
https://doi.org/10.1007/s11370-016-0201-x

Weitere Artikel der Ausgabe 4/2016

Intelligent Service Robotics 4/2016 Zur Ausgabe

Original Research Paper

STATE

Neuer Inhalt