Skip to main content

2018 | OriginalPaper | Buchkapitel

5. Model-Based 3D Object Recognition in RGB-D Images

verfasst von : Maciej Stefańczyk, Włodzimierz Kasprzak

Erschienen in: Bridging the Semantic Gap in Image and Video Analysis

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A computational framework for 3D object recognition in RGB-D images is presented. The focus is on computer vision applications in indoor autonomous robotics, where objects need to be recognized either for the purpose of being grasped and manipulated by the robot, or where the entire scene must be recognized to allow high-level cognitive tasks to be performed. The framework integrates solutions for generic (i.e. type-based) object representation (e.g. semantic networks), trainable transformations between abstraction levels (e.g. by neural networks), reasoning under uncertain and partial data (e.g. Dynamic Bayesian Networks, Fuzzy Logic), optimized model-to-data matching (e.g. constraint optimization problems) and efficient search strategies (switching between data- and model-driven inference steps). The computational implementation of the object model and the object recognition strategy is presented in more details. Testing scenarios deal with the recognition of cups and bottles or household furniture. Conducted experiments and the chosen applications confirmed, that this approach is valid and may easily be adapted to multiple scenarios.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ren, X., Fox, D., Konolige, K.: Change their perception: RGB-D for 3-D modeling and recognition. IEEE Robot. Autom. Mag. 20(4), 49–59 (2013)CrossRef Ren, X., Fox, D., Konolige, K.: Change their perception: RGB-D for 3-D modeling and recognition. IEEE Robot. Autom. Mag. 20(4), 49–59 (2013)CrossRef
2.
Zurück zum Zitat Richtsfeld, A., Mörwald, T., Prankl, J., Zillich, M., Vincze, M.: Learning of perceptual grouping for object segmentation on RGB-D data. J. Vis. Commun. Image Represent. 25(1), 64–73 (2014)CrossRef Richtsfeld, A., Mörwald, T., Prankl, J., Zillich, M., Vincze, M.: Learning of perceptual grouping for object segmentation on RGB-D data. J. Vis. Commun. Image Represent. 25(1), 64–73 (2014)CrossRef
4.
Zurück zum Zitat Newcombe, R.A., Davison, A.J., Izadi, S., Kohli, P., Hilliges, O., Shotton, J., Molyneaux, D., Hodges, S., Kim, D., Fitzgibbon, A.: KinectFusion: real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136. IEEE (2011) Newcombe, R.A., Davison, A.J., Izadi, S., Kohli, P., Hilliges, O., Shotton, J., Molyneaux, D., Hodges, S., Kim, D., Fitzgibbon, A.: KinectFusion: real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136. IEEE (2011)
6.
Zurück zum Zitat Hanson, A., Riseman, E.: The VISIONS image-understanding system. Adv. Comput. Vis. 1, 1–114 (1988) Hanson, A., Riseman, E.: The VISIONS image-understanding system. Adv. Comput. Vis. 1, 1–114 (1988)
7.
Zurück zum Zitat Hwang, V.S.S., Davis, L.S., Matsuyama, T.: Hypothesis integration in image understanding systems. Comput. Vis. Graph. Image Process. 36(2–3), 321–371 (1986)CrossRef Hwang, V.S.S., Davis, L.S., Matsuyama, T.: Hypothesis integration in image understanding systems. Comput. Vis. Graph. Image Process. 36(2–3), 321–371 (1986)CrossRef
8.
Zurück zum Zitat Niemann, H., Sagerer, G.F., Schroder, S., Kummert, F.: ERNEST: a semantic network system for pattern understanding. IEEE Trans. Pattern Anal. Mach. Intell. 12(9), 883–905 (1990)CrossRef Niemann, H., Sagerer, G.F., Schroder, S., Kummert, F.: ERNEST: a semantic network system for pattern understanding. IEEE Trans. Pattern Anal. Mach. Intell. 12(9), 883–905 (1990)CrossRef
9.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)
10.
Zurück zum Zitat Socher, R., Huval, B., Bhat, B., Manning, C., Ng, A.: Convolutional-recursive deep learning for 3D object classification. Adv. Neural Inf. Process. Syst. 25, 656–664 (2012) Socher, R., Huval, B., Bhat, B., Manning, C., Ng, A.: Convolutional-recursive deep learning for 3D object classification. Adv. Neural Inf. Process. Syst. 25, 656–664 (2012)
11.
Zurück zum Zitat Behnke, S.: Hierarchical Neural Networks for Image Interpretation. Lecture Notes in Computer Science, vol. 2766. Springer, Berlin (2003) Behnke, S.: Hierarchical Neural Networks for Image Interpretation. Lecture Notes in Computer Science, vol. 2766. Springer, Berlin (2003)
12.
Zurück zum Zitat Lin, D., Fidler, S., Urtasun, R.: Holistic scene understanding for 3d object detection with rgbd cameras. In: IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, December 1–8, 2013, pp. 1417–1424. IEEE Computer Society, ISBN 978-1-4799-2839-2 (2013) Lin, D., Fidler, S., Urtasun, R.: Holistic scene understanding for 3d object detection with rgbd cameras. In: IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, December 1–8, 2013, pp. 1417–1424. IEEE Computer Society, ISBN 978-1-4799-2839-2 (2013)
13.
Zurück zum Zitat Kasprzak, W.: A linguistic approach to 3-D object recognition. Comput. Graph. 11(4), 427–443 (1987)CrossRef Kasprzak, W.: A linguistic approach to 3-D object recognition. Comput. Graph. 11(4), 427–443 (1987)CrossRef
14.
Zurück zum Zitat Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)MATH Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)MATH
15.
Zurück zum Zitat Russel, S., Norvig, P.: Artificial Intelligence. A Modern Approach, 3rd edn. Prentice Hall, Upper Saddle River (2011) Russel, S., Norvig, P.: Artificial Intelligence. A Modern Approach, 3rd edn. Prentice Hall, Upper Saddle River (2011)
16.
Zurück zum Zitat Tsardoulias, E., Zieliński, C., Kasprzak, W., Reppou, S.: Merging robotics and AAL ontologies: the RAPP methodology. In: Progress in Automation, Robotics and Measuring Techniques. Advances in Intelligent Systems and Computing, vol. 351, pp. 285–297. Springer International Publishing (2015) Tsardoulias, E., Zieliński, C., Kasprzak, W., Reppou, S.: Merging robotics and AAL ontologies: the RAPP methodology. In: Progress in Automation, Robotics and Measuring Techniques. Advances in Intelligent Systems and Computing, vol. 351, pp. 285–297. Springer International Publishing (2015)
18.
Zurück zum Zitat Collet, A., Martinez, M., Srinivasa, S.S.: The MOPED framework: object recognition and pose estimation for manipulation. Int. J. Robot. Res. 30(10), 1284–1306 (2011)CrossRef Collet, A., Martinez, M., Srinivasa, S.S.: The MOPED framework: object recognition and pose estimation for manipulation. Int. J. Robot. Res. 30(10), 1284–1306 (2011)CrossRef
19.
Zurück zum Zitat Hinterstoisser, S., Holzer, S., Cagniart, C., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 858–865. IEEE (2011) Hinterstoisser, S., Holzer, S., Cagniart, C., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 858–865. IEEE (2011)
20.
Zurück zum Zitat O’Hara, S., Draper, B.A.: Introduction to the bag of features paradigm for image classification and retrieval (2011). arXiv preprint arXiv:1101.3354 O’Hara, S., Draper, B.A.: Introduction to the bag of features paradigm for image classification and retrieval (2011). arXiv preprint arXiv:​1101.​3354
21.
Zurück zum Zitat Lysenkov, I., Rabaud, V.: Pose estimation of rigid transparent objects in transparent clutter. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 162–169. IEEE (2013) Lysenkov, I., Rabaud, V.: Pose estimation of rigid transparent objects in transparent clutter. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 162–169. IEEE (2013)
22.
Zurück zum Zitat Pangercic, D., Haltakov, V., Beetz, M.: Fast and robust object detection in household environments using vocabulary trees with sift descriptors. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Workshop on Active Semantic Perception and Object Search in the Real World, San Francisco, CA, USA. Citeseer (2011) Pangercic, D., Haltakov, V., Beetz, M.: Fast and robust object detection in household environments using vocabulary trees with sift descriptors. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Workshop on Active Semantic Perception and Object Search in the Real World, San Francisco, CA, USA. Citeseer (2011)
23.
Zurück zum Zitat Kasprzak, W., Kornuta, T., Zieliński, C.: A virtual receptor in a robot control framework. In: Szewczyk, R., Zieliński, C., Kaliczyńska, M. (eds.) Recent Advances in Automation, Robotics and Measuring Techniques. Advances in Intelligent Systems and Computing (AISC), vol. 267, pp. 399–408. Springer, Berlin (2014)CrossRef Kasprzak, W., Kornuta, T., Zieliński, C.: A virtual receptor in a robot control framework. In: Szewczyk, R., Zieliński, C., Kaliczyńska, M. (eds.) Recent Advances in Automation, Robotics and Measuring Techniques. Advances in Intelligent Systems and Computing (AISC), vol. 267, pp. 399–408. Springer, Berlin (2014)CrossRef
24.
Zurück zum Zitat Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRef Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRef
25.
Zurück zum Zitat Marr, D.: Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. Henry Holt and Co., Inc., New York (1982) Marr, D.: Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. Henry Holt and Co., Inc., New York (1982)
26.
Zurück zum Zitat Lowe, D.G.: Three-dimensional object recognition from single two-dimensional images. Artif. Intell. 31(3), 355–395 (1987)CrossRef Lowe, D.G.: Three-dimensional object recognition from single two-dimensional images. Artif. Intell. 31(3), 355–395 (1987)CrossRef
27.
Zurück zum Zitat Forsyth, D.A., Ponce, J.: Computer Vision: A Modern Approach. Prentice Hall Professional Technical Reference (2002) Forsyth, D.A., Ponce, J.: Computer Vision: A Modern Approach. Prentice Hall Professional Technical Reference (2002)
28.
Zurück zum Zitat Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Prentice Hall, Upper Saddle River (2002) Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Prentice Hall, Upper Saddle River (2002)
29.
Zurück zum Zitat Stefańczyk, M., Kasprzak, W.: Multimodal segmentation of dense depth maps and associated color information. In: Bolc, L., Tadeusiewicz, R., Chmielewski, L., Wojciechowski, K. (eds.) Proceedings of the International Conference on Computer Vision and Graphics. Lecture Notes in Computer Science, vol. 7594, pp. 626–632. Springer, Berlin (2012) Stefańczyk, M., Kasprzak, W.: Multimodal segmentation of dense depth maps and associated color information. In: Bolc, L., Tadeusiewicz, R., Chmielewski, L., Wojciechowski, K. (eds.) Proceedings of the International Conference on Computer Vision and Graphics. Lecture Notes in Computer Science, vol. 7594, pp. 626–632. Springer, Berlin (2012)
30.
Zurück zum Zitat Richtsfeld, A., Morwald, T., Prankl, J., Zillich, M., Vincze, M.: Segmentation of unknown objects in indoor environments. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4791–4796. IEEE (2012) Richtsfeld, A., Morwald, T., Prankl, J., Zillich, M., Vincze, M.: Segmentation of unknown objects in indoor environments. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4791–4796. IEEE (2012)
31.
Zurück zum Zitat Miller, A.T., Allen, P.K.: Graspit! a versatile simulator for robotic grasping. IEEE Robot. Autom. Mag. 11(4), 110–122 (2004)CrossRef Miller, A.T., Allen, P.K.: Graspit! a versatile simulator for robotic grasping. IEEE Robot. Autom. Mag. 11(4), 110–122 (2004)CrossRef
32.
Zurück zum Zitat Łępicka, M., Kornuta, T., Stefańczyk, M.: Utilization of colour in ICP-based point cloud registration. In: Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015. Advances in Intelligent Systems and Computing, pp. 821–830. Springer, Berlin (2016) Łępicka, M., Kornuta, T., Stefańczyk, M.: Utilization of colour in ICP-based point cloud registration. In: Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015. Advances in Intelligent Systems and Computing, pp. 821–830. Springer, Berlin (2016)
33.
Zurück zum Zitat Naylor, B.: Computational representations of geometry. In: Representations of Geometry for Computer Graphics, SIGGRAPH ’94 Course Notes (1994) Naylor, B.: Computational representations of geometry. In: Representations of Geometry for Computer Graphics, SIGGRAPH ’94 Course Notes (1994)
34.
Zurück zum Zitat Jaklic, A., Leonardis, A., Solina, F.: Segmentation and Recovery of Superquadrics, vol. 20. Springer Science & Business Media (2013) Jaklic, A., Leonardis, A., Solina, F.: Segmentation and Recovery of Superquadrics, vol. 20. Springer Science & Business Media (2013)
35.
Zurück zum Zitat Foote, T.: tf: the transform library. In: 2013 IEEE International Conference on Technologies for Practical Robot Applications (TePRA), pp. 1–6. IEEE (2013) Foote, T.: tf: the transform library. In: 2013 IEEE International Conference on Technologies for Practical Robot Applications (TePRA), pp. 1–6. IEEE (2013)
37.
Zurück zum Zitat Stefańczyk, M., Pietruch, R.: Hypothesis generation in generic, model-based object recognition system. In: Szewczyk, R., Zieliński, C., Kaliczyńska, M. (eds.) Recent Advances in Automation, Robotics and Measuring Techniques. Advances in Intelligent Systems and Computing (AISC), vol. 440, pp. 717–727. Springer, Berlin (2016). https://doi.org/10.1007/978-3-319-29357-8_62 Stefańczyk, M., Pietruch, R.: Hypothesis generation in generic, model-based object recognition system. In: Szewczyk, R., Zieliński, C., Kaliczyńska, M. (eds.) Recent Advances in Automation, Robotics and Measuring Techniques. Advances in Intelligent Systems and Computing (AISC), vol. 440, pp. 717–727. Springer, Berlin (2016). https://​doi.​org/​10.​1007/​978-3-319-29357-8_​62
38.
Zurück zum Zitat Wilkowski, A., Stefańczyk, M.: Detection and recognition of compound 3D models by hypothesis generation. In: Szewczyk, R., Zieliński, C., Kaliczyńska, M. (eds.) Recent Advances in Automation, Robotics and Measuring Techniques. Advances in Intelligent Systems and Computing (AISC), vol. 440, pp. 659–668. Springer, Berlin (2016). https://doi.org/10.1007/978-3-319-29357-8_57 Wilkowski, A., Stefańczyk, M.: Detection and recognition of compound 3D models by hypothesis generation. In: Szewczyk, R., Zieliński, C., Kaliczyńska, M. (eds.) Recent Advances in Automation, Robotics and Measuring Techniques. Advances in Intelligent Systems and Computing (AISC), vol. 440, pp. 659–668. Springer, Berlin (2016). https://​doi.​org/​10.​1007/​978-3-319-29357-8_​57
39.
Zurück zum Zitat Stefańczyk, M., Laszkowski, M., Kornuta, T.: WUT visual perception dataset-a dataset for registration and recognition of objects. In: Challenges in Automation, Robotics and Measurement Techniques. Advances in Intelligent Systems and Computing (AISC), vol. 440, pp. 635–645. Springer, Berlin (2016) Stefańczyk, M., Laszkowski, M., Kornuta, T.: WUT visual perception dataset-a dataset for registration and recognition of objects. In: Challenges in Automation, Robotics and Measurement Techniques. Advances in Intelligent Systems and Computing (AISC), vol. 440, pp. 635–645. Springer, Berlin (2016)
Metadaten
Titel
Model-Based 3D Object Recognition in RGB-D Images
verfasst von
Maciej Stefańczyk
Włodzimierz Kasprzak
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-73891-8_5