Skip to main content
Erschienen in: Autonomous Robots 2/2014

01.08.2014

Object segmentation in cluttered and visually complex environments

verfasst von: Dmitri Ignakov, Guangjun Liu, Galina Okouneva

Erschienen in: Autonomous Robots | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Object segmentation is essential for systems that acquire object models online for robotic grasping. However, it remains a major technical challenge in visually complex and uncontrolled environments. Segmentation algorithms that rely on image features alone can perform poorly under certain lighting conditions, or if the object and the background have similar appearance. In parallel, known object segmentation algorithms that rely exclusively on three dimensional (3D) geometric data are derived under strong assumptions about the geometry of the scene. A promising approach to performing object segmentation is to use a combination of appearance and 3D features. In this paper, an object segmentation algorithm is presented that combines multiple appearance and geometric cues. The segmentation is formulated as a binary labeling problem. The Conditional Random Fields (CRF) framework is used to model the conditional probability of the labeling given the appearance and geometric data. The maximum a posteriori estimation of the labeling is obtained by minimizing the energy function corresponding to the CRF using graph cuts. A simple and efficient method for initializing the proposed algorithm is also presented. Experimental results have demonstrated the effectiveness of the proposed algorithm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bergstrom, N., Bjorkman, M., & Kragic, D. (2011). Generating object hypotheses in natural scenes through human–robot interaction. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 827–833). Bergstrom, N., Bjorkman, M., & Kragic, D. (2011). Generating object hypotheses in natural scenes through human–robot interaction. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 827–833).
Zurück zum Zitat Bjorkman, M., & Eklundh, J. (2006). Vision in the real world: Finding, attending and recognizing objects. International Journal of Imaging Systems and Technology, 16(5), 189–208.CrossRef Bjorkman, M., & Eklundh, J. (2006). Vision in the real world: Finding, attending and recognizing objects. International Journal of Imaging Systems and Technology, 16(5), 189–208.CrossRef
Zurück zum Zitat Bjorkman, M., & Kragic, D. (2010a). Active 3D scene segmentation and detection of unknown objects. In Proceedings of the 2010 IEEE international conference on robotics and automation (pp. 3114–3120). Bjorkman, M., & Kragic, D. (2010a). Active 3D scene segmentation and detection of unknown objects. In Proceedings of the 2010 IEEE international conference on robotics and automation (pp. 3114–3120).
Zurück zum Zitat Bjorkman, M., & Kragic, D. (2010b). Active 3D segmentation through fixation of previously unseen objects. In Proceedings of the British machine vision conference (pp. 119.1-119.11). Guildford: BMVA Press. doi:10.5244/C.24.119. Bjorkman, M., & Kragic, D. (2010b). Active 3D segmentation through fixation of previously unseen objects. In Proceedings of the British machine vision conference (pp. 119.1-119.11). Guildford: BMVA Press. doi:10.​5244/​C.​24.​119.
Zurück zum Zitat Bleiweiss, A., & Werman, M. (2009). Fusing time-of-flight depth and color for real-time segmentation and tracking. In A. Kolb & R. Koch (Eds.), Lecture notes in computer science: Vol. 5742. Dynamic 3D imaging (pp. 58–69). Berlin: Springer. Bleiweiss, A., & Werman, M. (2009). Fusing time-of-flight depth and color for real-time segmentation and tracking. In A. Kolb & R. Koch (Eds.), Lecture notes in computer science: Vol. 5742. Dynamic 3D imaging (pp. 58–69). Berlin: Springer.
Zurück zum Zitat Boykov, Y., & Funka-Lea, G. (2006). Graph cuts and efficient N-D image segmentation. International Journal of Computer Vision, 70(2), 109–131.CrossRef Boykov, Y., & Funka-Lea, G. (2006). Graph cuts and efficient N-D image segmentation. International Journal of Computer Vision, 70(2), 109–131.CrossRef
Zurück zum Zitat Boykov, Y., & Jolly, M. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. In Proceedings of the 8th IEEE international conference on computer vision (Vol. 1, pp. 105–112). Boykov, Y., & Jolly, M. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. In Proceedings of the 8th IEEE international conference on computer vision (Vol. 1, pp. 105–112).
Zurück zum Zitat Boykov, Y., Veksler, O., & Zabih, R. (1998). Markov random fields with efficient approximations. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 648–655). Boykov, Y., Veksler, O., & Zabih, R. (1998). Markov random fields with efficient approximations. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 648–655).
Zurück zum Zitat Cigla, C., & Alatan, A. (2009). Object segmentation in multi-view video via color, depth and motion cues. In Proceedings of the 17th IEEE signal processing and communications applications conference, Antalya, Turkey (pp. 668–671). Cigla, C., & Alatan, A. (2009). Object segmentation in multi-view video via color, depth and motion cues. In Proceedings of the 17th IEEE signal processing and communications applications conference, Antalya, Turkey (pp. 668–671).
Zurück zum Zitat Felzenszwalb, P., & Huttenlocher, D. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.CrossRef Felzenszwalb, P., & Huttenlocher, D. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.CrossRef
Zurück zum Zitat Franke, M. (2011). Color image segmentation based on an iterative graph cut algorithm using time-of-flight cameras. Pattern Recognition, 6835, 462–467.MathSciNet Franke, M. (2011). Color image segmentation based on an iterative graph cut algorithm using time-of-flight cameras. Pattern Recognition, 6835, 462–467.MathSciNet
Zurück zum Zitat Gumhold, S., Wang, X., & MacLeod, R. (2001). Feature extraction from point clouds. In Proceedings of the 10th international meshing roundtable (pp. 293–305). Gumhold, S., Wang, X., & MacLeod, R. (2001). Feature extraction from point clouds. In Proceedings of the 10th international meshing roundtable (pp. 293–305).
Zurück zum Zitat Harville, M., Gordon, G., & Woodfill, J. (2001). Foreground segmentation using adaptive mixture models in color and depth. In Proceedings of the IEEE workshop on detection and recognition of events in video (pp. 3–12). Harville, M., Gordon, G., & Woodfill, J. (2001). Foreground segmentation using adaptive mixture models in color and depth. In Proceedings of the IEEE workshop on detection and recognition of events in video (pp. 3–12).
Zurück zum Zitat Hirano, Y., Kitahama, K., & Yoshizawa, S. (2005). Image-based object recognition and dexterous hand/arm motion planning using RRTS for grasping in cluttered scene. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 2041–2046). Hirano, Y., Kitahama, K., & Yoshizawa, S. (2005). Image-based object recognition and dexterous hand/arm motion planning using RRTS for grasping in cluttered scene. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 2041–2046).
Zurück zum Zitat Ilea, D., & Whelan, P. (2011). Image segmentation based on the integration of colour-texture descriptors—A review. Pattern Recognition, 44, 2479–2501.CrossRefMATH Ilea, D., & Whelan, P. (2011). Image segmentation based on the integration of colour-texture descriptors—A review. Pattern Recognition, 44, 2479–2501.CrossRefMATH
Zurück zum Zitat Jain, A. K., & Farrokhnia, F. (1991). Unsupervised texture segmentation using gabor filters. Pattern Recognition, 24(12), 1167–1186. Jain, A. K., & Farrokhnia, F. (1991). Unsupervised texture segmentation using gabor filters. Pattern Recognition, 24(12), 1167–1186.
Zurück zum Zitat Johnson-Roberson, M., Bohg, J., Bjorkman, M., & Kragic, D. (2010). Attention-based active 3D point cloud segmentation. In Proceedings of the 2010 IEEE/RSJ international conference on intelligent robots and systems (pp. 1165–1170). Johnson-Roberson, M., Bohg, J., Bjorkman, M., & Kragic, D. (2010). Attention-based active 3D point cloud segmentation. In Proceedings of the 2010 IEEE/RSJ international conference on intelligent robots and systems (pp. 1165–1170).
Zurück zum Zitat Kim, J., & Hong, K. (2009). Color-texture segmentation using unsupervised graph cuts. Pattern Recognition, 42(5), 735–750.CrossRefMATH Kim, J., & Hong, K. (2009). Color-texture segmentation using unsupervised graph cuts. Pattern Recognition, 42(5), 735–750.CrossRefMATH
Zurück zum Zitat Kittler, J., & Illingworth, J. (1986). Minimum error thresholding. Pattern Recognition, 19(1), 41–47.CrossRef Kittler, J., & Illingworth, J. (1986). Minimum error thresholding. Pattern Recognition, 19(1), 41–47.CrossRef
Zurück zum Zitat Kolda, T. G., Lewis, R. M., & Torczon, V. (2003). Optimization by direct search: New perspectives on some classical and modern methods. SIAM Review, 45, 385–482.CrossRefMATHMathSciNet Kolda, T. G., Lewis, R. M., & Torczon, V. (2003). Optimization by direct search: New perspectives on some classical and modern methods. SIAM Review, 45, 385–482.CrossRefMATHMathSciNet
Zurück zum Zitat Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th international conference on machine learning (pp. 282–289). Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th international conference on machine learning (pp. 282–289).
Zurück zum Zitat LeCun, Y., Chopra, S., Hadsell, R., Huang, F. J., Bakir, G., Hofman, T., et al. (2006). A tutorial on energy-based learning. In Predicting structured data (pp. 191–241). Cambridge, MA: MIT Press. LeCun, Y., Chopra, S., Hadsell, R., Huang, F. J., Bakir, G., Hofman, T., et al. (2006). A tutorial on energy-based learning. In Predicting structured data (pp. 191–241). Cambridge, MA: MIT Press.
Zurück zum Zitat Leung, T., & Malik, J. (2001). Representing and recognizing the visual appearance of materials using three-dimensional textons. International Journal of Computer Vision, 43(1), 29–44.CrossRefMATH Leung, T., & Malik, J. (2001). Representing and recognizing the visual appearance of materials using three-dimensional textons. International Journal of Computer Vision, 43(1), 29–44.CrossRefMATH
Zurück zum Zitat Li, H., & Ngan, K. N. (2011). Image/video segmentation: Current status, trends, and challenges. Video segmentation and its applications (pp. 1–23). New York: Springer.CrossRef Li, H., & Ngan, K. N. (2011). Image/video segmentation: Current status, trends, and challenges. Video segmentation and its applications (pp. 1–23). New York: Springer.CrossRef
Zurück zum Zitat Malik, J., & Perona, P. (1990). Preattentive texture discrimination with early vision mechanisms. Journal of the Optical Society of America A, 7(5), 923–932.CrossRef Malik, J., & Perona, P. (1990). Preattentive texture discrimination with early vision mechanisms. Journal of the Optical Society of America A, 7(5), 923–932.CrossRef
Zurück zum Zitat Marton, Z. C., Goron, L., Rusu, R. B., & Beetz, M. (2009). Reconstruction and verification of 3D object models for grasping. In Proceedings of the 14th international symposium on robotics research, Lucernce, Switzerland (pp. 315–328). Marton, Z. C., Goron, L., Rusu, R. B., & Beetz, M. (2009). Reconstruction and verification of 3D object models for grasping. In Proceedings of the 14th international symposium on robotics research, Lucernce, Switzerland (pp. 315–328).
Zurück zum Zitat Rao, D., Le, Q., Phoka, T., Quigley, M., Sudsang, A., & Ng, A. (2010). Grasping novel objects with depth segmentation. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 2578–2585). Rao, D., Le, Q., Phoka, T., Quigley, M., Sudsang, A., & Ng, A. (2010). Grasping novel objects with depth segmentation. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 2578–2585).
Zurück zum Zitat Rasolzadeh, B., Bjorkman, M., Huebner, K., & Kragic, D. (2010). An active vision system for detecting, fixating and manipulating objects in the real world. International Journal of Robotics Research, 29(2–3), 133–154.CrossRef Rasolzadeh, B., Bjorkman, M., Huebner, K., & Kragic, D. (2010). An active vision system for detecting, fixating and manipulating objects in the real world. International Journal of Robotics Research, 29(2–3), 133–154.CrossRef
Zurück zum Zitat Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 23(3), 309–314.CrossRef Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 23(3), 309–314.CrossRef
Zurück zum Zitat Rusu, R. B., Holzbach, A., Diankov, R., Bradski, G., & Beetz, M. (2009). Perception for mobile manipulation and grasping using active stereo. In Proceedings of the 9th IEEE-RAS international conference on humanoid robots, Paris, France (pp. 632–638). Rusu, R. B., Holzbach, A., Diankov, R., Bradski, G., & Beetz, M. (2009). Perception for mobile manipulation and grasping using active stereo. In Proceedings of the 9th IEEE-RAS international conference on humanoid robots, Paris, France (pp. 632–638).
Zurück zum Zitat Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.CrossRef Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.CrossRef
Zurück zum Zitat Strom, J., Richardson, A., & Olson, E. (2010). Graph-based segmentation for colored 3D laser point clouds. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 2131–2136). Strom, J., Richardson, A., & Olson, E. (2010). Graph-based segmentation for colored 3D laser point clouds. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 2131–2136).
Zurück zum Zitat Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In D. Forsyth, P. Torr, & A. Zisserman (Eds.), Lecture notes in computer science: Vol. 5303. Computer vision–ECCV 2008 (pp. 582–595). Berlin: Springer. Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In D. Forsyth, P. Torr, & A. Zisserman (Eds.), Lecture notes in computer science: Vol. 5303. Computer vision–ECCV 2008 (pp. 582–595). Berlin: Springer.
Zurück zum Zitat Varma, M., & Zisserman, A. (2005). A statistical approach to texture classification from single images. International Journal of Computer Vision, 62(1–2), 61–81.CrossRef Varma, M., & Zisserman, A. (2005). A statistical approach to texture classification from single images. International Journal of Computer Vision, 62(1–2), 61–81.CrossRef
Zurück zum Zitat Zach, C., Pock, T., & Bischof, H. (2007). A duality based approach for realtime TV-L1 optical flow. In F. Hamprecht, C. Schnrr, & B. Jahne (Eds.), Lecture notes in computer science: Vol. 4713. Pattern recognition (pp. 214–223). Berlin: Springer. Zach, C., Pock, T., & Bischof, H. (2007). A duality based approach for realtime TV-L1 optical flow. In F. Hamprecht, C. Schnrr, & B. Jahne (Eds.), Lecture notes in computer science: Vol. 4713. Pattern recognition (pp. 214–223). Berlin: Springer.
Metadaten
Titel
Object segmentation in cluttered and visually complex environments
verfasst von
Dmitri Ignakov
Guangjun Liu
Galina Okouneva
Publikationsdatum
01.08.2014
Verlag
Springer US
Erschienen in
Autonomous Robots / Ausgabe 2/2014
Print ISSN: 0929-5593
Elektronische ISSN: 1573-7527
DOI
https://doi.org/10.1007/s10514-013-9381-9

Weitere Artikel der Ausgabe 2/2014

Autonomous Robots 2/2014 Zur Ausgabe

Neuer Inhalt