Skip to main content
Erschienen in: International Journal of Computer Vision 3/2015

01.07.2015

Efficient Dense Rigid-Body Motion Segmentation and Estimation in RGB-D Video

verfasst von: Jörg Stückler, Sven Behnke

Erschienen in: International Journal of Computer Vision | Ausgabe 3/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Motion is a fundamental grouping cue in video. Many current approaches to motion segmentation in monocular or stereo image sequences rely on sparse interest points or are dense but computationally demanding. We propose an efficient expectation–maximization (EM) framework for dense 3D segmentation of moving rigid parts in RGB-D video. Our approach segments images into pixel regions that undergo coherent 3D rigid-body motion. Our formulation treats background and foreground objects equally and poses no further assumptions on the motion of the camera or the objects than rigidness. While our EM-formulation is not restricted to a specific image representation, we supplement it with efficient image representation and registration for rapid segmentation of RGB-D video. In experiments, we demonstrate that our approach recovers segmentation and 3D motion at good precision.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
2
Due to the high run-time requirements of the method, we evaluated the approach at full frame-rate for sequence lengths that are multiples of 30 frames.
 
Literatur
Zurück zum Zitat Agrawal, M., Konolige, K., & Iocchi, L. (2005). Real-time detection of independent motion using stereo. In Proceedings of the IEEE Workshop on Motion. Agrawal, M., Konolige, K., & Iocchi, L. (2005). Real-time detection of independent motion using stereo. In Proceedings of the IEEE Workshop on Motion.
Zurück zum Zitat Ayvaci, A., & Soatto, S. (2009). Motion segmentation with occlusions on the superpixel graph. In Proceedings of the IEEE ICCV Workshops. Ayvaci, A., & Soatto, S. (2009). Motion segmentation with occlusions on the superpixel graph. In Proceedings of the IEEE ICCV Workshops.
Zurück zum Zitat Bishop, C. M. (2006). Pattern Recognition and Machine Learning (Information Science and Statistics). Secaucus: Springer. ISBN 0387310738. Bishop, C. M. (2006). Pattern Recognition and Machine Learning (Information Science and Statistics). Secaucus: Springer. ISBN 0387310738.
Zurück zum Zitat Boykov, Y., & Jolly, M. -P. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images. In Proceedings of the IEEE International Conference on Computer Vision. Boykov, Y., & Jolly, M. -P. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images. In Proceedings of the IEEE International Conference on Computer Vision.
Zurück zum Zitat Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.CrossRef Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.CrossRef
Zurück zum Zitat Brox, T., Bruhn, A., & Weickert, J. (2006). Variational motion segmentation with level sets. In Proceedings of the European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science (pp. 471–483). Brox, T., Bruhn, A., & Weickert, J. (2006). Variational motion segmentation with level sets. In Proceedings of the European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science (pp. 471–483).
Zurück zum Zitat Cremers, D., & Soatto, S. (2005). Motion competition: A variational approach to piecewise parametric motion segmentation. International Journal of Computer Vision, 62, 249–265.CrossRef Cremers, D., & Soatto, S. (2005). Motion competition: A variational approach to piecewise parametric motion segmentation. International Journal of Computer Vision, 62, 249–265.CrossRef
Zurück zum Zitat Delong, A., Osokin, A., Isack, H. N., & Boykov, Y. (2012). Fast approximate energy minimization with label costs. International Journal of Computer Vision, 96(1), 1–27.MATHMathSciNetCrossRef Delong, A., Osokin, A., Isack, H. N., & Boykov, Y. (2012). Fast approximate energy minimization with label costs. International Journal of Computer Vision, 96(1), 1–27.MATHMathSciNetCrossRef
Zurück zum Zitat Drost, Bertram, Ulrich, Markus, Navab, Nassir, & Ilic, Slobodan. (2010). Model globally, match locally: Efficient and robust 3D object recognition. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR). Drost, Bertram, Ulrich, Markus, Navab, Nassir, & Ilic, Slobodan. (2010). Model globally, match locally: Efficient and robust 3D object recognition. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).
Zurück zum Zitat Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338. Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338.
Zurück zum Zitat Fitzpatrick, P. (2003). First contact: an active vision approach to segmentation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Fitzpatrick, P. (2003). First contact: an active vision approach to segmentation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Zurück zum Zitat Gruber, A., & Weiss, Y. (2004). Multibody factorization with uncertainty and missing data using the EM algorithm. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR). Gruber, A., & Weiss, Y. (2004). Multibody factorization with uncertainty and missing data using the EM algorithm. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).
Zurück zum Zitat Hadfield, S., & Bowden, R. (2014). Scene particles: Unregularized particle based scene flow estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3), 564–576.CrossRef Hadfield, S., & Bowden, R. (2014). Scene particles: Unregularized particle based scene flow estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3), 564–576.CrossRef
Zurück zum Zitat Herbst, Evan, Ren, Xiaofeng, & Fox, Dieter. (2013). RGB-D flow: Dense 3-D motion estimation using color and depth. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (pp. 2276–2282). Herbst, Evan, Ren, Xiaofeng, & Fox, Dieter. (2013). RGB-D flow: Dense 3-D motion estimation using color and depth. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (pp. 2276–2282).
Zurück zum Zitat Herbst, Evan, Henry, Peter, & Fox, Dieter. (2014). Toward online 3-D object segmentation and mapping. In International Conference on Robotics and Automation (ICRA). Herbst, Evan, Henry, Peter, & Fox, Dieter. (2014). Toward online 3-D object segmentation and mapping. In International Conference on Robotics and Automation (ICRA).
Zurück zum Zitat Hornacek, M., Fitzgibbon, A., & Rother, C. (2014). SphereFlow: 6 DoF scene flow from RGB-D pairs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Hornacek, M., Fitzgibbon, A., & Rother, C. (2014). SphereFlow: 6 DoF scene flow from RGB-D pairs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Zurück zum Zitat Huguet, F., & Devernay, F. (2007). A variational method for scene flow estimation from stereo sequences. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). Huguet, F., & Devernay, F. (2007). A variational method for scene flow estimation from stereo sequences. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Zurück zum Zitat Kenney, J., Buckley, T., & Brock, O. (2009). Interactive segmentation for manipulation in unstructured environments. In Proceedings of the IEEE ICRA. Kenney, J., Buckley, T., & Brock, O. (2009). Interactive segmentation for manipulation in unstructured environments. In Proceedings of the IEEE ICRA.
Zurück zum Zitat Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2005). Learning layered motion segmentations of video. In Proceedings of the International Conference on Computer Vision (ICCV). Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2005). Learning layered motion segmentations of video. In Proceedings of the International Conference on Computer Vision (ICCV).
Zurück zum Zitat Ochs, P., Malik, J., & Brox, T. (2014). Segmentation of moving objects by long term video analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6), 1187–1200. Preprint.CrossRef Ochs, P., Malik, J., & Brox, T. (2014). Segmentation of moving objects by long term video analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6), 1187–1200. Preprint.CrossRef
Zurück zum Zitat Quiroga, J., Devernay, F., & Crowley, J. L. (2013). Local/global scene flow estimation. In Proceedings of the IEEE International Conference on Image Processing (ICIP). Quiroga, J., Devernay, F., & Crowley, J. L. (2013). Local/global scene flow estimation. In Proceedings of the IEEE International Conference on Image Processing (ICIP).
Zurück zum Zitat Ross, D., Tarlow, D., & Zemel, R. (2010). Learning articulated structure and motion. International Journal of Computer Vision, 88, 214–237.CrossRef Ross, D., Tarlow, D., & Zemel, R. (2010). Learning articulated structure and motion. International Journal of Computer Vision, 88, 214–237.CrossRef
Zurück zum Zitat Rothganger, F., Lazebnik, S., Schmid, C., & Ponce, J. (2007). Segmenting, modeling, and matching video clips containing multiple moving objects. IEEE Transactions on Pattern Analysis and Machine Intelligence (pp. 477–491). Rothganger, F., Lazebnik, S., Schmid, C., & Ponce, J. (2007). Segmenting, modeling, and matching video clips containing multiple moving objects. IEEE Transactions on Pattern Analysis and Machine Intelligence (pp. 477–491).
Zurück zum Zitat Roussos, A., Russell, C., Garg, R., & de Agapito, L. (2012). Dense multibody motion estimation and reconstruction from a handheld camera. In Proceedings of the IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Roussos, A., Russell, C., Garg, R., & de Agapito, L. (2012). Dense multibody motion estimation and reconstruction from a handheld camera. In Proceedings of the IEEE International Symposium on Mixed and Augmented Reality (ISMAR).
Zurück zum Zitat Saito, M., Okatani, T., Deguchi, K. (2012). Application of the mean field methods to mrf optimization in computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1680–1687). Saito, M., Okatani, T., Deguchi, K. (2012). Application of the mean field methods to mrf optimization in computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1680–1687).
Zurück zum Zitat Schindler, K., & Suter, D. (2006). Two-view multibody structure-and-motion with outliers through model selection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 983–995. ISSN 0162–8828. Schindler, K., & Suter, D. (2006). Two-view multibody structure-and-motion with outliers through model selection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 983–995. ISSN 0162–8828.
Zurück zum Zitat Sekkati, H., & Mitiche, A. (2006). Concurrent 3-D motion segmentation and 3-D interpretation of temporal sequences of monocular images. IEEE Transactions on Image Processing, 15(3), 641–653.CrossRef Sekkati, H., & Mitiche, A. (2006). Concurrent 3-D motion segmentation and 3-D interpretation of temporal sequences of monocular images. IEEE Transactions on Image Processing, 15(3), 641–653.CrossRef
Zurück zum Zitat Stückler, J., & Behnke, S. (2013). Efficient dense 3D rigid-body motion segmentation in RGB-D video. In Proceedings of the British Machine Vision Conference (BMVC). BMVA Press. Stückler, J., & Behnke, S. (2013). Efficient dense 3D rigid-body motion segmentation in RGB-D video. In Proceedings of the British Machine Vision Conference (BMVC). BMVA Press.
Zurück zum Zitat Stückler, J., & Behnke, S. (2014). Multi-resolution surfel maps for efficient dense 3D modeling and tracking. Journal of Visual Communication and Image Representation, 25(1), 137–147. Stückler, J., & Behnke, S. (2014). Multi-resolution surfel maps for efficient dense 3D modeling and tracking. Journal of Visual Communication and Image Representation, 25(1), 137–147.
Zurück zum Zitat Unger, M., Werlberger, M., Pock, T., & Bischof, H. (2012). Joint motion estimation and segmentation of complex scenes with label costs and occlusion modeling. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1878–1885). Unger, M., Werlberger, M., Pock, T., & Bischof, H. (2012). Joint motion estimation and segmentation of complex scenes with label costs and occlusion modeling. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1878–1885).
Zurück zum Zitat Van den Bergh, M., & van Gool, L. (2012). Real-time stereo and flow-based video segmentation with superpixels. In IEEE Workshop on Applications of Computer Vision (WACV). Van den Bergh, M., & van Gool, L. (2012). Real-time stereo and flow-based video segmentation with superpixels. In IEEE Workshop on Applications of Computer Vision (WACV).
Zurück zum Zitat Wang, S., Yu, H., & Hu, R. (2013). 3D video based segmentation and motion estimation with active surface evolution. Journal of Signal Processing Systems, 71(1), 21–34. Wang, S., Yu, H., & Hu, R. (2013). 3D video based segmentation and motion estimation with active surface evolution. Journal of Signal Processing Systems, 71(1), 21–34.
Zurück zum Zitat Weber, J., & Malik, J. (1997). Rigid body segmentation and shape description from dense optical flow under weak perspective. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 139–143.CrossRef Weber, J., & Malik, J. (1997). Rigid body segmentation and shape description from dense optical flow under weak perspective. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 139–143.CrossRef
Zurück zum Zitat Wedel, A., & Cremers, D. (2011). Stereoscopic scene flow for 3D motion analysis. Wedel, A., & Cremers, D. (2011). Stereoscopic scene flow for 3D motion analysis.
Zurück zum Zitat Zelnik-Manor, L., Machline, M., & Irani, M. (2006). Multi-body factorization with uncertainty: Revisiting motion consistency. International Journal of Computer Vision, 68(1), 27–41. Zelnik-Manor, L., Machline, M., & Irani, M. (2006). Multi-body factorization with uncertainty: Revisiting motion consistency. International Journal of Computer Vision, 68(1), 27–41.
Zurück zum Zitat Zhang, G., Jia, J., & Bao, H. (2011). Simultaneous multi-body stereo and segmentation. In Proc. of the IEEE International Conference on Computer Vision (ICCV). Zhang, G., Jia, J., & Bao, H. (2011). Simultaneous multi-body stereo and segmentation. In Proc. of the IEEE International Conference on Computer Vision (ICCV).
Metadaten
Titel
Efficient Dense Rigid-Body Motion Segmentation and Estimation in RGB-D Video
verfasst von
Jörg Stückler
Sven Behnke
Publikationsdatum
01.07.2015
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 3/2015
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-014-0796-3

Weitere Artikel der Ausgabe 3/2015

International Journal of Computer Vision 3/2015 Zur Ausgabe