skip to main content
research-article

Autoscanning for coupled scene reconstruction and proactive object analysis

Authors Info & Claims
Published:02 November 2015Publication History
Skip Abstract Section

Abstract

Detailed scanning of indoor scenes is tedious for humans. We propose autonomous scene scanning by a robot to relieve humans from such a laborious task. In an autonomous setting, detailed scene acquisition is inevitably coupled with scene analysis at the required level of detail. We develop a framework for object-level scene reconstruction coupled with object-centric scene analysis. As a result, the autoscanning and reconstruction will be object-aware, guided by the object analysis. The analysis is, in turn, gradually improved with progressively increased object-wise data fidelity. In realizing such a framework, we drive the robot to execute an iterative analyze-and-validate algorithm which interleaves between object analysis and guided validations.

The object analysis incorporates online learning into a robust graph-cut based segmentation framework, achieving a global update of object-level segmentation based on the knowledge gained from robot-operated local validation. Based on the current analysis, the robot performs proactive validation over the scene with physical push and scan refinement, aiming at reducing the uncertainty of both object-level segmentation and object-wise reconstruction. We propose a joint entropy to measure such uncertainty based on segmentation confidence and reconstruction quality, and formulate the selection of validation actions as a maximum information gain problem. The output of our system is a reconstructed scene with both object extraction and object-wise geometry fidelity.

Skip Supplemental Material Section

Supplemental Material

References

  1. Allen, P. K. 1988. Integrating vision and touch for object recognition tasks. Int. J. Robotics Research 7, 6, 1533. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bach, F., Lanckriet, G., and Jordan, M. 2004. Multiple kernel learning, conic duality, and the smo algorithm. In Proc. ICML, 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Berger, M., Tagliasacchi, A., Seversky, L. M., Alliez, P., Levine, J. A., Sharf, A., and Silva, C. 2014. State of the art in surface reconstruction from point clouds. Eurographics STAR, 165--185.Google ScholarGoogle Scholar
  4. Bersch, C., Pangercic, D., Osentoski, S., Hausman, K., Marton, Z.-C., Ueda, R., Okada, K., and Beetz, M. 2012. Segmentation of cluttered scenes through interactive perception. In RSS Workshop on Robots in Clutter: Manipulation, Perception and Navigation in Human Environments.Google ScholarGoogle Scholar
  5. Callieri, M., Fasano, A., Impoco, G., Cignoni, P., Scopigno, R., Parrini, G., and Biagini, G. 2004. Roboscan: an automatic system for accurate and unattended 3D scanning. In Proc. of 3DPVT, 805--812. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chen, S., Li, Y., and Kwok, N. M. 2011. Active vision in robotic systems: A survey of recent developments. Int. J. Robotics Research 30, 11, 1343--1377. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chen, J., Bautembach, D., and Izadi, S. 2013. Scalable real-time volumetric surface reconstruction. ACM Trans. on Graph. (SIGGRAPH) 32, 4, 113:1--113:16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chen, X., Golovinskiy, A., and Funkhouser, T. 2013. A benchmark for 3D mesh segmentation. ACM Trans. on Graph. (SIGGRAPH) 28, 3, 73:1--73:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chen, K., Lai, Y.-K., Wu, Y.-X., Martin, R., and Hu, S.-M. 2014. Automatic semantic modeling of indoor scenes from low-quality rgb-d data using contextual information. ACM Trans. on Graph. (SIGGRAPH Asia) 33, 6, 208:1--208:15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Cover, T., and Thomas, J. 1991. Elements of Information Theory. Wiley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., and Singer, Y. 2006. Online passive-aggressive algorithms. J. Mach. Learn. Res. 7 (Dec.), 551--585. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Curless, B., and Levoy, M. 1996. A volumetric method for building complex models from range images. In Proc. of SIGGRAPH, 303--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Fisher, M., Savva, M., and Hanrahan, P. 2011. Characterizing structural relationships in scenes using graph kernels. ACM Trans. on Graph. (SIGGRAPH) 30, 4, 34:1--34:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Fisher, M., Ritchie, D., Savva, M., Funkhouser, T., and Hanrahan, P. 2012. Example-based synthesis of 3D object arrangements. ACM Trans. on Graph. (SIGGRAPH Asia) 31, 6, 135:1--135:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Foster, R. B., Wang, R., and Grupen, R. 2011. A mobile robot for autonomous scene capture and rendering. UMass Technical Report UM-CS-2011-019.Google ScholarGoogle Scholar
  16. Golovinskiy, A., Kim, V. G., and Funkhouser, T. A. 2009. Shape-based recognition of 3D point clouds in urban environments. In Proc. ICCV, 2154--2161.Google ScholarGoogle Scholar
  17. Gupta, S., Arbelaez, P., and Malik, J. 2013. Perceptual organization and recognition of indoor scenes from RGB-D images. In Proc. CVPR, 564--571. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hausman, K., Balint-Benczedi, F., Pangercic, D., Marton, Z.-C., Ueda, R., Okada, K., and Beetz, M. 2013. Tracking-based interactive segmentation of textureless objects. In Proc. ICRA, 1122--1129.Google ScholarGoogle Scholar
  19. Hedau, V., Hoiem, D., and Forsyth, D. 2010. Thinking inside the box: Using appearance models and context based on room geometry. In Proc. ECCV. 224--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Herbst, E., Henry, P., and Fox, D. 2014. Toward online 3-D object segmentation and mapping. In Proc. ICRA, 3193--3200.Google ScholarGoogle Scholar
  21. Jiang, Y., and Saxena, A. 2013. Hallucinating humans for learning robotic placement of objects. In Proc. Experimental Robotics, 921--937.Google ScholarGoogle Scholar
  22. Katz, S., and Tal, A. 2003. Hierarchical mesh decomposition using fuzzy clustering and cuts. ACM Trans. on Graph. (SIGGRAPH) 22, 3, 954--961. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Khalfaoui, S., Seulin, R., Fougerolle, Y., and Fofi, D. 2013. An efficient method for fully automatic 3D digitization of unknown objects. Computers in Industry 64, 9, 1152--1160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kim, Y. M., Mitra, N. J., Yan, D.-M., and Guibas, L. 2012. Acquiring 3D indoor environments with variability and repetition. ACM Trans. on Graph. (SIGGRAPH Asia) 31, 6, 138:1--138:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Levandowsky, M., and Winter, D. 1971. Distance between sets. Nature 234, 5, 34--35.Google ScholarGoogle ScholarCross RefCross Ref
  26. Li, Y., Dai, A., Guibas, L., and Niessner, M. 2015. Database-assisted object retrieval for real-time 3D reconstruction. Computer Graphics Forum (Eurographics) 34, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Liu, T., Chaudhuri, S., Kim, V. G., Huang, Q., Mitra, N. J., and Funkhouser, T. 2014. Creating consistent scene graphs using a probabilistic grammar. ACM Trans. on Graph. (SIGGRAPH Asia) 33, 6, 211:1--211:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Mattausch, O., Panozzo, D., Mura, C., Sorkine-Hornung, O., and Pajarola, R. 2014. Object detection and classification from large-scale cluttered indoor scans. Computer Graphics Forum (Eurographics) 33, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Nan, L., Xie, K., and Sharf, A. 2012. A search-classify approach for cluttered indoor scene understanding. ACM Trans. on Graph. (SIGGRAPH Asia) 31, 6, 137:1--137:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Newcombe, R. A., Davison, A. J., Izadi, S., Kohli, P., Hilliges, O., Shotton, J., Molyneaux, D., Hodges, S., Kim, D., and Fitzgibbon, A. 2011. KinectFusion: Real-time dense surface mapping and tracking. In Proc. IEEE Int. Symp. on Mixed and Augmented Reality, 127--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Niessner, M., Zollhöfer, M., Izadi, S., and Stamminger, M. 2013. Real-time 3D reconstruction at scale using voxel hashing. ACM Trans. on Graph. (SIGGRAPH Asia) 32, 6, 169:1--169:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Papon, J., Abramov, A., Schoeler, M., and Wörgötter, F. 2013. Voxel cloud connectivity segmentation - supervoxels for point clouds. In Proc. CVPR, 2027--2034. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Prisacariu, V. A., Kähler, O., Cheng, M. M., Valentin, J., Torr, P. H. S., Reid, I. D., and Murray, D. W. 2014. A framework for the volumetric integration of depth images. ArXiv e-prints, 1410.0925.Google ScholarGoogle Scholar
  34. ROS, 2014. ROS Wiki. http://wiki.ros.org/.Google ScholarGoogle Scholar
  35. Roth, H., and Vona, M. 2012. Moving volume KinectFusion. In Proc. BMVC, 112:1--112:11.Google ScholarGoogle Scholar
  36. Salas-Moreno, R. F., Newcombe, R. A., Strasdat, H., Kelly, P. H. J., and Davison, A. J. 2012. SLAM++: Simultaneous localisation and mapping at the level of objects. In CVPR, 1352--1359. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Savva, M., Chang, A. X., Hanrahan, P., Fisher, M., and Niessner, M. 2014. Scenegrok: Inferring action maps in 3D environments. ACM Trans. on Graph. (SIGGRAPH Asia) 33, 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Schnabel, R., Wahl, R., and Klein, R. 2007. Efficient RANSAC for point-cloud shape detection. Computer Graphics Forum 26, 2, 214--226.Google ScholarGoogle ScholarCross RefCross Ref
  39. Shao, T., Xu, W., Zhou, K., Wang, J., Li, D., and Guo, B. 2012. An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Trans. on Graph. (SIGGRAPH Asia) 31, 6, 136:1--136:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Silberman, N., Kohli, P., Hoiem, D., and Fergus, R. 2012. Indoor segmentation and support inference from RGBD images. In Proc. ECCV, 746--760. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Valentin, J., Vineet, V., Cheng, M.-M., Kim, D., Shotton, J., Kohli, P., Niessner, M., Criminisi, A., Izadi, S., and Torr, P. 2015. SemanticPaint: Interactive 3D labeling and learning at your finger tips. ACM Trans. on Graph., to appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Wagner, R., Frese, U., and Buml, B. 2013. Real-time dense multi-scale workspace modeling on a humanoid robot. In Proc. IROS, 5164--5171.Google ScholarGoogle Scholar
  43. Whelan, T., Kaess, M., Fallon, M., Johannsson, H., Leonard, J., and McDonald, J. 2012. Kintinuous: Spatially extended KinectFusion. In RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras.Google ScholarGoogle Scholar
  44. Wu, S., Sun, W., Long, P., Huang, H., Cohen-Or, D., Gong, M., Deussen, O., and Chen, B. 2014. Quality-driven poisson-guided autoscanning. ACM Trans. on Graph. (SIGGRAPH Asia) 33, 6, 203:1--203:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Zhang, Y., Xu, W., Tong, Y., and Zhou, K. 2014. Online structure analysis for real-time indoor scene reconstruction. ACM Trans. on Graph.. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Zhou, Q.-Y., and Koltun, V. 2013. Dense scene reconstruction with points of interest. ACM Trans. on Graph. (SIGGRAPH) 32, 4, 112:1--112:8. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Autoscanning for coupled scene reconstruction and proactive object analysis

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 34, Issue 6
        November 2015
        944 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/2816795
        Issue’s Table of Contents

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 2 November 2015
        Published in tog Volume 34, Issue 6

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader