skip to main content
research-article
Open Access

Instant 3D photography

Published:30 July 2018Publication History
Skip Abstract Section

Abstract

We present an algorithm for constructing 3D panoramas from a sequence of aligned color-and-depth image pairs. Such sequences can be conveniently captured using dual lens cell phone cameras that reconstruct depth maps from synchronized stereo image capture. Due to the small baseline and resulting triangulation error the depth maps are considerably degraded and contain low-frequency error, which prevents alignment using simple global transformations. We propose a novel optimization that jointly estimates the camera poses as well as spatially-varying adjustment maps that are applied to deform the depth maps and bring them into good alignment. When fusing the aligned images into a seamless mosaic we utilize a carefully designed data term and the high quality of our depth alignment to achieve two orders of magnitude speedup w.r.t. previous solutions that rely on discrete optimization by removing the need for label smoothness optimization. Our algorithm processes about one input image per second, resulting in an end-to-end runtime of about one minute for mid-sized panoramas. The final 3D panoramas are highly detailed and can be viewed with binocular and head motion parallax in VR.

Skip Supplemental Material Section

Supplemental Material

a101-hedman.mp4

mp4

275.7 MB

References

  1. Sameer Agarwal, Keir Mierle, and Others. 2017. Ceres Solver, http://ceres-solver.org. (2017).Google ScholarGoogle Scholar
  2. Robert Anderson, David Gallup, Jonathan T. Barron, Janne Kontkanen, Noah Snavely, Carlos Hernandez Esteban, Sameer Agarwal, and Steven M. Seitz. 2016. Jump: Virtual Reality Video. ACM Transactions on Graphics 35, 6 (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Nicholas Ayache. 1989. Vision Stéréoscopique et Perception Multisensorielle: Application à la robotique mobile. Inter-Editions (MASSON). https://hal.inria.fr/inria-00615192Google ScholarGoogle Scholar
  4. Jonathan T. Barron, Andrew Adams, YiChang Shih, and Carlos Hernández. 2015. Fast Bilateral-Space Stereo for Synthetic Defocus. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), 4466--4474.Google ScholarGoogle ScholarCross RefCross Ref
  5. Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. 2017a. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. Proc. Computer Vision and Pattern Recognition (CVPR), IEEE (2017).Google ScholarGoogle ScholarCross RefCross Ref
  6. Angela Dai, Matthias Nießner, Michael Zollöfer, Shahram Izadi, and Christian Theobalt. 2017b. BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration. ACM Transactions on Graphics 2017 (TOG) 36, 3 (2017), Article no. 24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Abe Davis, Marc Levoy, and Fredo Durand. 2012. Unstructured Light Fields. Computer Graphics Forum (Proc. EUROGRAPHICS 2012) 31, 2pt1 (2012), 305--314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Facebook. 2016. Facebook Surround 360. https://facebook360.fb.com/facebook-surround-360/. (2016). Accessed: 2016-12-26.Google ScholarGoogle Scholar
  9. Clément Godard, Oisin Mac Aodha, and Gabriel J. Brostow. 2017. Unsupervised Monocular Depth Estimation with Left-Right Consistency. CVPR (2017).Google ScholarGoogle Scholar
  10. Steven J. Gortler, Radek Grzeszczuk, Richard Szeliski, and Michael F. Cohen. 1996. The Lumigraph. Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (1996), 43--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hyowon Ha, Sunghoon Im, Jaesik Park, Hae-Gon Jeon, and In So Kweon. 2016. High-quality Depth from Uncalibrated Small Motion Clip. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).Google ScholarGoogle ScholarCross RefCross Ref
  12. Kaiming He, Jian Sun, and Xiaoou Tang. 2010. Guided Image Filtering. Proceedings of the 11th European Conference on Computer Vision (ECCV) (2010), 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Peter Hedman, Suhib Alsisan, Richard Szeliski, and Johannes Kopf. 2017. Casual 3D Photography. ACM Transactions on Graphics (Proc. SIGGRAPH Asia 2017) 36, 6 (2017), Article no. 234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Aasma Hosni, Christoph Rhemann Rhemann, Michael Bleyer, Carsten Rother, and Margrit Gelautz. 2013. Fast Cost-Volume Filtering for Visual Correspondence and Beyond. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2 (2013), 504--511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jingwei Huang, Zhili Chen, Duygu Ceylan, and Hailin Jin. 2017. 6-DOF VR Videos with a Single 360-Camera. IEEE VR 2017 (2017).Google ScholarGoogle ScholarCross RefCross Ref
  16. Sunghoon Im, Hyowon Ha, François Rameau, Hae-Gon Jeon, Gyeongmin Choe, and In So Kweon. 2016. All-Around Depth from Small Motion with a Spherical Panoramic Camera. European Conference on Computer Vision (ECCV '16) (2016), 156--172.Google ScholarGoogle ScholarCross RefCross Ref
  17. Hiroshi Ishiguro, Masashi Yamamoto, and Saburo Tsuji. 1990. Omni-directional stereo for making global map. Third International Conference on Computer Vision (1990), 540--547.Google ScholarGoogle ScholarCross RefCross Ref
  18. Shahram Izadi, David Kim, Otmar Hilliges, David Molyneaux, Richard Newcombe, Pushmeet Kohli, Jamie Shotton, Steve Hodges, Dustin Freeman, Andrew Davison, and Andrew Fitzgibbon. 2011. KinectFusion: Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera. Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (2011), 559--568. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Robert Konrad, Donald G. Dansereau, Aniq Masood, and Gordon Wetzstein. 2017. SpinVR: Towards Live-streaming 3D Virtual Reality Video. ACM Transactions on Graphics (Proc. SIGGRAPH Asia 2017) 36, 6 (2017), article no. 209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jungjin Lee, Bumki Kim, Kyehyun Kim, Younghui Kim, and Junyong Noli. 2016. Rich360: Optimized Spherical Representation from Structured Panoramic Camera Arrays. ACM Transactions on Graphics 35, 4 (2016), article no. 63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Marc Levoy and Pat Hanrahan. 1996. Light Field Rendering. Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (1996), 31--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kaimo Lin, Nianjuan Jiang, Loong-Fah Cheong, Minh N. Do, and Jiangbo Lu. 2016. SEAGULL: Seam-Guided Local Alignment for Parallax-Tolerant Image Stitching. 14th European Conference on Computer Vision (ECCV) (2016), 370--385.Google ScholarGoogle Scholar
  23. Marius Muja and David G. Lowe. 2009. Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration. International Conference on Computer Vision Theory and Application VISSAPP'09) (2009), 331--340.Google ScholarGoogle Scholar
  24. Shmuel Peleg and Moshe Ben-Ezra. 1999. Stereo panorama with a single camera. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 1999) (1999), 395--401.Google ScholarGoogle ScholarCross RefCross Ref
  25. Shmuel Peleg, Moshe Ben-Ezra, and Yael Pritch. 2001. Omnistereo: panoramic stereo imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 3 (2001), 279--290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. F. Perazzi, A. Sorkine-Hornung, H. Zimmer, P. Kaufmann, O. Wang, S. Watson, and M. Gross. 2015. Panoramic Video from Unstructured Camera Arrays. Computer Graphics Forum 34, 2 (2015), 57--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Realities. 2017. realities.io | Go Places, http://realities.io/. (2017). Accessed: 2017-1-12.Google ScholarGoogle Scholar
  28. Erik Reinhard, Michael Ashikhmin, Bruce Gooch, and Peter Shirley. 2001. Color Transfer Between Images. IEEE Comput. Graph. Appl. 21, 5 (2001), 34--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Christian Richardt, Yael Pritch, Henning Zimmer, and Alexander Sorkine-Hornung. 2013. Megastereo: Constructing High-Resolution Stereo Panoramas. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013) (2013), 1256--1263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Johannes Lutz Schönberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).Google ScholarGoogle Scholar
  31. Steven M Seitz, Brian Curless, James Diebel, Daniel Scharstein, and Richard Szeliski. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) 1 (2006), 519--528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Jianbo Shi and Carlo Tomasi. 1994. Good Features to Track. 1994 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'94) (1994), 593 -- 600.Google ScholarGoogle Scholar
  33. Richard Szeliski. 2010. Computer Vision: Algorithms and Applications (1st ed.). Springer-Verlag New York, Inc., New York, NY, USA. Google ScholarGoogle Scholar
  34. Engin Tola, Vincent Lepetit, and Pascal Fua. 2010. DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo. IEEE Trans. Pattern Anal. Mach. Intell. 32, 5 (2010), 815--830. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Valve. 2016. Valve Developer Community: Advanced Outdoors Photogrammetry. https://developer.valvesoftware.com/wiki/Destinations/Advanced_Outdoors_Photogrammetry. (2016). Accessed: 2016-11-3.Google ScholarGoogle Scholar
  36. Thomas Whelan, Stefan Leutenegger, Renato Salas Moreno, Ben Glocker, and Andrew Davison. 2015. ElasticFusion: Dense SLAM Without A Pose Graph. Proceedings of Robotics: Science and Systems (2015).Google ScholarGoogle ScholarCross RefCross Ref
  37. Julio Zaragoza, Tat-Jun Chin, Michael S. Brown, and David Suter. 2013. As-Projective-As-Possible Image Stitching with Moving DLT. Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (2013), 2339--2346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Fan Zhang and Feng Liu. 2014. Parallax-Tolerant Image Stitching. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (2014), 3262--3269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Fan Zhang and Feng Liu. 2015. Casual Stereoscopic Panorama Stitching. IEEE Conference on Computer Vision and Pattern Recognition (CVPR '15) (2015), 2002--2010.Google ScholarGoogle Scholar
  40. Ke Colin Zheng, Sing Bing Kang, Michael F. Cohen, and Richard Szeliski. 2007. Layered Depth Panoramas. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2007) (2007), 1--8.Google ScholarGoogle Scholar
  41. Qian-Yi Zhou and Vladlen Koltun. 2014. Simultaneous Localization and Calibration: Self-Calibration of Consumer Depth Cameras. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014), 454--460. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Instant 3D photography

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Graphics
          ACM Transactions on Graphics  Volume 37, Issue 4
          August 2018
          1670 pages
          ISSN:0730-0301
          EISSN:1557-7368
          DOI:10.1145/3197517
          Issue’s Table of Contents

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 30 July 2018
          Published in tog Volume 37, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader