skip to main content
research-article

Depth synthesis and local warps for plausible image-based navigation

Published:04 July 2013Publication History
Skip Abstract Section

Abstract

Modern camera calibration and multiview stereo techniques enable users to smoothly navigate between different views of a scene captured using standard cameras. The underlying automatic 3D reconstruction methods work well for buildings and regular structures but often fail on vegetation, vehicles, and other complex geometry present in everyday urban scenes. Consequently, missing depth information makes Image-Based Rendering (IBR) for such scenes very challenging. Our goal is to provide plausible free-viewpoint navigation for such datasets. To do this, we introduce a new IBR algorithm that is robust to missing or unreliable geometry, providing plausible novel views even in regions quite far from the input camera positions. We first oversegment the input images, creating superpixels of homogeneous color content which often tends to preserve depth discontinuities. We then introduce a depth synthesis approach for poorly reconstructed regions based on a graph structure on the oversegmentation and appropriate traversal of the graph. The superpixels augmented with synthesized depth allow us to define a local shape-preserving warp which compensates for inaccurate depth. Our rendering algorithm blends the warped images, and generates plausible image-based novel views for our challenging target scenes. Our results demonstrate novel view synthesis in real time for multiple challenging scenes with significant depth complexity, providing a convincing immersive navigation experience.

Skip Supplemental Material Section

Supplemental Material

References

  1. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., and Susstrunk, S. 2012. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Analy. Machine Intell. 34, 11, 2274--2282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andreetto, M., Zelnik-Manor, L., and Perona, P. 2008. Unsupervised learning of categorical segments in image collections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop.Google ScholarGoogle Scholar
  3. Barnes, C., Shechtman, E., Finkelstein, A., and Goldman, D. B. 2009. Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28, 3, 24:1--24:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bleyer, M., Rother, C., Kohli, P., Scharstein, D., and Sinha, S. 2011. Object stereo joint stereo matching and object segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'11). 3081--3088. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Buehler, C., Bosse, M., Mcmillan, L., Gortler, S., and Cohen, M. 2001. Unstructured lumigraph rendering. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'01). 425--432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chaurasia, G., Sorkine, O., and Drettakis, G. 2011. Silhouette-aware warping for image-based rendering. Comput. Graph. Forum 30, 4, 1223--1232. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chen, J., Paris, S., Wang, J., Matusik, W., Cohen, M., and Durand, F. 2011. The video mesh: A data structure for image-based three dimensional video editing. In Proceedings of the IEEE International Conference on Computational Photography (ICCP'11).Google ScholarGoogle Scholar
  8. Chen, Y., Davis, T. A., Hager, W. W., and Rajamanickam, S. 2008. Algorithm 887: Cholmod, supernodal sparse cholesky factorization and update/downdate. ACM Trans. Math. Softw. 35, 3, 22:1--22:14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Cigla, C., Zabulis, X., and Alatan, A. 2007. Region-based dense depth extraction from multi-view video. In Proceedings of the IEEE International Conference on Image Processing (ICIP'07).Google ScholarGoogle Scholar
  10. Criminisi, A., Perez, P., and Toyama, K. 2003. Object removal by exemplar-based inpainting. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'03). 721--728.Google ScholarGoogle Scholar
  11. Debevec, P. E., Taylor, C. J., and Malik, J. 1996. Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach. In Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'96). 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dolson, J., Baek, J., Plagemann, C., and Thrun, S. 2010. Upsampling range data in dynamic environments. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'10). 1141--1148.Google ScholarGoogle Scholar
  13. Eisemann, M., Decker, B. D., Magnor, M., Bekaert, P., De Aguiar, E., Ahmed, N., Theobalt, C., and Sellent, A. 2008. Floating textures. Comput. Graph. Forum 27, 2, 409--418.Google ScholarGoogle ScholarCross RefCross Ref
  14. Felzenszwalb, P. F. and Huttenlocher, D. P. 2004. Efficient graph-based image segmentation. Int. J. Comput. Vision 59, 167--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Fuhrmann, S. and Goesele, M. 2011. Fusion of depth maps with multiple scales. In Proceedings of the SIGGRAPH Asia Conference. 148:1--148:8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Furukawa, Y., Curless, B., Seitz, S. M., and Szeliski, R. 2009. Manhattan-world stereo. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'09). 1422--1429.Google ScholarGoogle Scholar
  17. Furukawa, Y. and Ponce, J. 2009. Accurate, dense, and robust multiview stereopsis. IEEE Trans. PAMI 32, 8, 1362--1376. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Gallup, D., Frahm, J.-M., and Pollefeys, M. 2010. Piecewise planar and non-planar stereo for urban scene reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google ScholarGoogle Scholar
  19. Goesele, M., Ackermann, J., Fuhrmann, S., Haubold, C., and Klowsky, R. 2010. Ambient point clouds for view interpolation. ACM Trans. Graph. 29, 95:1--95:6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Goesele, M., Snavely, N., Curless, B., Hoppe, H., and Seitz, S. M. 2007. Multi-view stereo for community photo collections. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google ScholarGoogle Scholar
  21. Grundmann, M., Kwatra, V., Han, M., and Essa, I. 2010. Efficient hierarchical graph based video segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google ScholarGoogle Scholar
  22. Gupta, A., Bhat, P., Dontcheva, M., Curless, B., Deussen, O., and Cohen, M. 2009. Enhancing and experiencing spacetime resolution with videos and stills. In Proceedings of the IEEE International Conference on Computational Photography (ICCP'09).Google ScholarGoogle Scholar
  23. Hawe, S., Kleinsteuber, M., and Diepold, K. 2011. Dense disparity maps from sparse disparity measurements. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hoiem, D., Efros, A. A., and Hebert, M. 2007. Recovering surface layout from an image. Int. J. Comput. Vision 75, 1, 151--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Kazhdan, M., Bolitho, M., and Hoppe, H. 2006. Poisson surface reconstruction. In Proceedings of the 4th Eurographics Symposium on Geometry Processing (SGP'06). 61--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Kolmogorov, V. and Zabih, R. 2004. What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Analy. Machine Intell. 26, 2, 147--159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Kowdle, A., Sinha, S. N., and Szeliski, R. 2012. Multiple view object cosegmentation using appearance and stereo cues. In Proceedings of the 12th European Conference on Computer Vision (ECCV'12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Levoy, M. and Hanrahan, P. 1996. Light field rendering. In Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'96). 31--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Lipski, C., Linz, C., Berger, K., Sellent, A., and Magnor, M. 2010. Virtual video camera: Image-based viewpoint navigation through space and time. Comput. Graph. Forum 29, 8, 2555--2568.Google ScholarGoogle ScholarCross RefCross Ref
  30. Liu, F., Gleicher, M., Jin, H., and Agarwala, A. 2009. Contentpreserving warps for 3D video stabilization. In Proceedings of the ACM SIGGRAPH Papers. 44:1--44:9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Mahajan, D., Huang, F.-C., Matusik, W., Ramamoorthi, R., and Belhumeur, P. 2009. Moving gradients: A path-based method for plausible image interpolation. ACM Trans. Graph. 28, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Mcmillan, L. and Bishop, G. 1995. Plenoptic modeling: An image based rendering system. In Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'95). 39--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Micusik, B. and Kosecka, J. 2010. Multi-view superpixel stereo in urban environments. Int. J. Comput. Vision 89, 1, 106--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Perez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. In Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'03). 313--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Pollefeys, M., Nistér, D., Frahm, J. M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S.-J., Merrell, P., Salmi, C., Sinha, S., Talton, B., Wang, L., Yang, Q., Stewenius, H., Yang, R., Welch, G., and Towles, H. 2008. Detailed real-time urban 3D reconstruction from video. Int. J. Comput. Vision 78, 2--3, 143--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Sinha, S. N., Kopf, J., Goesele, M., Scharstein, D., and Szeliski, R. 2012. Image-based rendering for scenes with reflections. ACM Trans. Graph. 31, 4, 100:1--100:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Sinha, S. N., Mordohai, P., and Pollefeys, M. 2007. Multi-view stereo via graph cuts on the dual of an adaptive tetrahedral mesh. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google ScholarGoogle Scholar
  39. Sinha, S. N., Steedly, D., and Szeliski, R. 2009. Piecewise planar stereo for image-based rendering. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09). 1881--1888.Google ScholarGoogle Scholar
  40. Snavely, N., Seitz, S. M., and Szeliski, R. 2006. Photo tourism: Exploring photo collections in 3D. ACM Trans. Graph. 25, 3, 835--846. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Stich, T., Linz, C., Wallraven, C., Cunningham, D., and Magnor, M. 2011. Perception-motivated interpolation of image sequences. ACM Trans. Appl. Percept. 8, 2, 11:1--11:25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Vangorp, P., Chaurasia, G., Laffont, P.-Y., Fleming, R. W., and Drettakis, G. 2011. Perception of visual artifacts in image-based rendering of facades. Comput. Graph. Forum 30, 4, 1241--1250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Yang, Q., Yang, R., Davis, J., and Niste R, D. 2007. Spatial-depth super resolution for range images. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'07).Google ScholarGoogle ScholarCross RefCross Ref
  44. Zitnick, C. L., Jojic, N., and Kang, S. B. 2005. Consistent segmentation for optical flow estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05). 1308--1315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Zitnick, C. L. and Kang, S. B. 2007. Stereo for image-based rendering using image over-segmentation. Int. J. Comput. Vision 75, 1, 49--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM Trans. Graph. 23, 3, 600--608. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Depth synthesis and local warps for plausible image-based navigation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 32, Issue 3
        June 2013
        129 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/2487228
        Issue’s Table of Contents

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 July 2013
        • Accepted: 1 February 2013
        • Revised: 1 January 2013
        • Received: 1 July 2012
        Published in tog Volume 32, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader