Skip to main content
Top

2020 | OriginalPaper | Chapter

Geometry-Guided View Synthesis with Local Nonuniform Plane-Sweep Volume

Authors : Ao Li, Li Fang, Long Ye, Wei Zhong, Qin Zhang

Published in: Digital TV and Wireless Multimedia Communication

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper we develop a geometry-guided image generation technology for scene-independent novel view synthesis from a stereo image pair. We employ the successful plane-sweep strategy to tackle the problem of 3D scene structure approximation. But instead of putting on a general configuration, we use depth information to perform a local nonuniform plane spacing. More specifically, we first explicitly estimate a depth map in the reference view and use it to guide the planes spacing in plane-sweep volume, resulting in a geometry-guided manner for scene geometry approximation. Next we learn to predict a multiplane images (MPIs) representation, which can then be used to synthesize a range of novel views of the scene, including views that extrapolate significantly beyond the input baseline, to allow for efficient view synthesis. Our results on massive YouTube video frames dataset indicate that our approach makes it possible to synthesize higher quality images, while keeping the number of depth planes.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Tanimoto, M.: Overview of FTV (free-viewpoint television). In: Proceedings of the IEEE Conference on Multimedia and Expo (ICME 2009), pp. 1552–1553, June 2009 Tanimoto, M.: Overview of FTV (free-viewpoint television). In: Proceedings of the IEEE Conference on Multimedia and Expo (ICME 2009), pp. 1552–1553, June 2009
2.
go back to reference Kopf, J., Cohen, M.F., Szeliski, R.: First-person hyperlapse videos. In: SIGGRAPH (2014) Kopf, J., Cohen, M.F., Szeliski, R.: First-person hyperlapse videos. In: SIGGRAPH (2014)
3.
go back to reference Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1–3), 7–42 (2002)CrossRef Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1–3), 7–42 (2002)CrossRef
4.
go back to reference Kim, C., Zimmer, H., Pritch, Y., Sorkine-Hornung, A., Gross, M.: Scene reconstruction from high spatio-angular resolution light fields. ACM Trans. Graph. 32(4), 1–12 (2013)MATH Kim, C., Zimmer, H., Pritch, Y., Sorkine-Hornung, A., Gross, M.: Scene reconstruction from high spatio-angular resolution light fields. ACM Trans. Graph. 32(4), 1–12 (2013)MATH
5.
go back to reference Adelson, E., Bergen, J.: The plenoptic function and the elements of early vision. In: Computational Models of Visual Processing. MIT Press, Cambridge (1991) Adelson, E., Bergen, J.: The plenoptic function and the elements of early vision. In: Computational Models of Visual Processing. MIT Press, Cambridge (1991)
6.
go back to reference Levoy, M., Hanrahan, P.: Light field rendering. In: Proceedings of the ACM SIGGRAPH, pp. 31–42 (1996) Levoy, M., Hanrahan, P.: Light field rendering. In: Proceedings of the ACM SIGGRAPH, pp. 31–42 (1996)
7.
go back to reference Gortler, S.J., Grzeszczuk, R., Szeliski, R., Cohen, M.F.: The lumigraph. In: Proceedings of the ACM SIGGRAPH, pp. 43–54 (1996) Gortler, S.J., Grzeszczuk, R., Szeliski, R., Cohen, M.F.: The lumigraph. In: Proceedings of the ACM SIGGRAPH, pp. 43–54 (1996)
8.
go back to reference Buehler, C., Bosse, M., Mcmillan, L., et al.: Unstructured lumigraph rendering. In: Conference on Computer Graphics & Interactive Techniques. ACM (2001) Buehler, C., Bosse, M., Mcmillan, L., et al.: Unstructured lumigraph rendering. In: Conference on Computer Graphics & Interactive Techniques. ACM (2001)
9.
go back to reference Chai, J., Tong, X., Chan, S., et al.: Plenoptic sampling. In: Proceedings of the ACM SIGGRAPH, pp. 307–318 (2000) Chai, J., Tong, X., Chan, S., et al.: Plenoptic sampling. In: Proceedings of the ACM SIGGRAPH, pp. 307–318 (2000)
10.
go back to reference Pearson, J., Brookes, M., Dragotti, P.L.: Plenoptic layer-based modeling for image based rendering. IEEE Trans. Image Process. 22(9), 3405–3419 (2013)MathSciNetCrossRef Pearson, J., Brookes, M., Dragotti, P.L.: Plenoptic layer-based modeling for image based rendering. IEEE Trans. Image Process. 22(9), 3405–3419 (2013)MathSciNetCrossRef
11.
go back to reference Tatarchenko, M., Dosovitskiy, A., Brox, T.: Single-view to multi-view: reconstructing unseen views with a convolutional network. Knowl. Inf. Syst. 38(1), 231–257 (2015) Tatarchenko, M., Dosovitskiy, A., Brox, T.: Single-view to multi-view: reconstructing unseen views with a convolutional network. Knowl. Inf. Syst. 38(1), 231–257 (2015)
14.
go back to reference Flynn, J., Neulander, I., Philbin, J., Snavely, N.: DeepStereo: learning to predict new views from the world’s imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5515–5524 (2016) Flynn, J., Neulander, I., Philbin, J., Snavely, N.: DeepStereo: learning to predict new views from the world’s imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5515–5524 (2016)
15.
go back to reference Takeuchi, K., Okami, K., Ochi, D., et al.: Partial plane sweep volume for deep learning based view synthesis. In: ACM SIGGRAPH 2017 Posters. ACM (2017) Takeuchi, K., Okami, K., Ochi, D., et al.: Partial plane sweep volume for deep learning based view synthesis. In: ACM SIGGRAPH 2017 Posters. ACM (2017)
16.
go back to reference Liu, M., He, X., Salzmann, M.: Geometry-aware deep network for single-image novel view synthesis. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4616–4624 (2018) Liu, M., He, X., Salzmann, M.: Geometry-aware deep network for single-image novel view synthesis. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4616–4624 (2018)
17.
go back to reference Kalantari, N.K., Wang, T.-C., Ramamoorthi, R.: Learning-based view synthesis for light field cameras. ACM Trans. Graph. 35(6), 1–10 (2016)CrossRef Kalantari, N.K., Wang, T.-C., Ramamoorthi, R.: Learning-based view synthesis for light field cameras. ACM Trans. Graph. 35(6), 1–10 (2016)CrossRef
18.
go back to reference Tao, M.W., Srinivasan, P.P., Malik, J., et al.: Depth from shading, defocus, and correspondence using light-field angular coherence. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society (2015) Tao, M.W., Srinivasan, P.P., Malik, J., et al.: Depth from shading, defocus, and correspondence using light-field angular coherence. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society (2015)
19.
go back to reference Penner, E., Zhang, L.: Soft 3D reconstruction for view synthesis. In: Proceedings of the SIGGRAPH Asia (2017)CrossRef Penner, E., Zhang, L.: Soft 3D reconstruction for view synthesis. In: Proceedings of the SIGGRAPH Asia (2017)CrossRef
20.
go back to reference Zhou, T., Tucker, R., Flynn, J., et al.: Stereo magnification: learning view synthesis using multiplane images (2018) Zhou, T., Tucker, R., Flynn, J., et al.: Stereo magnification: learning view synthesis using multiplane images (2018)
21.
go back to reference Kalantari, N.K., Wang, T.-C., Ramamoorthi, R.: Learning-based view synthesis for light field cameras. In: Proceedings of the SIGGRAPH Asia (2016) Kalantari, N.K., Wang, T.-C., Ramamoorthi, R.: Learning-based view synthesis for light field cameras. In: Proceedings of the SIGGRAPH Asia (2016)
23.
go back to reference Hu, J., Ozay, M., Zhang, Y., et al.: Revisiting single image depth estimation: toward higher resolution maps with accurate object boundaries (2018) Hu, J., Ozay, M., Zhang, Y., et al.: Revisiting single image depth estimation: toward higher resolution maps with accurate object boundaries (2018)
25.
go back to reference Shade, J., Gortler, S., He, L., Szeliski, R.: Layered depth images. In: Proceedings of the SIGGRAPH (1998) Shade, J., Gortler, S., He, L., Szeliski, R.: Layered depth images. In: Proceedings of the SIGGRAPH (1998)
26.
go back to reference Collins, R.T.: A space-sweep approach to true multi-image matching. In: CVPR (1996) Collins, R.T.: A space-sweep approach to true multi-image matching. In: CVPR (1996)
27.
go back to reference Szeliski, R., Golland, P.: Stereo matching with transparency and matting. IJCV 32(1), 45–61 (1999)CrossRef Szeliski, R., Golland, P.: Stereo matching with transparency and matting. IJCV 32(1), 45–61 (1999)CrossRef
28.
go back to reference Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: NIPS (2015) Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: NIPS (2015)
30.
go back to reference Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: OSDI (2016) Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: OSDI (2016)
32.
go back to reference Hasinoff, S.W., et al.: Burst photography for high dynamic range and low-light imaging on mobile cameras. In: Proceedings of the SIGGRAPH Asia (2016) Hasinoff, S.W., et al.: Burst photography for high dynamic range and low-light imaging on mobile cameras. In: Proceedings of the SIGGRAPH Asia (2016)
33.
go back to reference Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. In: NIPS (2016) Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. In: NIPS (2016)
35.
go back to reference Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2018)CrossRef Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2018)CrossRef
36.
go back to reference Lin, Z., Shum, H.-Y.: A geometric analysis of light field rendering. Int. J. Comput. Vis. 58(2), 121–138 (2004)CrossRef Lin, Z., Shum, H.-Y.: A geometric analysis of light field rendering. Int. J. Comput. Vis. 58(2), 121–138 (2004)CrossRef
Metadata
Title
Geometry-Guided View Synthesis with Local Nonuniform Plane-Sweep Volume
Authors
Ao Li
Li Fang
Long Ye
Wei Zhong
Qin Zhang
Copyright Year
2020
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-15-3341-9_32