Skip to main content

2016 | OriginalPaper | Buchkapitel

Generative Visual Manipulation on the Natural Image Manifold

verfasst von : Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, Alexei A. Efros

Erschienen in: Computer Vision – ECCV 2016

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Realistic image manipulation is challenging because it requires modifying the image appearance in a user-controlled way, while preserving the realism of the result. Unless the user has considerable artistic skill, it is easy to “fall off” the manifold of natural images while editing. In this paper, we propose to learn the natural image manifold directly from data using a generative adversarial neural network. We then define a class of image editing operations, and constrain their output to lie on that learned manifold at all times. The model automatically adjusts the output keeping all edits as realistic as possible. All our manipulations are expressed in terms of constrained optimization and are applied in near-real time. We evaluate our algorithm on the task of realistic photo manipulation of shape and color. The presented method can further be used for changing one image to look like the other, as well as generating novel imagery from scratch based on user’s scribbles.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
For simplicity, we omit the pixel subscript (xy) for all the variables.
 
Literatur
1.
Zurück zum Zitat Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: NIPS. 2672–2680. (2014) Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: NIPS. 2672–2680. (2014)
2.
Zurück zum Zitat Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: ICLR (2016) Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: ICLR (2016)
3.
Zurück zum Zitat Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014) Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)
4.
Zurück zum Zitat Denton, E.L., Chintala, S., Fergus, R., et al.: Deep generative image models usinga laplacian pyramid of adversarial networks. In: NIPS, pp. 1486–1494 (2015) Denton, E.L., Chintala, S., Fergus, R., et al.: Deep generative image models usinga laplacian pyramid of adversarial networks. In: NIPS, pp. 1486–1494 (2015)
5.
Zurück zum Zitat Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. arXiv preprint arXiv:1602.02644 (2016) Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. arXiv preprint arXiv:​1602.​02644 (2016)
6.
Zurück zum Zitat Reinhard, E., Ashikhmin, M., Gooch, B., Shirley, P.: Color transfer between images. IEEE Comput. Graph. Appl. 21, 34–41 (2001)CrossRef Reinhard, E., Ashikhmin, M., Gooch, B., Shirley, P.: Color transfer between images. IEEE Comput. Graph. Appl. 21, 34–41 (2001)CrossRef
7.
Zurück zum Zitat Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. In: SIGGRAPH, SIGGRAPH 2004, pp. 689–694. ACM, New York (2004) Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. In: SIGGRAPH, SIGGRAPH 2004, pp. 689–694. ACM, New York (2004)
8.
Zurück zum Zitat Alexa, M., Cohen-Or, D., Levin, D.: As-rigid-as-possible shape interpolation. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2000 (2000) Alexa, M., Cohen-Or, D., Levin, D.: As-rigid-as-possible shape interpolation. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2000 (2000)
9.
Zurück zum Zitat Krähenbühl, P., Lang, M., Hornung, A., Gross, M.: A system for retargeting of streaming video. In: ACM Trans. Graph. (TOG), vol. 28. p. 126. ACM (2009) Krähenbühl, P., Lang, M., Hornung, A., Gross, M.: A system for retargeting of streaming video. In: ACM Trans. Graph. (TOG), vol. 28. p. 126. ACM (2009)
10.
Zurück zum Zitat Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.: Patchmatch: a randomized correspondence algorithm for structural image editing. SIGGRAPH 28(3), 24 (2009) Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.: Patchmatch: a randomized correspondence algorithm for structural image editing. SIGGRAPH 28(3), 24 (2009)
11.
Zurück zum Zitat Wolberg, G.: Digital Image Warping. IEEE Computer Society Press, Los Alamitos (1990) Wolberg, G.: Digital Image Warping. IEEE Computer Society Press, Los Alamitos (1990)
12.
Zurück zum Zitat Shechtman, E., Rav-Acha, A., Irani, M., Seitz, S.: Regenerative morphing. In: CVPR, San-Francisco, CA, June 2010 Shechtman, E., Rav-Acha, A., Irani, M., Seitz, S.: Regenerative morphing. In: CVPR, San-Francisco, CA, June 2010
13.
Zurück zum Zitat Kemelmacher-Shlizerman, I., Shechtman, E., Garg, R., Seitz, S.M.: Exploring photobios. In: SIGGRAPH, vol. 30, p. 61 (2011) Kemelmacher-Shlizerman, I., Shechtman, E., Garg, R., Seitz, S.M.: Exploring photobios. In: SIGGRAPH, vol. 30, p. 61 (2011)
14.
Zurück zum Zitat Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)CrossRef Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)CrossRef
15.
Zurück zum Zitat Portilla, J., Simoncelli, E.P.: A parametric texture model based on joint statistics of complex wavelet coefficients. IJCV 40(1), 49–70 (2000)CrossRefMATH Portilla, J., Simoncelli, E.P.: A parametric texture model based on joint statistics of complex wavelet coefficients. IJCV 40(1), 49–70 (2000)CrossRefMATH
16.
Zurück zum Zitat Zoran, D., Weiss, Y.: From learning models of natural image patches to whole image restoration. In: Proceedings of ICCV, pp. 479–486 (2011) Zoran, D., Weiss, Y.: From learning models of natural image patches to whole image restoration. In: Proceedings of ICCV, pp. 479–486 (2011)
17.
Zurück zum Zitat Roth, S., Black, M.J.: Fields of experts: a framework for learning image priors. In: CVPR (2005) Roth, S., Black, M.J.: Fields of experts: a framework for learning image priors. In: CVPR (2005)
18.
Zurück zum Zitat Zhu, J.Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Learning a discriminative model for the perception of realism in composite images. In: ICCV (2015) Zhu, J.Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Learning a discriminative model for the perception of realism in composite images. In: ICCV (2015)
19.
Zurück zum Zitat Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefMATH Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefMATH
20.
Zurück zum Zitat Salakhutdinov, R., Hinton, G.E.: Deep boltzmann machines. In: AISTATS (2009) Salakhutdinov, R., Hinton, G.E.: Deep boltzmann machines. In: AISTATS (2009)
21.
Zurück zum Zitat Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: ICML (2008) Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: ICML (2008)
22.
Zurück zum Zitat Bengio, Y., Laufer, E., Alain, G., Yosinski, J.: Deep generative stochastic networks trainable by backprop. In: ICML, pp. 226–234 (2014) Bengio, Y., Laufer, E., Alain, G., Yosinski, J.: Deep generative stochastic networks trainable by backprop. In: ICML, pp. 226–234 (2014)
23.
Zurück zum Zitat Gregor, K., Danihelka, I., Graves, A., Wierstra, D.: Draw: a recurrent neural network for image generation. In: ICML (2015) Gregor, K., Danihelka, I., Graves, A., Wierstra, D.: Draw: a recurrent neural network for image generation. In: ICML (2015)
24.
Zurück zum Zitat Dosovitskiy, A., Tobias Springenberg, J., Brox, T.: Learning to generate chairs with convolutional neural networks. In: CVPR, pp. 1538–1546 (2015) Dosovitskiy, A., Tobias Springenberg, J., Brox, T.: Learning to generate chairs with convolutional neural networks. In: CVPR, pp. 1538–1546 (2015)
25.
Zurück zum Zitat Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. arXiv preprint arXiv:1603.08155 (2016) Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. arXiv preprint arXiv:​1603.​08155 (2016)
26.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
27.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR, pp. 248–255. IEEE (2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR, pp. 248–255. IEEE (2009)
28.
Zurück zum Zitat Byrd, R.H., Lu, P., Nocedal, J., Zhu, C.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16(5), 1190–1208 (1995)MathSciNetCrossRefMATH Byrd, R.H., Lu, P., Nocedal, J., Zhu, C.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16(5), 1190–1208 (1995)MathSciNetCrossRefMATH
29.
Zurück zum Zitat Gershman, S.J., Goodman, N.D.: Amortized inference in probabilistic reasoning. In: Proceedings of the 36th Annual Conference of the Cognitive Science Society (2014) Gershman, S.J., Goodman, N.D.: Amortized inference in probabilistic reasoning. In: Proceedings of the 36th Annual Conference of the Cognitive Science Society (2014)
30.
Zurück zum Zitat Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: Pajdla, T., Matas, J.G. (eds.) ECCV 2004. LNCS, vol. 3024, pp. 25–36. Springer, Heidelberg (2004)CrossRef Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: Pajdla, T., Matas, J.G. (eds.) ECCV 2004. LNCS, vol. 3024, pp. 25–36. Springer, Heidelberg (2004)CrossRef
31.
Zurück zum Zitat Bruhn, A., Weickert, J., Schnörr, C.: Lucas/kanade meets horn/schunck: combining local and global optic flow methods. IJCV 61(3), 211–231 (2005)CrossRef Bruhn, A., Weickert, J., Schnörr, C.: Lucas/kanade meets horn/schunck: combining local and global optic flow methods. IJCV 61(3), 211–231 (2005)CrossRef
32.
Zurück zum Zitat Shih, Y., Paris, S., Durand, F., Freeman, W.T.: Data-driven hallucination of different times of day from a single outdoor photo. ACM Trans. Graph. (TOG) 32(6), 200 (2013)CrossRef Shih, Y., Paris, S., Durand, F., Freeman, W.T.: Data-driven hallucination of different times of day from a single outdoor photo. ACM Trans. Graph. (TOG) 32(6), 200 (2013)CrossRef
33.
Zurück zum Zitat He, K., Sun, J., Tang, X.: Guided image filtering. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 1–14. Springer, Heidelberg (2010)CrossRef He, K., Sun, J., Tang, X.: Guided image filtering. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 1–14. Springer, Heidelberg (2010)CrossRef
34.
Zurück zum Zitat Parikh, D., Grauman, K.: Relative attributes. In: ICCV, pp. 503–510. IEEE (2011) Parikh, D., Grauman, K.: Relative attributes. In: ICCV, pp. 503–510. IEEE (2011)
35.
Zurück zum Zitat Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, vol. 1, pp. 886–893. IEEE (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, vol. 1, pp. 886–893. IEEE (2005)
36.
Zurück zum Zitat Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. ICML 37, 448–456 (2015) Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. ICML 37, 448–456 (2015)
37.
Zurück zum Zitat Yu, A., Grauman, K.: Fine-grained visual comparisons with local learning. In: CVPR, pp. 192–199 (2014) Yu, A., Grauman, K.: Fine-grained visual comparisons with local learning. In: CVPR, pp. 192–199 (2014)
38.
Zurück zum Zitat Yu, F., Zhang, Y., Song, S., Seff, A., Xiao, J.: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015) Yu, F., Zhang, Y., Song, S., Seff, A., Xiao, J.: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:​1506.​03365 (2015)
39.
Zurück zum Zitat Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: NIPS, pp. 487–495 (2014) Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: NIPS, pp. 487–495 (2014)
40.
Zurück zum Zitat Seitz, S.M., Dyer, C.R.: View Morphing, pp. 21–30, New York (1996) Seitz, S.M., Dyer, C.R.: View Morphing, pp. 21–30, New York (1996)
41.
Zurück zum Zitat Sun, X., Wang, C., Xu, C., Zhang, L.: Indexing billions of images for sketch-based retrieval. In: ACM MM (2013) Sun, X., Wang, C., Xu, C., Zhang, L.: Indexing billions of images for sketch-based retrieval. In: ACM MM (2013)
42.
Zurück zum Zitat Zhu, J.Y., Lee, Y.J., Efros, A.A.: Averageexplorer: interactive exploration and alignment of visual data collections. SIGGRAPH 33(4) (2014) Zhu, J.Y., Lee, Y.J., Efros, A.A.: Averageexplorer: interactive exploration and alignment of visual data collections. SIGGRAPH 33(4) (2014)
43.
Zurück zum Zitat Risser, E., Han, C., Dahyot, R., Grinspun, E.: Synthesizing structured image hybrids. SIGGRAPH 29(4), 85:1–85:6 (2010) Risser, E., Han, C., Dahyot, R., Grinspun, E.: Synthesizing structured image hybrids. SIGGRAPH 29(4), 85:1–85:6 (2010)
44.
Zurück zum Zitat Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)CrossRef Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)CrossRef
45.
Zurück zum Zitat Kim, J., Liu, C., Sha, F., Grauman, K.: Deformable spatial pyramid matching for fast dense correspondences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2307–2314 (2013) Kim, J., Liu, C., Sha, F., Grauman, K.: Deformable spatial pyramid matching for fast dense correspondences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2307–2314 (2013)
Metadaten
Titel
Generative Visual Manipulation on the Natural Image Manifold
verfasst von
Jun-Yan Zhu
Philipp Krähenbühl
Eli Shechtman
Alexei A. Efros
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46454-1_36