Top

Published in:

2016 | OriginalPaper | Chapter

Generative Visual Manipulation on the Natural Image Manifold

Authors : Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, Alexei A. Efros

Published in: Computer Vision – ECCV 2016

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Realistic image manipulation is challenging because it requires modifying the image appearance in a user-controlled way, while preserving the realism of the result. Unless the user has considerable artistic skill, it is easy to “fall off” the manifold of natural images while editing. In this paper, we propose to learn the natural image manifold directly from data using a generative adversarial neural network. We then define a class of image editing operations, and constrain their output to lie on that learned manifold at all times. The model automatically adjusts the output keeping all edits as realistic as possible. All our manipulations are expressed in terms of constrained optimization and are applied in near-real time. We evaluate our algorithm on the task of realistic photo manipulation of shape and color. The presented method can further be used for changing one image to look like the other, as well as generating novel imagery from scratch based on user’s scribbles.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Do We Really Need to Collect Millions of Faces for Effective Face Recognition?

next chapter Deep Cascaded Bi-Network for Face Hallucination

For simplicity, we omit the pixel subscript (x, y) for all the variables.

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: NIPS. 2672–2680. (2014)

Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: ICLR (2016)

Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)

Denton, E.L., Chintala, S., Fergus, R., et al.: Deep generative image models usinga laplacian pyramid of adversarial networks. In: NIPS, pp. 1486–1494 (2015)

Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. arXiv preprint arXiv:1602.02644 (2016)

Reinhard, E., Ashikhmin, M., Gooch, B., Shirley, P.: Color transfer between images. IEEE Comput. Graph. Appl. 21, 34–41 (2001)CrossRef

Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. In: SIGGRAPH, SIGGRAPH 2004, pp. 689–694. ACM, New York (2004)

Alexa, M., Cohen-Or, D., Levin, D.: As-rigid-as-possible shape interpolation. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2000 (2000)

Krähenbühl, P., Lang, M., Hornung, A., Gross, M.: A system for retargeting of streaming video. In: ACM Trans. Graph. (TOG), vol. 28. p. 126. ACM (2009)

10.

Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.: Patchmatch: a randomized correspondence algorithm for structural image editing. SIGGRAPH 28(3), 24 (2009)

11.

Wolberg, G.: Digital Image Warping. IEEE Computer Society Press, Los Alamitos (1990)

12.

Shechtman, E., Rav-Acha, A., Irani, M., Seitz, S.: Regenerative morphing. In: CVPR, San-Francisco, CA, June 2010

13.

Kemelmacher-Shlizerman, I., Shechtman, E., Garg, R., Seitz, S.M.: Exploring photobios. In: SIGGRAPH, vol. 30, p. 61 (2011)

14.

Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)CrossRef

15.

Portilla, J., Simoncelli, E.P.: A parametric texture model based on joint statistics of complex wavelet coefficients. IJCV 40(1), 49–70 (2000)CrossRefMATH

16.

Zoran, D., Weiss, Y.: From learning models of natural image patches to whole image restoration. In: Proceedings of ICCV, pp. 479–486 (2011)

17.

Roth, S., Black, M.J.: Fields of experts: a framework for learning image priors. In: CVPR (2005)

18.

Zhu, J.Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Learning a discriminative model for the perception of realism in composite images. In: ICCV (2015)

19.

Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefMATH

20.

Salakhutdinov, R., Hinton, G.E.: Deep boltzmann machines. In: AISTATS (2009)

21.

Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: ICML (2008)

22.

Bengio, Y., Laufer, E., Alain, G., Yosinski, J.: Deep generative stochastic networks trainable by backprop. In: ICML, pp. 226–234 (2014)

23.

Gregor, K., Danihelka, I., Graves, A., Wierstra, D.: Draw: a recurrent neural network for image generation. In: ICML (2015)

24.

Dosovitskiy, A., Tobias Springenberg, J., Brox, T.: Learning to generate chairs with convolutional neural networks. In: CVPR, pp. 1538–1546 (2015)

25.

Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. arXiv preprint arXiv:1603.08155 (2016)

26.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)

27.

Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR, pp. 248–255. IEEE (2009)

28.

Byrd, R.H., Lu, P., Nocedal, J., Zhu, C.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16(5), 1190–1208 (1995)MathSciNetCrossRefMATH

29.

Gershman, S.J., Goodman, N.D.: Amortized inference in probabilistic reasoning. In: Proceedings of the 36th Annual Conference of the Cognitive Science Society (2014)

30.

Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: Pajdla, T., Matas, J.G. (eds.) ECCV 2004. LNCS, vol. 3024, pp. 25–36. Springer, Heidelberg (2004)CrossRef

31.

Bruhn, A., Weickert, J., Schnörr, C.: Lucas/kanade meets horn/schunck: combining local and global optic flow methods. IJCV 61(3), 211–231 (2005)CrossRef

32.

Shih, Y., Paris, S., Durand, F., Freeman, W.T.: Data-driven hallucination of different times of day from a single outdoor photo. ACM Trans. Graph. (TOG) 32(6), 200 (2013)CrossRef

33.

He, K., Sun, J., Tang, X.: Guided image filtering. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 1–14. Springer, Heidelberg (2010)CrossRef

34.

Parikh, D., Grauman, K.: Relative attributes. In: ICCV, pp. 503–510. IEEE (2011)

35.

Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, vol. 1, pp. 886–893. IEEE (2005)

36.

Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. ICML 37, 448–456 (2015)

37.

Yu, A., Grauman, K.: Fine-grained visual comparisons with local learning. In: CVPR, pp. 192–199 (2014)

38.

Yu, F., Zhang, Y., Song, S., Seff, A., Xiao, J.: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015)

39.

Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: NIPS, pp. 487–495 (2014)

40.

Seitz, S.M., Dyer, C.R.: View Morphing, pp. 21–30, New York (1996)

41.

Sun, X., Wang, C., Xu, C., Zhang, L.: Indexing billions of images for sketch-based retrieval. In: ACM MM (2013)

42.

Zhu, J.Y., Lee, Y.J., Efros, A.A.: Averageexplorer: interactive exploration and alignment of visual data collections. SIGGRAPH 33(4) (2014)

43.

Risser, E., Han, C., Dahyot, R., Grinspun, E.: Synthesizing structured image hybrids. SIGGRAPH 29(4), 85:1–85:6 (2010)

44.

Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)CrossRef

45.

Kim, J., Liu, C., Sha, F., Grauman, K.: Deformable spatial pyramid matching for fast dense correspondences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2307–2314 (2013)

Title: Generative Visual Manipulation on the Natural Image Manifold
Authors: Jun-Yan Zhu
Philipp Krähenbühl
Eli Shechtman
Alexei A. Efros
Publisher: Springer International Publishing
Book: Computer Vision – ECCV 2016
Print ISBN: 978-3-319-46453-4

Electronic ISBN: 978-3-319-46454-1

Copyright Year: 2016
DOI: https://doi.org/10.1007/978-3-319-46454-1_36

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner