nach oben

International Journal of Computer Vision

Erschienen in:

21.04.2018

Artistic Style Transfer for Videos and Spherical Images

verfasst von: Manuel Ruder, Alexey Dosovitskiy, Thomas Brox

Erschienen in: International Journal of Computer Vision | Ausgabe 11/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Manually re-drawing an image in a certain artistic style takes a professional artist a long time. Doing this for a video sequence single-handedly is beyond imagination. We present two computational approaches that transfer the style from one image (for example, a painting) to a whole video sequence. In our first approach, we adapt to videos the original image style transfer technique by Gatys et al. based on energy minimization. We introduce new ways of initialization and new loss functions to generate consistent and stable stylized video sequences even in cases with large motion and strong occlusion. Our second approach formulates video stylization as a learning problem. We propose a deep network architecture and training procedures that allow us to stylize arbitrary-length videos in a consistent and stable way, and nearly in real time. We show that the proposed methods clearly outperform simpler baselines both qualitatively and quantitatively. Finally, we propose a way to adapt these approaches also to 360\(^\circ \) images and videos as they emerge with recent virtual reality hardware.

Vorheriger Artikel Depth-Based Hand Pose Estimation: Methods, Data, and Challenges

Nächster Artikel Elastic Alignment of Triangular Surface Meshes

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

https://github.com/manuelruder/artistic-videos.

https://github.com/jcjohnson/neural-style.

https://github.com/manuelruder/fast-artistic-videos.

https://github.com/jcjohnson/fast-neural-style.

In the pretrained VGG models, which we use for our perceptual loss, values are centered around the mean pixel of the ImageNet dataset.

Butler, D. J., Wulff, J., Stanley, G. B., & Black, M. J. (2012). A naturalistic open source movie for optical flow evaluation. In ECCV (pp. 611–625).

Chen, D., Liao, J., Yuan, L., Yu, N., & Hua, G. (2017). Coherent online video style transfer. In ICCV (pp. 1114–1123).

Collobert, R., Kavukcuoglu, K., & Farabet, C. (2011). Torch7: A matlab-like environment for machine learning. In BigLearn, NIPS Workshop.

Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). Texture synthesis using convolutional neural networks. In NIPS (pp. 262–270).

Gatys, L. A., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural networks. In CVPR (pp. 2414–2423).

Ghiasi, G., Lee, H., Kudlur, M., Dumoulin, V., & Shlens, J. (2017). Exploring the structure of a real-time, arbitrary neural artistic stylization network. In BMVC.

Gupta, A., Johnson, J., Alahi, A., & Fei-Fei, L. (2017). Characterizing and improving stability in neural style transfer. In ICCV (pp. 4087–4096).

Hays, J., & Essa, I. (2004). Image and video based painterly animation. In Proceedings of the 3rd international symposium on non-photorealistic animation and rendering, NPAR (pp. 113–120).

Huang, H., Wang, H., Luo, W., Ma, L., Jiang, W., Zhu, X., Li, Z., & Liu, W. (2017). Real-time neural style transfer for videos. In CVPR (pp. 7044–7052).

Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., & Brox, T. (2017). Flownet 2.0: Evolution of optical flow estimation with deep networks. In CVPR .

Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML.

Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In ECCV (pp. 694–711).

Li, C., & Wand, M. (2016a). Combining markov random fields and convolutional neural networks for image synthesis. In CVPR (pp. 2479–2486).

Li, C., & Wand, M. (2016b). Precomputed real-time texture synthesis with markovian generative adversarial networks. In ECCV (pp. 702–716).

Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollr, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In ECCV .

Litwinowicz, P. (1997). Processing images and video for an impressionist effect. In Proceedings of the 24th annual conference on computer graphics and interactive techniques, SIGGRAPH (pp. 407–414).

Luan, F., Paris, S., Shechtman, E., & Bala, K. (2017). Deep photo style transfer. arXiv:1703.07511.

Marszalek, M., Laptev, I., & Schmid, C. (2009). Actions in context. In CVPR (pp. 2929–2936).

Nikulin, Y., & Novak, R. (2016). Exploring the neural algorithm of artistic style. CoRR. arXiv:abs/1602.07188.

O’Donovan, P., & Hertzmann, A. (2012). Anipaint: Interactive painterly animation from video. Transactions on Visualization and Computer Graphics, 18(3), 475–487.CrossRef

Revaud, J., Weinzaepfel, P., Harchaoui, Z., & Schmid, C. (2015). Epicflow: Edge-preserving interpolation of correspondences for optical flow. In CVPR (pp. 1164–1172).

Ruder, M., Dosovitskiy, A., & Brox, T. (2016). Artistic style transfer for videos. In GCPR (pp. 26–36).

Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR .

Sundaram, N., Brox, T., & Keutzer, K. (2010). Dense point trajectories by GPU-accelerated large displacement optical flow. In ECCV (pp. 438–451).

Ulyanov, D., Lebedev, V., Vedaldi, A., & Lempitsky, V. S. (2016). Texture networks: Feed-forward synthesis of textures and stylized images. In ICML (pp. 1349–1357).

Ulyanov, D., Vedaldi, A., & Lempitsky, V. S. (2016). Instance normalization: The missing ingredient for fast stylization. CoRR. arXiv:abs/1607.08022.

Weinzaepfel, P., Revaud, J., Harchaoui, Z., & Schmid, C. (2013). DeepFlow: Large displacement optical flow with deep matching. In ICCV (pp. 1385–1392).

Yu, F., & Koltun, V. (2016). Multi-scale context aggregation by dilated convolutions. In ICLR .

Zhang, H., & Dana, K. J. (2017). Multi-style generative network for real-time transfer. CoRR. arXiv:abs/1703.06953.

Titel: Artistic Style Transfer for Videos and Spherical Images
verfasst von: Manuel Ruder
Alexey Dosovitskiy
Thomas Brox
Publikationsdatum: 21.04.2018
Verlag: Springer US
Erschienen in: International Journal of Computer Vision / Ausgabe 11/2018
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-018-1089-z

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 11/2018

On Unifying Multi-view Self-Representations for Clustering by Tensor Multi-rank Minimization

Elastic Alignment of Triangular Surface Meshes

Joint Contour Filtering

Depth-Based Hand Pose Estimation: Methods, Data, and Challenges