Skip to main content
Top
Published in: Soft Computing 23/2020

12-06-2020 | Methodologies and Application

Multimodal image-to-image translation between domains with high internal variability

Authors: Jian Wang, Jiancheng Lv, Xue Yang, Chenwei Tang, Xi Peng

Published in: Soft Computing | Issue 23/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Multimodal image-to-image translation based on generative adversarial networks (GANs) shows suboptimal performance in the visual domains with high internal variability, e.g., translation from multiple breeds of cats to multiple breeds of dogs. To alleviate this problem, we recast the training procedure as modeling distinct distributions which are observed sequentially, for example, when different classes are encountered over time. As a result, the discriminator may forget about the previous target distributions, known as catastrophic forgetting, leading to non-/slow convergence. Through experimental observation, we found that the discriminator does not always forget the previously learned distributions during training. Therefore, we propose a novel generator regulating GAN (GR-GAN). The proposed method encourages the discriminator to teach the generator more effectively when it remembers more of the previously learned distributions, while discouraging the discriminator to guide the generator when catastrophic forgetting happens on the discriminator. Both qualitative and quantitative results show that the proposed method is significantly superior to the state-of-the-art methods in handling the image data that are with high variability.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
Considering the memory consumption, the batch size of image translation models is usually very small, e.g., 1.
 
2
Some models use LSGAN (Mao et al. 2017) objective.
 
4
All testers are independent of the authors’ research group.
 
Literature
go back to reference Arjovsky M, Chintala S, Bottou L (2007) Wasserstein generative adversarial networks. In: International conference on machine learning, pp 214–223 Arjovsky M, Chintala S, Bottou L (2007) Wasserstein generative adversarial networks. In: International conference on machine learning, pp 214–223
go back to reference Bousmalis K, Silberman N, Dohan D, Erhan D, Krishnan D (2017) Unsupervised pixel-level domain adaptation with generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3722–3731 Bousmalis K, Silberman N, Dohan D, Erhan D, Krishnan D (2017) Unsupervised pixel-level domain adaptation with generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3722–3731
go back to reference Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 248–255 Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 248–255
go back to reference Dong C, Loy CC, He K, Tang X (2014) Learning a deep convolutional network for image super-resolution. In: European conference on computer vision, pp 184–199 Dong C, Loy CC, He K, Tang X (2014) Learning a deep convolutional network for image super-resolution. In: European conference on computer vision, pp 184–199
go back to reference French RM (1999) Catastrophic forgetting in connectionist networks. Trends Cognit Sci 3(4):128–135CrossRef French RM (1999) Catastrophic forgetting in connectionist networks. Trends Cognit Sci 3(4):128–135CrossRef
go back to reference Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(1):59.1–59.35 Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(1):59.1–59.35
go back to reference Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2414–2423 Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2414–2423
go back to reference Gonzalez-Garcia A, van de Weijer J, Bengio Y (2018) Image-to-image translation for cross-domain disentanglement. In: Advances in neural information processing systems 31: Annual conference on neural information processing Systems 2018, pp 1287–1298 Gonzalez-Garcia A, van de Weijer J, Bengio Y (2018) Image-to-image translation for cross-domain disentanglement. In: Advances in neural information processing systems 31: Annual conference on neural information processing Systems 2018, pp 1287–1298
go back to reference Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems 27: Annual conference on neural information processing systems 2014, pp 2672–2680 Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems 27: Annual conference on neural information processing systems 2014, pp 2672–2680
go back to reference Hinterstoisser S, Lepetit V, Ilic S, Holzer S, Bradski G, Konolige K, Navab N (2012) Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Asian conference on computer vision, pp 548–562 Hinterstoisser S, Lepetit V, Ilic S, Holzer S, Bradski G, Konolige K, Navab N (2012) Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Asian conference on computer vision, pp 548–562
go back to reference Huang X, Liu MY, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV), pp 172–189 Huang X, Liu MY, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV), pp 172–189
go back to reference Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134 Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
go back to reference Kim T, Cha M, Kim H, Lee JK, Kim J (2017) Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 1857–1865 Kim T, Cha M, Kim H, Lee JK, Kim J (2017) Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 1857–1865
go back to reference Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526MathSciNetCrossRef Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526MathSciNetCrossRef
go back to reference Larsson G, Maire M, Shakhnarovich G (2016) Learning representations for automatic colorization. In: European conference on computer vision, pp 577–593 Larsson G, Maire M, Shakhnarovich G (2016) Learning representations for automatic colorization. In: European conference on computer vision, pp 577–593
go back to reference LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551CrossRef LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551CrossRef
go back to reference LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef
go back to reference Lee HY, Tseng HY, Huang JB, Singh M, Yang MH (2018) Diverse image-to-image translation via disentangled representations. In: Proceedings of the European conference on computer vision (ECCV), pp 35–51 Lee HY, Tseng HY, Huang JB, Singh M, Yang MH (2018) Diverse image-to-image translation via disentangled representations. In: Proceedings of the European conference on computer vision (ECCV), pp 35–51
go back to reference Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. In: Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, pp 700–708 Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. In: Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, pp 700–708
go back to reference Liu D, Fu J, Qu Q, Lv J (2018) BFGAN: Backward and forward generative adversarial networks for lexically constrained sentence generation. IEEE ACM Trans Audio Speech Lang Process 27(12):2350–2361CrossRef Liu D, Fu J, Qu Q, Lv J (2018) BFGAN: Backward and forward generative adversarial networks for lexically constrained sentence generation. IEEE ACM Trans Audio Speech Lang Process 27(12):2350–2361CrossRef
go back to reference Ma L, Jia X, Georgoulis S, Tuytelaars T, Van Gool L (2019) Exemplar guided unsupervised image-to-image translation with semantic consistency. In: International conference on learning representations Ma L, Jia X, Georgoulis S, Tuytelaars T, Van Gool L (2019) Exemplar guided unsupervised image-to-image translation with semantic consistency. In: International conference on learning representations
go back to reference Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2794–2802 Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2794–2802
go back to reference Parkhi OM, Vedaldi A, Zisserman A, Jawahar CV (2012) Cats and dogs. In: Proceedings of the IEEE conference on computer vision and pattern recognition Parkhi OM, Vedaldi A, Zisserman A, Jawahar CV (2012) Cats and dogs. In: Proceedings of the IEEE conference on computer vision and pattern recognition
go back to reference Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2536–2544 Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2536–2544
go back to reference Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: International conference on learning representations Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: International conference on learning representations
go back to reference Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2234–2242 Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2234–2242
go back to reference Sangkloy P, Lu J, Fang C, Yu F, Hays J (2017) Scribbler: controlling deep image synthesis with sketch and color. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5400–5409 Sangkloy P, Lu J, Fang C, Yu F, Hays J (2017) Scribbler: controlling deep image synthesis with sketch and color. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5400–5409
go back to reference Tang C, Xu K, He Z, Lv J (2019) Exaggerated portrait caricatures synthesis. Inf Sci 502:363–375CrossRef Tang C, Xu K, He Z, Lv J (2019) Exaggerated portrait caricatures synthesis. Inf Sci 502:363–375CrossRef
go back to reference Thanh-Tung H, Tran T, Venkatesh S (2018) On catastrophic forgetting and mode collapse in generative adversarial networks. arXiv preprint arXiv:1807.04015 Thanh-Tung H, Tran T, Venkatesh S (2018) On catastrophic forgetting and mode collapse in generative adversarial networks. arXiv preprint arXiv:​1807.​04015
go back to reference Wu C, Herranz L, Liu X, Wang Y, van de Weijer J, Raducanu B (2018) Memory replay gans: learning to generate images from new categories without forgetting. In: Conference on neural information processing systems (NIPS) Wu C, Herranz L, Liu X, Wang Y, van de Weijer J, Raducanu B (2018) Memory replay gans: learning to generate images from new categories without forgetting. In: Conference on neural information processing systems (NIPS)
go back to reference Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2849–2857 Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2849–2857
go back to reference Yu X, Ying Z, Li G, Gao W (2018) Multi-mapping image-to-image translation with central biasing normalization. arXiv preprint arXiv:1806.10050 Yu X, Ying Z, Li G, Gao W (2018) Multi-mapping image-to-image translation with central biasing normalization. arXiv preprint arXiv:​1806.​10050
go back to reference Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: European conference on computer vision, pp 649–666 Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: European conference on computer vision, pp 649–666
go back to reference Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 586–595 Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 586–595
go back to reference Zhu JY, Krähenbühl P, Shechtman E, Efros AA (2016) Generative visual manipulation on the natural image manifold. In: European conference on computer vision, pp. 597–613 Zhu JY, Krähenbühl P, Shechtman E, Efros AA (2016) Generative visual manipulation on the natural image manifold. In: European conference on computer vision, pp. 597–613
go back to reference Zhu JY, Park T, Isola P, Efros AA (2017a) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232 Zhu JY, Park T, Isola P, Efros AA (2017a) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
go back to reference Zhu JY, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shecht man E (2017b) Toward multimodal image-to-image translation. In: Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, pp 465–476 Zhu JY, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shecht man E (2017b) Toward multimodal image-to-image translation. In: Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, pp 465–476
Metadata
Title
Multimodal image-to-image translation between domains with high internal variability
Authors
Jian Wang
Jiancheng Lv
Xue Yang
Chenwei Tang
Xi Peng
Publication date
12-06-2020
Publisher
Springer Berlin Heidelberg
Published in
Soft Computing / Issue 23/2020
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-020-05073-6

Other articles of this Issue 23/2020

Soft Computing 23/2020 Go to the issue

Premium Partner