Skip to main content
Erschienen in: Neural Processing Letters 6/2022

10.06.2022

Precise Correspondence Enhanced GAN for Person Image Generation

verfasst von: Ji Liu, Yuesheng Zhu

Erschienen in: Neural Processing Letters | Ausgabe 6/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

To generate a realistic person image for pose-guided person image generation, especially for local body parts, is challenging. Two reasons account for it: (1) the difficulty for long-range relation modeling, (2) a deficiency in precise local correspondence capturing. We propose a Precise Correspondence Enhanced Generative Adversarial Network (PCE-GAN) to address these problems. PCE-GAN includes a global branch and a local branch. The former maintains the global consistency of the generated person image and the latter captures the precise local correspondence. More specifically, the long-range relation is well established via the spatial-channel Multi-layer Perceptrons module in the transformation blocks within both branches. The precise local correspondence is captured effectively by the local branch’s local-pair building and local-guiding modules. Finally, the outputs of each branch are combined for mutually improved benefits based on the enhanced correspondences. Experimental results show that, compared to previous state-of-the-art methods using the Market-1501 dataset, PCE-GAN performs quantitatively better, with a \(5.53\%\) and \(7.74\%\) improvement in SSIM and IS scores, respectively. Qualitative results for both Market-1501 and DeepFashion datasets are also provided herein to further validate the effectiveness of our method.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Ma L, Jia X, Sun Q, Schiele B, Tuytelaars T, Van Gool L (2017) Pose guided person image generation. In: Advances in neural information processing systems, pp 406–416 Ma L, Jia X, Sun Q, Schiele B, Tuytelaars T, Van Gool L (2017) Pose guided person image generation. In: Advances in neural information processing systems, pp 406–416
2.
Zurück zum Zitat Ma L, Sun Q, Georgoulis S, Van Gool L, Schiele B, Fritz M (2018) Disentangled person image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 99– 108 Ma L, Sun Q, Georgoulis S, Van Gool L, Schiele B, Fritz M (2018) Disentangled person image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 99– 108
3.
Zurück zum Zitat Siarohin A, Sangineto E, Lathuiliere S, Sebe N (2018) Deformable gans for pose-based human image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3408– 3416 Siarohin A, Sangineto E, Lathuiliere S, Sebe N (2018) Deformable gans for pose-based human image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3408– 3416
4.
Zurück zum Zitat Zhu Z, Huang T, Shi B, Yu M, Wang B, Bai X (2019) Progressive pose attention transfer for person image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2347– 2356 Zhu Z, Huang T, Shi B, Yu M, Wang B, Bai X (2019) Progressive pose attention transfer for person image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2347– 2356
5.
Zurück zum Zitat AlBahar B, Huang J-B (2019) Guided image-to-image translation with bi-directional feature transformation. In: Proceedings of the IEEE international conference on computer vision, pp 9016– 9025 AlBahar B, Huang J-B (2019) Guided image-to-image translation with bi-directional feature transformation. In: Proceedings of the IEEE international conference on computer vision, pp 9016– 9025
6.
Zurück zum Zitat Men Y, Mao Y, Jiang Y, Ma W-Y, Lian, Z (2020) Controllable person image synthesis with attribute-decomposed gan. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5084– 5093 Men Y, Mao Y, Jiang Y, Ma W-Y, Lian, Z (2020) Controllable person image synthesis with attribute-decomposed gan. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5084– 5093
7.
Zurück zum Zitat Lv Z, Li X, Li X, Li F, Lin T, He D, Zuo W (2021) Learning semantic person image generation by region-adaptive normalization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10806– 10815 Lv Z, Li X, Li X, Li F, Lin T, He D, Zuo W (2021) Learning semantic person image generation by region-adaptive normalization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10806– 10815
8.
Zurück zum Zitat Siarohin A, Woodford OJ, Ren J, Chai M, Tulyakov S (2021) Motion representations for articulated animation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 13653– 13662 Siarohin A, Woodford OJ, Ren J, Chai M, Tulyakov S (2021) Motion representations for articulated animation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 13653– 13662
9.
Zurück zum Zitat Tang H, Xu D, Liu G, Wang W, Sebe N, Yan Y (2019) Cycle in cycle generative adversarial networks for keypoint-guided image generation. In: Proceedings of the ACM international conference on multimedia, pp 2052– 2060 Tang H, Xu D, Liu G, Wang W, Sebe N, Yan Y (2019) Cycle in cycle generative adversarial networks for keypoint-guided image generation. In: Proceedings of the ACM international conference on multimedia, pp 2052– 2060
10.
Zurück zum Zitat Tang H. Bai S, Zhang L, Torr PH, Sebe N (2020) Xinggan for person image generation. In: Proceedings of the European conference on computer vision, pp 717– 734 Tang H. Bai S, Zhang L, Torr PH, Sebe N (2020) Xinggan for person image generation. In: Proceedings of the European conference on computer vision, pp 717– 734
11.
Zurück zum Zitat Tang H, Bai S, Torr PH, Sebe N (2020) Bipartite graph reasoning gans for person image generation. In: British machine vision conference Tang H, Bai S, Torr PH, Sebe N (2020) Bipartite graph reasoning gans for person image generation. In: British machine vision conference
12.
Zurück zum Zitat LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef
13.
Zurück zum Zitat Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, vol 27 Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, vol 27
14.
Zurück zum Zitat Fister I Jr, Perc M, Ljubič K, Kamal SM, Iglesias A, Fister I (2015) Particle swarm optimization for automatic creation of complex graphic characters. Chaos Solit Fract 73:29–35MathSciNetCrossRef Fister I Jr, Perc M, Ljubič K, Kamal SM, Iglesias A, Fister I (2015) Particle swarm optimization for automatic creation of complex graphic characters. Chaos Solit Fract 73:29–35MathSciNetCrossRef
15.
Zurück zum Zitat Kingma DP, Welling M (2013) Auto-encoding variational bayes. In: International conference on learning representations Kingma DP, Welling M (2013) Auto-encoding variational bayes. In: International conference on learning representations
17.
Zurück zum Zitat Han Z, Huang H (2021) Gan based three-stage-training algorithm for multi-view facial expression recognition. Neural Process Lett 53(6):4189–4205CrossRef Han Z, Huang H (2021) Gan based three-stage-training algorithm for multi-view facial expression recognition. Neural Process Lett 53(6):4189–4205CrossRef
18.
Zurück zum Zitat Xiang X, Yu Z, Lv N, Kong X, Saddik AE (2020) Attention-based generative adversarial network for semi-supervised image classification. Neural Process Lett 51(2):1527–1540CrossRef Xiang X, Yu Z, Lv N, Kong X, Saddik AE (2020) Attention-based generative adversarial network for semi-supervised image classification. Neural Process Lett 51(2):1527–1540CrossRef
19.
Zurück zum Zitat Wen J, Shen Y, Yang J (2022) Multi-view gait recognition based on generative adversarial network. Neural Process Lett 1–23 Wen J, Shen Y, Yang J (2022) Multi-view gait recognition based on generative adversarial network. Neural Process Lett 1–23
20.
Zurück zum Zitat Brock A, Donahue J, Simonyan K ( 2018) Large scale gan training for high fidelity natural image synthesis. In: International conference on learning representations Brock A, Donahue J, Simonyan K ( 2018) Large scale gan training for high fidelity natural image synthesis. In: International conference on learning representations
21.
Zurück zum Zitat Shaham TR, Dekel T Michaeli T (2019) Singan: learning a generative model from a single natural image. In: Proceedings of the IEEE international conference on computer vision, pp 4570– 4580 Shaham TR, Dekel T Michaeli T (2019) Singan: learning a generative model from a single natural image. In: Proceedings of the IEEE international conference on computer vision, pp 4570– 4580
22.
Zurück zum Zitat Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 4401– 4410 Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 4401– 4410
23.
Zurück zum Zitat Esser P, Sutter E, Ommer B (2018) A variational u-net for conditional appearance and shape generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8857– 8866 Esser P, Sutter E, Ommer B (2018) A variational u-net for conditional appearance and shape generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8857– 8866
24.
Zurück zum Zitat Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612CrossRef Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612CrossRef
25.
Zurück zum Zitat Zakharov E, Shysheya A, Burkov E, Lempitsky V (2019) Few-shot adversarial learning of realistic neural talking head models. In: Proceedings of the IEEE international conference on computer vision, pp 9459– 9468 Zakharov E, Shysheya A, Burkov E, Lempitsky V (2019) Few-shot adversarial learning of realistic neural talking head models. In: Proceedings of the IEEE international conference on computer vision, pp 9459– 9468
26.
Zurück zum Zitat Kim J, Kim M, Kang H, Lee KH (2019) U-gat-it: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: International conference on learning representations Kim J, Kim M, Kang H, Lee KH (2019) U-gat-it: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: International conference on learning representations
27.
Zurück zum Zitat Alami Mejjati Y, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. Adv Neural Inf Process Syst 31:3693–3703 Alami Mejjati Y, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. Adv Neural Inf Process Syst 31:3693–3703
28.
Zurück zum Zitat Park T, Liu M-Y, Wang T-C, Zhu J-Y (2019) Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2337– 2346 Park T, Liu M-Y, Wang T-C, Zhu J-Y (2019) Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2337– 2346
29.
Zurück zum Zitat Ren B, Tang H, Sebe N (2021) Cascaded cross mlp-mixer gans for cross-view image translation. In: British machine vision conference Ren B, Tang H, Sebe N (2021) Cascaded cross mlp-mixer gans for cross-view image translation. In: British machine vision conference
30.
Zurück zum Zitat Balakrishnan G, Zhao A, Dalca AV, Durand F, Guttag J (2018) Synthesizing images of humans in unseen poses. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8340– 8348 Balakrishnan G, Zhao A, Dalca AV, Durand F, Guttag J (2018) Synthesizing images of humans in unseen poses. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8340– 8348
31.
Zurück zum Zitat Lassner C, Pons-Moll G, Gehler PV (2017) A generative model of people in clothing. In: Proceedings of the IEEE international conference on computer vision, pp 853– 862 Lassner C, Pons-Moll G, Gehler PV (2017) A generative model of people in clothing. In: Proceedings of the IEEE international conference on computer vision, pp 853– 862
32.
Zurück zum Zitat Wang B, Zheng H, Liang X, Chen Y, Lin L, Yang M (2018) Toward characteristic-preserving image-based virtual try-on network. In: Proceedings of the European conference on computer vision, pp 589– 604 Wang B, Zheng H, Liang X, Chen Y, Lin L, Yang M (2018) Toward characteristic-preserving image-based virtual try-on network. In: Proceedings of the European conference on computer vision, pp 589– 604
33.
Zurück zum Zitat Neverova N, Alp Guler R, Kokkinos I (2018) Dense pose transfer. In: Proceedings of the European conference on computer vision, pp 123– 138 Neverova N, Alp Guler R, Kokkinos I (2018) Dense pose transfer. In: Proceedings of the European conference on computer vision, pp 123– 138
34.
Zurück zum Zitat Li Y, Huang C, Loy CC (2019) Dense intrinsic appearance flow for human pose transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3693– 3702 Li Y, Huang C, Loy CC (2019) Dense intrinsic appearance flow for human pose transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3693– 3702
35.
Zurück zum Zitat Zanfir M, Oneata E, Popa A-I, Zanfir A, Sminchisescu C (2020) Human synthesis and scene compositing. Proc AAAI Conf Art Intell 34:12749–12756 Zanfir M, Oneata E, Popa A-I, Zanfir A, Sminchisescu C (2020) Human synthesis and scene compositing. Proc AAAI Conf Art Intell 34:12749–12756
36.
Zurück zum Zitat Zhang J, Li K, Lai Y-K, Yang J (2021) Pise: person image synthesis and editing with decoupled gan. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7982–7990 Zhang J, Li K, Lai Y-K, Yang J (2021) Pise: person image synthesis and editing with decoupled gan. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7982–7990
37.
Zurück zum Zitat Cao Z, Simon T, Wei S-E, Sheikh, Y (2017) Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7291– 7299 Cao Z, Simon T, Wei S-E, Sheikh, Y (2017) Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7291– 7299
38.
Zurück zum Zitat Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision
39.
Zurück zum Zitat Liu Z, Luo P, Qiu S, Wang X, Tang X (2016) Deepfashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1096– 1104 Liu Z, Luo P, Qiu S, Wang X, Tang X (2016) Deepfashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1096– 1104
40.
Zurück zum Zitat Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, vol 29 Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, vol 29
41.
Zurück zum Zitat Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representations Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representations
42.
Zurück zum Zitat Huang S, Xiong H, Cheng Z-Q, Wang Q, Zhou X, Wen B, Huan J, Dou D (2020) Generating person images with appearance-aware pose stylizer. In: International joint conference on artificial intelligence Huang S, Xiong H, Cheng Z-Q, Wang Q, Zhou X, Wen B, Huan J, Dou D (2020) Generating person images with appearance-aware pose stylizer. In: International joint conference on artificial intelligence
43.
Zurück zum Zitat Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. In: Advances in neural information processing systems, vol 32 Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. In: Advances in neural information processing systems, vol 32
44.
Zurück zum Zitat Ren Y, Yu X, Chen J, Li TH, Li G (2020) Deep image spatial transformation for person image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7690– 7699 Ren Y, Yu X, Chen J, Li TH, Li G (2020) Deep image spatial transformation for person image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7690– 7699
45.
Zurück zum Zitat Andriluka M, Pishchulin L, Gehler P, Schiele B (2014) Human pose estimation: new benchmark and state of the art analysis. In: Proceedings of the IEEE computer vision and pattern recognition Andriluka M, Pishchulin L, Gehler P, Schiele B (2014) Human pose estimation: new benchmark and state of the art analysis. In: Proceedings of the IEEE computer vision and pattern recognition
Metadaten
Titel
Precise Correspondence Enhanced GAN for Person Image Generation
verfasst von
Ji Liu
Yuesheng Zhu
Publikationsdatum
10.06.2022
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 6/2022
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-10853-2

Weitere Artikel der Ausgabe 6/2022

Neural Processing Letters 6/2022 Zur Ausgabe

Neuer Inhalt