Skip to main content
main-content

Tipp

Weitere Kapitel dieses Buchs durch Wischen aufrufen

2018 | OriginalPaper | Buchkapitel

Unsupervised Image-to-Image Translation with Stacked Cycle-Consistent Adversarial Networks

verfasst von: Minjun Li, Haozhi Huang, Lin Ma, Wei Liu, Tong Zhang, Yugang Jiang

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

share
TEILEN

Abstract

Recent studies on unsupervised image-to-image translation have made remarkable progress by training a pair of generative adversarial networks with a cycle-consistent loss. However, such unsupervised methods may generate inferior results when the image resolution is high or the two image domains are of significant appearance differences, such as the translations between semantic layouts and natural images in the Cityscapes dataset. In this paper, we propose novel Stacked Cycle-Consistent Adversarial Networks (SCANs) by decomposing a single translation into multi-stage transformations, which not only boost the image translation quality but also enable higher resolution image-to-image translation in a coarse-to-fine fashion. Moreover, to properly exploit the information from the previous stage, an adaptive fusion block is devised to learn a dynamic integration of the current stage’s output and the previous stage’s output. Experiments on multiple datasets demonstrate that our proposed approach can improve the translation quality compared with previous single-stage unsupervised methods.
Literatur
1.
Zurück zum Zitat Chen, Q., Koltun, V.: Photographic image synthesis with cascaded refinement networks. In: Proceedings of ICCV (2017) Chen, Q., Koltun, V.: Photographic image synthesis with cascaded refinement networks. In: Proceedings of ICCV (2017)
2.
Zurück zum Zitat Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of CVPR (2016) Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of CVPR (2016)
3.
Zurück zum Zitat Denton, E.L., Chintala, S., Fergus, R., et al.: Deep generative image models using a laplacian pyramid of adversarial networks. In: Proceedings of NIPS (2015) Denton, E.L., Chintala, S., Fergus, R., et al.: Deep generative image models using a laplacian pyramid of adversarial networks. In: Proceedings of NIPS (2015)
4.
Zurück zum Zitat Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of ICCV (2015) Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of ICCV (2015)
5.
Zurück zum Zitat Goodfellow, I., et al.: Generative adversarial nets. In: Proceedings of NIPS (2014) Goodfellow, I., et al.: Generative adversarial nets. In: Proceedings of NIPS (2014)
6.
Zurück zum Zitat Iizuka, S., Simo-Serra, E., Ishikawa, H.: Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans. Graph. (TOG) 35, 110 (2016) CrossRef Iizuka, S., Simo-Serra, E., Ishikawa, H.: Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans. Graph. (TOG) 35, 110 (2016) CrossRef
7.
Zurück zum Zitat Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of CVPR (2017) Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of CVPR (2017)
9.
Zurück zum Zitat Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. arXiv preprint arXiv:​1710.​10196 (2017) Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. arXiv preprint arXiv:​1710.​10196 (2017)
10.
Zurück zum Zitat Kim, T., Cha, M., Kim, H., Lee, J., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of ICML (2017) Kim, T., Cha, M., Kim, H., Lee, J., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of ICML (2017)
11.
Zurück zum Zitat Kingma, D., Ba, J.: Adam: A method for stochastic optimization. In: Proceedings of ICLR (2014) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. In: Proceedings of ICLR (2014)
13.
Zurück zum Zitat Laffont, P.Y., Ren, Z., Tao, X., Qian, C., Hays, J.: Transient attributes for high-level understanding and editing of outdoor scenes. ACM Trans. Graph. (TOG) 33, 149 (2014) CrossRef Laffont, P.Y., Ren, Z., Tao, X., Qian, C., Hays, J.: Transient attributes for high-level understanding and editing of outdoor scenes. ACM Trans. Graph. (TOG) 33, 149 (2014) CrossRef
14.
Zurück zum Zitat Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of CVPR (2017) Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of CVPR (2017)
15.
Zurück zum Zitat Liang, X., Zhang, H., Xing, E.P.: Generative semantic manipulation with contrasting GAN. In: Proceedings of NIPS (2017) Liang, X., Zhang, H., Xing, E.P.: Generative semantic manipulation with contrasting GAN. In: Proceedings of NIPS (2017)
16.
Zurück zum Zitat Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: Proceedings of NIPS (2017) Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: Proceedings of NIPS (2017)
17.
Zurück zum Zitat Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of CVPR (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of CVPR (2015)
19.
Zurück zum Zitat Odena, A., Dumoulin, V., Olah, C.: Deconvolution and checkerboard artifacts. Distill 1(10), e3 (2016) CrossRef Odena, A., Dumoulin, V., Olah, C.: Deconvolution and checkerboard artifacts. Distill 1(10), e3 (2016) CrossRef
20.
Zurück zum Zitat Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of CVPR (2016) Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of CVPR (2016)
21.
Zurück zum Zitat Simo-Serra, E., Iizuka, S., Sasaki, K., Ishikawa, H.: Learning to simplify: fully convolutional networks for rough sketch cleanup. ACM Trans. Graph. (TOG) 35(4), 121 (2016) CrossRef Simo-Serra, E., Iizuka, S., Sasaki, K., Ishikawa, H.: Learning to simplify: fully convolutional networks for rough sketch cleanup. ACM Trans. Graph. (TOG) 35(4), 121 (2016) CrossRef
22.
Zurück zum Zitat Taigman, Y., Polyak, A., Wolf, L.: Unsupervised cross-domain image generation. In: Proceedings of ICLR (2016) Taigman, Y., Polyak, A., Wolf, L.: Unsupervised cross-domain image generation. In: Proceedings of ICLR (2016)
23.
Zurück zum Zitat Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of CVPR (2018) Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of CVPR (2018)
25.
Zurück zum Zitat Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. (TIP) 13(4), 600–612 (2004) CrossRef Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. (TIP) 13(4), 600–612 (2004) CrossRef
26.
Zurück zum Zitat Xie, S., Tu, Z.: Holistically-nested edge detection. In: Proceedings of ICCV (2015) Xie, S., Tu, Z.: Holistically-nested edge detection. In: Proceedings of ICCV (2015)
27.
Zurück zum Zitat Yi, Z., Zhang, H., Gong, P.T., et al.: DualGAN: unsupervised dual learning for image-to-image translation. In: Proceedings of ICCV (2017) Yi, Z., Zhang, H., Gong, P.T., et al.: DualGAN: unsupervised dual learning for image-to-image translation. In: Proceedings of ICCV (2017)
28.
Zurück zum Zitat Xiong, Z., Luo, W., Ma, L., Liu, W., Luo, J.: Learning to generate time-lapse videos using multi-stage dynamic generative adversarial networks. In: Proceedings of CVPR (2018) Xiong, Z., Luo, W., Ma, L., Liu, W., Luo, J.: Learning to generate time-lapse videos using multi-stage dynamic generative adversarial networks. In: Proceedings of CVPR (2018)
29.
Zurück zum Zitat Zhang, H., et al.: StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of ICCV (2016) Zhang, H., et al.: StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of ICCV (2016)
31.
Zurück zum Zitat Zhang, R., et al.: Real-time user-guided image colorization with learned deep priors. ACM Trans. Graph. (TOG) (2017) Zhang, R., et al.: Real-time user-guided image colorization with learned deep priors. ACM Trans. Graph. (TOG) (2017)
33.
Zurück zum Zitat Zhao, B., Wu, X., Cheng, Z.Q., Liu, H., Jie, Z., Feng, J.: Multi-view image generation from a single-view. arXiv preprint arXiv:​1704.​04886 (2017) Zhao, B., Wu, X., Cheng, Z.Q., Liu, H., Jie, Z., Feng, J.: Multi-view image generation from a single-view. arXiv preprint arXiv:​1704.​04886 (2017)
34.
Zurück zum Zitat Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of ICCV (2017) Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of ICCV (2017)
Metadaten
Titel
Unsupervised Image-to-Image Translation with Stacked Cycle-Consistent Adversarial Networks
verfasst von
Minjun Li
Haozhi Huang
Lin Ma
Wei Liu
Tong Zhang
Yugang Jiang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01240-3_12

Premium Partner