Top

Published in:

2019 | OriginalPaper | Chapter

Context-Aware GANs for Image Generation from Multimodal Queries

Authors : Kenki Nakamura, Qiang Ma

Published in: Database and Expert Systems Applications

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In this paper, we propose a novel model of context-aware generative adversarial networks (GANs) to generate images from a multimodal query: a pair of condition text and context image. In our study, context is defined as the objects and concepts that appear in the image but not in the text. We construct two object trees expressing the objects and the corresponding hierarchical relationships described in the input condition text and context image, respectively. We compare these two object trees to extract the context. Then, based on the extracted context, we generate parameters for the generator in context-aware GANs. To guarantee that the generated image is related to the multimodal query, i.e., both the condition text and context image, we also construct a context discriminator in addition to the condition discriminator, similar to that of conditional GANs. The experimental results reveal that the prepared model generates images with higher resolutions, containing more contextual information than previous models.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Database Processing-in-Memory: A Vision

For details of these relationships, please refer to [2].

Antol, S., Agrawal, A., Lu, J., Mitchell, M., Batra, D., Zitnick, C.L., Parikh, D.: VQA: visual question answering. ICCV 2015, 2425–2433 (2015)

Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. EMNLP 2014, 740–750 (2014)

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: NIPS 2014, pp. 2672–2680 (2014)

Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRef

Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. CoRR abs/1710.10196 (2017). http://arxiv.org/abs/1710.10196

Ma, Q.: Utilization and analysis of user generated contents toward personalized and distributed sightseeing. Syst. Control Inf. 63(1), 32–37 (2019)

Ma, Q.: Forefront of sightseeing informatics - technologies of collective intelligence for promotion of personalized and distributed sightseeing. Inf. Process. 58(3), 220–226 (2017)

Zhuang, C.Y., Ma, Q., Liang, X.F., Yoshikawa, M.: Discovering obscure sightseeing spots by analysis of geo-tagged social images. ASONAM 2015, 590–595 (2015)CrossRef

Nakamura, K., Ma, Q.: Context-aware image generation by using generative adversarial networks. ISM 2017, 516–523 (2017)

10.

Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. EMNLP 2014, 1532–1543 (2014)

11.

Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR abs/1511.06434 (2015). http://arxiv.org/abs/1511.06434

12.

Reed, S.E., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. ICML 2016, 1060–1069 (2016)

13.

Teney, D., Liu, L., van den Hengel, A.: Graph-structured representations for visual question answering. CVPR 2017, 3233–3241 (2017)

14.

Vondrick, C., Pirsiavash, H., Torralba, A.: Generating videos with scene dynamics. NIPS 2016, 613–621 (2016)

15.

Zhang, H., Xu, T., Li, H.: StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. ICCV 2017, 5908–5916 (2017)

Title: Context-Aware GANs for Image Generation from Multimodal Queries
Authors: Kenki Nakamura
Qiang Ma
Publisher: Springer International Publishing
Book: Database and Expert Systems Applications
Print ISBN: 978-3-030-27614-0

Electronic ISBN: 978-3-030-27615-7

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-030-27615-7_33

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner