Skip to main content
Top

2020 | OriginalPaper | Chapter

Learning from the Guidance: Knowledge Embedded Meta-learning for Medical Visual Question Answering

Authors : Wenbo Zheng, Lan Yan, Fei-Yue Wang, Chao Gou

Published in: Neural Information Processing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Traditional medical visual question answering approaches require a large amount of labeled data for training, but still cannot jointly consider both image and text information. To address this issue, we propose a novel framework called Knowledge Embedded Meta-Learning. In particular, we present a deep relation network to capture and memorize the relation among different samples. First, we introduce the embedding approach to perform feature fusion representation learning. Then, we present the construction of our knowledge graph that relates image with text, as the guidance of our meta-learner. We design a knowledge embedding mechanism to incorporate the knowledge representation into our network. Final result is derived from our relation network by learning to compare the features of samples. Experimental results demonstrate that the proposed approach achieves significantly higher performance compared with other state-of-the-arts.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ben Abacha, A., Hasan, S.A., Datla, V.V., Liu, J., Demner-Fushman, D., Müller, H.: VQA-Med: overview of the medical visual question answering task at ImageCLEF 2019. In: CLEF2019 Working Notes. CEUR Workshop Proceedings (2019) Ben Abacha, A., Hasan, S.A., Datla, V.V., Liu, J., Demner-Fushman, D., Müller, H.: VQA-Med: overview of the medical visual question answering task at ImageCLEF 2019. In: CLEF2019 Working Notes. CEUR Workshop Proceedings (2019)
2.
go back to reference Ben-Younes, H., Cadene, R., Thome, N., Cord, M.: Block: bilinear superdiagonal fusion for visual question answering and visual relationship detection. In: The Thirty-Third AAAI Conference on Artificial Intelligence (2019) Ben-Younes, H., Cadene, R., Thome, N., Cord, M.: Block: bilinear superdiagonal fusion for visual question answering and visual relationship detection. In: The Thirty-Third AAAI Conference on Artificial Intelligence (2019)
3.
go back to reference Cadene, R., Ben-younes, H., Cord, M., Thome, N.: Murel: multimodal relational reasoning for visual question answering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019 Cadene, R., Ben-younes, H., Cord, M., Thome, N.: Murel: multimodal relational reasoning for visual question answering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
4.
go back to reference Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186. Minneapolis, Minnesota, June 2019 Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186. Minneapolis, Minnesota, June 2019
5.
go back to reference Do, T., Do, T.T., Tran, H., Tjiputra, E., Tran, Q.D.: Compact trilinear interaction for visual question answering. In: The IEEE International Conference on Computer Vision (ICCV), October 2019 Do, T., Do, T.T., Tran, H., Tjiputra, E., Tran, Q.D.: Compact trilinear interaction for visual question answering. In: The IEEE International Conference on Computer Vision (ICCV), October 2019
6.
go back to reference Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
7.
go back to reference Gao, P., et al.: Dynamic fusion with intra- and inter-modality attention flow for visual question answering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019 Gao, P., et al.: Dynamic fusion with intra- and inter-modality attention flow for visual question answering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
8.
go back to reference Gao, P., You, H., Zhang, Z., Wang, X., Li, H.: Multi-modality latent interaction network for visual question answering. In: The IEEE International Conference on Computer Vision (ICCV), October 2019 Gao, P., You, H., Zhang, Z., Wang, X., Li, H.: Multi-modality latent interaction network for visual question answering. In: The IEEE International Conference on Computer Vision (ICCV), October 2019
9.
go back to reference Lin, M., Chen, Q., Yan, S.: Network in network. In: 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014) Lin, M., Chen, Q., Yan, S.: Network in network. In: 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014)
10.
go back to reference Marino, K., Salakhutdinov, R., Gupta, A.: The more you know: using knowledge graphs for image classification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 Marino, K., Salakhutdinov, R., Gupta, A.: The more you know: using knowledge graphs for image classification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
11.
go back to reference Shah, M., Chen, X., Rohrbach, M., Parikh, D.: Cycle-consistency for robust visual question answering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019 Shah, M., Chen, X., Rohrbach, M., Parikh, D.: Cycle-consistency for robust visual question answering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
12.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations, May 2015 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations, May 2015
13.
go back to reference Yu, Z., Yu, J., Cui, Y., Tao, D., Tian, Q.: Deep modular co-attention networks for visual question answering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019 Yu, Z., Yu, J., Cui, Y., Tao, D., Tian, Q.: Deep modular co-attention networks for visual question answering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Metadata
Title
Learning from the Guidance: Knowledge Embedded Meta-learning for Medical Visual Question Answering
Authors
Wenbo Zheng
Lan Yan
Fei-Yue Wang
Chao Gou
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-63820-7_22

Premium Partner