Top

Published in:

2019 | OriginalPaper | Chapter

An Effective Deep Transfer Learning and Information Fusion Framework for Medical Visual Question Answering

Authors : Feifan Liu, Yalei Peng, Max P. Rosen

Published in: Experimental IR Meets Multilinguality, Multimodality, and Interaction

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Medical visual question answering (Med-VQA) is very important for better clinical decision support and enhanced patient engagement in patient-centered medical care. Compared with open domain VQA tasks, VQA in medical domain becomes more challenging due to limited training resources as well as unique characteristics on medical images and domain vocabularies. In this paper, we propose and develop a novel deep transfer learning model, ETM-Trans, which exploits embedding topic modeling (ETM) on textual questions to derive topic labels to pair with associated medical images for finetuning the pre-trained ImageNet model. We also explore and implement a co-attention mechanism where residual networks is used to extract visual features from image interacting with the long-short term memory (LSTM) based question representation providing fine-grained contextual information for answer derivation. To efficiently integrate visual features from the image and textual features from the question, we employ Multimodal Factorized Bilinear (MFB) pooling as well as Multimodal Factorized High-order (MFH) pooling. The ETM-Trans model won the international Med-VQA 2018 challenge, achieving the best WBSS score of 0.186.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Interactive Learning-Based Retrieval Technique for Visual Lifelogging

next chapter Language Modeling in Temporal Mood Variation Models for Early Risk Detection on the Internet

https://visualqa.org/challenge.html.

Gupta, A.K.: Survey of visual question answering: datasets and techniques. arXiv:1705.03865 [cs] (2017)

Lin, T.-Y., et al.: Microsoft COCO: common objects in context. arXiv:1405.0312 [cs] (2014)CrossRef

Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. arXiv:1602.07332 [cs] (2016)

Ionescu, B., et al.: Overview of ImageCLEF 2018: challenges, datasets and evaluation. In: Bellot, P., et al. (eds.) CLEF 2018. LNCS, vol. 11018, pp. 309–334. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98932-7_28CrossRef

Hasan, S.A., Ling, Y., Farri, O., Liu, J., Lungren, M., Müller, H.: Overview of the ImageCLEF 2018 medical domain visual question answering task. In: CLEF2018 Working Notes, Avignon, France (2018). http://ceur-ws.org/

Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. arXiv:1506.01497 [cs] (2015)

Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning (2008)

Ilievski, I., Yan, S., Feng, J.: A focused dynamic attention model for visual question answering. arXiv:1604.01485 [cs] (2016)

Kim, J.-H., On, K.-W., Lim, W., Kim, J., Ha, J.-W., Zhang, B.-T.: Hadamard product for low-rank bilinear pooling. arXiv:1610.04325 [cs] (2016)

10.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

11.

Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., Liu, C.: A survey on deep transfer learning. arXiv:1808.01974 [cs, stat] (2018)

12.

Yu, Z., Yu, J., Fan, J., Tao, D.: Multi-modal factorized bilinear pooling with co-attention learning for visual question answering. arXiv:1708.01471 [cs] (2017)

13.

Fukui, A., Park, D.H., Yang, D., Rohrbach, A., Darrell, T., Rohrbach, M.: Multimodal compact bilinear pooling for visual question answering and visual grounding. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 457–468. Association for Computational Linguistics, Austin (2016)

14.

Kim, J.-H., et al.: Multimodal residual learning for visual QA. arXiv:1606.01455 [cs] (2016)

15.

Qiang, J., Chen, P., Wang, T., Wu, X.: Topic modeling over short texts by incorporating word embeddings. arXiv:1609.08496 [cs] (2016)

Title: An Effective Deep Transfer Learning and Information Fusion Framework for Medical Visual Question Answering
Authors: Feifan Liu
Yalei Peng
Max P. Rosen
Publisher: Springer International Publishing
Book: Experimental IR Meets Multilinguality, Multimodality, and Interaction
Print ISBN: 978-3-030-28576-0

Electronic ISBN: 978-3-030-28577-7

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-030-28577-7_20

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner