survey

A Survey on Deep Learning and Explainability for Automatic Report Generation from Medical Images

Authors:
Pablo Messina

Computer Science Department, Pontificia Universidad Católica de Chile, Vicuña Mackenna, Santiago, Chile

Computer Science Department, Pontificia Universidad Católica de Chile, Vicuña Mackenna, Santiago, Chile

0000-0002-4517-478X
View Profile

,
Pablo Pino

Computer Science Department, Pontificia Universidad Católica de Chile, Vicuña Mackenna, Santiago, Chile

Computer Science Department, Pontificia Universidad Católica de Chile, Vicuña Mackenna, Santiago, Chile
View Profile

,
Denis Parra

Computer Science Department, Pontificia Universidad Católica de Chile, Vicuña Mackenna, Santiago, Chile

Computer Science Department, Pontificia Universidad Católica de Chile, Vicuña Mackenna, Santiago, Chile
View Profile

,
Alvaro Soto

Computer Science Department, Pontificia Universidad Católica de Chile, Vicuña Mackenna, Santiago, Chile

Computer Science Department, Pontificia Universidad Católica de Chile, Vicuña Mackenna, Santiago, Chile
View Profile

,
Cecilia Besa

Department of Radiology, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile

Department of Radiology, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile
View Profile

,
Sergio Uribe

Department of Radiology, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile

Department of Radiology, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile
View Profile

,
Marcelo Andía

Department of Radiology, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile

Department of Radiology, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile
View Profile

,
Cristian Tejos

Department of Electrical Engineering, Pontificia Universidad Católica de Chile, Vicuña Mackenna, Santiago, Chile

Department of Electrical Engineering, Pontificia Universidad Católica de Chile, Vicuña Mackenna, Santiago, Chile
View Profile

,
Claudia Prieto

School of Biomedical Engineering and Imaging Sciences, King’s College London,St Thomas’ Hospital, London, UK

School of Biomedical Engineering and Imaging Sciences, King’s College London,St Thomas’ Hospital, London, UK
View Profile

,
Daniel Capurro

School of Computing and Information Systems, The University of Melbourne, Australia

School of Computing and Information Systems, The University of Melbourne, Australia
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 54 Issue 10sArticle No.: 203pp 1–40https://doi.org/10.1145/3522747

Published:13 September 2022Publication History

ACM Computing Surveys

Abstract

Every year physicians face an increasing demand of image-based diagnosis from patients, a problem that can be addressed with recent artificial intelligence methods. In this context, we survey works in the area of automatic report generation from medical images, with emphasis on methods using deep neural networks, with respect to (1) Datasets, (2) Architecture Design, (3) Explainability, and (4) Evaluation Metrics. Our survey identifies interesting developments but also remaining challenges. Among them, the current evaluation of generated reports is especially weak, since it mostly relies on traditional Natural Language Processing (NLP) metrics, which do not accurately capture medical correctness.

REFERENCES

[1] Abràmoff Michael D., Folk James C., Han Dennis P., Walker Jonathan D., Williams David F., Russell Stephen R., Massin Pascale, Cochener Beatrice, Gain Philippe, Tang Li, Mathieu Lamard, Daniela C. Moga, Gwénolé Quellec, and Meindert Niemeijer. 2013. Automated analysis of retinal images for detection of referable diabetic retinopathy. JAMA Ophthalmology 131, 3 (2013), 351–357.Google ScholarCross Ref
[2] Adadi A. and Berrada M.. 2018. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access 6 (2018), 52138–52160.Google ScholarCross Ref
[3] Adebayo Julius, Gilmer Justin, Muelly Michael, Goodfellow Ian, Hardt Moritz, and Kim Been. 2018. Sanity checks for saliency maps. In Advances in Neural Information Processing Systems 31. Curran Associates, Inc., 9505–9515.Google Scholar
[4] Ahmad Muhammad Aurangzeb, Eckert Carly, and Teredesai Ankur. 2018. Interpretable machine learning in healthcare. In Proc. of the 2018 ACM Intl. Conf. on Bioinformatics, Computational Biology, and Health Informatics (BCB’18). ACM, New York, NY, 559–560.Google Scholar
[5] Akazawa Kentaro, Sakamoto Ryo, Nakajima Satoshi, Wu Dan, Li Yue, Oishi Kenichi, Faria Andreia V., Yamada Kei, Togashi Kaori, Lyketsos Constantine G., Miller Michael I., and Mori Susumu. 2019. Automated generation of radiologic descriptions on brain volume changes from T1-weighted MR images: Initial assessment of feasibility. Frontiers in Neurology 10 (2019), 7.Google ScholarCross Ref
[6] Allaouzi Imane, Ahmed M. Ben, Benamrou B., and Ouardouz M.. 2018. Automatic caption generation for medical images. In Proc. of the 3rd Intl. Conf. on Smart City Applications (SCA’18). ACM, New York, NY, Article 86, 6 pages.Google Scholar
[7] Alsharid Mohammad, Sharma Harshita, Drukker Lior, Chatelain Pierre, Papageorghiou Aris T., and Noble J. Alison. 2019. Captioning ultrasound images automatically. In Medical Image Computing and Computer Assisted Intervention (MICCAI’19). Springer Intl. Publishing, Cham, 338–346.Google Scholar
[8] Amershi Saleema, Weld Dan, Vorvoreanu Mihaela, Fourney Adam, Nushi Besmira, Collisson Penny, Suh Jina, Iqbal Shamsi, Bennett Paul N., Inkpen Kori, Teevan Jaime, Kikin-Gil Ruth, and Horvitz Eric. 2019. Guidelines for human-AI interaction. In Proc. of the 2019 CHI Conf. on Human Factors in Computing Systems (CHI’19). ACM, 1–13.Google Scholar
[9] Anderson Peter, Fernando Basura, Johnson Mark, and Gould Stephen. 2016. SPICE: Semantic propositional image caption evaluation. In Computer Vision (ECCV’16). Springer Intl. Publishing, Cham, 382–398.Google Scholar
[10] Aronson Alan R. and Lang François-Michel. 2010. An overview of MetaMap: Historical perspective and recent advances. Journal of the American Medical Informatics Association 17, 3 (2010), 229–236.Google ScholarCross Ref
[11] Babar Zaheer, Laarhoven Twan van, Zanzotto Fabio Massimo, and Marchiori Elena. 2021. Evaluating diagnostic content of AI-generated radiology reports of chest X-rays. Artificial Intelligence in Medicine 116 (2021), 102075. Google ScholarCross Ref
[12] Banerjee Satanjeev and Lavie Alon. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proc. of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. ACL, 65–72.Google Scholar
[13] Abacha Asma Ben, Datla Vivek V., Hasan Sadid A., Demner-Fushman Dina, and Müller Henning. 2020. Overview of the VQA-med task at ImageCLEF 2020: Visual question answering and generation in the medical domain. In CLEF 2020 Working Notes (CEUR Workshop Proceedings).Google Scholar
[14] Abacha Asma Ben, Hasan Sadid A., Datla Vivek V., Liu Joey, Demner-Fushman Dina, and Müller Henning. 2019. VQA-Med: Overview of the medical visual question answering task at ImageCLEF 2019. In CLEF2019 Working Notes (CEUR Workshop Proceedings).Google Scholar
[15] Bengio Yoshua, Louradour Jérôme, Collobert Ronan, and Weston Jason. 2009. Curriculum learning. In Proc. of the 26th Annual Intl. Conf. on Machine Learning (ICML’09). ACM, 41–48.Google Scholar
[16] Biswal Siddharth, Xiao Cao, Glass Lucas M., Westover Brandon, and Sun Jimeng. 2020. CLARA: Clinical report auto-completion. In Proc. of the Web Conf. 2020 (WWW’20). ACM, New York, NY, 541–550.Google Scholar
[17] Boag William, Hsu Tzu-Ming Harry, Mcdermott Matthew, Berner Gabriela, Alesentzer Emily, and Szolovits Peter. 2020. Baselines for chest X-ray report generation. In Proc. of the Machine Learning for Health NeurIPS Workshop (Proc. of Machine Learning Research), Vol. 116. PMLR, 126–140.Google Scholar
[18] Branko Milosavljević, Danijela Boberić, and Dušan Surla. 2010. Retrieval of bibliographic records using Apache Lucene. Electronic Library 28, 4 (Jan. 2010), 525–539.Google ScholarCross Ref
[19] Bustos Aurelia, Pertusa Antonio, Salinas Jose-Maria, and Iglesia-Vayá Maria de la. 2019. Padchest: A large chest x-ray image dataset with multi-label annotated reports. arXiv:1901.07441 (2019).Google Scholar
[20] Cai Carrie J., Winter Samantha, Steiner David, Wilcox Lauren, and Terry Michael. 2019. “Hello AI”: Uncovering the onboarding needs of medical practitioners for human-AI collaborative decision-making. Proc. of the ACM on Human-computer Interaction 3, CSCW (2019), 1–24.Google ScholarDigital Library
[21] Caruana Rich. 1997. Multitask learning. Machine Learning 28, 1 (1997), 41–75.Google ScholarDigital Library
[22] Carvalho Diogo V., Pereira Eduardo M., and Cardoso Jaime S.. 2019. Machine learning interpretability: A survey on methods and metrics. Electronics 8, 8 (2019), 832.Google ScholarCross Ref
[23] Chen Danqi and Manning Christopher. 2014. A fast and accurate dependency parser using neural networks. In Proc. of the 2014 Conf. on Empirical Methods in Natural Language Processing (EMNLP’14). ACL, 740–750.Google Scholar
[24] Chopra Sumit, Hadsell Raia, and LeCun Yann. 2005. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1. IEEE, 539–546.Google Scholar
[25] Christ P., Ettlinger F., Grün F., Lipkova J., and Kaissis G.. 2017. Lits-liver tumor segmentation challenge. ISBI and MICCAI (2017).Google Scholar
[26] Chung Junyoung, Gulcehre Caglar, Cho Kyunghyun, and Bengio Yoshua. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. In NIPS 2014 Workshop on Deep Learning.Google Scholar
[27] Decencière Etienne, Zhang Xiwei, Cazuguel Guy, Lay Bruno, Cochener Béatrice, Trone Caroline, Gain Philippe, Ordonez Richard, Massin Pascale, Erginay Ali, Béatrice Charton, and Klein Jc. 2014. Feedback on a publicly distributed image database: The Messidor database. Image Analysis & Stereology 33, 3 (2014), 231–234.Google ScholarCross Ref
[28] Demner-Fushman Dina, Kohli Marc D., Rosenman Marc B., Shooshan Sonya E., Rodriguez Laritza, Antani Sameer, Thoma George R., and McDonald Clement J.. 2015. Preparing a collection of radiology examinations for distribution and retrieval. Journal of the American Medical Informatics Association 23, 2 (2015), 304–310.Google ScholarCross Ref
[29] Deng J., Dong W., Socher R., Li L., Li Kai, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conf. on Computer Vision and Pattern Recognition. 248–255.Google Scholar
[30] Denkowski Michael and Lavie Alon. 2010. Extending the meteor machine translation evaluation metric to the phrase level. In Human Language Technologies: The 2010 Annual Conf. of the North American Chapter of the ACL (HLT’10). ACL, 250–253.Google Scholar
[31] Denkowski Michael and Lavie Alon. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In Proc. of the 6th Workshop on Statistical Machine Translation (WMT’11). ACL, 85–91.Google Scholar
[32] Denkowski Michael and Lavie Alon. 2014. Meteor universal: Language specific translation evaluation for any target language. In Proc. of the 9th Workshop on Statistical Machine Translation. ACL, 376–380.Google Scholar
[33] Doshi-Velez Finale and Kim Been. 2017. Towards a rigorous science of interpretable machine learning. stat 1050 (2017), 2.Google Scholar
[34] Došilović F. K., Brčić M., and Hlupić N.. 2018. Explainable artificial intelligence: A survey. In 2018 41st Intl. Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO’18). 0210–0215.Google Scholar
[35] Eickhoff Carsten, Schwall Immanuel, Herrera Alba García Seco de, and Müller Henning. 2017. Overview of ImageCLEFcaption 2017 - The image caption prediction and concept extraction tasks to understand biomedical images. In CLEF2017 Working Notes (CEUR Workshop Proceedings).Google Scholar
[36] Gajbhiye Gaurav O., Nandedkar Abhijeet V., and Faye Ibrahima. 2020. Automatic report generation for chest X-ray images: A multilevel multi-attention approach. In Computer Vision and Image Processing. Springer, Singapore, 174–182.Google Scholar
[37] Gale William, Oakden-Rayner Luke, Carneiro Gustavo, Bradley Andrew P., and Palmer Lyle J.. 2017. Detecting hip fractures with radiologist-level performance using deep neural networks. arXiv:1711.06504 (2017).Google Scholar
[38] Gale W., Oakden-Rayner L., Carneiro G., Palmer L. J., and Bradley A. P.. 2019. Producing radiologist-quality reports for interpretable deep learning. In 2019 IEEE 16th Intl. Symposium on Biomedical Imaging (ISBI’19). 1275–1279.Google Scholar
[39] Herrera Alba García Seco de, Eickhoff Carsten, Andrearczyk Vincent, and Müller Henning. 2018. Overview of the ImageCLEF 2018 caption prediction tasks. In CLEF2018 Working Notes (CEUR Workshop Proceedings).Google Scholar
[40] Gasimova Aydan. 2019. Automated enriched medical concept generation for chest X-ray images. In Interpretability of Machine Intelligence in Medical Image Computing and Multimodal Learning for Clinical Decision Support. Springer Intl. Publishing, Cham, 83–92.Google ScholarDigital Library
[41] Goodfellow Ian, Pouget-Abadie Jean, Mirza Mehdi, Xu Bing, Warde-Farley David, Ozair Sherjil, Courville Aaron, and Bengio Yoshua. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems 27. Curran Associates, Inc., 2672–2680.Google Scholar
[42] Graves Alex and Schmidhuber Jürgen. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18, 5–6 (2005), 602–610.Google ScholarDigital Library
[43] Graziani Mara, Andrearczyk Vincent, and Müller Henning. 2018. Regression concept vectors for bidirectional explanations in histopathology. In Understanding and Interpreting Machine Learning in Medical Image Computing Applications. Springer Intl. Publishing, Cham, 124–132.Google ScholarCross Ref
[44] Gu M., Huang X., and Fang Y.. 2019. Automatic generation of pulmonary radiology reports with semantic tags. In 2019 IEEE 11th Intl. Conf. on Advanced Infocomm Technology (ICAIT’19). 162–167.Google Scholar
[45] Guidotti Riccardo, Monreale Anna, Ruggieri Salvatore, Turini Franco, Giannotti Fosca, and Pedreschi Dino. 2018. A survey of methods for explaining black box models. ACM Computing Surveys (CSUR) 51, 5 (2018), 1–42.Google ScholarDigital Library
[46] Gunning David. 2017. Explainable artificial intelligence (XAI). DOI:Google ScholarCross Ref
[47] Han Zhongyi, Wei Benzheng, Leung Stephanie, Chung Jonathan, and Li Shuo. 2018. Towards automatic report generation in spine radiology using weakly supervised framework. In Medical Image Computing and Computer Assisted Intervention (MICCAI’18). Springer Intl. Publishing, Cham, 185–193.Google Scholar
[48] Harzig Philipp, Chen Yan-Ying, Chen Francine, and Lienhart Rainer. 2019. Addressing data bias problems for chest X-ray image report generation. arXiv abs/1908.02123 (2019).Google Scholar
[49] Harzig Philipp, Einfalt Moritz, and Lienhart Rainer. 2019. Automatic disease detection and report generation for gastrointestinal tract examination. In Proc. of the 27th ACM Intl. Conf. on Multimedia (MM’19). ACM, New York, NY, 2573–2577.Google Scholar
[50] Hasan Sadid A., Ling Yuan, Farri Oladimeji, Liu Joey, Lungren Matthew, and Müller Henning. 2018. Overview of the ImageCLEF 2018 medical domain visual question answering task. In CLEF2018 Working Notes (CEUR Workshop Proceedings).Google Scholar
[51] Hasan Sadid A., Ling Yuan, Liu Joey, Sreenivasan Rithesh, Anand Shreya, Arora Tilak Raj, Datla Vivek, Lee Kathy, Qadir Ashequl, Swisher Christine, and Farri Oladimeji. 2018. Attention-based medical caption generation with image modality classification and clinical concept mapping. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. Springer Intl. Publishing, Cham, 224–230.Google Scholar
[52] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’16). 770–778.Google Scholar
[53] He Xuehai, Zhang Yichen, Mou Luntian, Xing Eric, and Xie Pengtao. 2020. PathVQA: 30000+ questions for medical visual question answering. arXiv:2003.10286 (2020).Google Scholar
[54] Heath Michael, Bowyer Kevin, Kopans Daniel, Moore Richard, and Kegelmeyer P.. 2001. The digital database for screening mammography. In Proc of the 5th Intl. Workshop on Digital Mammography, Vol. 58, M. J. Yaffe, ed. Medical Physics Publishing, 212–218.Google Scholar
[55] Hicks Steven, Riegler Michael, Smedsrud Pia, Haugen Trine B., Randel Kristin Ranheim, Pogorelov Konstantin, Stensland Håkon Kvale, Dang-Nguyen Duc-Tien, Lux Mathias, Petlund Andreas, Lange Thomas de, Schmidt Peter Thelin, and Halvorsen Pål. 2019. ACM multimedia BioMedia 2019 grand challenge overview. In Proc. of the 27th ACM Intl. Conf. on Multimedia (MM’19). ACM, 2563–2567.Google Scholar
[56] Hicks Steven Alexander, Pogorelov Konstantin, Lange Thomas de, Lux Mathias, Jeppsson Mattis, Randel Kristin Ranheim, Eskeland Sigrun, Halvorsen Pål, and Riegler Michael. 2018. Comprehensible reasoning and automated reporting of medical examinations based on deep learning analysis. In Proc of the 9th ACM Multimedia Systems Conference (MMSys’18). ACM, 490–493.Google ScholarDigital Library
[57] Hochreiter Sepp and Schmidhuber Jürgen. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.Google ScholarDigital Library
[58] Hoover A.. 1975. STARE database. http://www.ces.clemson.edu/ahoover/stare.Google Scholar
[59] Howard Andrew G., Zhu Menglong, Chen Bo, Kalenichenko Dmitry, Wang Weijun, Weyand Tobias, Andreetto Marco, and Adam Hartwig. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017).Google Scholar
[60] Huang Gao, Liu Zhuang, Maaten Laurens Van Der, and Weinberger Kilian Q.. 2017. Densely connected convolutional networks. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’17). 4700–4708.Google Scholar
[61] Huang Xin, Yan Fengqi, Xu Wei, and Li Maozhen. 2019. Multi-attention and incorporating background information model for chest X-ray image report generation. IEEE Access 7 (2019), 154808–154817.Google ScholarCross Ref
[62] Hwang Eui Jin, Park Sunggyun, Jin Kwang-Nam, Kim Jung Im, Choi So Young, Lee Jong Hyuk, Goo Jin Mo, Aum Jaehong, Yim Jae-Joon, Cohen Julien G., Ferretti Gilbert R., Park Chang Min, Development for the DLAD, and Group Evaluation. 2019. Development and validation of a deep learning–based automated detection algorithm for major thoracic diseases on chest radiographs. JAMA Network Open 2, 3 (03 2019), e191095–e191095.Google ScholarCross Ref
[63] Jeremy Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, Silviana Ciurea-Ilcus, Chris Chute, Henrik Marklund, Behzad Haghgoo, Robyn Ball, Katie Shpanskaya, Jayne Seekins, David A. Mong, Safwan S. Halabi, Jesse K. Sandberg, Ricky Jones, David B. Larson, Curtis P. Langlotz, Bhavik N. Patel, Matthew P. Lungren, and Andrew Y. Ng. 2019. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proc. of the AAAI Conf. on Artificial Intelligence, Vol. 33. Association for the Advancement of Artificial Intelligence (AAAI), 590–597.Google Scholar
[64] Jackson E. F.. 2018. Quantitative Imaging: The translation from research tool to clinical practice. Radiology 286, 2 (2018), 499.Google ScholarCross Ref
[65] Jain Saahil, Agrawal Ashwin, Saporta Adriel, Truong Steven, Duong Du Nguyen, Bui Tan, Chambon Pierre, Zhang Yuhao, Lungren Matthew P., Ng Andrew Y., Langlotz Curtis, and Rajpurkar Pranav. 2021. RadGraph: Extracting clinical entities and relations from radiology reports. In 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1). https://openreview.net/forum?id=pMWtc5NKd7V.Google Scholar
[66] Jain Sarthak and Wallace Byron C.. 2019. Attention is not explanation. In Proc. of the 2019 Conf. of the North American Chapter of the ACL: Human Language Technologies, Volume 1 (Long and Short Papers). ACL.Google Scholar
[67] Jing Baoyu, Wang Zeya, and Xing Eric. 2019. Show, describe and conclude: On exploiting the structure information of chest X-ray reports. In Proc of the 57th Annual Meeting of the ACL. ACL, 6570–6580.Google Scholar
[68] Jing Baoyu, Xie Pengtao, and Xing Eric. 2018. On the automatic generation of medical imaging reports. In Proc. of the 56th Annual Meeting of the ACL (Volume 1: Long Papers). ACL, 2577–2586.Google Scholar
[69] Johnson Alistair E. W., Pollard Tom J., Greenbaum Nathaniel R., Lungren Matthew P., Deng Chih-ying, Peng Yifan, Lu Zhiyong, Mark Roger G., Berkowitz Seth J, and Horng Steven. 2019. MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs. arXiv:1901.07042 (2019).Google Scholar
[70] Johnson Alistair E. W., Pollard Tom J., Berkowitz Seth J., Greenbaum Nathaniel R., Lungren Matthew P., Deng Chih-ying, Mark Roger G., and Horng Steven. 2019. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific Data 6, 1 (Dec. 2019), 317.Google ScholarCross Ref
[71] Kaelbling Leslie Pack, Littman Michael L., and Moore Andrew W.. 1996. Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4 (1996), 237–285.Google ScholarDigital Library
[72] Kälviäinen R. V. J. P. H. and Uusitalo H.. 2007. DIARETDB1 diabetic retinopathy database and evaluation protocol. In Medical Image Understanding and Analysis, Vol. 2007. Citeseer, 61.Google Scholar
[73] Kauppi Tomi, Kalesnykiene Valentina, Kamarainen Joni-Kristian, Lensu Lasse, Sorri Iiris, Uusitalo Hannu, Kälviäinen Heikki, and Pietilä Juhani. 2006. DIARETDB0: Evaluation database and methodology for diabetic retinopathy algorithms. Machine Vision and Pattern Recognition Research Group 73 (2006), 1–17.Google Scholar
[74] Khan Asifullah, Sohail Anabia, Zahoora Umme, and Qureshi Aqsa Saeed. 2020. A survey of the recent architectures of deep convolutional neural networks. Artificial Intelligence Review (April 2020), 1–62.Google Scholar
[75] Kim Been, Wattenberg Martin, Gilmer Justin, Cai Carrie, Wexler James, Viegas Fernanda, and Sayres Rory. 2018. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV). InProc. of Machine Learning Research, Vol. 80. PMLR, 2668–2677.Google Scholar
[76] Kisilev Pavel, Sason Eli, Barkan Ella, and Hashoul Sharbell. 2016. Medical image description using multi-task-loss CNN. In Deep Learning and Data Labeling for Medical Applications. Springer Intl. Publishing, Cham, 121–129.Google Scholar
[77] Komodakis Nikos and Zagoruyko Sergey. 2017. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In ICLR. Google Scholar
[78] Kornblith Simon, Shlens Jonathon, and Le Quoc V.. 2019. Do better imagenet models transfer better? In 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR’19). IEEE Computer Society, Los Alamitos, CA, 2656–2666.Google Scholar
[79] Krizhevsky Alex, Sutskever Ilya, and Hinton Geoffrey E.. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25. Curran Associates, Inc., 1097–1105.Google ScholarDigital Library
[80] Kumar M. P., Packer Benjamin, and Koller Daphne. 2010. Self-paced learning for latent variable models. In Advances in Neural Information Processing Systems 23. Curran Associates, Inc., 1189–1197.Google Scholar
[81] Langlotz Curtis P.. 2006. RadLex: A new method for indexing online educational materials. Radiographics: A Review Publication of the Radiological Society of North America, Inc. 26, 6 (2006), 1595.Google ScholarCross Ref
[82] Lavie Alon and Agarwal Abhaya. 2007. Meteor: An automatic metric for MT evaluation with high levels of correlation with human judgments. In Proc of the 2nd Workshop on Statistical Machine Translation (StatMT’07). ACL, 228–231.Google Scholar
[83] Le Quoc and Mikolov Tomas. 2014. Distributed representations of sentences and documents. In Intl. Conf. on Machine Learning. 1188–1196.Google Scholar
[84] Leaman Robert, Khare Ritu, and Lu Zhiyong. 2015. Challenges in clinical natural language processing for automated disorder normalization. Journal of Biomedical Informatics 57 (2015), 28–37.Google ScholarDigital Library
[85] Li Christy Y., Liang Xiaodan, Hu Zhiting, and Xing Eric P.. 2018. Hybrid retrieval-generation reinforced agent for medical image report generation. In Proc. of the 32nd Intl. Conf. on Neural Information Processing Systems (NIPS’18). Curran Associates Inc., Red Hook, NY, 1537–1547.Google Scholar
[86] Li Christy Y., Liang Xiaodan, Hu Zhiting, and Xing Eric P.. 2019. Knowledge-driven encode, retrieve, paraphrase for medical image report generation. In Proc. of the AAAI Conf. on Artificial Intelligence, Vol. 33. 6666–6673.Google Scholar
[87] Li Jiyun and Hong Yongliang. 2019. Label generation system based on generative adversarial network for medical image. In Proc. of the 2nd Intl. Conf. on Artificial Intelligence and Pattern Recognition (AIPR’19). ACM, 78–82.Google Scholar
[88] Li Jiwei, Luong Thang, and Jurafsky Dan. 2015. A hierarchical neural autoencoder for paragraphs and documents. In Proc of the 53rd Annual Meeting of the ACL and the 7th Intl. Joint Conf. on Natural Language Processing (Volume 1: Long Papers). ACL, 1106–1115.Google Scholar
[89] Li Xin, Cao Rui, and Zhu Dongxiao. 2019. Vispi: Automatic visual perception and interpretation of chest X-rays. arXiv:1906.05190 (2019).Google Scholar
[90] Lin Chin-Yew. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out. ACL, 74–81.Google Scholar
[91] Lipton Zachary C.. 2018. The mythos of model interpretability. Communications of the ACM 61, 10 (2018), 36–43.Google ScholarDigital Library
[92] Liu Guanxiong, Hsu Tzu-Ming Harry, McDermott Matthew, Boag Willie, Weng Wei-Hung, Szolovits Peter, and Ghassemi Marzyeh. 2019. Clinically accurate chest X-ray report generation. In Machine Learning for Healthcare Conference (Proc of Machine Learning Research), Vol. 106. PMLR, 249–269.Google Scholar
[93] Loveymi Samira, Dezfoulian Mir Hossein, and Mansoorizadeh Muharram. 2020. Generate structured radiology report from CT images using image annotation techniques: Preliminary results with liver CT. Journal of Digital Imaging 33, 2 (April 2020), 375–390.Google ScholarCross Ref
[94] Ma Kai, Wu Kaijie, Cheng Hao, Gu Chaochen, Xu Rui, and Guan Xinping. 2018. A pathology image diagnosis network with visual interpretability and structured diagnostic report. In Neural Information Processing. Springer Intl. Publishing, Cham, 282–293.Google Scholar
[95] Maksoud Sam, Wiliem Arnold, Zhao Kun, Zhang Teng, Wu Lin, and Lovell Brian. 2019. CORAL8: Concurrent object regression for area localization in medical image panels. In Medical Image Computing and Computer Assisted Intervention (MICCAI’19). Springer Intl. Publishing, Cham, 432–441.Google Scholar
[96] Mathur Nitika, Baldwin Timothy, and Cohn Trevor. 2020. Tangled up in BLEU: Reevaluating the evaluation of automatic machine translation evaluation metrics. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4984–4997. Google ScholarCross Ref
[97] Monshi Maram Mahmoud A., Poon Josiah, and Chung Vera. 2020. Deep learning in generating radiology reports: A survey. Artificial Intelligence in Medicine 106 (2020), 101878.Google ScholarCross Ref
[98] Moradi Mehdi, Guo Yufan, Gur Yaniv, Negahdar Mohammadreza, and Syeda-Mahmood Tanveer. 2016. A cross-modality neural network transform for semi-automatic medical image annotation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI’16). Springer Intl. Publishing, Cham, 300–307.Google Scholar
[99] Moreira Inês C., Amaral Igor, Domingues Inês, Cardoso António, Cardoso Maria Joao, and Cardoso Jaime S.. 2012. Inbreast: Toward a full-field digital mammographic database. Academic Radiology 19, 2 (2012), 236–248.Google ScholarCross Ref
[100] Mork J. G., Yepes A. J. J., and Aronson A. R.. 2013. The NLM medical text indexer system for indexing biomedical literature. In CEUR Workshop Proceedings, Vol. 1094.Google Scholar
[101] Otter Daniel W., Medina Julian R., and Kalita Jugal K.. 2020. A survey of the usages of deep learning for natural language processing. IEEE Transactions on Neural Networks and Learning Systems 32, 2 (2020), 604–624. https://europepmc.org/article/med/32324570.Google Scholar
[102] Papineni Kishore, Roukos Salim, Ward Todd, and Zhu Wei-Jing. 2002. Bleu: A method for automatic evaluation of machine translation. In Proc. of the 40th Annual Meeting of the ACL. ACL, 311–318.Google Scholar
[103] Pascanu Razvan, Mikolov Tomas, and Bengio Yoshua. 2013. On the difficulty of training recurrent neural networks. In Proc. of the 30th Intl. Conf. on Intl. Conf. on Machine Learning - Volume 28 (ICML’13). JMLR.org, III–1310–III–1318.Google Scholar
[104] Pavlopoulos John, Kougia Vasiliki, and Androutsopoulos Ion. 2019. A survey on biomedical image captioning. In Proc. of the 2nd Workshop on Shortcomings in Vision and Language. ACL, 26–36.Google Scholar
[105] Pelka Obioma, Koitka Sven, Rückert Johannes, Nensa Felix, and Friedrich Christoph M.. 2018. Radiology objects in COntext (ROCO): A multimodal image dataset. In Intravascular Imaging and Computer Assisted Stenting and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis. Springer Intl. Publishing, Cham, 180–189.Google Scholar
[106] Peng Yifan, Wang Xiaosong, Lu Le, Bagheri Mohammadhadi, Summers Ronald, and Lu Zhiyong. 2018. Negbio: A high-performance tool for negation and uncertainty detection in radiology reports. AMIA Summits on Translational Science Proceedings 2018 (2018), 188.Google Scholar
[107] Pino Pablo, Parra Denis, Besa Cecilia, and Lagos Claudio. 2021. Clinically correct report generation from chest X-rays using templates. In Machine Learning in Medical Imaging, Lian Chunfeng, Cao Xiaohuan, Rekik Islem, Xu Xuanang, and Yan Pingkun (Eds.). Springer International Publishing, Cham, 654–663. Google ScholarCross Ref
[108] Pino Pablo, Parra Denis, Messina Pablo, Besa Cecilia, and Uribe Sergio. 2020. Inspecting state of the art performance and NLP metrics in image-based medical report generation. arXiv preprint arXiv:2011.09257 (2020). In LXAI at NeurIPS 2020.Google Scholar
[109] Raghu Maithra, Zhang Chiyuan, Kleinberg Jon, and Bengio Samy. 2019. Transfusion: Understanding transfer learning for medical imaging. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 3347–3357.Google Scholar
[110] Pranav Rajpurkar, Jeremy Irvin, Kaylie Zhu, Brandon Yang, Hershel Mehta, Tony Duan, Daisy Ding, Aarti Bagul, Curtis Langlotz, Katie Shpanskaya, Matthew P. Lungren, and Andrew Y. Ng. 2017. Chexnet: Radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv:1711.05225 (2017).Google Scholar
[111] Reiter Ehud. 2018. A structured review of the validity of BLEU. Computational Linguistics 44, 3 (2018), 393–401. Google ScholarCross Ref
[112] Ren Shaoqing, He Kaiming, Girshick Ross, and Sun Jian. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems 28. Curran Associates, Inc., 91–99.Google Scholar
[113] Reyes Mauricio, Meier Raphael, Pereira Sérgio, Silva Carlos A., Dahlweid Fried-Michael, Tengg-Kobligk Hendrik von, Summers Ronald M., and Wiest Roland. 2020. On the interpretability of artificial intelligence in radiology: Challenges and opportunities. Radiology: Artificial Intelligence 2, 3 (2020), e190043.Google ScholarCross Ref
[114] Ribeiro Marco Tulio, Singh Sameer, and Guestrin Carlos. 2016. “Why should I trust you?”: Explaining the predictions of any classifier. In Proc. of the 22nd ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (KDD’16). ACM, 1135–1144.Google Scholar
[115] Rogers Frank B.. 1963. Medical subject headings. Bulletin of the Medical Library Association 51, 1 (1963), 114–116.Google Scholar
[116] Ronneberger Olaf, Fischer Philipp, and Brox Thomas. 2015. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI’15). Springer Intl. Publishing, Cham, 234–241.Google Scholar
[117] Rosman David A., Bamporiki Judith, Stein-Wexler Rebecca, and Harris Robert D.. 2019. Developing diagnostic radiology training in low resource countries. Current Radiology Reports 7, 9 (2019), 27.Google ScholarCross Ref
[118] Selvaraju Ramprasaath R., Cogswell Michael, Das Abhishek, Vedantam Ramakrishna, Parikh Devi, and Batra Dhruv. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proc. of the IEEE Intl. Conf. on Computer Vision (ICCV’17). 618–626.Google Scholar
[119] Shickel Benjamin, Tighe Patrick James, Bihorac Azra, and Rashidi Parisa. 2017. Deep EHR: A survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE Journal of Biomedical and Health Informatics 22, 5 (2017), 1589–1604.Google ScholarCross Ref
[120] Shin Hoo-Chang, Roberts Kirk, Lu Le, Demner-Fushman Dina, Yao Jianhua, and Summers Ronald M.. 2016. Learning to read chest X-rays: Recurrent neural cascade model for automated image annotation. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’16). 2497–2506.Google Scholar
[121] Shrikumar Avanti, Greenside Peyton, and Kundaje Anshul. 2017. Learning important features through propagating activation differences. In Proc. of the 34th Intl. Conf. on Machine Learning - Volume 70 (ICML’17). JMLR.org, 3145–3153.Google Scholar
[122] Simonyan Karen and Zisserman Andrew. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014).Google Scholar
[123] Singh Sonit, Karimi Sarvnaz, Ho-Shon Kevin, and Hamey Len. 2019. From chest X-rays to radiology reports: A multimodal machine learning approach. In 2019 Digital Image Computing: Techniques and Applications (DICTA’19). IEEE, 1–8.Google Scholar
[124] Smilkov Daniel, Thorat Nikhil, Kim Been, Viégas Fernanda, and Wattenberg Martin. 2017. Smoothgrad: Removing noise by adding noise. arXiv:1706.03825 (2017).Google Scholar
[125] Soldaini Luca and Goharian Nazli. 2016. Quickumls: A fast, unsupervised approach for medical concept extraction. In MedIR Workshop, Sigir. 1–4.Google Scholar
[126] Spinks Graham and Moens Marie-Francine. 2019. Justifying diagnosis decisions by deep neural networks. Journal of Biomedical Informatics 96 (2019), 103248.Google ScholarDigital Library
[127] Springenberg J., Dosovitskiy Alexey, Brox Thomas, and Riedmiller M.. 2015. Striving for simplicity: The all convolutional net. In ICLR (Workshop Track).Google Scholar
[128] Sun Li, Wang Weipeng, Li Jiyun, and Lin Jingsheng. 2019. Study on medical image report generation based on improved encoding-decoding method. In Intelligent Computing Theories and Application. Springer Intl. Publishing, Cham, 686–696.Google Scholar
[129] Szegedy Christian, Liu Wei, Jia Yangqing, Sermanet Pierre, Reed Scott, Anguelov Dragomir, Erhan Dumitru, Vanhoucke Vincent, and Rabinovich Andrew. 2015. Going deeper with convolutions. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’15). 1–9.Google Scholar
[130] Szegedy Christian, Vanhoucke Vincent, Ioffe Sergey, Shlens Jon, and Wojna Zbigniew. 2016. Rethinking the inception architecture for computer vision. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’16). 2818–2826.Google Scholar
[131] Tian Jiang, Li Cong, Shi Zhongchao, and Xu Feiyu. 2018. A diagnostic report generator from CT volumes on liver tumor with semi-supervised attention mechanism. In Medical Image Computing and Computer Assisted Intervention (MICCAI’18). Springer Intl. Publishing, Cham, 702–710.Google Scholar
[132] Tian Jiang, Zhong Cheng, Shi Zhongchao, and Xu Feiyu. 2019. Towards automatic diagnosis from multi-modal medical data. In Interpretability of Machine Intelligence in Medical Image Computing and Multimodal Learning for Clinical Decision Support. Springer Intl. Publishing, Cham, 67–74.Google Scholar
[133] Tjoa Erico and Guan Cuntai. 2019. A survey on explainable artificial intelligence (XAI): Towards medical XAI. arXiv:1907.07374 (2019).Google Scholar
[134] Tonekaboni Sana, Joshi Shalmali, McCradden Melissa D., and Goldenberg Anna. 2019. What clinicians want: Contextualizing explainable machine learning for clinical end use. In Proc. of the 4th Machine Learning for Healthcare Conference (Proc of Machine Learning Research), Vol. 106. PMLR, 359–380.Google Scholar
[135] Topol Eric. 2019. Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again (1st ed.). Basic Books, Inc.Google Scholar
[136] Tsai Min-Jen and Tao Yu-Han. 2019. Machine learning based common radiologist-level pneumonia detection on chest X-rays. In 2019 13th Intl. Conf. on Signal Processing and Communication Systems (ICSPCS’19). IEEE, 1–7.Google Scholar
[137] Miltenburg Emiel van, Clinciu Miruna, Dušek Ondřej, Gkatzia Dimitra, Inglis Stephanie, Leppänen Leo, Mahamood Saad, Manning Emma, Schoch Stephanie, Thomson Craig, and Wen Luou. 2021. Underreporting of errors in NLG output, and what to do about it. In Proc. of the 14th International Conference on Natural Language Generation. Association for Computational Linguistics, 140–153. https://aclanthology.org/2021.inlg-1.14.Google Scholar
[138] Miltenburg Emiel van, Lu Wei-Ting, Krahmer Emiel, Gatt Albert, Chen Guanyi, Li Lin, and Deemter Kees van. 2020. Gradations of error severity in automatic image descriptions. In Proc. of the 13th International Conference on Natural Language Generation. Association for Computational Linguistics, 398–411. https://aclanthology.org/2020.inlg-1.45.Google Scholar
[139] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 5998–6008.Google Scholar
[140] Vedantam Ramakrishna, Zitnick C. Lawrence, and Parikh Devi. 2015. Cider: Consensus-based image description evaluation. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’15). 4566–4575.Google Scholar
[141] Vinyals Oriol, Toshev Alexander, Bengio Samy, and Erhan Dumitru. 2015. Show and tell: A neural image caption generator. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’15). 3156–3164.Google Scholar
[142] Wang Jinhua, Yang Xi, Cai Hongmin, Tan Wanchang, Jin Cangzheng, and Li Li. 2016. Discrimination of breast cancer with microcalcifications on mammography by deep learning. Scientific Reports (Nature Publisher Group) 6 (2016), 27327.Google ScholarCross Ref
[143] Wang Xiaosong, Peng Yifan, Lu Le, Lu Zhiyong, Bagheri Mohammadhadi, and Summers Ronald M.. 2017. ChestX-ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In The IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’17). 3462–3471.Google Scholar
[144] Wang Xiaosong, Peng Yifan, Lu Le, Lu Zhiyong, and Summers Ronald M.. 2018. Tienet: Text-image embedding network for common thorax disease classification and reporting in chest x-rays. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’18). 9049–9058.Google Scholar
[145] Wang Xuwen, Zhang Yu, Guo Zhen, and Li Jiao. 2019. A computational framework towards medical image explanation. In Artificial Intelligence in Medicine: Knowledge Representation and Transparent and Explainable Systems. Springer Intl. Publishing, Cham, 120–131.Google Scholar
[146] Williams Ronald J. and Zipser David. 1989. A learning algorithm for continually running fully recurrent neural networks. Neural Computation 1, 2 (1989), 270–280.Google ScholarDigital Library
[147] Wu C., Chang H., Liu J., and Jang J. R.. 2018. Adaptive generation of structured medical report using NER regarding deep learning. In 2018 Conf. on Technologies and Applications of Artificial Intelligence (TAAI’18). 10–13.Google Scholar
[148] Wu Luhui, Wan Cheng, Wu Yiquan, and Liu Jiang. 2017. Generative caption for diabetic retinopathy images. In 2017 Intl. Conf. on Security, Pattern Analysis, and Cybernetics (SPAC’17). 515–519.Google Scholar
[149] Xie Xiaozheng, Niu Jianwei, Liu Xuefeng, Chen Zhengsu, and Tang Shaojie. 2020. A survey on domain knowledge powered deep learning for medical image analysis. arXiv:2004.12150 (2020).Google Scholar
[150] Xie Xiancheng, Xiong Yun, Yu Philip S., Li Kangan, Zhang Suhua, and Zhu Yangyong. 2019. Attention-based abnormal-aware fusion network for radiology report generation. In Database Systems for Advanced Applications. Springer Intl. Publishing, Cham, 448–452.Google Scholar
[151] Xiong Yuxuan, Du Bo, and Yan Pingkun. 2019. Reinforced transformer for medical image captioning. In Machine Learning in Medical Imaging. Springer Intl. Publishing, Cham, 673–680.Google Scholar
[152] Xu Kelvin, Ba Jimmy Lei, Kiros Ryan, Cho Kyunghyun, Courville Aaron, Salakhutdinov Ruslan, Zemel Richard S., and Bengio Yoshua. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proc. of the 32nd Intl. Conf. on Intl. Conf. on Machine Learning - Volume 37 (ICML’15). JMLR.org, 2048–2057.Google Scholar
[153] Xue Yuan and Huang Xiaolei. 2019. Improved disease classification in chest X-rays with transferred features from report generation. In Information Processing in Medical Imaging. Springer Intl. Publishing, Cham, 125–138.Google Scholar
[154] Xue Yuan, Xu Tao, Long L. Rodney, Xue Zhiyun, Antani Sameer, Thoma George R., and Huang Xiaolei. 2018. Multimodal recurrent model with attention for automated radiology report generation. In Intl. Conf. on Medical Image Computing and Computer-Assisted Intervention. Springer, 457–466.Google Scholar
[155] Yin C., Qian B., Wei J., Li X., Zhang X., Li Y., and Zheng Q.. 2019. Automatic generation of medical imaging diagnostic report with hierarchical recurrent neural network. In 2019 IEEE Intl. Conf. on Data Mining (ICDM’19). 728–737.Google Scholar
[156] Yuan Jianbo, Liao Haofu, Luo Rui, and Luo Jiebo. 2019. Automatic radiology report generation based on multi-view image fusion and medical concept enrichment. In Medical Image Computing and Computer Assisted Intervention (MICCAI’19). Springer Intl. Publishing, Cham, 721–729.Google Scholar
[157] Zeng Xianhua, Wen Li, Liu Banggui, and Qi Xiaojun. 2020. Deep learning for ultrasound image caption generation based on object detection. Neurocomputing 392 (2020), 132–141.Google ScholarCross Ref
[158] Zeng Xian-Hua, Liu Bang-Gui, and Zhou Meng. 2018. Understanding and generating ultrasound image description. Journal of Computer Science and Technology 33, 5 (2018), 1086–1100.Google ScholarCross Ref
[159] Zhang Han, Xu Tao, Li Hongsheng, Zhang Shaoting, Wang Xiaogang, Huang Xiaolei, and Metaxas Dimitris N.. 2017. Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In Proc. of the IEEE Intl. Conf. on Computer Vision (ICCV’17). 5907–5915.Google Scholar
[160] Zhang Xiang, Zhao Junbo, and LeCun Yann. 2015. Character-level convolutional networks for text classification. In Proc. of the 28th Intl. Conf. on Neural Information Processing Systems - Volume 1 (NIPS’15). MIT Press, Cambridge, MA, 649–657.Google Scholar
[161] Zhang Yuhao, Merck Derek, Tsai Emily, Manning Christopher D., and Langlotz Curtis. 2020. Optimizing the factual correctness of a summary: A study of summarizing radiology reports. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5108–5120. Google ScholarCross Ref
[162] Zhang Yixiao, Wang Xiaosong, Xu Ziyue, Yu Qihang, Yuille Alan, and Xu Daguang. 2020. When radiology report generation meets knowledge graph. Proceedings of the AAAI Conference on Artificial Intelligence 34, 7 (April 2020), 12910–12917. Google ScholarCross Ref
[163] Zhang Zizhao, Xie Yuanpu, Xing Fuyong, McGough Mason, and Yang Lin. 2017. Mdnet: A semantically and visually interpretable medical image diagnosis network. In Proc of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’17). 3549–3557.Google Scholar
[164] Zhao Jake, Kim Yoon, Zhang Kelly, Rush Alexander M., and LeCun Yann. 2017. Adversarially regularized autoencoders. arXiv:cs.LG/1706.04223.Google Scholar
[165] Zhou Bolei, Khosla Aditya, Lapedriza Agata, Oliva Aude, and Torralba Antonio. 2016. Learning deep features for discriminative localization. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’16). 2921–2929.Google Scholar
[166] Zhu Feng, Li Hongsheng, Ouyang Wanli, Yu Nenghai, and Wang Xiaogang. 2017. Learning spatial regularization with image-level supervisions for multi-label image classification. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’17). 2027–2036.Google Scholar
[167] Zhu Jun-Yan, Park Taesung, Isola Phillip, and Efros Alexei A.. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proc. of the IEEE Intl. Conf. on Computer Vision. 2223–2232.Google Scholar

Index Terms

A Survey on Deep Learning and Explainability for Automatic Report Generation from Medical Images
1. Applied computing
  1. Life and medical sciences
    1. Health care information systems
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
    2. Natural language processing
      1. Natural language generation
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Automatic Caption Generation for Medical Images
SCA '18: Proceedings of the 3rd International Conference on Smart City Applications

With the increasing availability of medical images coming from different modalities (X-Ray, CT, PET, MRI, ultrasound, etc.), and the huge advances in the development of incredibly fast, accurate and enhanced computing power with the current graphics ...
Read More
Automatic Generation of Medical Report with Knowledge Graph
ICCPR '21: Proceedings of the 2021 10th International Conference on Computing and Pattern Recognition

As an important part of medical diagnosis, medical images are widely used in the diagnosis and treatment of diseases. Radiologists need to write reports for a large number of medical images every day, which usually occupies most of the radiologists’ ...
Read More
Managing Medical Images and Clinical Information: InCor's Experience

Patients usually get medical assistance in several clinics and hospitals during their lifetime, archiving vital information in a dispersed way. Clearly, a proper patient care should take into account that information in order to check for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Computing Surveys Volume 54, Issue 10s
January 2022
831 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3551649
Editor:
Albert Zomaya
University of Sydney, Australia
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 September 2022
- Online AM: 23 March 2022
- Accepted: 7 December 2021
- Revised: 2 November 2021
- Received: 14 September 2020
Published in csur Volume 54, Issue 10s

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Medical report generation
medical image captioning
natural language report
medical images
deep learning
explainable artificial intelligence
Qualifiers
- survey
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 2,211
  Total Downloads
- Downloads (Last 12 months)964
- Downloads (Last 6 weeks)121
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

HTML Format

View this article in HTML Format .

View HTML Format

A Survey on Deep Learning and Explainability for Automatic Report Generation from Medical Images

ACM Computing Surveys

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Automatic Caption Generation for Medical Images

Automatic Generation of Medical Report with Knowledge Graph

Managing Medical Images and Clinical Information: InCor's Experience

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

HTML Format

Share this Publication link

Share on Social Media