Skip to main content
Top
Published in: Neural Computing and Applications 13/2021

05-04-2021 | S.I. : DICTA 2019

Show, tell and summarise: learning to generate and summarise radiology findings from medical images

Authors: Sonit Singh, Sarvnaz Karimi, Kevin Ho-Shon, Len Hamey

Published in: Neural Computing and Applications | Issue 13/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Radiology plays a vital role in health care by viewing the human body for diagnosis, monitoring, and treatment of medical problems. In radiology practice, radiologists routinely examine medical images such as chest X-rays and describe their findings in the form of radiology reports. However, this task of reading medical images and summarising its insights is time consuming, tedious, and error-prone, which often represents a bottleneck in the clinical diagnosis process. A computer-aided diagnosis system which can automatically generate radiology reports from medical images can be of great significance in reducing workload, reducing diagnostic errors, speeding up clinical workflow, and helping to alleviate any shortage of radiologists. Existing research in radiology report generation focuses on generating the concatenation of the findings and impression sections. Also, existing work ignores important differences between normal and abnormal radiology reports. The text of normal and abnormal reports differs in style and it is difficult for a single model to learn both the text style and learn to transition from findings to impression. To alleviate these challenges, we propose a Show, Tell and Summarise model that first generates findings from chest X-rays and then summarises them to provide impression section. The proposed work generates the findings and impression sections separately, overcoming the limitation of previous research. Also, we use separate models for generating normal and abnormal radiology reports which provide true insight of model’s performance. Experimental results on the publicly available IU-CXR dataset show the effectiveness of our proposed model. Finally, we highlight limitations in the radiology report generation research and present recommendations for future work.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Footnotes
2
We interchangeably use summary or impression to denote conclusive remarks of a radiology report.
 
Literature
1.
go back to reference Lewis SJ, Gandomkar Z, Brennan PC (2019) Artificial intelligence in medical imaging practice: looking to the future. J Med Radiat Sci 66(4):292–295CrossRef Lewis SJ, Gandomkar Z, Brennan PC (2019) Artificial intelligence in medical imaging practice: looking to the future. J Med Radiat Sci 66(4):292–295CrossRef
2.
go back to reference Demner-Fushman D, Kohli MD, Rosenman MB, Shooshan SE, Rodriguez L, Antani S, Thoma GR, McDonald CJ (2016) Preparing a collection of radiology examinations for distribution and retrieval. J Am Med Inform Assoc 23(2):304–310CrossRef Demner-Fushman D, Kohli MD, Rosenman MB, Shooshan SE, Rodriguez L, Antani S, Thoma GR, McDonald CJ (2016) Preparing a collection of radiology examinations for distribution and retrieval. J Am Med Inform Assoc 23(2):304–310CrossRef
3.
go back to reference Kisilev P, Walach E, Barkan E, Ophir B, Alpert S, Hashoul SY (2015) From medical image to automatic medical report generation. IBM J Res Dev 59(2/3):2:1–2:7CrossRef Kisilev P, Walach E, Barkan E, Ophir B, Alpert S, Hashoul SY (2015) From medical image to automatic medical report generation. IBM J Res Dev 59(2/3):2:1–2:7CrossRef
4.
go back to reference Kisilev P, Sason E, Barkan E, Hashoul S (2016) Medical image description using multi-task-loss CNN. In: Carneiro G, Mateus D, Peter L, Bradley A, Tavares JMRS, Belagiannis V, Papa JP, Nascimento JC, Loog M, Lu Z, Cardoso JS, Cornebise J (eds) Deep learning and data labeling for medical applications. Springer, Berlin, pp 121–129CrossRef Kisilev P, Sason E, Barkan E, Hashoul S (2016) Medical image description using multi-task-loss CNN. In: Carneiro G, Mateus D, Peter L, Bradley A, Tavares JMRS, Belagiannis V, Papa JP, Nascimento JC, Loog M, Lu Z, Cardoso JS, Cornebise J (eds) Deep learning and data labeling for medical applications. Springer, Berlin, pp 121–129CrossRef
5.
go back to reference Jing B, Xie P, Xing E (2018) On the automatic generation of medical imaging reports. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long Papers). Association for Computational Linguistics, pp 2577–2586 Jing B, Xie P, Xing E (2018) On the automatic generation of medical imaging reports. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long Papers). Association for Computational Linguistics, pp 2577–2586
6.
go back to reference Yin C, Qian B, Wei J, Li X, Zhang X, Li Y, Zheng Q (2019) Automatic generation of medical imaging diagnostic report with hierarchical recurrent neural network. In: 2019 IEEE international conference on data mining (ICDM), pp 728–737 Yin C, Qian B, Wei J, Li X, Zhang X, Li Y, Zheng Q (2019) Automatic generation of medical imaging diagnostic report with hierarchical recurrent neural network. In: 2019 IEEE international conference on data mining (ICDM), pp 728–737
7.
go back to reference Jing B, Wang Z, Xing E (2019) Show, describe and conclude: on exploiting the structure information of chest x-ray reports. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Florence, pp 6570–6580 Jing B, Wang Z, Xing E (2019) Show, describe and conclude: on exploiting the structure information of chest x-ray reports. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Florence, pp 6570–6580
8.
go back to reference Vinyals O, Toshev A, Bengio S, Erhan D (2017) Show and tell: lessons learned from the 2015 mscoco image captioning challenge. IEEE Trans Pattern Anal Mach Intell 39(4):652–663CrossRef Vinyals O, Toshev A, Bengio S, Erhan D (2017) Show and tell: lessons learned from the 2015 mscoco image captioning challenge. IEEE Trans Pattern Anal Mach Intell 39(4):652–663CrossRef
9.
go back to reference Goodfellow I, Bengio Y, Courville A (2016) Deep learning. The MIT Press, CambridgeMATH Goodfellow I, Bengio Y, Courville A (2016) Deep learning. The MIT Press, CambridgeMATH
10.
go back to reference Lee LIT, Kanthasamy S, Ayyalaraju RS, Ganatra R (2019) The current state of artificial intelligence in medical imaging and nuclear medicine. BJR|Open 1(1):20190037CrossRef Lee LIT, Kanthasamy S, Ayyalaraju RS, Ganatra R (2019) The current state of artificial intelligence in medical imaging and nuclear medicine. BJR|Open 1(1):20190037CrossRef
11.
go back to reference Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM (2017) ChestX-Ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: IEEE conference on computer vision and pattern recognition. Hawaii, United States, pp 3462–3471 Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM (2017) ChestX-Ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: IEEE conference on computer vision and pattern recognition. Hawaii, United States, pp 3462–3471
12.
go back to reference Bustos A, Pertusa A, Salinas J, de la Iglesia-Vayá M (2019) Padchest: a large chest x-ray image dataset with multi-label annotated reports. arXiv:1901.07441 Bustos A, Pertusa A, Salinas J, de la Iglesia-Vayá M (2019) Padchest: a large chest x-ray image dataset with multi-label annotated reports. arXiv:​1901.​07441
13.
go back to reference Johnson AEW, Pollard TJ, Berkowitz SJ, Greenbaum NR, Lungren MP, Deng CY, Mark RG, Horng S (2019) MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci Data 6(1):317CrossRef Johnson AEW, Pollard TJ, Berkowitz SJ, Greenbaum NR, Lungren MP, Deng CY, Mark RG, Horng S (2019) MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci Data 6(1):317CrossRef
14.
go back to reference Irvin J, Rajpurkar P, Ko M, Yu Y, Ciurea-Ilcus S, Chute C, Marklund H, Haghgoo B, Ball, RL, Shpanskaya KS, Seekins J, Mong DA, Halabi SS, Sandberg JK, Jones R, Larson DB, Langlotz CP, Patel BN, Lungren, MP, Ng AY (2019) Chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: The thirty-third AAAI conference on artificial intelligence, AAAI 2019, the thirty-first innovative applications of artificial intelligence conference, IAAI 2019, the ninth AAAI symposium on educational advances in artificial intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–February 1, 2019. AAAI Press, pp 590–597 Irvin J, Rajpurkar P, Ko M, Yu Y, Ciurea-Ilcus S, Chute C, Marklund H, Haghgoo B, Ball, RL, Shpanskaya KS, Seekins J, Mong DA, Halabi SS, Sandberg JK, Jones R, Larson DB, Langlotz CP, Patel BN, Lungren, MP, Ng AY (2019) Chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: The thirty-third AAAI conference on artificial intelligence, AAAI 2019, the thirty-first innovative applications of artificial intelligence conference, IAAI 2019, the ninth AAAI symposium on educational advances in artificial intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–February 1, 2019. AAAI Press, pp 590–597
15.
go back to reference Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., Red Hook, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., Red Hook, pp 1097–1105
16.
go back to reference Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: IEEE conference on computer vision and pattern recognition, pp 2818–2826 Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: IEEE conference on computer vision and pattern recognition, pp 2818–2826
17.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition, pp 770–778
18.
go back to reference Gündel S, Grbic S, Georgescu B, Liu S, Maier A, Comaniciu D (2019) Learning to recognize abnormalities in chest x-rays with location-aware dense networks. In: Vera-Rodriguez R, Fierrez J, Morales A (eds) Progress in pattern recognition, image analysis, computer vision, and applications. Springer, Berlin, pp 757–765CrossRef Gündel S, Grbic S, Georgescu B, Liu S, Maier A, Comaniciu D (2019) Learning to recognize abnormalities in chest x-rays with location-aware dense networks. In: Vera-Rodriguez R, Fierrez J, Morales A (eds) Progress in pattern recognition, image analysis, computer vision, and applications. Springer, Berlin, pp 757–765CrossRef
19.
go back to reference Rajpurkar P, Irvin J, Zhu K, Yang B, Mehta H, Duan T, Ding D, Bagul A, Langlotz C, Shpanskaya K, Lungren MP, Ng AY (2017) Chexnet: radiologist-level pneumonia detection on chest x-rays with deep learning Rajpurkar P, Irvin J, Zhu K, Yang B, Mehta H, Duan T, Ding D, Bagul A, Langlotz C, Shpanskaya K, Lungren MP, Ng AY (2017) Chexnet: radiologist-level pneumonia detection on chest x-rays with deep learning
20.
go back to reference Baltruschat IM, Nickisch H, Grass M, Knopp T, Saalbach A (2019) Comparison of deep learning approaches for multi-label chest X-ray classification. Sci Rep 9(1):6381CrossRef Baltruschat IM, Nickisch H, Grass M, Knopp T, Saalbach A (2019) Comparison of deep learning approaches for multi-label chest X-ray classification. Sci Rep 9(1):6381CrossRef
21.
go back to reference Yao L, Poblenz E, Dagunts D, Covington B, Bernard D, Lyman K (2017) Learning to diagnose from scratch by exploiting dependencies among labels. CoRR. arXiv:1710.10501 Yao L, Poblenz E, Dagunts D, Covington B, Bernard D, Lyman K (2017) Learning to diagnose from scratch by exploiting dependencies among labels. CoRR. arXiv:​1710.​10501
22.
go back to reference Singh S, Ho-Shon K, Karimi S, Hamey L (2018) Modality classification and concept detection in medical images using deep transfer learning. In: 2018 International conference on image and vision computing New Zealand (IVCNZ), pp 1–9 Singh S, Ho-Shon K, Karimi S, Hamey L (2018) Modality classification and concept detection in medical images using deep transfer learning. In: 2018 International conference on image and vision computing New Zealand (IVCNZ), pp 1–9
23.
go back to reference Wang W, Liang D, Chen Q, Iwamoto Y, Han XH, Zhang Q, Hu H, Lin L, Chen YW (2020) Medical image classification using deep learning. Springer, Berlin, pp 33–51 Wang W, Liang D, Chen Q, Iwamoto Y, Han XH, Zhang Q, Hu H, Lin L, Chen YW (2020) Medical image classification using deep learning. Springer, Berlin, pp 33–51
24.
go back to reference Yadav SS, Jadhav SM (2019) Deep convolutional neural network based medical image classification for disease diagnosis. J Big Data 6(1):113CrossRef Yadav SS, Jadhav SM (2019) Deep convolutional neural network based medical image classification for disease diagnosis. J Big Data 6(1):113CrossRef
25.
go back to reference Zhang J, Xie Y, Wu Q, Xia Y (2019) Medical image classification using synergic deep learning. Med Image Anal 54:10–19CrossRef Zhang J, Xie Y, Wu Q, Xia Y (2019) Medical image classification using synergic deep learning. Med Image Anal 54:10–19CrossRef
26.
go back to reference Kumar A, Kim J, Lyndon D, Fulham M, Feng D (2017) An ensemble of fine-tuned convolutional neural networks for medical image classification. IEEE J Biomed Health Inform 21(1):31–40CrossRef Kumar A, Kim J, Lyndon D, Fulham M, Feng D (2017) An ensemble of fine-tuned convolutional neural networks for medical image classification. IEEE J Biomed Health Inform 21(1):31–40CrossRef
27.
go back to reference Faes L, Wagner SK, Fu DJ, Liu X, Korot E, Ledsam JR, Back T, Chopra R, Pontikos N, Kern C, Moraes G, Schmid MK, Sim D, Balaskas K, Bachmann LM, Denniston AK, Keane PA (2019) Automated deep learning design for medical image classification by health-care professionals with no coding experience: a feasibility study. Lancet Digit Health 1(5):e232–e242CrossRef Faes L, Wagner SK, Fu DJ, Liu X, Korot E, Ledsam JR, Back T, Chopra R, Pontikos N, Kern C, Moraes G, Schmid MK, Sim D, Balaskas K, Bachmann LM, Denniston AK, Keane PA (2019) Automated deep learning design for medical image classification by health-care professionals with no coding experience: a feasibility study. Lancet Digit Health 1(5):e232–e242CrossRef
28.
go back to reference Hossain MZ, Sohel F, Shiratuddin MF, Laga H (2019) A comprehensive survey of deep learning for image captioning. ACM Comput Surv 51(6):118:1–118:36CrossRef Hossain MZ, Sohel F, Shiratuddin MF, Laga H (2019) A comprehensive survey of deep learning for image captioning. ACM Comput Surv 51(6):118:1–118:36CrossRef
29.
go back to reference Farhadi A, Hejrati M, Sadeghi MA, Young P, Rashtchian C, Hockenmaier J, Forsyth D (2010) Every picture tells a story: generating sentences from images. In: Daniilidis K, Maragos P, Paragios N (eds) Computer vision—ECCV 2010. Springer, Berlin, pp 15–29CrossRef Farhadi A, Hejrati M, Sadeghi MA, Young P, Rashtchian C, Hockenmaier J, Forsyth D (2010) Every picture tells a story: generating sentences from images. In: Daniilidis K, Maragos P, Paragios N (eds) Computer vision—ECCV 2010. Springer, Berlin, pp 15–29CrossRef
30.
go back to reference Li S, Kulkarni G, Berg TL, Berg AC, Choi Y (2011) Composing simple image descriptions using web-scale n-grams. In: Proceedings of the fifteenth conference on computational natural language learning, CoNLL’11. Association for Computational Linguistics, USA, pp 220–228 Li S, Kulkarni G, Berg TL, Berg AC, Choi Y (2011) Composing simple image descriptions using web-scale n-grams. In: Proceedings of the fifteenth conference on computational natural language learning, CoNLL’11. Association for Computational Linguistics, USA, pp 220–228
31.
go back to reference Kulkarni G, Premraj V, Ordonez V, Dhar S, Li S, Choi Y, Berg AC, Berg TL (2013) Babytalk: understanding and generating simple image descriptions. IEEE Trans Pattern Anal Mach Intell 35(12):2891–2903CrossRef Kulkarni G, Premraj V, Ordonez V, Dhar S, Li S, Choi Y, Berg AC, Berg TL (2013) Babytalk: understanding and generating simple image descriptions. IEEE Trans Pattern Anal Mach Intell 35(12):2891–2903CrossRef
32.
go back to reference Hodosh M, Young P, Hockenmaier J (2013) Framing image description as a ranking task: data, models and evaluation metrics. J Artif Int Res 47(1):853–899MathSciNetMATH Hodosh M, Young P, Hockenmaier J (2013) Framing image description as a ranking task: data, models and evaluation metrics. J Artif Int Res 47(1):853–899MathSciNetMATH
33.
go back to reference Mason R, Charniak E (2014) Nonparametric method for data-driven image captioning. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 2: short papers). Association for Computational Linguistics, Baltimore, pp 592–598 Mason R, Charniak E (2014) Nonparametric method for data-driven image captioning. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 2: short papers). Association for Computational Linguistics, Baltimore, pp 592–598
34.
go back to reference Ordonez V, Kulkarni G, Berg TL (2011) Im2text: describing images using 1 million captioned photographs. In: Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ (eds) Advances in neural information processing systems, vol 24. Curran Associates, Inc, Red Hook, pp 1143–1151 Ordonez V, Kulkarni G, Berg TL (2011) Im2text: describing images using 1 million captioned photographs. In: Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ (eds) Advances in neural information processing systems, vol 24. Curran Associates, Inc, Red Hook, pp 1143–1151
35.
go back to reference Mason R, Charniak E (2014) Nonparametric method for data-driven image captioning. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 2: short papers). Association for Computational Linguistics, pp 592–598 Mason R, Charniak E (2014) Nonparametric method for data-driven image captioning. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 2: short papers). Association for Computational Linguistics, pp 592–598
36.
go back to reference Kiros R, Salakhutdinov R, Zemel R (2014) Multimodal neural language models. In: Xing EP, Jebara T (eds) Proceedings of the 31st international conference on machine learning, proceedings of machine learning research, vol 32. PMLR, Bejing, China, pp 595–603 Kiros R, Salakhutdinov R, Zemel R (2014) Multimodal neural language models. In: Xing EP, Jebara T (eds) Proceedings of the 31st international conference on machine learning, proceedings of machine learning research, vol 32. PMLR, Bejing, China, pp 595–603
37.
go back to reference Karpathy A, Fei-Fei L (2017) Deep visual-semantic alignments for generating image descriptions. IEEE Trans Pattern Anal Mach Intell 39(4):664–676CrossRef Karpathy A, Fei-Fei L (2017) Deep visual-semantic alignments for generating image descriptions. IEEE Trans Pattern Anal Mach Intell 39(4):664–676CrossRef
38.
go back to reference Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: Bach F, Blei D (eds) Proceedings of the 32nd international conference on machine learning, proceedings of machine learning research, vol 37. PMLR, Lille, France, pp 2048–2057 Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: Bach F, Blei D (eds) Proceedings of the 32nd international conference on machine learning, proceedings of machine learning research, vol 37. PMLR, Lille, France, pp 2048–2057
39.
go back to reference Liu C, Mao J, Sha F, Yuille A (2017) Attention correctness in neural image captioning. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, AAAI’17. AAAI Press, pp 4176–4182 Liu C, Mao J, Sha F, Yuille A (2017) Attention correctness in neural image captioning. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, AAAI’17. AAAI Press, pp 4176–4182
40.
go back to reference You Q, Jin H, Wang Z, Fang C, Luo J (2016) Image captioning with semantic attention. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 4651–4659 You Q, Jin H, Wang Z, Fang C, Luo J (2016) Image captioning with semantic attention. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 4651–4659
41.
go back to reference Anderson P, He X, Buehler C, Teney D, Johnson M, Gould S, Zhang L (2018) Bottom-up and top-down attention for image captioning and visual question answering. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 6077–6086 Anderson P, He X, Buehler C, Teney D, Johnson M, Gould S, Zhang L (2018) Bottom-up and top-down attention for image captioning and visual question answering. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 6077–6086
42.
go back to reference Krause J, Johnson J, Krishna R, Fei-Fei L (2017) A hierarchical approach for generating descriptive image paragraphs. In: 2017 IEEE conference on computer vision and pattern recognition, pp 3337–3345 Krause J, Johnson J, Krishna R, Fei-Fei L (2017) A hierarchical approach for generating descriptive image paragraphs. In: 2017 IEEE conference on computer vision and pattern recognition, pp 3337–3345
43.
go back to reference Johnson J, Karpathy A, Fei-Fei L (2016) DenseCap: fully Convolutional Localization Networks for Dense Captioning. In: 2016 IEEE conference on computer vision and pattern recognition, pp 4565–4574 Johnson J, Karpathy A, Fei-Fei L (2016) DenseCap: fully Convolutional Localization Networks for Dense Captioning. In: 2016 IEEE conference on computer vision and pattern recognition, pp 4565–4574
44.
go back to reference Xue Y, Xu T, Rodney Long L, Xue Z, Antani S, Thoma GR, Huang X (2018) Multimodal recurrent model with attention for automated radiology report generation. In: Frangi AF, Schnabel JA, Davatzikos C, Alberola-López C, Fichtinger G (eds) Medical image computing and computer assisted intervention—MICCAI 2018. Springer, Berlin, pp 457–466CrossRef Xue Y, Xu T, Rodney Long L, Xue Z, Antani S, Thoma GR, Huang X (2018) Multimodal recurrent model with attention for automated radiology report generation. In: Frangi AF, Schnabel JA, Davatzikos C, Alberola-López C, Fichtinger G (eds) Medical image computing and computer assisted intervention—MICCAI 2018. Springer, Berlin, pp 457–466CrossRef
45.
go back to reference Xiong Y, Du B, Yan P (2019) Reinforced transformer for medical image captioning. In: Suk HI, Liu M, Yan P, Lian C (eds) Machine learning in medical imaging. Springer, Berlin, pp 673–680CrossRef Xiong Y, Du B, Yan P (2019) Reinforced transformer for medical image captioning. In: Suk HI, Liu M, Yan P, Lian C (eds) Machine learning in medical imaging. Springer, Berlin, pp 673–680CrossRef
46.
go back to reference Schlegl T, Waldstein SM, Vogl WD, Schmidt-Erfurth U, Langs G (2015) Predicting semantic descriptions from medical images with convolutional neural networks. In: Ourselin S, Alexander DC, Westin CF, Cardoso MJ (eds) Information processing in medical imaging. Springer, Cham, pp 437–448CrossRef Schlegl T, Waldstein SM, Vogl WD, Schmidt-Erfurth U, Langs G (2015) Predicting semantic descriptions from medical images with convolutional neural networks. In: Ourselin S, Alexander DC, Westin CF, Cardoso MJ (eds) Information processing in medical imaging. Springer, Cham, pp 437–448CrossRef
47.
go back to reference Shin HC, Lu L, Kim L, Seff A, Yao J, Summers RM (2016) Interleaved text/image deep mining on a large-scale radiology database for automated image interpretation. J Mach Learn Res 17(107):1–31MathSciNet Shin HC, Lu L, Kim L, Seff A, Yao J, Summers RM (2016) Interleaved text/image deep mining on a large-scale radiology database for automated image interpretation. J Mach Learn Res 17(107):1–31MathSciNet
48.
go back to reference Shin H, Roberts K, Lu L, Demner-Fushman D, Yao J, Summers RM (2016) Learning to read chest x-rays: recurrent neural cascade model for automated image annotation. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2497–2506. https://doi.org/10.1109/CVPR.2016.274 Shin H, Roberts K, Lu L, Demner-Fushman D, Yao J, Summers RM (2016) Learning to read chest x-rays: recurrent neural cascade model for automated image annotation. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2497–2506. https://​doi.​org/​10.​1109/​CVPR.​2016.​274
49.
go back to reference Zhang Z, Xie Y, Xing F, McGough M, Yang L (2017) MDNet: a Semantically and Visually interpretable medical image diagnosis network. In: IEEE conference on computer vision and pattern recognition, Hawaii, United States, pp 3549–3557 Zhang Z, Xie Y, Xing F, McGough M, Yang L (2017) MDNet: a Semantically and Visually interpretable medical image diagnosis network. In: IEEE conference on computer vision and pattern recognition, Hawaii, United States, pp 3549–3557
50.
go back to reference Li Y, Liang X, Hu Z, Xing EP (2018) Hybrid retrieval-generation reinforced agent for medical image report generation. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31. Curran Associates, Inc, Red Hook, pp 1530–1540 Li Y, Liang X, Hu Z, Xing EP (2018) Hybrid retrieval-generation reinforced agent for medical image report generation. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31. Curran Associates, Inc, Red Hook, pp 1530–1540
51.
go back to reference Zeng XH, Liu BG, Zhou M (2018) Understanding and generating ultrasound image description. J Comput Sci Technol 33(5):1086–1100CrossRef Zeng XH, Liu BG, Zhou M (2018) Understanding and generating ultrasound image description. J Comput Sci Technol 33(5):1086–1100CrossRef
52.
go back to reference Radev DR, Hovy E, McKeown K (2002) Introduction to the special issue on summarization. Comput Linguist 28(4):399–408CrossRef Radev DR, Hovy E, McKeown K (2002) Introduction to the special issue on summarization. Comput Linguist 28(4):399–408CrossRef
53.
go back to reference Mishra R, Bian J, Fiszman M, Weir CR, Jonnalagadda S, Mostafa J, Del Fiol G (2014) Text summarization in the biomedical domain. J Biomed Inform 52(C):457–467CrossRef Mishra R, Bian J, Fiszman M, Weir CR, Jonnalagadda S, Mostafa J, Del Fiol G (2014) Text summarization in the biomedical domain. J Biomed Inform 52(C):457–467CrossRef
54.
go back to reference Neto JL, Freitas AA, Kaestner CAA (2002) Automatic text summarization using a machine learning approach. In: Proceedings of the 16th Brazilian symposium on artificial intelligence: advances in artificial intelligence, SBIA’02. Springer, Berlin, pp 205–215 Neto JL, Freitas AA, Kaestner CAA (2002) Automatic text summarization using a machine learning approach. In: Proceedings of the 16th Brazilian symposium on artificial intelligence: advances in artificial intelligence, SBIA’02. Springer, Berlin, pp 205–215
55.
go back to reference Filippova K, Altun Y (2013) Overcoming the lack of parallel data in sentence compression. In: Proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, Washington, USA, pp 1481–1491 Filippova K, Altun Y (2013) Overcoming the lack of parallel data in sentence compression. In: Proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, Washington, USA, pp 1481–1491
56.
go back to reference Colmenares CA, Litvak M, Mantrach A, Silvestri F (2015) HEADS: Headline generation as sequence prediction using an abstract feature-rich space. In: Proceedings of the 2015 conference of the North American chapter of the association for computational linguistics: human language technologies. Association for Computational Linguistics, Denver, Colorado, pp 133–142 Colmenares CA, Litvak M, Mantrach A, Silvestri F (2015) HEADS: Headline generation as sequence prediction using an abstract feature-rich space. In: Proceedings of the 2015 conference of the North American chapter of the association for computational linguistics: human language technologies. Association for Computational Linguistics, Denver, Colorado, pp 133–142
57.
go back to reference Kryscinski W, Keskar NS, McCann B, Xiong C, Socher R (2019) Neural text summarization: a critical evaluation. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp 540–551 Kryscinski W, Keskar NS, McCann B, Xiong C, Socher R (2019) Neural text summarization: a critical evaluation. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp 540–551
58.
go back to reference See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Vancouver, Canada, pp 1073–1083 See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Vancouver, Canada, pp 1073–1083
59.
go back to reference Tan J, Wan X, Xiao J (2017) Abstractive document summarization with a graph-based attentional neural model. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Vancouver, Canada, pp 1171–1181 Tan J, Wan X, Xiao J (2017) Abstractive document summarization with a graph-based attentional neural model. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Vancouver, Canada, pp 1171–1181
60.
go back to reference Cohan A, Dernoncourt F, Kim DS, Bui T, Kim S, Chang W, Goharian N (2018) A discourse-aware attention model for abstractive summarization of long documents. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 2 (short papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 615–621 Cohan A, Dernoncourt F, Kim DS, Bui T, Kim S, Chang W, Goharian N (2018) A discourse-aware attention model for abstractive summarization of long documents. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 2 (short papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 615–621
61.
go back to reference Hsu WT, Lin CK, Lee MY, Min K, Tang J, Sun M (2018) A unified model for extractive and abstractive summarization using inconsistency loss. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Melbourne, Australia, pp 132–141 Hsu WT, Lin CK, Lee MY, Min K, Tang J, Sun M (2018) A unified model for extractive and abstractive summarization using inconsistency loss. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Melbourne, Australia, pp 132–141
62.
go back to reference Liu L, Tang J, Wan X, Guo Z (2019) Generating diverse and descriptive image captions using visual paraphrases. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 4239–4248 Liu L, Tang J, Wan X, Guo Z (2019) Generating diverse and descriptive image captions using visual paraphrases. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 4239–4248
63.
go back to reference Gehrmann S, Deng Y, Rush A (2018) Bottom-up abstractive summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing. Association for Computational Linguistics, Brussels, Belgium, pp 4098–4109 Gehrmann S, Deng Y, Rush A (2018) Bottom-up abstractive summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing. Association for Computational Linguistics, Brussels, Belgium, pp 4098–4109
64.
go back to reference Chen YC, Bansal M (2018) Fast abstractive summarization with reinforce-selected sentence rewriting. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Melbourne, Australia, pp 675–686 Chen YC, Bansal M (2018) Fast abstractive summarization with reinforce-selected sentence rewriting. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Melbourne, Australia, pp 675–686
65.
go back to reference Moirangthem DS, Lee M (2020) Abstractive summarization of long texts by representing multiple compositionalities with temporal hierarchical pointer generator network. Neural Netw 124:1–11CrossRef Moirangthem DS, Lee M (2020) Abstractive summarization of long texts by representing multiple compositionalities with temporal hierarchical pointer generator network. Neural Netw 124:1–11CrossRef
66.
go back to reference Deng J, Dong W, Socher R, Li L, Kai L, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255 Deng J, Dong W, Socher R, Li L, Kai L, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255
67.
go back to reference Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: a neural Image Caption generator. In: 2015 IEEE conference on computer vision and pattern recognition, pp 3156–3164 Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: a neural Image Caption generator. In: 2015 IEEE conference on computer vision and pattern recognition, pp 3156–3164
68.
go back to reference Zhang Y, Ding DY, Qian T, Manning CD, Langlotz CP (2018) Learning to summarize radiology findings. In: Proceedings of the ninth international workshop on health text mining and information analysis. Association for Computational Linguistics, Brussels, Belgium, pp 204–213 Zhang Y, Ding DY, Qian T, Manning CD, Langlotz CP (2018) Learning to summarize radiology findings. In: Proceedings of the ninth international workshop on health text mining and information analysis. Association for Computational Linguistics, Brussels, Belgium, pp 204–213
69.
go back to reference Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd International conference on learning representations, ICLR 2015; Conference date: 07-05-2015 Through 09-05-2015 Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd International conference on learning representations, ICLR 2015; Conference date: 07-05-2015 Through 09-05-2015
70.
go back to reference Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: 2014 IEEE conference on computer vision and pattern recognition workshops, pp 512–519 Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: 2014 IEEE conference on computer vision and pattern recognition workshops, pp 512–519
71.
go back to reference Raghu M, Zhang C, Kleinberg J, Bengio S (2019) Transfusion: understanding transfer learning for medical imaging. In: Wallach H, Larochelle H, Beygelzimer A, AlcheBuc F, Fox E, Garnett R (eds) Advances in neural information processing systems, vol 32. Curran Associates, Inc, Red Hook, pp 3347–3357 Raghu M, Zhang C, Kleinberg J, Bengio S (2019) Transfusion: understanding transfer learning for medical imaging. In: Wallach H, Larochelle H, Beygelzimer A, AlcheBuc F, Fox E, Garnett R (eds) Advances in neural information processing systems, vol 32. Curran Associates, Inc, Red Hook, pp 3347–3357
72.
go back to reference Singh S, Karimi S, Ho-Shon K, Hamey L (2019) From chest x-rays to radiology reports: a multimodal machine learning approach. In: 2019 digital image computing: techniques and applications (DICTA), pp 1–8 Singh S, Karimi S, Ho-Shon K, Hamey L (2019) From chest x-rays to radiology reports: a multimodal machine learning approach. In: 2019 digital image computing: techniques and applications (DICTA), pp 1–8
73.
go back to reference Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics. Philadelphia, Pennsylvania, United States Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics. Philadelphia, Pennsylvania, United States
74.
go back to reference Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: 42nd Annual meeting of the association for computational linguistics. Barcelona, Spain, pp 1–8 Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: 42nd Annual meeting of the association for computational linguistics. Barcelona, Spain, pp 1–8
75.
go back to reference Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. Ann Arbor, Michigan, United States, pp 65–72 Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. Ann Arbor, Michigan, United States, pp 65–72
76.
go back to reference Vedantam R, Zitnick CL, Parikh D (2015) CIDEr: consensus-based image description evaluation. In: IEEE conference on computer vision and pattern recognition. Boston, Massachusetts, United States, pp 4566–4575 Vedantam R, Zitnick CL, Parikh D (2015) CIDEr: consensus-based image description evaluation. In: IEEE conference on computer vision and pattern recognition. Boston, Massachusetts, United States, pp 4566–4575
77.
go back to reference Chen X, Hao Fang TYL, Vedantam R, Gupta S, Dollár P, Zitnick CL (2015) Microsoft COCO captions: data collection and evaluation server. arXiv:1504.00325 Chen X, Hao Fang TYL, Vedantam R, Gupta S, Dollár P, Zitnick CL (2015) Microsoft COCO captions: data collection and evaluation server. arXiv:​1504.​00325
79.
go back to reference Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) Tensorflow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX conference on operating systems design and implementation, OSDI’16. USENIX Association, USA, pp 265–283 Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) Tensorflow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX conference on operating systems design and implementation, OSDI’16. USENIX Association, USA, pp 265–283
80.
go back to reference Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Bengio Y, LeCun Y (eds) 3rd International conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Bengio Y, LeCun Y (eds) 3rd International conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings
81.
go back to reference Pennington J, Socher R, Manning CD (2014) GloVe: global Vectors for word representation. In: Empirical methods in natural language processing. Doha, Qatar, pp 1532–1543 Pennington J, Socher R, Manning CD (2014) GloVe: global Vectors for word representation. In: Empirical methods in natural language processing. Doha, Qatar, pp 1532–1543
82.
go back to reference Johnson AEW, Pollard TJ, Greenbaum NR, Lungren MP, Deng CY, Peng Y, Lu Z, Mark RG, Berkowitz SJ, Horng S (2019) MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs Johnson AEW, Pollard TJ, Greenbaum NR, Lungren MP, Deng CY, Peng Y, Lu Z, Mark RG, Berkowitz SJ, Horng S (2019) MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs
83.
go back to reference Lindh A, Ross RJ, Mahalunkar A, Salton G, Kelleher JD (2018) Generating diverse and meaningful captions. In: Kůrková V, Manolopoulos Y, Hammer B, Iliadis L, Maglogiannis I (eds) Artificial neural networks and machine learning—ICANN 2018. Springer, Cham, pp 176–187 Lindh A, Ross RJ, Mahalunkar A, Salton G, Kelleher JD (2018) Generating diverse and meaningful captions. In: Kůrková V, Manolopoulos Y, Hammer B, Iliadis L, Maglogiannis I (eds) Artificial neural networks and machine learning—ICANN 2018. Springer, Cham, pp 176–187
84.
go back to reference Deshpande A, Aneja J, Wang L, Schwing AG, Forsyth D (2019) Fast, diverse and accurate image captioning guided by part-of-speech. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10687–10696 Deshpande A, Aneja J, Wang L, Schwing AG, Forsyth D (2019) Fast, diverse and accurate image captioning guided by part-of-speech. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10687–10696
85.
go back to reference Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 2227–2237 Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 2227–2237
86.
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30. Curran Associates, Inc, Red Hook, pp 5998–6008 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30. Curran Associates, Inc, Red Hook, pp 5998–6008
87.
go back to reference Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171–4186 Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171–4186
Metadata
Title
Show, tell and summarise: learning to generate and summarise radiology findings from medical images
Authors
Sonit Singh
Sarvnaz Karimi
Kevin Ho-Shon
Len Hamey
Publication date
05-04-2021
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 13/2021
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-021-05943-6

Other articles of this Issue 13/2021

Neural Computing and Applications 13/2021 Go to the issue

Premium Partner