Skip to main content

2019 | OriginalPaper | Buchkapitel

19. Automatic Classification and Reporting of Multiple Common Thorax Diseases Using Chest Radiographs

verfasst von : Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Ronald M. Summers

Erschienen in: Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Chest X-rays are one of the most common radiological examinations in daily clinical routines. Reporting thorax diseases using chest X-rays is often an entry-level task for radiologist trainees. Yet, reading a chest X-ray image remains a challenging job for learning-oriented machine intelligence, due to (1) shortage of large-scale machine-learnable medical image datasets, and (2) lack of techniques that can mimic the high-level reasoning of human radiologists that requires years of knowledge accumulation and professional training. In this paper, we show the clinical free-text radiological reports that accompany X-ray images in hospital picture and archiving communication systems can be utilized as a priori knowledge for tackling these two key problems. We propose a novel text-image embedding network (TieNet) for extracting the distinctive image and text representations. Multi-level attention models are integrated into an end-to-end trainable CNN-RNN architecture for highlighting the meaningful text words and image regions. We first apply TieNet to classify the chest X-rays by using both image features and text embeddings extracted from associated reports. The proposed auto-annotation framework achieves high accuracy (over 0.9 on average in AUCs) in assigning disease labels for our hand-label evaluation dataset. Furthermore, we transform the TieNet into a chest X-ray reporting system. It simulates the reporting process and can output disease classification and a preliminary report together, with X-ray images being the only input. The classification results are significantly improved (6% increase on average in AUCs) compared to the state-of-the-art baseline on an unseen and hand-labeled dataset (OpenI).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2016) TensorFlow: large-scale machine learning on heterogeneous distributed systems Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2016) TensorFlow: large-scale machine learning on heterogeneous distributed systems
2.
Zurück zum Zitat Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: International conference on learning representations (ICLR), pp 1–15 Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: International conference on learning representations (ICLR), pp 1–15
3.
Zurück zum Zitat Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72 Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72
4.
Zurück zum Zitat Ben-Cohen A, Diamant I, Klang E, Amitai M, Greenspan H (2016) Fully convolutional network for liver segmentation and lesions detection. In: International workshop on large-scale annotation of biomedical data and expert label synthesis, pp 77–85 Ben-Cohen A, Diamant I, Klang E, Amitai M, Greenspan H (2016) Fully convolutional network for liver segmentation and lesions detection. In: International workshop on large-scale annotation of biomedical data and expert label synthesis, pp 77–85
5.
Zurück zum Zitat Chartrand G, Cheng PM, Vorontsov E, Drozdzal M, Turcotte S, Pal CJ, Kadoury S, Tang A (2017) Deep learning: a primer for radiologists. Radiogr Rev 37(7):2113–2131. Radiological Society of North America, Inc Chartrand G, Cheng PM, Vorontsov E, Drozdzal M, Turcotte S, Pal CJ, Kadoury S, Tang A (2017) Deep learning: a primer for radiologists. Radiogr Rev 37(7):2113–2131. Radiological Society of North America, Inc
6.
Zurück zum Zitat Dai B, Zhang Y, Lin D (2017) Detecting visual relationships with deep relational networks. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 3076–3086 Dai B, Zhang Y, Lin D (2017) Detecting visual relationships with deep relational networks. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 3076–3086
7.
Zurück zum Zitat Demner-Fushman D, Kohli MD, Rosenman MB, Shooshan SE, Rodriguez L, Antani S, Thoma GR, McDonald CJ (2015) Preparing a collection of radiology examinations for distribution and retrieval. J Am Med Inf Assoc 23(2):304–310CrossRef Demner-Fushman D, Kohli MD, Rosenman MB, Shooshan SE, Rodriguez L, Antani S, Thoma GR, McDonald CJ (2015) Preparing a collection of radiology examinations for distribution and retrieval. J Am Med Inf Assoc 23(2):304–310CrossRef
8.
Zurück zum Zitat Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 248–255 Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 248–255
9.
Zurück zum Zitat Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639):115–118CrossRef Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639):115–118CrossRef
10.
Zurück zum Zitat Everingham M, Eslami SMA, Gool LV, Williams CKI, Winn J, Zisserman A (2015) The PASCAL visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136CrossRef Everingham M, Eslami SMA, Gool LV, Williams CKI, Winn J, Zisserman A (2015) The PASCAL visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136CrossRef
11.
Zurück zum Zitat Gan Z, Gan C, He X, Pu Y, Tran K, Gao J, Carin L, Deng L (2017) Semantic compositional networks for visual captioning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1–13 Gan Z, Gan C, He X, Pu Y, Tran K, Gao J, Carin L, Deng L (2017) Semantic compositional networks for visual captioning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1–13
12.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
13.
Zurück zum Zitat Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
14.
Zurück zum Zitat Hu R, Rohrbach M, Andreas J, Darrell T, Saenko K (2017) Modeling relationships in referential expressions with compositional modular networks. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1115–1124 Hu R, Rohrbach M, Andreas J, Darrell T, Saenko K (2017) Modeling relationships in referential expressions with compositional modular networks. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1115–1124
15.
Zurück zum Zitat Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia, pp 675–678 Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia, pp 675–678
16.
Zurück zum Zitat Johnson J, Karpathy A, Fei-Fei L (2016) DenseCap: fully convolutional localization networks for dense captioning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 4565–4574 Johnson J, Karpathy A, Fei-Fei L (2016) DenseCap: fully convolutional localization networks for dense captioning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 4565–4574
17.
Zurück zum Zitat Karpathy A, Fei-Fei L (2017) Deep visual-semantic alignments for generating image descriptions. IEEE Trans Pattern Anal Mach Intell 39(4):664–676CrossRef Karpathy A, Fei-Fei L (2017) Deep visual-semantic alignments for generating image descriptions. IEEE Trans Pattern Anal Mach Intell 39(4):664–676CrossRef
18.
Zurück zum Zitat Krishna R, Zhu Y, Groth O, Johnson J, Hata K, Kravitz J, Chen S, Kalantidis Y, Li LJ, Shamma DA, Bernstein MS, Li FF (2016) Visual genome: connecting language and vision using crowdsourced dense image annotations Krishna R, Zhu Y, Groth O, Johnson J, Hata K, Kravitz J, Chen S, Kalantidis Y, Li LJ, Shamma DA, Bernstein MS, Li FF (2016) Visual genome: connecting language and vision using crowdsourced dense image annotations
19.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
20.
Zurück zum Zitat Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: Text summarization branches out: proceedings of the ACL-04 workshop, Barcelona, Spain, vol 8, pp 1–8 (2004) Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: Text summarization branches out: proceedings of the ACL-04 workshop, Barcelona, Spain, vol 8, pp 1–8 (2004)
21.
Zurück zum Zitat Lin M, Chen Q, Yan S (2014) Network in network. In: International conference on learning representations (ICLR), pp 1–10 Lin M, Chen Q, Yan S (2014) Network in network. In: International conference on learning representations (ICLR), pp 1–10
22.
Zurück zum Zitat Lin TY, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollár P (2014) Microsoft COCO: common objects in context. In: European conference on computer vision (ECCV), pp 740–755CrossRef Lin TY, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollár P (2014) Microsoft COCO: common objects in context. In: European conference on computer vision (ECCV), pp 740–755CrossRef
23.
Zurück zum Zitat Lin Z, Feng M, dos Santos CN, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. In: 5th international conference on learning representations (ICLR), pp 1–15 Lin Z, Feng M, dos Santos CN, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. In: 5th international conference on learning representations (ICLR), pp 1–15
24.
Zurück zum Zitat Liu J, Wang D, Lu L, Wei Z, Kim L, Turkbey EB, Sahiner B, Petrick N, Summers RM (2017) Detection and diagnosis of colitis on computed tomography using deep convolutional neural networks. Med Phys 44(9):4630–4642CrossRef Liu J, Wang D, Lu L, Wei Z, Kim L, Turkbey EB, Sahiner B, Petrick N, Summers RM (2017) Detection and diagnosis of colitis on computed tomography using deep convolutional neural networks. Med Phys 44(9):4630–4642CrossRef
25.
Zurück zum Zitat Liu Y, Sun C, Lin L, Wang X (2016) Learning natural language inference using bidirectional LSTM model and inner-attention Liu Y, Sun C, Lin L, Wang X (2016) Learning natural language inference using bidirectional LSTM model and inner-attention
26.
Zurück zum Zitat Meng F, Lu Z, Wang M, Li H, Jiang W, Liu Q (2015) Encoding source language with convolutional neural network for machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (ACL-CoNLL), pp 20–30 Meng F, Lu Z, Wang M, Li H, Jiang W, Liu Q (2015) Encoding source language with convolutional neural network for machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (ACL-CoNLL), pp 20–30
27.
Zurück zum Zitat Nam H, Ha JW, Kim J (2017) Dual attention networks for multimodal reasoning and matching. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 299–307 Nam H, Ha JW, Kim J (2017) Dual attention networks for multimodal reasoning and matching. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 299–307
28.
Zurück zum Zitat Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics (ACL), pp 311–318 Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics (ACL), pp 311–318
29.
Zurück zum Zitat Pedersoli M, Lucas T, Schmid C, Verbeek J (2017) Areas of attention for image captioning. In: International conference on computer vision (ICCV), pp 1–22 Pedersoli M, Lucas T, Schmid C, Verbeek J (2017) Areas of attention for image captioning. In: International conference on computer vision (ICCV), pp 1–22
30.
Zurück zum Zitat Plummer B, Wang L, Cervantes C, Caicedo J, Hockenmaier J, Lazebnik S (2015) Flickr30k entities: collecting region-to-phrase correspondences for richer image-to-sentence models. In: International conference on computer vision (ICCV) Plummer B, Wang L, Cervantes C, Caicedo J, Hockenmaier J, Lazebnik S (2015) Flickr30k entities: collecting region-to-phrase correspondences for richer image-to-sentence models. In: International conference on computer vision (ICCV)
31.
Zurück zum Zitat Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing (EMNLP), pp 379–389 Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing (EMNLP), pp 379–389
32.
Zurück zum Zitat Shin HC, Roberts K, Lu L, Demner-Fushman D, Yao J, Summers RM (2016) Learning to read chest X-rays: recurrent neural cascade model for automated image annotation. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 2497–2506 Shin HC, Roberts K, Lu L, Demner-Fushman D, Yao J, Summers RM (2016) Learning to read chest X-rays: recurrent neural cascade model for automated image annotation. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 2497–2506
33.
Zurück zum Zitat Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Advances in neural information processing systems, pp 2692–2700 Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Advances in neural information processing systems, pp 2692–2700
34.
Zurück zum Zitat Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: a neural image caption generator. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 3156–3164 Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: a neural image caption generator. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 3156–3164
35.
Zurück zum Zitat Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM (2017) ChestX-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 2097–2106 Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM (2017) ChestX-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 2097–2106
36.
Zurück zum Zitat Wu Q, Wang P, Shen C, Dick A, van den Hengel A (2016) Ask me anything: free-form visual question answering based on knowledge from external sources. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1–5 Wu Q, Wang P, Shen C, Dick A, van den Hengel A (2016) Ask me anything: free-form visual question answering based on knowledge from external sources. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1–5
37.
Zurück zum Zitat Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning (ICML), pp 2048–2057 Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning (ICML), pp 2048–2057
38.
Zurück zum Zitat Yu D, Fu J, Mei T, Rui Y (2017) Multi-level attention networks for visual question answering. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9 Yu D, Fu J, Mei T, Rui Y (2017) Multi-level attention networks for visual question answering. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9
39.
Zurück zum Zitat Yulia WLLCC, Amir TS, Alan RFACD, Trancoso WBI (2015) Not all contexts are created equal: better word representations with variable attention. In: Proceedings of the 2015 conference on empirical methods in natural language processing (EMNLP), pp 1367–1372 Yulia WLLCC, Amir TS, Alan RFACD, Trancoso WBI (2015) Not all contexts are created equal: better word representations with variable attention. In: Proceedings of the 2015 conference on empirical methods in natural language processing (EMNLP), pp 1367–1372
40.
Zurück zum Zitat Zhang Z, Chen P, Sapkota M, Yang L (2017) TandemNet: distilling knowledge from medical images using diagnostic reports as optional semantic references. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 320–328CrossRef Zhang Z, Chen P, Sapkota M, Yang L (2017) TandemNet: distilling knowledge from medical images using diagnostic reports as optional semantic references. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 320–328CrossRef
41.
Zurück zum Zitat Zhang Z, Xie Y, Xing F, McGough M, Yang L (2017) MDNet: a semantically and visually interpretable medical image diagnosis network. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 6428–6436 Zhang Z, Xie Y, Xing F, McGough M, Yang L (2017) MDNet: a semantically and visually interpretable medical image diagnosis network. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 6428–6436
42.
Zurück zum Zitat Zhu Y, Groth O, Bernstein M, Fei-Fei L (2016) Visual7W: grounded question answering in images. In: The IEEE conference on computer vision and pattern recognition (CVPR) Zhu Y, Groth O, Bernstein M, Fei-Fei L (2016) Visual7W: grounded question answering in images. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Metadaten
Titel
Automatic Classification and Reporting of Multiple Common Thorax Diseases Using Chest Radiographs
verfasst von
Xiaosong Wang
Yifan Peng
Le Lu
Zhiyong Lu
Ronald M. Summers
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-13969-8_19

Premium Partner