Skip to main content

Diagnostic Knowledge Extraction from MedlinePlus: An Application for Infectious Diseases

  • Conference paper
  • First Online:
9th International Conference on Practical Applications of Computational Biology and Bioinformatics

Abstract

In the creation of diagnostic decision support systems (DDSS) it is crucial to have validated and precise knowledge in order to create accurate systems. Typically, medical experts are the source of this knowledge, but it is not always possible to obtain all the desired information from them. Another valuable source could be medical books or articles describing the diagnosis of diseases managed by the DDSS, but again, it is not easy to extract this information. In this paper we present the results of our research, in which we have used Web scraping and a combination of natural language processing techniques to extract diagnostic criteria from MedlinePlus articles about infectious diseases.

The erratum of this chapter can be found under DOI 10.1007/978-3-319-19776-0_16

An erratum to this chapter can be found at http://dx.doi.org/10.1007/978-3-319-19776-0_16

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.diseasesdatabase.com/.

  2. 2.

    http://jsoup.org/.

  3. 3.

    http://www.medicinenet.com/procedures_and_tests/article.htm.

  4. 4.

    http://bioportal.bioontology.org/.

  5. 5.

    http://www.openlifedata.org/.

  6. 6.

    https://jena.apache.org/.

  7. 7.

    http://sourceforge.net/projects/simmetrics/.

  8. 8.

    https://github.com/alejandrorg/medlineplus2ddx.

  9. 9.

    https://github.com/alejandrorg/medlineplus2ddx/blob/master/diseasesList.txt.

  10. 10.

    https://github.com/alejandrorg/medlineplus2ddx/blob/master/Results.xlsx.

References

  1. Tsumoto, S.: Automated extraction of medical expert system rules from clinical databases based on rough set theory. Inf. Sci. 12(1–4), 67–84 (1998)

    Article  Google Scholar 

  2. Tan, K.C., Yu, Q., Heng, C.M., Lee, T.H.: Evolutionary computing for knowledge discovery in medical diagnosis. Artif. Intell. Med. 27, 129–154 (2003)

    Article  Google Scholar 

  3. Hahn, U., Romacker, M., Schulz, S.: medSynDiKATe—a natural language system for the extraction of medical information from findings reports. Int. J. Med. Inf. 67(1–3), 63–74 (2002)

    Google Scholar 

  4. Amaral, M.B., Roberts, A., Rector, A.L.: NLP techniques associated with the OpenGALEN ontology for semi-automatic textual extraction of medical knowledge: abstracting and mapping equivalent linguistic and logical constructs. In: Proceedings if the AMIA Annual Symposium, pp. 76–80 (2000)

    Google Scholar 

  5. Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proceedings of the AMIA Annual Symposium, pp. 17–21 (2001)

    Google Scholar 

  6. Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A., McKusick, V.A.: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33(1), 514–517 (2005)

    Google Scholar 

  7. Köhler, S., et al.: The human phenotype ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 42(D1), 966–974 (2014)

    Article  Google Scholar 

  8. Bodenreider, O.: The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32(1), 267–270 (2004)

    Article  Google Scholar 

  9. Okumura, T., Aramaki, E., Tateisi, Y.: Clinical vocabulary and clinical finding concepts in medical literature. In: Proceedings of the International Joint Conference on Natural Language Processing Workshop on Natural Language Processing for Medical and Healthcare Fields, pp. 7–13 (2013)

    Google Scholar 

  10. Okumura, T., Tateisi, Y.: A lightweight approach for extracting disease-symptom relation with MetaMap toward automated generation of Disease Knowledge Base. Health Inf. Sci. 164–172 (2012)

    Google Scholar 

  11. Wu, Y., Denny, J.C., Rosenbloom, S.T., Miller, R.A., Giuse, D.A., Xu, H.A.: comparative study of current clinical natural language processing systems on handling abbreviations in discharge summaries. In: Proceedings of the AMIA Annual Symposium, pp. 997–1003 (2012)

    Google Scholar 

  12. Denecke, K.: Extracting medical Concepts from medical social media with clinical NLP tools: a qualitative study. In: Proceedings of the Fourth Workshop on Building and Evaluation Resources for Health and Biomedical Text Processing (2014)

    Google Scholar 

  13. Rodríguez-González, A., Martinez-Romero, M., Egaña-Aranguren, M., Wilkinson, M.D.: Nanopublishing clinical diagnoses: tracking diagnostic knowledge base content and utilization. In: IEEE 27th International Symposium on Computer-Based Medical Systems (CBMS), pp. 335–340 (2014)

    Google Scholar 

  14. Zhou, X.Z., Menche, J., Barabási, A.-L., Sharma, A.: Human symptoms–disease network. Nat. Commun. 5 (2013)

    Google Scholar 

Download references

Acknowledgments

Alejandro Rodríguez González’s and Mark Wilkinson’s work is supported by Isaac Peral Programme of the UPM. Marcos Martínez-Romero work has been supported by a Postdoc Fellowship from the Xunta de Galicia, Spain (ref. POS-A/2013/197).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alejandro Rodríguez-González .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Rodríguez-González, A., Martínez-Romero, M., Costumero, R., Wilkinson, M.D., Menasalvas-Ruiz, E. (2015). Diagnostic Knowledge Extraction from MedlinePlus: An Application for Infectious Diseases. In: Overbeek, R., Rocha, M., Fdez-Riverola, F., De Paz, J. (eds) 9th International Conference on Practical Applications of Computational Biology and Bioinformatics. Advances in Intelligent Systems and Computing, vol 375. Springer, Cham. https://doi.org/10.1007/978-3-319-19776-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19776-0_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19775-3

  • Online ISBN: 978-3-319-19776-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics