Skip to main content
Top

2019 | OriginalPaper | Chapter

Concept Bag: A New Method for Computing Concept Similarity in Biomedical Data

Authors : Richard L. Bradshaw, Ramkiran Gouripeddi, Julio C. Facelli

Published in: Bioinformatics and Biomedical Engineering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Biomedical data are a rich source of information and knowledge, not only for direct patient care, but also for secondary use in population health, clinical research, and translational research. Biomedical data are typically scattered across multiple systems and syntactic and semantic data integration is necessary to fully utilize the data’s potential. This paper introduces new algorithms that were devised to support automatic and semi-automatic integration of semantically heterogeneous biomedical data. The new algorithms incorporate both data mining and biomedical informatics methods to create “concept bags” in the same way that “word bags” are used in data mining and text retrieval. The methods are highly configurable and were tested in five different ways on different types of biomedical data. The new methods performed well in computing similarity between medical terms and data elements - both critical for semi/automatic data integration operations.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Gouripeddi, R., Warner, P., Mo, P.: Federating clinical data from six pediatric hospitals: process and initial results for microbiology from the PHIS+ Consortium. In: AMIA Annual Symposium Proceedings (2012) Gouripeddi, R., Warner, P., Mo, P.: Federating clinical data from six pediatric hospitals: process and initial results for microbiology from the PHIS+ Consortium. In: AMIA Annual Symposium Proceedings (2012)
2.
go back to reference Narus, S.P., Srivastava, R., Gouripeddi, R., Livne, O.E., Mo, P., Bickel, J.P., et al.: Federating clinical data from six pediatric hospitals: process and initial results from the PHIS+ Consortium. In: AMIA Annual Symposium Proceedings, pp. 994–1003 (2011). PubMed PMID: 22195159; PubMed Central PMCID: PMCPMC3243196 Narus, S.P., Srivastava, R., Gouripeddi, R., Livne, O.E., Mo, P., Bickel, J.P., et al.: Federating clinical data from six pediatric hospitals: process and initial results from the PHIS+ Consortium. In: AMIA Annual Symposium Proceedings, pp. 994–1003 (2011). PubMed PMID: 22195159; PubMed Central PMCID: PMCPMC3243196
4.
go back to reference Lasalle, B., Varner, M., Botkin, J., Jackson, M., Stark, L., Cessna, M., et al.: Biobanking informatics infrastructure to support clinical and translational research. AMIA Jt. Summits Transl. Sci. Proc. 132–5 (2013). PubMed PMID: 24303252; PubMed Central PMCID: PMC3845745 Lasalle, B., Varner, M., Botkin, J., Jackson, M., Stark, L., Cessna, M., et al.: Biobanking informatics infrastructure to support clinical and translational research. AMIA Jt. Summits Transl. Sci. Proc. 132–5 (2013). PubMed PMID: 24303252; PubMed Central PMCID: PMC3845745
5.
go back to reference Murphy, S.N., Weber, G., Mendis, M., Gainer, V., Chueh, H.C., Churchill, S., et al.: Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J. Am. Med. Inform. Assoc. (JAMIA) 17(2), 124–130 (2010). https://doi.org/10.1136/jamia.2009.000893. PubMed PMID: 20190053; PubMed Central PMCID: PMC3000779CrossRef Murphy, S.N., Weber, G., Mendis, M., Gainer, V., Chueh, H.C., Churchill, S., et al.: Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J. Am. Med. Inform. Assoc. (JAMIA) 17(2), 124–130 (2010). https://​doi.​org/​10.​1136/​jamia.​2009.​000893. PubMed PMID: 20190053; PubMed Central PMCID: PMC3000779CrossRef
10.
go back to reference Dhombres, F., Charlet, J.: As ontologies reach maturity, artificial intelligence starts being fully efficient: findings from the section on knowledge representation and management for the yearbook 2018. Yearb. Med. Inform. 27(1), 140–145 (2018). https://doi.org/10.1055/s-0038-1667078. PubMed PMID: 30157517CrossRef Dhombres, F., Charlet, J.: As ontologies reach maturity, artificial intelligence starts being fully efficient: findings from the section on knowledge representation and management for the yearbook 2018. Yearb. Med. Inform. 27(1), 140–145 (2018). https://​doi.​org/​10.​1055/​s-0038-1667078. PubMed PMID: 30157517CrossRef
11.
go back to reference Baayen, R.H., Hendrix, P., Ramscar, M.: Sidestepping the combinatorial explosion: an explanation of n-gram frequency effects based on naive discriminative learning. Lang. Speech 56(Pt 3), 329–347 (2013). PubMed PMID: 24416960CrossRef Baayen, R.H., Hendrix, P., Ramscar, M.: Sidestepping the combinatorial explosion: an explanation of n-gram frequency effects based on naive discriminative learning. Lang. Speech 56(Pt 3), 329–347 (2013). PubMed PMID: 24416960CrossRef
14.
go back to reference Bradshaw, R.: Concept bag: a new method for computing similarity. University of Utah (2015) Bradshaw, R.: Concept bag: a new method for computing similarity. University of Utah (2015)
16.
go back to reference Harris, P.A., Taylor, R., Thielke, R., Payne, J., Gonzalez, N., Conde, J.G.: Research electronic data capture (REDCap)–a metadata-driven methodology and workflow process for providing translational research informatics support. J. Biomed. Inform. 42(2), 377–381 (2009). https://doi.org/10.1016/j.jbi.2008.08.010. PubMed PMID: 18929686; PubMed Central PMCID: PMC2700030CrossRef Harris, P.A., Taylor, R., Thielke, R., Payne, J., Gonzalez, N., Conde, J.G.: Research electronic data capture (REDCap)–a metadata-driven methodology and workflow process for providing translational research informatics support. J. Biomed. Inform. 42(2), 377–381 (2009). https://​doi.​org/​10.​1016/​j.​jbi.​2008.​08.​010. PubMed PMID: 18929686; PubMed Central PMCID: PMC2700030CrossRef
18.
go back to reference Lopez-Raton, M., Rodriguez-Alvarez, M., Cadarso-Suarez, C., Gude-Sampedro, F.: Optimal cutpoints: an R package for selecting optimal cutpoints in diagnostic tests. J. Stat. Softw. 61(8), 1–36 (2015) Lopez-Raton, M., Rodriguez-Alvarez, M., Cadarso-Suarez, C., Gude-Sampedro, F.: Optimal cutpoints: an R package for selecting optimal cutpoints in diagnostic tests. J. Stat. Softw. 61(8), 1–36 (2015)
19.
go back to reference Aquire, E., cuadros, M., Rigua, G., Soroa, A. (eds.): Exploring knowledge bases for similarity. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation. European Language Resources Association, Valleta (2010) Aquire, E., cuadros, M., Rigua, G., Soroa, A. (eds.): Exploring knowledge bases for similarity. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation. European Language Resources Association, Valleta (2010)
20.
go back to reference Leacock, C., Chodorow, M.: Using corpus statistics and wordnet relations for sense identification. In: Fellbaum, C. (ed.) Wordnet: An Electronic Lexical Database, pp. 265–283. MIT Press, Cambridge (1998) Leacock, C., Chodorow, M.: Using corpus statistics and wordnet relations for sense identification. In: Fellbaum, C. (ed.) Wordnet: An Electronic Lexical Database, pp. 265–283. MIT Press, Cambridge (1998)
21.
go back to reference Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, Las Cruces (1994) Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, Las Cruces (1994)
Metadata
Title
Concept Bag: A New Method for Computing Concept Similarity in Biomedical Data
Authors
Richard L. Bradshaw
Ramkiran Gouripeddi
Julio C. Facelli
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-17935-9_2

Premium Partner