Skip to main content
Top

2017 | OriginalPaper | Chapter

An Extensible Ontology Modeling Approach Using Post Coordinated Expressions for Semantic Provenance in Biomedical Research

Authors : Joshua Valdez, Michael Rueschman, Matthew Kim, Sara Arabyarmohammadi, Susan Redline, Satya S. Sahoo

Published in: On the Move to Meaningful Internet Systems. OTM 2017 Conferences

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Provenance metadata describing the source or origin of data is critical to verify and validate results of scientific experiments. Indeed, reproducibility of scientific studies is rapidly gaining significant attention in the research community, for example biomedical and healthcare research. To address this challenge in the biomedical research domain, we have developed the Provenance for Clinical and Healthcare Research (ProvCaRe) using World Wide Web Consortium (W3C) PROV specifications, including the PROV Ontology (PROV-O). In the ProvCaRe project, we are extending PROV-O to create a formal model of provenance information that is necessary for scientific reproducibility and replication in biomedical research. However, there are several challenges associated with the development of the ProvCaRe ontology, including: (1) Ontology engineering: modeling all biomedical provenance-related terms in an ontology has undefined scope and is not feasible before the release of the ontology; (2) Redundancy: there are a large number of existing biomedical ontologies that already model relevant biomedical terms; and (3) Ontology maintenance: adding or deleting terms from a large ontology is error prone and it will be difficult to maintain the ontology over time. Therefore, in contrast to modeling all classes and properties in an ontology before deployment (also called precoordination), we propose the “ProvCaRe Compositional Grammar Syntax” to model ontology classes on-demand (also called postcoordination). The compositional grammar syntax allows us to re-use existing biomedical ontology classes and compose provenance-specific terms that extend PROV-O classes and properties. We demonstrate the application of this approach in the ProvCaRe ontology and the use of the ontology in the development of the ProvCaRe knowledgebase that consists of more than 38 million provenance triples automatically extracted from 384,802 published research articles using a text processing workflow.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
2
The namespace for the terms used in the expressions are not repeated for brevity.
 
Literature
1.
go back to reference Collins, F.S., Tabak, L.A.: Policy: NIH plans to enhance reproducibility. Nature 505, 612–613 (2014)CrossRef Collins, F.S., Tabak, L.A.: Policy: NIH plans to enhance reproducibility. Nature 505, 612–613 (2014)CrossRef
2.
go back to reference Landis, S.C., Amara, S.G., Asadullah, K., et al.: A call for transparent reporting to optimize the predictive value of preclinical research. Nature 490(7419), 187–191 (2012)CrossRef Landis, S.C., Amara, S.G., Asadullah, K., et al.: A call for transparent reporting to optimize the predictive value of preclinical research. Nature 490(7419), 187–191 (2012)CrossRef
3.
go back to reference Redline, S., Dean III, D., Sanders, M.H.: Entering the era of “Big Data”: getting our metrics right. SLEEP 36(4), 465–469 (2013)CrossRef Redline, S., Dean III, D., Sanders, M.H.: Entering the era of “Big Data”: getting our metrics right. SLEEP 36(4), 465–469 (2013)CrossRef
4.
go back to reference Baker, M.: 1,500 scientists lift the lid on reproducibility. Nature 533(7604), 452–454 (2016)CrossRef Baker, M.: 1,500 scientists lift the lid on reproducibility. Nature 533(7604), 452–454 (2016)CrossRef
6.
go back to reference Buneman, P., Davidson, S.: Data provenance - the foundation of data quality (2010) Buneman, P., Davidson, S.: Data provenance - the foundation of data quality (2010)
7.
go back to reference Goble, C.: Position statement: musings on provenance, workflow and (semantic web) annotations for bioinformatics. In: Workshop on Data Derivation and Provenance, Chicago (2002) Goble, C.: Position statement: musings on provenance, workflow and (semantic web) annotations for bioinformatics. In: Workshop on Data Derivation and Provenance, Chicago (2002)
8.
go back to reference Sahoo, S.S., Sheth, A., Henson, C.: Semantic provenance for escience: managing the deluge of scientific data. IEEE Internet Comput. 12(4), 46–54 (2008)CrossRef Sahoo, S.S., Sheth, A., Henson, C.: Semantic provenance for escience: managing the deluge of scientific data. IEEE Internet Comput. 12(4), 46–54 (2008)CrossRef
9.
go back to reference Valdez, J., Kim, M., Rueschman, M., Socrates, V., Redline, S., Sahoo, S.S.: ProvCaRe semantic provenance knowledgebase: evaluating scientific reproducibility of research studies. Presented at the American Medical Informatics Association (AMIA) Annual Conference, Washington DC (2017) Valdez, J., Kim, M., Rueschman, M., Socrates, V., Redline, S., Sahoo, S.S.: ProvCaRe semantic provenance knowledgebase: evaluating scientific reproducibility of research studies. Presented at the American Medical Informatics Association (AMIA) Annual Conference, Washington DC (2017)
10.
go back to reference Zhao, J., Goble, C., Stevens, R., Turi, D.: Mining Taverna’s semantic web of provenance. J. Concurr. Comput. Practice Exp. 20(5), 463–472 (2008)CrossRef Zhao, J., Goble, C., Stevens, R., Turi, D.: Mining Taverna’s semantic web of provenance. J. Concurr. Comput. Practice Exp. 20(5), 463–472 (2008)CrossRef
11.
go back to reference Simmhan, Y.L., Plale, A.B., Gannon, A.D.: A survey of data provenance in e-science. SIGMOD Rec. 34(3), 31–36 (2005)CrossRef Simmhan, Y.L., Plale, A.B., Gannon, A.D.: A survey of data provenance in e-science. SIGMOD Rec. 34(3), 31–36 (2005)CrossRef
12.
go back to reference Moreau, L., Clifford, B., Freire, J., et al.: The open provenance model core specification (v1.1). Future Gener. Comput. Syst. 27(6), 743–756 (2010) Moreau, L., Clifford, B., Freire, J., et al.: The open provenance model core specification (v1.1). Future Gener. Comput. Syst. 27(6), 743–756 (2010)
13.
go back to reference Hitzler, P., Krötzsch, M., Parsia, B., Patel-Schneider, P.F., Rudolph, S.: OWL 2 Web Ontology Language Primer. In: W3C Recommendation. World Wide Web Consortium W3C (2009) Hitzler, P., Krötzsch, M., Parsia, B., Patel-Schneider, P.F., Rudolph, S.: OWL 2 Web Ontology Language Primer. In: W3C Recommendation. World Wide Web Consortium W3C (2009)
14.
go back to reference Sahoo, S.S., Sheth, A.: Provenir ontology: towards a framework for eScience provenance management. Presented at the Microsoft eScience Workshop, Pittsburgh, USA, October 2009 Sahoo, S.S., Sheth, A.: Provenir ontology: towards a framework for eScience provenance management. Presented at the Microsoft eScience Workshop, Pittsburgh, USA, October 2009
15.
go back to reference Moreau, L., Missier, P.: PROV Data Model (PROV-DM). In: W3C Recommendation. World Wide Web Consortium W3C (2013) Moreau, L., Missier, P.: PROV Data Model (PROV-DM). In: W3C Recommendation. World Wide Web Consortium W3C (2013)
16.
go back to reference Lebo, T., Sahoo, S.S., McGuinness, D.; PROV-O: the PROV ontology. In: W3C Recommendation. World Wide Web Consortium W3C (2013) Lebo, T., Sahoo, S.S., McGuinness, D.; PROV-O: the PROV ontology. In: W3C Recommendation. World Wide Web Consortium W3C (2013)
17.
go back to reference Cheney, J., Missier, P., Moreau, L.: Constraints of the PROV data model. In: W3C Recommendation. World Wide Web Consortium W3C (2013) Cheney, J., Missier, P., Moreau, L.: Constraints of the PROV data model. In: W3C Recommendation. World Wide Web Consortium W3C (2013)
18.
go back to reference Dean, D.A., Goldberger, A.L., Mueller, R., Kim, M., et al.: Scaling up scientific discovery in sleep medicine: the National Sleep Research Resource. SLEEP 39(5), 1151–1164 (2016)CrossRef Dean, D.A., Goldberger, A.L., Mueller, R., Kim, M., et al.: Scaling up scientific discovery in sleep medicine: the National Sleep Research Resource. SLEEP 39(5), 1151–1164 (2016)CrossRef
19.
go back to reference Cyganiak, R., Wood, D., Lanthaler, M.: RDF 1.1 concepts and abstract syntax. In: W3C Recommendation, World Wide Web Consortium (W3C) (2014) Cyganiak, R., Wood, D., Lanthaler, M.: RDF 1.1 concepts and abstract syntax. In: W3C Recommendation, World Wide Web Consortium (W3C) (2014)
20.
go back to reference Rector, A., Luigi, I.: Lexically suggest, logically define: quality assurance of the use of qualifiers and expected results of post-coordination in SNOMED CT. J. Biomed. Inform. 45(2), 199–209 (2012)CrossRef Rector, A., Luigi, I.: Lexically suggest, logically define: quality assurance of the use of qualifiers and expected results of post-coordination in SNOMED CT. J. Biomed. Inform. 45(2), 199–209 (2012)CrossRef
21.
go back to reference Musen, M.A., Noy, N.F., Shah, N.H., Whetzel, P.L., Chute, C.G., Story, M.A., Smith, B.: NCBO team: The national center for biomedical ontology. J. Am. Med. Inform. Assoc. 19(2), 190–195 (2012)CrossRef Musen, M.A., Noy, N.F., Shah, N.H., Whetzel, P.L., Chute, C.G., Story, M.A., Smith, B.: NCBO team: The national center for biomedical ontology. J. Am. Med. Inform. Assoc. 19(2), 190–195 (2012)CrossRef
22.
go back to reference Köhler, S., Doelken, S.C., Mungall, C.J., et al.: The human phenotype ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 42, 966–974 (2014). Database IssueCrossRef Köhler, S., Doelken, S.C., Mungall, C.J., et al.: The human phenotype ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 42, 966–974 (2014). Database IssueCrossRef
23.
go back to reference Giannangelo, K., Fenton, S.: SNOMED CT survey: an assessment of implementation in EMR/EHR applications. Perspect Health Inf. Manag. 5, 7 (2008) Giannangelo, K., Fenton, S.: SNOMED CT survey: an assessment of implementation in EMR/EHR applications. Perspect Health Inf. Manag. 5, 7 (2008)
24.
go back to reference Bodenreider, O., Stevens, R.: Bio-ontologies: current trends and future directions. Brief. Bioinform. 7(3), 256–274 (2006)CrossRef Bodenreider, O., Stevens, R.: Bio-ontologies: current trends and future directions. Brief. Bioinform. 7(3), 256–274 (2006)CrossRef
25.
go back to reference Sim, I., Tu, S.W., Carini, S., Lehmann, H.P., Pollock, B.H., Peleg, M., Wittkowski, K.M.: The ontology of clinical research (OCRe): an informatics foundation for the science of clinical research. J. Biomed. Inform. 52, 78–91 (2014)CrossRef Sim, I., Tu, S.W., Carini, S., Lehmann, H.P., Pollock, B.H., Peleg, M., Wittkowski, K.M.: The ontology of clinical research (OCRe): an informatics foundation for the science of clinical research. J. Biomed. Inform. 52, 78–91 (2014)CrossRef
26.
go back to reference Tu, S.W., Peleg, M., Carini, S., Bobak, M., Ross, J., Rubin, D., Sim, I.: A practical method for transforming free-text eligibility criteria into computable criteria. J. Biomed. Inform. 44(2), 239–250 (2011)CrossRef Tu, S.W., Peleg, M., Carini, S., Bobak, M., Ross, J., Rubin, D., Sim, I.: A practical method for transforming free-text eligibility criteria into computable criteria. J. Biomed. Inform. 44(2), 239–250 (2011)CrossRef
27.
go back to reference Bandrowski, A., Brinkman, R., Brochhausen, M., et al.: The ontology for biomedical investigations. Plos One 11(4), e0154556 (2016)CrossRef Bandrowski, A., Brinkman, R., Brochhausen, M., et al.: The ontology for biomedical investigations. Plos One 11(4), e0154556 (2016)CrossRef
28.
go back to reference Huang, X., Lin, J., Demner-Fushman, D.: Evaluation of PICO as a knowledge representation for clinical questions. Presented at the AMIA Annual Symposium Proceedings (2006) Huang, X., Lin, J., Demner-Fushman, D.: Evaluation of PICO as a knowledge representation for clinical questions. Presented at the AMIA Annual Symposium Proceedings (2006)
30.
go back to reference Hearst, M.A.: Untangling text data mining. In: 37th the Association for Computational Linguistics on Computational Linguistics meeting, pp. 3–10 (1999) Hearst, M.A.: Untangling text data mining. In: 37th the Association for Computational Linguistics on Computational Linguistics meeting, pp. 3–10 (1999)
31.
go back to reference Rindflesch, T.C., Pakhomov, S.V., Fiszman, M., Kilicoglu, H., Sanchez, V.R.: Medical facts to support inferencing in natural language processing. Presented at the AMIA Annual Symposium Proceedings (2005) Rindflesch, T.C., Pakhomov, S.V., Fiszman, M., Kilicoglu, H., Sanchez, V.R.: Medical facts to support inferencing in natural language processing. Presented at the AMIA Annual Symposium Proceedings (2005)
32.
go back to reference O’Connor, G.T., Caffo, B., Newman, A.B., Quan, S.F., Rapoport, D.M., Redline, S., Resnick, H.E., Samet, J., Shahar, E.: Prospective study of sleep-disordered breathing and hypertension: the sleep heart health study. Am. J. Respir. Crit. Care Med. 179(12), 1159–1164 (2009)CrossRef O’Connor, G.T., Caffo, B., Newman, A.B., Quan, S.F., Rapoport, D.M., Redline, S., Resnick, H.E., Samet, J., Shahar, E.: Prospective study of sleep-disordered breathing and hypertension: the sleep heart health study. Am. J. Respir. Crit. Care Med. 179(12), 1159–1164 (2009)CrossRef
Metadata
Title
An Extensible Ontology Modeling Approach Using Post Coordinated Expressions for Semantic Provenance in Biomedical Research
Authors
Joshua Valdez
Michael Rueschman
Matthew Kim
Sara Arabyarmohammadi
Susan Redline
Satya S. Sahoo
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-69459-7_23

Premium Partner