Skip to main content
Top
Published in: Neuroinformatics 3/2019

31-10-2018 | Software Original Article

Automated Metadata Suggestion During Repository Submission

Authors: Robert A. McDougal, Isha Dalal, Thomas M. Morse, Gordon M. Shepherd

Published in: Neuroinformatics | Issue 3/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Knowledge discovery via an informatics resource is constrained by the completeness of the resource, both in terms of the amount of data it contains and in terms of the metadata that exists to describe the data. Increasing completeness in one of these categories risks reducing completeness in the other because manually curating metadata is time consuming and is restricted by familiarity with both the data and the metadata annotation scheme. The diverse interests of a research community may drive a resource to have hundreds of metadata tags with few examples for each making it challenging for humans or machine learning algorithms to learn how to assign metadata tags properly. We demonstrate with ModelDB, a computational neuroscience model discovery resource, that using manually-curated regular-expression based rules can overcome this challenge by parsing existing texts from data providers during user data entry to suggest metadata annotations and prompt them to suggest other related metadata annotations rather than leaving the task to a curator. In the ModelDB implementation, analyzing the abstract identified 6.4 metadata tags per abstract at 79% precision. Using the full-text produced higher recall with low precision (41%), and the title alone produced few (1.3) metadata annotations per entry; we thus recommend data providers use their abstract during upload. Grouping the possible metadata annotations into categories (e.g. cell type, biological topic) revealed that precision and recall for the different text sources varies by category. Given this proof-of-concept, other bioinformatics resources can likewise improve the quality of their metadata by adopting our approach of prompting data uploaders with relevant metadata at the minimal cost of formalizing rules for each potential metadata annotation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Ambert, K. H., & Cohen, A. M. (2012). Text-mining and neuroscience. International Review of Neurobiology, 103, 109–132.CrossRef Ambert, K. H., & Cohen, A. M. (2012). Text-mining and neuroscience. International Review of Neurobiology, 103, 109–132.CrossRef
go back to reference Anderson, J. C., Binzegger, T., Kahana, O., Martin, K. A. C., & Segev, I. (1999). Dendritic asymmetry cannot account for directional responses of neurons in visual cortex. Nature Neuroscience, 2(9), 820–824.CrossRef Anderson, J. C., Binzegger, T., Kahana, O., Martin, K. A. C., & Segev, I. (1999). Dendritic asymmetry cannot account for directional responses of neurons in visual cortex. Nature Neuroscience, 2(9), 820–824.CrossRef
go back to reference Ascoli, G. A. (2015). Sharing neuron data: carrots, sticks, and digital records. PLoS Biology, 13(10), e1002275.CrossRef Ascoli, G. A. (2015). Sharing neuron data: carrots, sticks, and digital records. PLoS Biology, 13(10), e1002275.CrossRef
go back to reference Beech, D. J., & Barnes, S. (1989). Characterization of a voltage-gated K+ channel that accelerates the rod response to dim light. Neuron, 3, 573–581.CrossRef Beech, D. J., & Barnes, S. (1989). Characterization of a voltage-gated K+ channel that accelerates the rod response to dim light. Neuron, 3, 573–581.CrossRef
go back to reference Cohen, K. B., & Hunter, L. (2008). Getting started in text mining. PLoS Computational Biology, 4(1), e20.CrossRef Cohen, K. B., & Hunter, L. (2008). Getting started in text mining. PLoS Computational Biology, 4(1), e20.CrossRef
go back to reference Cornelisse, L. N., van Elburg, R. A. J., Meredith, R. M., Yuste, R., & Mansvelder, H. D. (2007). High speed two-photon imaging of calcium dynamics in dendritic spines: consequences for spine calcium kinetics and buffer capacity. PLoS One, 2(10), e1073.CrossRef Cornelisse, L. N., van Elburg, R. A. J., Meredith, R. M., Yuste, R., & Mansvelder, H. D. (2007). High speed two-photon imaging of calcium dynamics in dendritic spines: consequences for spine calcium kinetics and buffer capacity. PLoS One, 2(10), e1073.CrossRef
go back to reference Crasto, C. J., Marenco, L. N., Migliore, M., Mao, B., Nadkarni, P. M., Miller, P., & Shepherd, G. M. (2003). Text mining neuroscience journal articles to populate neuroscience databases. Neuroinformatics, 1(3), 215–237.CrossRef Crasto, C. J., Marenco, L. N., Migliore, M., Mao, B., Nadkarni, P. M., Miller, P., & Shepherd, G. M. (2003). Text mining neuroscience journal articles to populate neuroscience databases. Neuroinformatics, 1(3), 215–237.CrossRef
go back to reference De Schutter, E. (2014). The dangers of plug-and-play simulation using shared models. Neuroinformatics, 12(2), 227.PubMed De Schutter, E. (2014). The dangers of plug-and-play simulation using shared models. Neuroinformatics, 12(2), 227.PubMed
go back to reference French, L., Liu, P., Marais, O., Koreman, T., Tseng, L., Lai, A., & Pavlidis, P. (2015). Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application. Frontiers in Neuroinformatics, 9, 13.CrossRef French, L., Liu, P., Marais, O., Koreman, T., Tseng, L., Lai, A., & Pavlidis, P. (2015). Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application. Frontiers in Neuroinformatics, 9, 13.CrossRef
go back to reference Garcia-Grajales, J. A., Rucabado, G., Garcia-Dopico, A., Pena, J. M., & Jerusalem, A. (2015). Neurite, a finite difference large scale parallel program for the simulation of electrical signal propagation in neurites under mechanical loading. PLoS One, 10(2), e0116532.CrossRef Garci­a-Grajales, J. A., Rucabado, G., Garci­a-Dopico, A., Pena, J. M., & Jerusalem, A. (2015). Neurite, a finite difference large scale parallel program for the simulation of electrical signal propagation in neurites under mechanical loading. PLoS One, 10(2), e0116532.CrossRef
go back to reference Heinz, M. G., Zhang, X., Bruce, I. C., & Carney, L. H. (2001). Auditory nerve model for predicting performance limits of normal and impaired listeners. Acoustics Research Letters Online, 2(3), 91–96.CrossRef Heinz, M. G., Zhang, X., Bruce, I. C., & Carney, L. H. (2001). Auditory nerve model for predicting performance limits of normal and impaired listeners. Acoustics Research Letters Online, 2(3), 91–96.CrossRef
go back to reference Howe, D., Costanzo, M., Fey, P., Gojobori, T., Hannick, L., Hide, W., Hill, D. P., Kania, R., Schaeffer, M., St Pierre, S., & Twigger, S. (2008). Big data: the future of biocuration. Nature, 455(7209), 47–50.CrossRef Howe, D., Costanzo, M., Fey, P., Gojobori, T., Hannick, L., Hide, W., Hill, D. P., Kania, R., Schaeffer, M., St Pierre, S., & Twigger, S. (2008). Big data: the future of biocuration. Nature, 455(7209), 47–50.CrossRef
go back to reference Kim, M., Park, A. J., Havekes, R., Chay, A., Guercio, L. A., Oliveira, R. F., Abel, T., & Blackwell, K. T. (2011). Colocalization of protein kinase A with adenylyl cyclase enhances protein kinase A activity during induction of long-lasting long-term-potentiation. PLoS Computational Biology, 7, e1002084.CrossRef Kim, M., Park, A. J., Havekes, R., Chay, A., Guercio, L. A., Oliveira, R. F., Abel, T., & Blackwell, K. T. (2011). Colocalization of protein kinase A with adenylyl cyclase enhances protein kinase A activity during induction of long-lasting long-term-potentiation. PLoS Computational Biology, 7, e1002084.CrossRef
go back to reference McDougal, R. A., Morse, T. M., Carnevale, T., Marenco, L., Wang, R., Migliore, M., Miller, P. L., Shepherd, G. M., & Hines, M. L. (2017). Twenty years of ModelDB and beyond: building essential modeling tools for the future of neuroscience. Journal of Computational Neuroscience, 42(1), 1–10.CrossRef McDougal, R. A., Morse, T. M., Carnevale, T., Marenco, L., Wang, R., Migliore, M., Miller, P. L., Shepherd, G. M., & Hines, M. L. (2017). Twenty years of ModelDB and beyond: building essential modeling tools for the future of neuroscience. Journal of Computational Neuroscience, 42(1), 1–10.CrossRef
go back to reference Mirsky, J. S., Nadkarni, P. M., Healy, M. D., Miller, P. L., & Shepherd, G. M. (1998). Database tools for integrating and searching membrane property data correlated with neuronal morphology. Journal of Neuroscience Methods, 82, 105–121.CrossRef Mirsky, J. S., Nadkarni, P. M., Healy, M. D., Miller, P. L., & Shepherd, G. M. (1998). Database tools for integrating and searching membrane property data correlated with neuronal morphology. Journal of Neuroscience Methods, 82, 105–121.CrossRef
go back to reference Neymotin, S. A., Lee, H., Park, E., Fenton, A. A., & Lytton, W. W. (2011). Emergence of physiological oscillation frequencies in a computer model of neocortex. Frontiers in Computational Neuroscience, 5, 19–75.CrossRef Neymotin, S. A., Lee, H., Park, E., Fenton, A. A., & Lytton, W. W. (2011). Emergence of physiological oscillation frequencies in a computer model of neocortex. Frontiers in Computational Neuroscience, 5, 19–75.CrossRef
go back to reference Nielsen, J. (1993). Usability Engineering. Academic Press: Boston, MA. Nielsen, J. (1993). Usability Engineering. Academic Press: Boston, MA.
go back to reference Prescott, S. A., Ratte, S., De Koninck, Y., & Sejnowski, T. J. (2008). Pyramidal neurons switch from integrators in vitro to resonators under in vivo-like conditions. Journal of Neurophysiology, 100(6), 3030–3042.CrossRef Prescott, S. A., Ratte, S., De Koninck, Y., & Sejnowski, T. J. (2008). Pyramidal neurons switch from integrators in vitro to resonators under in vivo-like conditions. Journal of Neurophysiology, 100(6), 3030–3042.CrossRef
go back to reference Richardet, R., Chappelier, J. C., Telefont, M., & Hill, S. (2015). Large-scale extraction of brain connectivity from the neuroscientific literature. Bioinformatics, 31(10), 1640–1647.CrossRef Richardet, R., Chappelier, J. C., Telefont, M., & Hill, S. (2015). Large-scale extraction of brain connectivity from the neuroscientific literature. Bioinformatics, 31(10), 1640–1647.CrossRef
go back to reference Rishikesh, N., & Venkatesh, Y. V. (2003). A computational model for the development of simple-cell receptive fields spanning the regimes before and after eye-opening. Neurocomputing, 50, 125–158.CrossRef Rishikesh, N., & Venkatesh, Y. V. (2003). A computational model for the development of simple-cell receptive fields spanning the regimes before and after eye-opening. Neurocomputing, 50, 125–158.CrossRef
go back to reference Sousa, M., Szucs, P., Lima, D., & Aguiar, P. (2014). The pronociceptive dorsal reticular nucleus contains mostly tonic neurons and shows a high prevalence of spontaneous activity in block preparation. Journal of Neurophysiology, 111(7), 1507–1518.CrossRef Sousa, M., Szucs, P., Lima, D., & Aguiar, P. (2014). The pronociceptive dorsal reticular nucleus contains mostly tonic neurons and shows a high prevalence of spontaneous activity in block preparation. Journal of Neurophysiology, 111(7), 1507–1518.CrossRef
go back to reference Van Auken, K., Jaffery, J., Chan, J., Müller, H. M., & Sternberg, P. W. (2009). Semi-automated curation of protein subcellular localization: a text mining-based approach to gene ontology (GO) cellular component curation. BMC Bioinformatics, 10(1), 228.CrossRef Van Auken, K., Jaffery, J., Chan, J., Müller, H. M., & Sternberg, P. W. (2009). Semi-automated curation of protein subcellular localization: a text mining-based approach to gene ontology (GO) cellular component curation. BMC Bioinformatics, 10(1), 228.CrossRef
go back to reference Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.W., da Silva Santos, L.B., Bourne, P.E., & Bouwman, J., (2016). The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 3. Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.W., da Silva Santos, L.B., Bourne, P.E., & Bouwman, J., (2016). The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 3.
go back to reference Wolf, J. A., Moyer, J. T., Lazarewicz, M. T., Contreras, D., Benoit-Marand, M., O'Donnell, P., & Finkel, L. H. (2005). NMDA-AMPA ratio impacts state transitions and entrainment to oscillations in a computational model of the nucleus accumbens medium spiny projection neuron. The Journal of Neuroscience, 25, 9080–9095.CrossRef Wolf, J. A., Moyer, J. T., Lazarewicz, M. T., Contreras, D., Benoit-Marand, M., O'Donnell, P., & Finkel, L. H. (2005). NMDA-AMPA ratio impacts state transitions and entrainment to oscillations in a computational model of the nucleus accumbens medium spiny projection neuron. The Journal of Neuroscience, 25, 9080–9095.CrossRef
Metadata
Title
Automated Metadata Suggestion During Repository Submission
Authors
Robert A. McDougal
Isha Dalal
Thomas M. Morse
Gordon M. Shepherd
Publication date
31-10-2018
Publisher
Springer US
Published in
Neuroinformatics / Issue 3/2019
Print ISSN: 1539-2791
Electronic ISSN: 1559-0089
DOI
https://doi.org/10.1007/s12021-018-9403-z

Other articles of this Issue 3/2019

Neuroinformatics 3/2019 Go to the issue

Premium Partner