Skip to main content
Top

Hint

Swipe to navigate through the chapters of this book

2022 | OriginalPaper | Chapter

Predicting the Usage of Scientific Datasets Based on Article, Author, Institution, and Journal Bibliometrics

Authors : Daniel E. Acuna, Zijun Yi, Lizhen Liang, Han Zhuang

Published in: Information for a Better World: Shaping the Global Future

Publisher: Springer International Publishing

Abstract

Scientific datasets are increasingly crucial for knowledge accumulation and reproducibility, making it essential to understand how they are used. Although usage information is hard to obtain, features from the publications that describe a dataset can provide clues. This article associates dataset downloads with the authors’ h-index, institutional prestige, journal ranking, and the references used in the publication that first introduces them. Tens of thousands of datasets and associated publications from figshare.com are used in our analysis. We found that a gradient boosting model achieved the highest performance against linear regression, random forests, and artificial neural networks. Our interpretation results suggest that journal ranking is highly predictive of usage while the author’s institutional prestige and h-index are less critical. In addition, we found that publications with a long but focused body of references are associated with more dataset downloads. We also show that prediction performance decays rapidly the farther we estimate downloads into the future. Finally, we discuss the implications of our work for reproducibility and data policies.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
3.
go back to reference Christensen, G., Miguel, E.: Transparency, reproducibility, and the credibility of economics research. J. Econ. Lit. 56(3), 920–980 (2018) CrossRef Christensen, G., Miguel, E.: Transparency, reproducibility, and the credibility of economics research. J. Econ. Lit. 56(3), 920–980 (2018) CrossRef
7.
go back to reference Mearian, L.: CW@ 50: Data storage goes from $1 M to 2 cents per gigabyte. Computerworld (2017) Mearian, L.: CW@ 50: Data storage goes from $1 M to 2 cents per gigabyte. Computerworld (2017)
10.
go back to reference Open Science Collaboration: Estimating the reproducibility of psychological science. Science 349(6251), aac4716 (2015) Open Science Collaboration: Estimating the reproducibility of psychological science. Science 349(6251), aac4716 (2015)
12.
go back to reference Sinatra, R., Wang, D., Deville, P., Song, C., Barabási, A.-L.: Quantifying the evolution of individual scientific impact. Science 354(6312), aaf5239 (2016) Sinatra, R., Wang, D., Deville, P., Song, C., Barabási, A.-L.: Quantifying the evolution of individual scientific impact. Science 354(6312), aaf5239 (2016)
17.
go back to reference Zeng, T., Wu, L., Bratt, S., Acuna, D.E.: Assigning credit to scientific datasets using article citation networks. J. Inform. 14(2), 101013 (2020) CrossRef Zeng, T., Wu, L., Bratt, S., Acuna, D.E.: Assigning credit to scientific datasets using article citation networks. J. Inform. 14(2), 101013 (2020) CrossRef
Metadata
Title
Predicting the Usage of Scientific Datasets Based on Article, Author, Institution, and Journal Bibliometrics
Authors
Daniel E. Acuna
Zijun Yi
Lizhen Liang
Han Zhuang
Copyright Year
2022
DOI
https://doi.org/10.1007/978-3-030-96957-8_5

Premium Partner