Skip to main content
Erschienen in: International Journal on Digital Libraries 1/2020

28.10.2018

Assessing plausibility of scientific claims to support high-quality content in digital collections

verfasst von: José María González Pinto, Wolf-Tilo Balke

Erschienen in: International Journal on Digital Libraries | Ausgabe 1/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents a formalization and extension of a novel approach to support high-quality content in digital libraries. Building on the concept of plausibility used in cognitive sciences, we aim at judging the plausibility of new scientific papers in light of prior knowledge. In particular, our work proposes a novel assessment of scientific papers to qualitatively support the work of reviewers. To do this, our approach focuses on the key aspect of scientific papers: claims. Claims are sentences found in empirical scientific papers that state statistical associations between entities and correspond to the core contributions of the papers. We can find these types of claims, for instance, in medicine, chemistry, and biology, where the consumption of a drug, a substance, or a product causes an effect on some other type of entity such as a disease, or another drug or substance. To operationalize the notion of plausibility, we promote claims as first-class citizens for scientific digital libraries and exploit state-of-the-art neural embedding representations of text and topic models. As a proof of concept of the potential usefulness of this notion of plausibility, we study and report extensive experiments on documents with scientific papers from the PubMed digital library.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
PubMed comprises more than 28 million citations for biomedical literature from MEDLINE, life science journals, and online books.
 
Literatur
1.
Zurück zum Zitat Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Kaiser, L., Kudlur, M., Levenberg, J., Man, D., Monga, R., Moore, S., Murray, D., Shlens, J., Steiner, B., Sutskever, I., Tucker, P., Vanhoucke, V., Vasudevan, V., Vinyals, O., Warden, P., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467v2 p. 19 (2015). URLhttp://download.tensorflow.org/paper/whitepaper2015.pdf Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Kaiser, L., Kudlur, M., Levenberg, J., Man, D., Monga, R., Moore, S., Murray, D., Shlens, J., Steiner, B., Sutskever, I., Tucker, P., Vanhoucke, V., Vasudevan, V., Vinyals, O., Warden, P., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv:​1603.​04467v2 p. 19 (2015). URLhttp://​download.​tensorflow.​org/​paper/​whitepaper2015.​pdf
5.
Zurück zum Zitat Bertsimas, D., Tsitsiklis, J.N.: Introduction to Linear Optimization. Athena Scientific, Belmont (1997) Bertsimas, D., Tsitsiklis, J.N.: Introduction to Linear Optimization. Athena Scientific, Belmont (1997)
6.
7.
Zurück zum Zitat Blei, D.M., Lafferty, J.D.: Topic models. In: Srivastava AN, Sahami M (eds) Text Mining: Classification, Clustering, and Applications, chap. 4. Data Mining and Knowledge Discovery Series, Chapman & Hall/CRC, pp. 71–89 (2009). https://doi.org/10.1145/1143844.1143859 Blei, D.M., Lafferty, J.D.: Topic models. In: Srivastava AN, Sahami M (eds) Text Mining: Classification, Clustering, and Applications, chap. 4. Data Mining and Knowledge Discovery Series, Chapman & Hall/CRC, pp. 71–89 (2009). https://​doi.​org/​10.​1145/​1143844.​1143859
9.
Zurück zum Zitat Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information 5, 135–146 (2016). DOI 1511.09249v1. arXiv:1607.04606 Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information 5, 135–146 (2016). DOI 1511.09249v1. arXiv:​1607.​04606
10.
Zurück zum Zitat Chollet, F.: Deep Learning with Python, 1st edn. Manning Publications, Shelter Island (2017) Chollet, F.: Deep Learning with Python, 1st edn. Manning Publications, Shelter Island (2017)
15.
Zurück zum Zitat González Pinto J.M.; Balke, W.T.: Can plausibility help to support high quality content in digital libraries? In: TPDL 2017 21st International Conference on Theory and Practice of Digital Libraries. Thessaloniki, Greece (2017)CrossRef González Pinto J.M.; Balke, W.T.: Can plausibility help to support high quality content in digital libraries? In: TPDL 2017 21st International Conference on Theory and Practice of Digital Libraries. Thessaloniki, Greece (2017)CrossRef
21.
Zurück zum Zitat Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). arXiv:1207.0580 Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). arXiv:​1207.​0580
23.
Zurück zum Zitat Joachims, T.: Text categorization with support vector machines: Learning with many relevant features. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 1398, 137–142 (1998). https://doi.org/10.1007/s13928716 CrossRef Joachims, T.: Text categorization with support vector machines: Learning with many relevant features. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 1398, 137–142 (1998). https://​doi.​org/​10.​1007/​s13928716 CrossRef
26.
Zurück zum Zitat Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. Int. Conf. Learn. Represent. 2015, 1–15 (2015) Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. Int. Conf. Learn. Represent. 2015, 1–15 (2015)
27.
28.
Zurück zum Zitat Kuhn, T., Barbano, P.E., Nagy, M.L., Krauthammer, M.: Broadening the scope of nanopublications. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7882 LNCS, pp. 487–501 (2013). https://doi.org/10.1007/978-3-642-38288-8-33 Kuhn, T., Barbano, P.E., Nagy, M.L., Krauthammer, M.: Broadening the scope of nanopublications. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7882 LNCS, pp. 487–501 (2013). https://​doi.​org/​10.​1007/​978-3-642-38288-8-33
29.
Zurück zum Zitat Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of The 32nd international conference on machine learning vol. 37, pp. 957–966 (2015) Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of The 32nd international conference on machine learning vol. 37, pp. 957–966 (2015)
34.
Zurück zum Zitat Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of NAACL-HLT, June, pp. 746–751 (2013) Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of NAACL-HLT, June, pp. 746–751 (2013)
35.
39.
Zurück zum Zitat Price, B.Y.S., Flach, P.A.: Computational support for academic peer review: a perspective from artificial intelligence. Commun. ACM 60(3), 70–79 (2017)CrossRef Price, B.Y.S., Flach, P.A.: Computational support for academic peer review: a perspective from artificial intelligence. Commun. ACM 60(3), 70–79 (2017)CrossRef
44.
Zurück zum Zitat Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp 384–394 (2010) Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp 384–394 (2010)
45.
Zurück zum Zitat Velterop, J.: Nanopublications: the future of coping with information overload. LOGOS: J. World Book Community 21, 3–4 (2010)CrossRef Velterop, J.: Nanopublications: the future of coping with information overload. LOGOS: J. World Book Community 21, 3–4 (2010)CrossRef
48.
Zurück zum Zitat Zhang, Y., Wallace, B.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. In: Proceedings of the The 8th International Joint Conference on Natural Language Processing, pp. 253–263 (2017). arXiv:1510.03820 Zhang, Y., Wallace, B.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. In: Proceedings of the The 8th International Joint Conference on Natural Language Processing, pp. 253–263 (2017). arXiv:​1510.​03820
Metadaten
Titel
Assessing plausibility of scientific claims to support high-quality content in digital collections
verfasst von
José María González Pinto
Wolf-Tilo Balke
Publikationsdatum
28.10.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal on Digital Libraries / Ausgabe 1/2020
Print ISSN: 1432-5012
Elektronische ISSN: 1432-1300
DOI
https://doi.org/10.1007/s00799-018-0256-8

Weitere Artikel der Ausgabe 1/2020

International Journal on Digital Libraries 1/2020 Zur Ausgabe