Skip to main content

2019 | OriginalPaper | Buchkapitel

How Lexical Gold Standards Have Effects on the Usefulness of Text Analysis Tools for Digital Scholarship

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper describes how the current lexical similarity and analogy gold standards are built to conform to certain ideas about what the models they are designed to evaluate are used for. Topical relevance has always been the most important target notion for information access tools and related language technology technologies, and while this has proven a useful starting point for much of what information technology is used for, it does not always align well with other uses to which technologies are being put, most notably use cases from digital scholarship in the humanities or social sciences. This paper argues for more systematic formulation of requirements from the digital humanities and social sciences and more explicit description of the assumptions underlying model design.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Spärck Jones argues that this should be understood in terms of occurrence statistics rather than more elusive statistical notions. However, the target notion is a relevance-oriented one.
 
Literatur
1.
Zurück zum Zitat Baroni, M., Bernardi, R., Do, N.Q., Shan, C.C.: Entailment above the word level in distributional semantics. In: Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics. ACL (2012) Baroni, M., Bernardi, R., Do, N.Q., Shan, C.C.: Entailment above the word level in distributional semantics. In: Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics. ACL (2012)
2.
Zurück zum Zitat Baroni, M., Lenci, A.: How we BLESSed distributional semantic evaluation. In: Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics. Association for Computational Linguistics (2011) Baroni, M., Lenci, A.: How we BLESSed distributional semantic evaluation. In: Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics. Association for Computational Linguistics (2011)
3.
Zurück zum Zitat Chiarello, C., Burgess, C., Richards, L., Pollock, A.: Semantic and associative priming in the cerebral hemispheres: some words do, some words don’t... sometimes, some places. Brain Lang. 38(1), 75–104 (1990)CrossRef Chiarello, C., Burgess, C., Richards, L., Pollock, A.: Semantic and associative priming in the cerebral hemispheres: some words do, some words don’t... sometimes, some places. Brain Lang. 38(1), 75–104 (1990)CrossRef
4.
Zurück zum Zitat Da, N.Z.: The computational case against computational literary studies. Crit. Inq. 45(3), 601–639 (2019)CrossRef Da, N.Z.: The computational case against computational literary studies. Crit. Inq. 45(3), 601–639 (2019)CrossRef
5.
Zurück zum Zitat Da, N.Z.: The digital humanities debacle—computational methods repeatedly come up short. The Chronicle of Higher Education (2019) Da, N.Z.: The digital humanities debacle—computational methods repeatedly come up short. The Chronicle of Higher Education (2019)
6.
Zurück zum Zitat Finkelstein, L., et al.: Placing search in context: the concept revisited. In: Proceedings of the International Conference on World Wide Web. ACM (2001) Finkelstein, L., et al.: Placing search in context: the concept revisited. In: Proceedings of the International Conference on World Wide Web. ACM (2001)
7.
Zurück zum Zitat Fitzpatrick, K.: The humanities, done digitally. The Chronicle of Higher Education (2011) Fitzpatrick, K.: The humanities, done digitally. The Chronicle of Higher Education (2011)
8.
Zurück zum Zitat Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist. 41, 665–695 (2016)MathSciNetCrossRef Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist. 41, 665–695 (2016)MathSciNetCrossRef
9.
Zurück zum Zitat Jänicke, S., Franzini, G., Cheema, M.F., Scheuermann, G.: On close and distant reading in digital humanities: a survey and future challenges. In: Eurographics Conference on Visualization (EuroVis), vol. 2 (2015) Jänicke, S., Franzini, G., Cheema, M.F., Scheuermann, G.: On close and distant reading in digital humanities: a survey and future challenges. In: Eurographics Conference on Visualization (EuroVis), vol. 2 (2015)
10.
Zurück zum Zitat Katz, S.M.: Distribution of content words and phrases in text and language modelling. Nat. Lang. Eng. 2(1), 15–59 (1996)CrossRef Katz, S.M.: Distribution of content words and phrases in text and language modelling. Nat. Lang. Eng. 2(1), 15–59 (1996)CrossRef
11.
Zurück zum Zitat Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)CrossRef Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)CrossRef
12.
Zurück zum Zitat Mikolov, T., Yih, W., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (2013) Mikolov, T., Yih, W., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (2013)
13.
Zurück zum Zitat Moretti, F.: Distant Reading. Verso Books, London (2013) Moretti, F.: Distant Reading. Verso Books, London (2013)
14.
Zurück zum Zitat O’Connor, B., Bamman, D., Smith, N.A.: Computational text analysis for social science: model assumptions and complexity. In: Second Workshop on Computational Social Science and the Wisdom of Crowds (2011) O’Connor, B., Bamman, D., Smith, N.A.: Computational text analysis for social science: model assumptions and complexity. In: Second Workshop on Computational Social Science and the Wisdom of Crowds (2011)
15.
Zurück zum Zitat Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)CrossRef Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)CrossRef
16.
Zurück zum Zitat Schwartz, H.A., Gomez, F.: Evaluating semantic metrics on tasks of concept similarity. In: Proceedings of FLAIRS (2011) Schwartz, H.A., Gomez, F.: Evaluating semantic metrics on tasks of concept similarity. In: Proceedings of FLAIRS (2011)
17.
Zurück zum Zitat Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)CrossRef Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)CrossRef
18.
Zurück zum Zitat Underwood, T.: Dear Humanists: Fear Not the Digital Revolution. The Chronicle of Higher Education (2019) Underwood, T.: Dear Humanists: Fear Not the Digital Revolution. The Chronicle of Higher Education (2019)
Metadaten
Titel
How Lexical Gold Standards Have Effects on the Usefulness of Text Analysis Tools for Digital Scholarship
verfasst von
Jussi Karlgren
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-28577-7_14