Skip to main content
Top

2021 | OriginalPaper | Chapter

Knowledge-Based Named Entity Recognition of Archaeological Concepts in Dutch

Authors : Andreas Vlachidis, Douglas Tudhope, Milco Wansleeben

Published in: Metadata and Semantic Research

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The advancement of Natural Language Processing (NLP) allows the process of deriving information from large volumes of text to be automated, making text-based resources more discoverable and useful. The attention is turned to one of the most important, but traditionally difficult to access resources in archaeology; the largely unpublished reports generated by commercial or “rescue” archaeology, commonly known as “grey literature”. The paper presents the development and evaluation of a Named Entity Recognition system of Dutch archaeological grey literature targeted at extracting mentions of artefacts, archaeological features, materials, places and time entities. The role of domain vocabulary is discussed for the development of a KOS-driven NLP pipeline which is evaluated against a Gold Standard, human-annotated corpus.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Simple Knowledge Organization System (SKOS) is a Semantic Web format and aW3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject heading systems, or any other type of structured controlled vocabulary https://​www.​w3.​org/​2004/​02/​skos/​.
 
Literature
1.
go back to reference Evans, T.N.: A reassessment of archaeological grey literature: semantics and paradoxes. Internet Archaeol. 40 (2015) Evans, T.N.: A reassessment of archaeological grey literature: semantics and paradoxes. Internet Archaeol. 40 (2015)
3.
go back to reference Richards, J., Tudhope, D., Vlachidis, A.: Text mining in archaeology: extracting information from archaeological reports. In: Barcelo, J.A., Bogdanovic, I. (eds.) Mathematics and Archaeology, pp. 240–254. CRC Press, Boca Raton (2015) Richards, J., Tudhope, D., Vlachidis, A.: Text mining in archaeology: extracting information from archaeological reports. In: Barcelo, J.A., Bogdanovic, I. (eds.) Mathematics and Archaeology, pp. 240–254. CRC Press, Boca Raton (2015)
4.
go back to reference Brandsen, A., Lambers, K., Verberne, S., Wansleeben, M.: User requirement solicitation for an information retrieval system applied to Dutch grey literature in the archaeology domain. J. Comput. Appl. Archaeol. 2(1), 21–30 (2019) Brandsen, A., Lambers, K., Verberne, S., Wansleeben, M.: User requirement solicitation for an information retrieval system applied to Dutch grey literature in the archaeology domain. J. Comput. Appl. Archaeol. 2(1), 21–30 (2019)
5.
go back to reference Vlachidis, A., Tudhope, D.: A knowledge- based approach to Information Extraction for semantic interoperability in the archaeology domain. J. Assoc. Inf. Sci. Technol. 67(5), 1138–1152 (2016)CrossRef Vlachidis, A., Tudhope, D.: A knowledge- based approach to Information Extraction for semantic interoperability in the archaeology domain. J. Assoc. Inf. Sci. Technol. 67(5), 1138–1152 (2016)CrossRef
6.
go back to reference Cunningham, H., Tablan, V., Roberts, A., Bontcheva, K.: Getting more out of biomedical documents with GATE’s full lifecycle open source text analytics. PLoS Comput. Biol. 9(2), e1002854 (2013)CrossRef Cunningham, H., Tablan, V., Roberts, A., Bontcheva, K.: Getting more out of biomedical documents with GATE’s full lifecycle open source text analytics. PLoS Comput. Biol. 9(2), e1002854 (2013)CrossRef
7.
go back to reference Meghini, C., et al.: ARIADNE: a research infrastructure for archaeology. J. Comput. Cult. Heritage (JOCCH) 10(3), 18 (2017) Meghini, C., et al.: ARIADNE: a research infrastructure for archaeology. J. Comput. Cult. Heritage (JOCCH) 10(3), 18 (2017)
8.
go back to reference Tudhope, D., May, K., Binding, C., Vlachidis. A.: Connecting archaeological data and grey literature via semantic cross search. Internet Archaeol. 30 (2011) Tudhope, D., May, K., Binding, C., Vlachidis. A.: Connecting archaeological data and grey literature via semantic cross search. Internet Archaeol. 30 (2011)
9.
go back to reference Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)CrossRef Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)CrossRef
10.
go back to reference Toledo, J.I., Carbonell, M., Fornés, A., Lladós, A.J.: Information extraction from historical handwritten document images with a context-aware neural model. Pattern Recogn. 86, 27–36 (2019)CrossRef Toledo, J.I., Carbonell, M., Fornés, A., Lladós, A.J.: Information extraction from historical handwritten document images with a context-aware neural model. Pattern Recogn. 86, 27–36 (2019)CrossRef
11.
go back to reference Grishman, R., Sundheim, B.: Message understanding conference-6: a brief history. In: 16th International Conference on Computational Lingusitics, pp. 466–471 (1996) Grishman, R., Sundheim, B.: Message understanding conference-6: a brief history. In: 16th International Conference on Computational Lingusitics, pp. 466–471 (1996)
12.
go back to reference Tjong Kim Sang, E.F.: Introduction to the CoNLL-2002 shared task: language-independent named entity recognition. In: Proceedings of CoNLL-2002, pp. 155–158 (2002) Tjong Kim Sang, E.F.: Introduction to the CoNLL-2002 shared task: language-independent named entity recognition. In: Proceedings of CoNLL-2002, pp. 155–158 (2002)
13.
go back to reference Hooland, S., De Wilde, M., Verborgh, R., Steiner, T., Van de Walle, R.: Exploring entity recognition and disambiguation for cultural heritage collections. Digit. Sch. Hum. 30(2), 262–279 (2013) Hooland, S., De Wilde, M., Verborgh, R., Steiner, T., Van de Walle, R.: Exploring entity recognition and disambiguation for cultural heritage collections. Digit. Sch. Hum. 30(2), 262–279 (2013)
14.
go back to reference Amrani, A., Abajian, V., Kodratoff, Y.: A chain of text-mining to extract information in archaeology. In: Annual IEEE Computer Conference, International Conference on Information and Communication Technologies: From Theory to Applications, and ICTTA, 3rd International Conference on Information and Communication Technologies: From Theory to Applications, 7–11 April (2008) Amrani, A., Abajian, V., Kodratoff, Y.: A chain of text-mining to extract information in archaeology. In: Annual IEEE Computer Conference, International Conference on Information and Communication Technologies: From Theory to Applications, and ICTTA, 3rd International Conference on Information and Communication Technologies: From Theory to Applications, 7–11 April (2008)
15.
go back to reference Paijmans, H., Wubben, S.: Preparing archaeological reports for intelligent retrieval. In: Posluschny, A., Lambers, K., Herzog, I. (eds.) Layers of Perception. Proceedings of the 35th International Conference on Computer Applications and Quantitative Methods in Archaeology (CAA) Berlin, Germany, April 2–6, pp. 212–217 (2007) Paijmans, H., Wubben, S.: Preparing archaeological reports for intelligent retrieval. In: Posluschny, A., Lambers, K., Herzog, I. (eds.) Layers of Perception. Proceedings of the 35th International Conference on Computer Applications and Quantitative Methods in Archaeology (CAA) Berlin, Germany, April 2–6, pp. 212–217 (2007)
16.
go back to reference Byrne, K.F., Klein, E.: Automatic extraction of archaeological events from text. In: Frischer, B., Crawford, J.W., Koller, D. (eds.) Making History Interactive. Proceedings of the 37th Computer Application in Archaeology Conference, pp. 48–56 (2009) Byrne, K.F., Klein, E.: Automatic extraction of archaeological events from text. In: Frischer, B., Crawford, J.W., Koller, D. (eds.) Making History Interactive. Proceedings of the 37th Computer Application in Archaeology Conference, pp. 48–56 (2009)
17.
go back to reference Jeffrey, S., Richards, J., Ciravegna, F., Waller, S., Chapman, S., Zhang, Z.: The archaeotools project: faceted classification and natural language processing in an archaeological context. Philosoph. Trans. Ser. A. Math. Phys. Eng. Sci. 367(1897), 2507–2519 (2009) Jeffrey, S., Richards, J., Ciravegna, F., Waller, S., Chapman, S., Zhang, Z.: The archaeotools project: faceted classification and natural language processing in an archaeological context. Philosoph. Trans. Ser. A. Math. Phys. Eng. Sci. 367(1897), 2507–2519 (2009)
18.
go back to reference Vlachidis, A.: Semantic indexing via knowledge organization systems: applying the CIDOC-CRM to archaeological grey literature. Doctoral dissertation, University of Glamorgan (2012) Vlachidis, A.: Semantic indexing via knowledge organization systems: applying the CIDOC-CRM to archaeological grey literature. Doctoral dissertation, University of Glamorgan (2012)
20.
go back to reference Piskorski, J., Wieloch, K., Sydow, M.: On knowledge-poor methods for person name matching and lemmatization for highly inflectional languages. Inf. Retr. 12(3), 275–299 (2009)CrossRef Piskorski, J., Wieloch, K., Sydow, M.: On knowledge-poor methods for person name matching and lemmatization for highly inflectional languages. Inf. Retr. 12(3), 275–299 (2009)CrossRef
Metadata
Title
Knowledge-Based Named Entity Recognition of Archaeological Concepts in Dutch
Authors
Andreas Vlachidis
Douglas Tudhope
Milco Wansleeben
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-71903-6_6