Skip to main content
main-content
Top

Hint

Swipe to navigate through the chapters of this book

2021 | OriginalPaper | Chapter

Reconciling and Using Historical Person Registers as Linked Open Data in the AcademySampo Portal and Data Service

Authors : Petri Leskinen, Eero Hyvönen

Published in: The Semantic Web – ISWC 2021

Publisher: Springer International Publishing

share
SHARE

Abstract

This paper presents a method for extracting and reassembling a genealogical network automatically from a biographical register of historical people. The method is applied to a dataset of short textual biographies about all 28 000 Finnish and Swedish academic people educated in 1640–1899 in Finland. The aim is to connect and disambiguate the relatives mentioned in the biographies in order to build a continuous, genealogical network, which can be used in Digital Humanities for data and network analysis of historical academic people and their lives. An artificial neural network approach is presented for solving a supervised learning task to disambiguate relatives mentioned in the register descriptions using basic biographical information enhanced with an ontology of vocations and additional occasionally sparse genealogical information. Evaluation results of the record linkage are promising and provide novel insights into the problem of historical people register reconciliation. The outcome of the work has been used in practise as part of the in-use AcademySampo portal and linked open data service, a new member in the Sampo series of cultural heritage applications for Digital Humanities.
Footnotes
1
The portal and its linked open data service, including a SPARQL endpoint, was released on February 5, 2021. More information about AcademySampo can be found on the project homepage: https://​seco.​cs.​aalto.​fi/​projects/​yo-matrikkelit/​.
 
8
This statistical result was obtained after we used the reconciled data in AcademySampo for data analysis.
 
Literature
2.
go back to reference Antonie, L., Gadgil, H., Grewal, G., Inwood, K.: Historical data integration, a study of WWI Canadian soldiers. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), pp. 186–193. IEEE (2016) Antonie, L., Gadgil, H., Grewal, G., Inwood, K.: Historical data integration, a study of WWI Canadian soldiers. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), pp. 186–193. IEEE (2016)
8.
go back to reference Cunningham, A.: After “it’s over over there’’: using record linkage to enable the reconstruction of World War I veterans’ demography from soldiers’ experiences to civilian populations. Historical Methods: J. Quant. Interdisc. Hist. 51, 1–27 (2018) CrossRef Cunningham, A.: After “it’s over over there’’: using record linkage to enable the reconstruction of World War I veterans’ demography from soldiers’ experiences to civilian populations. Historical Methods: J. Quant. Interdisc. Hist. 51, 1–27 (2018) CrossRef
10.
go back to reference Fokkens, A., et al.: BiographyNet: extracting relations between people and events. In: Europa baut auf Biographien, pp. 193–224. New Academic Press, Wien (2017) Fokkens, A., et al.: BiographyNet: extracting relations between people and events. In: Europa baut auf Biographien, pp. 193–224. New Academic Press, Wien (2017)
12.
go back to reference Gangemi, A., Presutti, V., Recupero, D.R., Nuzzolese, A.G., Draicchio, F., Mongiovì, M.: Semantic web machine reading with FRED. Semantic Web 8, 873–893 (2017) CrossRef Gangemi, A., Presutti, V., Recupero, D.R., Nuzzolese, A.G., Draicchio, F., Mongiovì, M.: Semantic web machine reading with FRED. Semantic Web 8, 873–893 (2017) CrossRef
13.
go back to reference Gu, L., Baxter, R., Vickers, D., Rainsford, C.: Record linkage: current practice and future directions. CSIRO Mathematical and Information Sciences (2003). cMIS Technical Report No. 03/83 Gu, L., Baxter, R., Vickers, D., Rainsford, C.: Record linkage: current practice and future directions. CSIRO Mathematical and Information Sciences (2003). cMIS Technical Report No. 03/83
19.
go back to reference Ivie, S., Pixton, B., Giraud-Carrier, C.: Metric-based data mining model for genealogical record linkage. In: 2007 IEEE International Conference on Information Reuse and Integration, pp. 538–543. IEEE (2007) Ivie, S., Pixton, B., Giraud-Carrier, C.: Metric-based data mining model for genealogical record linkage. In: 2007 IEEE International Conference on Information Reuse and Integration, pp. 538–543. IEEE (2007)
20.
go back to reference Koho, M., Gasbarra, L., Tuominen, J., Rantala, H., Jokipii, I., Hyvönen, E.: AMMO ontology of Finnish historical occupations. In: Proceedings of the First International Workshop on Open Data and Ontologies for Cultural Heritage (ODOCH 2019), vol. 2375, pp. 91–96. CEUR Workshop Proceedings, June 2019. http://​ceur-ws.​org/​Vol-2375/​ Koho, M., Gasbarra, L., Tuominen, J., Rantala, H., Jokipii, I., Hyvönen, E.: AMMO ontology of Finnish historical occupations. In: Proceedings of the First International Workshop on Open Data and Ontologies for Cultural Heritage (ODOCH 2019), vol. 2375, pp. 91–96. CEUR Workshop Proceedings, June 2019. http://​ceur-ws.​org/​Vol-2375/​
22.
go back to reference Langmead, A., Otis, J., Warren, C., Weingart, S., Zilinski, L.: Towards interoperable network ontologies for the digital humanities. Int. J. Humanit. Arts Comput. 10(1), 22–35 (2016) CrossRef Langmead, A., Otis, J., Warren, C., Weingart, S., Zilinski, L.: Towards interoperable network ontologies for the digital humanities. Int. J. Humanit. Arts Comput. 10(1), 22–35 (2016) CrossRef
25.
go back to reference Leskinen, P., Hyvönen, E.: Linked open data service about historical Finnish academic people in 1640–1899. In: DHN 2020 Digital Humanities in the Nordic Countries. Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, vol. 2612, pp. 284–292. CEUR Workshop Proceedings, October 2020. http://​ceur-ws.​org/​Vol-2612/​short14.​pdf Leskinen, P., Hyvönen, E.: Linked open data service about historical Finnish academic people in 1640–1899. In: DHN 2020 Digital Humanities in the Nordic Countries. Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, vol. 2612, pp. 284–292. CEUR Workshop Proceedings, October 2020. http://​ceur-ws.​org/​Vol-2612/​short14.​pdf
26.
go back to reference Malmi, E., Gionis, A., Solin, A.: Computationally inferred genealogical networks uncover long-term trends in assortative mating. arXiv (2018). arXiv:​1802.​06055 [cs.SI] Malmi, E., Gionis, A., Solin, A.: Computationally inferred genealogical networks uncover long-term trends in assortative mating. arXiv (2018). arXiv:​1802.​06055 [cs.SI]
27.
go back to reference Pixton, B., Giraud-Carrier, C.: Using structured neural networks for record linkage. In: Proceedings of the Sixth Annual Workshop on Technology for Family History and Genealogical Research (2006) Pixton, B., Giraud-Carrier, C.: Using structured neural networks for record linkage. In: Proceedings of the Sixth Annual Workshop on Technology for Family History and Genealogical Research (2006)
29.
go back to reference Rospocher, M., et al.: Building event-centric knowledge graphs from news. Web Semantics 37, 132–151 (2016) CrossRef Rospocher, M., et al.: Building event-centric knowledge graphs from news. Web Semantics 37, 132–151 (2016) CrossRef
30.
go back to reference Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014) MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014) MathSciNetMATH
31.
go back to reference Tan, P.N., Steinbach, M., Kumar, V.: Introduction to data mining, 1st edn (2005) Tan, P.N., Steinbach, M., Kumar, V.: Introduction to data mining, 1st edn (2005)
33.
go back to reference Wang, S., Liu, W., Wu, J., Cao, L., Meng, Q., Kennedy, P.J.: Training deep neural networks on imbalanced data sets. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 4368–4374. IEEE (2016) Wang, S., Liu, W., Wu, J., Cao, L., Meng, Q., Kennedy, P.J.: Training deep neural networks on imbalanced data sets. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 4368–4374. IEEE (2016)
34.
go back to reference Warren, C., Shore, D., Otis, J., Wang, L., Finegold, M., Shalizi, C.: Six degrees of Francis Bacon: a statistical method for reconstructing large historical social networks. Digit. Humanit. Q. 10(3) (2016) Warren, C., Shore, D., Otis, J., Wang, L., Finegold, M., Shalizi, C.: Six degrees of Francis Bacon: a statistical method for reconstructing large historical social networks. Digit. Humanit. Q. 10(3) (2016)
35.
go back to reference Winkler, W.E.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage (1990) Winkler, W.E.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage (1990)
36.
go back to reference Winkler, W.E.: Overview of record linkage and current research directions. Technical report, U.S. Census Bureau (2006) Winkler, W.E.: Overview of record linkage and current research directions. Technical report, U.S. Census Bureau (2006)
Metadata
Title
Reconciling and Using Historical Person Registers as Linked Open Data in the AcademySampo Portal and Data Service
Authors
Petri Leskinen
Eero Hyvönen
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-88361-4_42

Premium Partner