Skip to main content
Top

2020 | OriginalPaper | Chapter

Incremental Multi-source Entity Resolution for Knowledge Graph Completion

Authors : Alieh Saeedi, Eric Peukert, Erhard Rahm

Published in: The Semantic Web

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We present and evaluate new methods for incremental entity resolution as needed for the completion of knowledge graphs integrating data from multiple sources. Compared to previous approaches we aim at reducing the dependency on the order in which new sources and entities are added. For this purpose, we consider sets of new entities for an optimized assignment of them to entity clusters. We also propose the use of a light-weight approach to repair entity clusters in order to correct wrong clusters. The new approaches are integrated within the FAMER framework for parallel and scalable entity clustering. A detailed evaluation of the new approaches for real-world workloads shows their high effectiveness. In particular, the repair approach outperforms other incremental approaches and achieves the same quality than with batch-like entity resolution showing that its results are independent from the order in which new entities are added.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Obraczka, D., Saeedi, A., Rahm, E.: Knowledge graph completion with FAMER. In: Proceedings of the DI2KG (2019) Obraczka, D., Saeedi, A., Rahm, E.: Knowledge graph completion with FAMER. In: Proceedings of the DI2KG (2019)
3.
go back to reference Welch, M., Sane, A., Drome, C.: Fast and accurate incremental entity resolution relative to an entity knowledge base. In: CIKM (2012) Welch, M., Sane, A., Drome, C.: Fast and accurate incremental entity resolution relative to an entity knowledge base. In: CIKM (2012)
4.
go back to reference Nentwig, M., Rahm, E.: Incremental clustering on linked data. In: ICDMW. IEEE (2018) Nentwig, M., Rahm, E.: Incremental clustering on linked data. In: ICDMW. IEEE (2018)
6.
go back to reference Gruenheid, A., et al.: Incremental record linkage. PVLDB 7(9), 697–708 (2014) Gruenheid, A., et al.: Incremental record linkage. PVLDB 7(9), 697–708 (2014)
7.
go back to reference Getoor, L., Machanavajjhala, A.: Entity resolution: theory, practice & open challenges. PVLDB 5(12), 2018–2019 (2012) Getoor, L., Machanavajjhala, A.: Entity resolution: theory, practice & open challenges. PVLDB 5(12), 2018–2019 (2012)
9.
go back to reference Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Silk-a link discovery framework for the web of data. Ldow 538, 53 (2009) Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Silk-a link discovery framework for the web of data. Ldow 538, 53 (2009)
10.
go back to reference Nentwig, M., Hartung, M., Ngonga, N.A., Rahm, E.: A survey of current link discovery frameworks. Semant. Web 8(3), 419–436 (2017)CrossRef Nentwig, M., Hartung, M., Ngonga, N.A., Rahm, E.: A survey of current link discovery frameworks. Semant. Web 8(3), 419–436 (2017)CrossRef
11.
go back to reference Papadakis, G., et al.: The return of JedAI: end-to-end entity resolution for structured and semi-structured data. PVLDB 11(12), 1950–1953 (2018) Papadakis, G., et al.: The return of JedAI: end-to-end entity resolution for structured and semi-structured data. PVLDB 11(12), 1950–1953 (2018)
13.
go back to reference Bellare, K., et al.: WOO: a scalable and multi-tenant platform for continuous knowledge base synthesis. PVLDB 6(11), 1114–1125 (2013) Bellare, K., et al.: WOO: a scalable and multi-tenant platform for continuous knowledge base synthesis. PVLDB 6(11), 1114–1125 (2013)
16.
go back to reference do Nascimento, D., et al.: Heuristic-based approaches for speeding up incremental record linkage. J. Syst. Softw. 137, 335–354 (2018)CrossRef do Nascimento, D., et al.: Heuristic-based approaches for speeding up incremental record linkage. J. Syst. Softw. 137, 335–354 (2018)CrossRef
17.
go back to reference Hildebrandt, K., Panse, F., Wilcke, N., Ritter, N.: Large-scale data pollution with Apache Spark. IEEE Trans. Big Data (2017) Hildebrandt, K., Panse, F., Wilcke, N., Ritter, N.: Large-scale data pollution with Apache Spark. IEEE Trans. Big Data (2017)
18.
go back to reference Christen, P., Vatsalan, D.: Flexible and extensible generation and corruption of personal data. In: ACM CIKM. ACM (2013) Christen, P., Vatsalan, D.: Flexible and extensible generation and corruption of personal data. In: ACM CIKM. ACM (2013)
Metadata
Title
Incremental Multi-source Entity Resolution for Knowledge Graph Completion
Authors
Alieh Saeedi
Eric Peukert
Erhard Rahm
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-49461-2_23