Skip to main content
Top

2022 | OriginalPaper | Chapter

14. Graph-Based Hierarchical Record Clustering for Unsupervised Entity Resolution

Authors : Islam Akef Ebeid, John R. Talburt, Md Abdus Salam Siddique

Published in: ITNG 2022 19th International Conference on Information Technology-New Generations

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The chapter discusses the critical problem of entity resolution in data cleaning, curation, and integration, focusing on unsupervised methods. It introduces a graph-based hierarchical record clustering technique, GDWM, which modifies the Data Washing Machine (DWM) algorithm to enhance accuracy and efficiency. The method integrates graph-based transitive closure and Modularity optimization, eliminating the need for iterative reiterations and threshold setting. The chapter also presents experiments on synthetic benchmark datasets, demonstrating the effectiveness and superior performance of GDWM compared to the original DWM. The results highlight significant improvements in precision, recall, F1 scores, and execution time, making GDWM a standout solution in the field of unsupervised entity resolution.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Business + Economics & Engineering + Technology"

Online-Abonnement

Springer Professional "Business + Economics & Engineering + Technology" gives you access to:

  • more than 102.000 books
  • more than 537 journals

from the following subject areas:

  • Automotive
  • Construction + Real Estate
  • Business IT + Informatics
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Mechanical Engineering + Materials
  • Insurance + Risk


Secure your knowledge advantage now!

Springer Professional "Engineering + Technology"

Online-Abonnement

Springer Professional "Engineering + Technology" gives you access to:

  • more than 67.000 books
  • more than 390 journals

from the following specialised fileds:

  • Automotive
  • Business IT + Informatics
  • Construction + Real Estate
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Mechanical Engineering + Materials





 

Secure your knowledge advantage now!

Springer Professional "Business + Economics"

Online-Abonnement

Springer Professional "Business + Economics" gives you access to:

  • more than 67.000 books
  • more than 340 journals

from the following specialised fileds:

  • Construction + Real Estate
  • Business IT + Informatics
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Insurance + Risk



Secure your knowledge advantage now!

Literature
This content is only visible if you are logged in and have the appropriate permissions.
Metadata
Title
Graph-Based Hierarchical Record Clustering for Unsupervised Entity Resolution
Authors
Islam Akef Ebeid
John R. Talburt
Md Abdus Salam Siddique
Copyright Year
2022
DOI
https://doi.org/10.1007/978-3-030-97652-1_14

Premium Partner