Skip to main content

2016 | OriginalPaper | Buchkapitel

9. Recent Advances in Object Identification

verfasst von : Carlo Batini, Monica Scannapieco

Erschienen in: Data and Information Quality

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Research on object identification has been producing several significant results in the last years, in different areas of computer science. As observed in [140], it is well known that in data mining projects, a large proportion of effort (20–30 % reported in [566]) is spent for understanding data and 50–70 % for data preparation. Governmental organizations need to reconcile and integrate their huge and heterogeneous data assets; statistical agencies routinely link survey and administrative data, in the health sector historical data on patients; and analyses are to be linked for improving effectiveness of operation and policies [80]; security agencies increasingly rely on the ability to correlate files referring to a single individual; data linkage can help in bioinformatics to relate known genome sequences to a new unknown sequence. Due to such increasing interest in object identification, in this chapter, we pay attention to the main trends and results in the area with a focus on the latest results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
6.
Zurück zum Zitat Abowd JM, Vilhuber L (2005) The sensitivity of economic statistics to coding errors in personal identifiers. Journal of Business & Economic Statistics 23(2) Abowd JM, Vilhuber L (2005) The sensitivity of economic statistics to coding errors in personal identifiers. Journal of Business & Economic Statistics 23(2)
14.
Zurück zum Zitat Aizawa A, Oyama K (2005) A fast linkage detection scheme for multi-source information integration. In: Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration, 2005 (WIRI’05). IEEE, New York, pp 30–39CrossRef Aizawa A, Oyama K (2005) A fast linkage detection scheme for multi-source information integration. In: Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration, 2005 (WIRI’05). IEEE, New York, pp 30–39CrossRef
15.
Zurück zum Zitat Al-Lawati A, Lee D, McDaniel P (2005) Blocking-aware private record linkage. In: Proceedings of the 2nd International Workshop on Information Quality in Information Systems. ACM, New York, pp 59–68 Al-Lawati A, Lee D, McDaniel P (2005) Blocking-aware private record linkage. In: Proceedings of the 2nd International Workshop on Information Quality in Information Systems. ACM, New York, pp 59–68
16.
Zurück zum Zitat Altowim Y, Kalashnikov DV, Mehrotra S (2014) Progressive approach to relational entity resolution. Proceedings of the VLDB Endowment 7(11):999–1010CrossRef Altowim Y, Kalashnikov DV, Mehrotra S (2014) Progressive approach to relational entity resolution. Proceedings of the VLDB Endowment 7(11):999–1010CrossRef
17.
Zurück zum Zitat Altwaijry H, Kalashnikov DV, Mehrotra S (2013) Query-driven approach to entity resolution. Proceedings of the VLDB Endowment 6(14):1846–1857CrossRef Altwaijry H, Kalashnikov DV, Mehrotra S (2013) Query-driven approach to entity resolution. Proceedings of the VLDB Endowment 6(14):1846–1857CrossRef
23.
Zurück zum Zitat Ananthakrishna R, Chaudhuri C, Ganti V (2002) Eliminating Fuzzy duplicates in data warehouses. In: Proceedings of VLDB 2002, Hong Kong, pp 586–597 Ananthakrishna R, Chaudhuri C, Ganti V (2002) Eliminating Fuzzy duplicates in data warehouses. In: Proceedings of VLDB 2002, Hong Kong, pp 586–597
24.
Zurück zum Zitat Arasu A, Chaudhuri S, Kaushik R (2008) Transformation-based framework for record matching. In: IEEE 24th International Conference on Data Engineering (ICDE 2008). IEEE, New York, pp 40–49CrossRef Arasu A, Chaudhuri S, Kaushik R (2008) Transformation-based framework for record matching. In: IEEE 24th International Conference on Data Engineering (ICDE 2008). IEEE, New York, pp 40–49CrossRef
25.
Zurück zum Zitat Arasu A, Chaudhuri S, Kaushik R (2009) Learning string transformations from examples. Proceedings of the VLDB Endowment 2(1):514–525CrossRef Arasu A, Chaudhuri S, Kaushik R (2009) Learning string transformations from examples. Proceedings of the VLDB Endowment 2(1):514–525CrossRef
26.
Zurück zum Zitat Arasu A, Götz M, Kaushik R (2010) On active learning of record matching packages. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM, New York, pp 783–794CrossRef Arasu A, Götz M, Kaushik R (2010) On active learning of record matching packages. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM, New York, pp 783–794CrossRef
29.
Zurück zum Zitat Asher J, Fienberg SE, Stuart E, Zaslavsky A (2003) Inferences for finite populations using multiple data sources with different reference times. In: Proceedings of Statistics Canada Symposium 2002: Modelling Survey Data For Social and Economic Research. Statistics Canada, Ottawa, vol 385 Asher J, Fienberg SE, Stuart E, Zaslavsky A (2003) Inferences for finite populations using multiple data sources with different reference times. In: Proceedings of Statistics Canada Symposium 2002: Modelling Survey Data For Social and Economic Research. Statistics Canada, Ottawa, vol 385
37.
Zurück zum Zitat Batini C, Scannapieco M (2006) Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications). Springer, New YorkMATH Batini C, Scannapieco M (2006) Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications). Springer, New YorkMATH
50.
Zurück zum Zitat Beeri C, Kanza Y, Safra E, Sagiv Y (2004) Object fusion in geographic information systems. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases. VLDB Endowment, vol 30, pp 816–827 Beeri C, Kanza Y, Safra E, Sagiv Y (2004) Object fusion in geographic information systems. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases. VLDB Endowment, vol 30, pp 816–827
51.
Zurück zum Zitat Beeri C, Doytsher Y, Kanza Y, Safra E, Sagiv Y (2005) Finding corresponding objects when integrating several geo-spatial datasets. In: Proceedings of the 13th Annual ACM International Workshop on Geographic Information Systems. ACM, New York, pp 87–96 Beeri C, Doytsher Y, Kanza Y, Safra E, Sagiv Y (2005) Finding corresponding objects when integrating several geo-spatial datasets. In: Proceedings of the 13th Annual ACM International Workshop on Geographic Information Systems. ACM, New York, pp 87–96
54.
Zurück zum Zitat Benjelloun O, Garcia-Molina H, Menestrina D, Su Q, Whang SE, Widom J (2009) Swoosh: a generic approach to entity resolution. The VLDB Journal, The International Journal on Very Large Data Bases 18(1):255–276CrossRef Benjelloun O, Garcia-Molina H, Menestrina D, Su Q, Whang SE, Widom J (2009) Swoosh: a generic approach to entity resolution. The VLDB Journal, The International Journal on Very Large Data Bases 18(1):255–276CrossRef
58.
Zurück zum Zitat Berjawi B (2013) Introduction to the Integration of Location-Based Services of Several Providers Berjawi B (2013) Introduction to the Integration of Location-Based Services of Several Providers
66.
Zurück zum Zitat Bhattacharya I, Getoor L (2004) Deduplication and group detection using links. In: KDD Workshop on Link Analysis and Group Detection Bhattacharya I, Getoor L (2004) Deduplication and group detection using links. In: KDD Workshop on Link Analysis and Group Detection
67.
Zurück zum Zitat Bhattacharya I, Getoor L (2004) Iterative record linkage for cleaning and integration. In: Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. ACM, New York, pp 11–18CrossRef Bhattacharya I, Getoor L (2004) Iterative record linkage for cleaning and integration. In: Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. ACM, New York, pp 11–18CrossRef
68.
Zurück zum Zitat Bhattacharya I, Getoor L (2007) Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data (TKDD) 1(1):5CrossRef Bhattacharya I, Getoor L (2007) Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data (TKDD) 1(1):5CrossRef
69.
Zurück zum Zitat Bhattacharya I, Getoor L, Licamele L (2006) Query-time entity resolution. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, pp 529–534CrossRef Bhattacharya I, Getoor L, Licamele L (2006) Query-time entity resolution. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, pp 529–534CrossRef
75.
Zurück zum Zitat Bilenko M, Mooney RJ (2003) Adaptive duplicate detection using learnable string similarity measures. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, pp 39–48CrossRef Bilenko M, Mooney RJ (2003) Adaptive duplicate detection using learnable string similarity measures. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, pp 39–48CrossRef
76.
Zurück zum Zitat Bilenko M, Kamath B, Mooney RJ (2006) Adaptive blocking: learning to scale up record linkage. In: Sixth International Conference on Data Mining, 2006 (ICDM’06). IEEE, New York, pp 87–96 Bilenko M, Kamath B, Mooney RJ (2006) Adaptive blocking: learning to scale up record linkage. In: Sixth International Conference on Data Mining, 2006 (ICDM’06). IEEE, New York, pp 87–96
80.
Zurück zum Zitat Blakely T, Salmond C (2002) Probabilistic record linkage and a method to calculate the positive predictive value. International Journal of Epidemiology 31(6):1246–1252CrossRef Blakely T, Salmond C (2002) Probabilistic record linkage and a method to calculate the positive predictive value. International Journal of Epidemiology 31(6):1246–1252CrossRef
94.
Zurück zum Zitat Brizan DG, Tansel AU (2006) A survey of entity resolution and record linkage methodologies. Communications of the IIMA 6(3):41–50 Brizan DG, Tansel AU (2006) A survey of entity resolution and record linkage methodologies. Communications of the IIMA 6(3):41–50
121.
Zurück zum Zitat Chaudhuri S, Das Sarma A, Ganti V, Kaushik R (2007) Leveraging aggregate constraints for deduplication. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. ACM, New York, pp 437–448CrossRef Chaudhuri S, Das Sarma A, Ganti V, Kaushik R (2007) Leveraging aggregate constraints for deduplication. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. ACM, New York, pp 437–448CrossRef
122.
Zurück zum Zitat Chen CC, Knoblock CA, Shahabi C, Chiang YY, Thakkar S (2004) Automatically and accurately conflating orthoimagery and street maps. In: Proceedings of the 12th Annual ACM International Workshop on Geographic Information Systems. ACM, New York, pp 47–56 Chen CC, Knoblock CA, Shahabi C, Chiang YY, Thakkar S (2004) Automatically and accurately conflating orthoimagery and street maps. In: Proceedings of the 12th Annual ACM International Workshop on Geographic Information Systems. ACM, New York, pp 47–56
123.
Zurück zum Zitat Chen CC, Shahabi C, Knoblock CA, Kolahdouzan M (2006) Automatically and efficiently matching road networks with spatial attributes in unknown geometry systems. In: Proceedings of the Third Workshop on Spatio-Temporal Database Management (Co-located with VLDB2006), Seoul, pp 1–8 Chen CC, Shahabi C, Knoblock CA, Kolahdouzan M (2006) Automatically and efficiently matching road networks with spatial attributes in unknown geometry systems. In: Proceedings of the Third Workshop on Spatio-Temporal Database Management (Co-located with VLDB2006), Seoul, pp 1–8
124.
Zurück zum Zitat Chen CC, Knoblock CA, Shahabi C (2008) Automatically and accurately conflating raster maps with orthoimagery. GeoInformatica 12(3):377–410CrossRef Chen CC, Knoblock CA, Shahabi C (2008) Automatically and accurately conflating raster maps with orthoimagery. GeoInformatica 12(3):377–410CrossRef
126.
Zurück zum Zitat Chen Z, Kalashnikov DV, Mehrotra S (2007) Adaptive graphical approach to entity resolution. In: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries. ACM, New York, pp 204–213 Chen Z, Kalashnikov DV, Mehrotra S (2007) Adaptive graphical approach to entity resolution. In: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries. ACM, New York, pp 204–213
127.
Zurück zum Zitat Chen Z, Kalashnikov DV, Mehrotra S (2009) Exploiting context analysis for combining multiple entity resolution systems. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data. ACM, New York, pp 207–218 Chen Z, Kalashnikov DV, Mehrotra S (2009) Exploiting context analysis for combining multiple entity resolution systems. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data. ACM, New York, pp 207–218
136.
Zurück zum Zitat Christen P (2006) A comparison of personal name matching: techniques and practical issues. In: Sixth IEEE International Conference on Data Mining Workshops, 2006 (ICDM Workshops 2006). IEEE, New York, pp 290–294CrossRef Christen P (2006) A comparison of personal name matching: techniques and practical issues. In: Sixth IEEE International Conference on Data Mining Workshops, 2006 (ICDM Workshops 2006). IEEE, New York, pp 290–294CrossRef
137.
Zurück zum Zitat Christen P (2007) A two-step classification approach to unsupervised record linkage. In: Proceedings of the Sixth Australasian Conference on Data Mining and Analytics. Australian Computer Society, Inc., vol 70, pp 111–119 Christen P (2007) A two-step classification approach to unsupervised record linkage. In: Proceedings of the Sixth Australasian Conference on Data Mining and Analytics. Australian Computer Society, Inc., vol 70, pp 111–119
138.
Zurück zum Zitat Christen P (2008) Automatic record linkage using seeded nearest neighbour and support vector machine classification. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, pp 151–159CrossRef Christen P (2008) Automatic record linkage using seeded nearest neighbour and support vector machine classification. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, pp 151–159CrossRef
139.
Zurück zum Zitat Christen P (2012) A survey of indexing techniques for scalable record linkage and deduplication. IEEE Transactions on Knowledge and Data Engineering 24(9):1537–1555CrossRef Christen P (2012) A survey of indexing techniques for scalable record linkage and deduplication. IEEE Transactions on Knowledge and Data Engineering 24(9):1537–1555CrossRef
140.
Zurück zum Zitat Christen P, Goiser K (2007) Quality and complexity measures for data linkage and deduplication. In: Quality Measures in Data Mining. Springer, New York, pp 127–151CrossRef Christen P, Goiser K (2007) Quality and complexity measures for data linkage and deduplication. In: Quality Measures in Data Mining. Springer, New York, pp 127–151CrossRef
141.
Zurück zum Zitat Christen P, Pudjijono A (2009) Accurate synthetic generation of realistic personal information. In: Advances in Knowledge Discovery and Data Mining. Springer, New York, pp 507–514CrossRef Christen P, Pudjijono A (2009) Accurate synthetic generation of realistic personal information. In: Advances in Knowledge Discovery and Data Mining. Springer, New York, pp 507–514CrossRef
142.
Zurück zum Zitat Christen P, et al (2007) Towards parameter-free blocking for scalable record linkage. Australian National University, Canberra Christen P, et al (2007) Towards parameter-free blocking for scalable record linkage. Australian National University, Canberra
145.
Zurück zum Zitat Cohen WW, Richman J (2002) Learning to match and cluster large high-dimensional data sets for data integration. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, pp 475–480CrossRef Cohen WW, Richman J (2002) Learning to match and cluster large high-dimensional data sets for data integration. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, pp 475–480CrossRef
159.
Zurück zum Zitat Culotta A, McCallum A (2005) Joint deduplication of multiple record types in relational data. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management. ACM, New York, pp 257–258 Culotta A, McCallum A (2005) Joint deduplication of multiple record types in relational data. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management. ACM, New York, pp 257–258
160.
Zurück zum Zitat Damerau FJ (1964) A technique for computer detection and correction of spelling errors. Communications of the ACM 7(3):171–176CrossRef Damerau FJ (1964) A technique for computer detection and correction of spelling errors. Communications of the ACM 7(3):171–176CrossRef
171.
Zurück zum Zitat De Vries T, Ke H, Chawla S, Christen P (2009) Robust record linkage blocking using suffix arrays. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM, New York, pp 305–314 De Vries T, Ke H, Chawla S, Christen P (2009) Robust record linkage blocking using suffix arrays. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM, New York, pp 305–314
179.
Zurück zum Zitat Dong X, Halevy A, Madhavan J (2005) Reference reconciliation in complex information spaces. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data. ACM, New York, pp 85–96CrossRef Dong X, Halevy A, Madhavan J (2005) Reference reconciliation in complex information spaces. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data. ACM, New York, pp 85–96CrossRef
183.
Zurück zum Zitat Draisbach U, Naumann F (2009) A comparison and generalization of blocking and windowing algorithms for duplicate detection. In: Proceedings of the International Workshop on Quality in Databases (QDB), pp 51–56 Draisbach U, Naumann F (2009) A comparison and generalization of blocking and windowing algorithms for duplicate detection. In: Proceedings of the International Workshop on Quality in Databases (QDB), pp 51–56
187.
Zurück zum Zitat Durham E, Xue Y, Kantarcioglu M, Malin B (2012) Quantifying the correctness, computational complexity, and security of privacy-preserving string comparators for record linkage. Information Fusion 13(4):245–259CrossRef Durham E, Xue Y, Kantarcioglu M, Malin B (2012) Quantifying the correctness, computational complexity, and security of privacy-preserving string comparators for record linkage. Information Fusion 13(4):245–259CrossRef
188.
Zurück zum Zitat Dusserre L, Quantin C, Bouzelat H (1994) A one way public key cryptosystem for the linkage of nominal files in epidemiological studies. Medinfo 8:644–647 Dusserre L, Quantin C, Bouzelat H (1994) A one way public key cryptosystem for the linkage of nominal files in epidemiological studies. Medinfo 8:644–647
193.
Zurück zum Zitat Elfeky MG, Verykios VS, Elmagarmid AK (2002) Tailor: a record linkage toolbox. In: Proceedings of the 18th International Conference on Data Engineering, 2002. IEEE, New York, pp 17–28 Elfeky MG, Verykios VS, Elmagarmid AK (2002) Tailor: a record linkage toolbox. In: Proceedings of the 18th International Conference on Data Engineering, 2002. IEEE, New York, pp 17–28
197.
Zurück zum Zitat Elmagarmid AK, Ipeirotis PG, Verykios VS (2007) Duplicate record detection: a survey. IEEE Transactions on Knowledge and Data Engineering 19(1):1–16CrossRef Elmagarmid AK, Ipeirotis PG, Verykios VS (2007) Duplicate record detection: a survey. IEEE Transactions on Knowledge and Data Engineering 19(1):1–16CrossRef
226.
Zurück zum Zitat Fawcett T (2004) Roc graphs: notes and practical considerations for researchers. Machine Learning 31:1–38MathSciNet Fawcett T (2004) Roc graphs: notes and practical considerations for researchers. Machine Learning 31:1–38MathSciNet
231.
Zurück zum Zitat Filin S, Doytsher Y (2000) Detection of corresponding objects in linear-based map conflation. Surveying and Land Information Systems 60(2):117–128 Filin S, Doytsher Y (2000) Detection of corresponding objects in linear-based map conflation. Surveying and Land Information Systems 60(2):117–128
232.
Zurück zum Zitat Filin S, Doytsher Y (2000) A linear conflation approach for the integration of photogrammetric information and gis data. International Archives of Photogrammetry and Remote Sensing 33(B3/1; PART 3):282–288 Filin S, Doytsher Y (2000) A linear conflation approach for the integration of photogrammetric information and gis data. International Archives of Photogrammetry and Remote Sensing 33(B3/1; PART 3):282–288
242.
Zurück zum Zitat Fortier MFA, Ziou D, Armenakis C, Wang S (2000) Automated updating of road information from aerial images. In: American Society Photogrammetry and Remote Sensing Conference, pp 16–23 Fortier MFA, Ziou D, Armenakis C, Wang S (2000) Automated updating of road information from aerial images. In: American Society Photogrammetry and Remote Sensing Conference, pp 16–23
247.
Zurück zum Zitat Friedman C, Sideli R (1992) Tolerating spelling errors during patient validation. Computers and Biomedical Research 25(5):486–509CrossRef Friedman C, Sideli R (1992) Tolerating spelling errors during patient validation. Computers and Biomedical Research 25(5):486–509CrossRef
248.
Zurück zum Zitat Fung B, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Computing Surveys (CSUR) 42(4):14CrossRef Fung B, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Computing Surveys (CSUR) 42(4):14CrossRef
251.
Zurück zum Zitat Gabay Y, Doytsher Y (2000) Features-an approach to matching lines in partly similar engineering maps. Geomatica 54(3):297–310 Gabay Y, Doytsher Y (2000) Features-an approach to matching lines in partly similar engineering maps. Geomatica 54(3):297–310
262.
Zurück zum Zitat Getoor L, Machanavajjhala A (2012) Entity resolution: theory, practice & open challenges. Proceedings of the VLDB Endowment 5(12):2018–2019CrossRef Getoor L, Machanavajjhala A (2012) Entity resolution: theory, practice & open challenges. Proceedings of the VLDB Endowment 5(12):2018–2019CrossRef
268.
Zurück zum Zitat Goiser K, Christen P (2006) Towards automated record linkage. In: Proceedings of the Fifth Australasian Conference on Data Mining and Analytics. Australian Computer Society, Inc., vol 61, pp 23–31 Goiser K, Christen P (2006) Towards automated record linkage. In: Proceedings of the Fifth Australasian Conference on Data Mining and Analytics. Australian Computer Society, Inc., vol 61, pp 23–31
278.
Zurück zum Zitat Gruenheid A, Dong XL, Srivastava D (2014) Incremental record linkage. PVLDB 7(9):697–708 Gruenheid A, Dong XL, Srivastava D (2014) Incremental record linkage. PVLDB 7(9):697–708
279.
Zurück zum Zitat Grünwald PD (2007) The Minimum Description Length Principle. MIT Press, Cambridge Grünwald PD (2007) The Minimum Description Length Principle. MIT Press, Cambridge
280.
Zurück zum Zitat Gu L, Baxter RA (2004) Adaptive filtering for efficient record linkage. In: SDM. SIAM, Philadelphia, pp 477–481 Gu L, Baxter RA (2004) Adaptive filtering for efficient record linkage. In: SDM. SIAM, Philadelphia, pp 477–481
284.
Zurück zum Zitat Guo S, Dong XL, Srivastava D, Zajac R (2010) Record linkage with uniqueness constraints and erroneous values. Proceedings of the VLDB Endowment 3(1–2):417–428CrossRef Guo S, Dong XL, Srivastava D, Zajac R (2010) Record linkage with uniqueness constraints and erroneous values. Proceedings of the VLDB Endowment 3(1–2):417–428CrossRef
287.
Zurück zum Zitat Hagan MT, Demuth HB, Beale MH, et al (1996) Neural Network Design, vol 1. PWS, Boston Hagan MT, Demuth HB, Beale MH, et al (1996) Neural Network Design, vol 1. PWS, Boston
290.
Zurück zum Zitat Hall R, Fienberg SE (2011) Privacy-preserving record linkage. In: Privacy in Statistical Databases. Springer, New York, pp 269–283 Hall R, Fienberg SE (2011) Privacy-preserving record linkage. In: Privacy in Statistical Databases. Springer, New York, pp 269–283
301.
Zurück zum Zitat Hassanzadeh O, Chiang F, Lee HC, Miller RJ (2009) Framework for evaluating clustering algorithms in duplicate detection. Proceedings of the VLDB Endowment 2(1):1282–1293CrossRef Hassanzadeh O, Chiang F, Lee HC, Miller RJ (2009) Framework for evaluating clustering algorithms in duplicate detection. Proceedings of the VLDB Endowment 2(1):1282–1293CrossRef
302.
Zurück zum Zitat Hastings J (2008) Automated conflation of digital gazetteer data. International Journal of Geographical Information Science 22(10):1109–1127MathSciNetCrossRef Hastings J (2008) Automated conflation of digital gazetteer data. International Journal of Geographical Information Science 22(10):1109–1127MathSciNetCrossRef
308.
Zurück zum Zitat Hernández MA, Stolfo SJ (1995) The merge/purge problem for large databases. In: ACM SIGMOD Record. ACM, New York, vol 24, pp 127–138 Hernández MA, Stolfo SJ (1995) The merge/purge problem for large databases. In: ACM SIGMOD Record. ACM, New York, vol 24, pp 127–138
319.
Zurück zum Zitat Huang J, Ertekin S, Giles CL (2006) Efficient name disambiguation for large-scale databases. In: Knowledge Discovery in Databases: PKDD 2006. Springer, New York, pp 536–544CrossRef Huang J, Ertekin S, Giles CL (2006) Efficient name disambiguation for large-scale databases. In: Knowledge Discovery in Databases: PKDD 2006. Springer, New York, pp 536–544CrossRef
323.
Zurück zum Zitat Inan A, Kantarcioglu M, Bertino E, Scannapieco M (2008) A hybrid approach to private record linkage. In: IEEE 24th International Conference on Data Engineering, 2008 (ICDE 2008). IEEE, New York, pp 496–505CrossRef Inan A, Kantarcioglu M, Bertino E, Scannapieco M (2008) A hybrid approach to private record linkage. In: IEEE 24th International Conference on Data Engineering, 2008 (ICDE 2008). IEEE, New York, pp 496–505CrossRef
324.
Zurück zum Zitat Inan A, Kantarcioglu M, Ghinita G, Bertino E (2010) Private record matching using differential privacy. In: Proceedings of the 13th International Conference on Extending Database Technology. ACM, New York, pp 123–134CrossRef Inan A, Kantarcioglu M, Ghinita G, Bertino E (2010) Private record matching using differential privacy. In: Proceedings of the 13th International Conference on Extending Database Technology. ACM, New York, pp 123–134CrossRef
353.
Zurück zum Zitat Kalashnikov DV, Mehrotra S (2006) Domain-independent data cleaning via analysis of entity-relationship graph. ACM Transactions on Database Systems (TODS) 31(2):716–767CrossRef Kalashnikov DV, Mehrotra S (2006) Domain-independent data cleaning via analysis of entity-relationship graph. ACM Transactions on Database Systems (TODS) 31(2):716–767CrossRef
354.
Zurück zum Zitat Karakasidis A, Verykios VS (2010) Advances in privacy preserving record linkage. In: E-Activity and Intelligent Web Construction: Effects of Social Design, pp 22–29 Karakasidis A, Verykios VS (2010) Advances in privacy preserving record linkage. In: E-Activity and Intelligent Web Construction: Effects of Social Design, pp 22–29
355.
Zurück zum Zitat Kargupta H, Datta S, Wang Q, Sivakumar K (2003) On the privacy preserving properties of random data perturbation techniques. In: Third IEEE International Conference on Data Mining, 2003 (ICDM 2003). IEEE, New York, pp 99–106 Kargupta H, Datta S, Wang Q, Sivakumar K (2003) On the privacy preserving properties of random data perturbation techniques. In: Third IEEE International Conference on Data Mining, 2003 (ICDM 2003). IEEE, New York, pp 99–106
364.
Zurück zum Zitat Keßler C, Janowicz K, Bishr M (2009) An agenda for the next generation gazetteer: geographic information contribution and retrieval. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM, New York, pp 91–100 Keßler C, Janowicz K, Bishr M (2009) An agenda for the next generation gazetteer: geographic information contribution and retrieval. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM, New York, pp 91–100
377.
Zurück zum Zitat Köpcke H, Rahm E (2010) Frameworks for entity matching: a comparison. Data & Knowledge Engineering 69(2):197–210CrossRef Köpcke H, Rahm E (2010) Frameworks for entity matching: a comparison. Data & Knowledge Engineering 69(2):197–210CrossRef
378.
Zurück zum Zitat Köpcke H, Thor A, Rahm E (2010) Evaluation of entity resolution approaches on real-world match problems. Proceedings of the VLDB Endowment 3(1–2):484–493CrossRef Köpcke H, Thor A, Rahm E (2010) Evaluation of entity resolution approaches on real-world match problems. Proceedings of the VLDB Endowment 3(1–2):484–493CrossRef
382.
Zurück zum Zitat Kukich K (1992) Techniques for automatically correcting words in text. ACM Computing Surveys (CSUR) 24(4):377–439CrossRef Kukich K (1992) Techniques for automatically correcting words in text. ACM Computing Surveys (CSUR) 24(4):377–439CrossRef
384.
Zurück zum Zitat Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning (ICML ’01). Morgan Kaufmann, San Francisco, pp 282–289, URL http://dl.acm.org/citation.cfm?id=645530.655813 Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning (ICML ’01). Morgan Kaufmann, San Francisco, pp 282–289, URL http://​dl.​acm.​org/​citation.​cfm?​id=​645530.​655813
385.
Zurück zum Zitat Lait A, Randell B (1996) An assessment of name matching algorithms. Technical Report Series-University of Newcastle Upon Tyne Computing Science Lait A, Randell B (1996) An assessment of name matching algorithms. Technical Report Series-University of Newcastle Upon Tyne Computing Science
391.
Zurück zum Zitat Lee C, Rey T, Mentele J, Garver M (2005) Structured neural network techniques for modeling loyalty and profitability. In: Proceedings of SAS User Group International (SUGI 30), pp 082–30 Lee C, Rey T, Mentele J, Garver M (2005) Structured neural network techniques for modeling loyalty and profitability. In: Proceedings of SAS User Group International (SUGI 30), pp 082–30
400.
Zurück zum Zitat Li L, Goodchild M (2012) Automatically and accurately matching objects in geospatial datasets. Advances in Geo-Spatial Information Science 10:71–79 Li L, Goodchild M (2012) Automatically and accurately matching objects in geospatial datasets. Advances in Geo-Spatial Information Science 10:71–79
424.
Zurück zum Zitat Malin B (2005) Unsupervised name disambiguation via social network similarity. In: Workshop on Link Analysis, Counterterrorism, and Security, vol 1401, pp 93–102 Malin B (2005) Unsupervised name disambiguation via social network similarity. In: Workshop on Link Analysis, Counterterrorism, and Security, vol 1401, pp 93–102
437.
Zurück zum Zitat Michelson M, Knoblock CA (2006) Learning blocking schemes for record linkage. Proceedings of the National Conference on Artificial Intelligence 21(1):440 Michelson M, Knoblock CA (2006) Learning blocking schemes for record linkage. Proceedings of the National Conference on Artificial Intelligence 21(1):440
438.
Zurück zum Zitat Michelson M, Knoblock CA (2007) Mining heterogeneous transformations for record linkage. In: Proceedings of the 6th International Workshop on Information Integration on the Web, pp 68–73 Michelson M, Knoblock CA (2007) Mining heterogeneous transformations for record linkage. In: Proceedings of the 6th International Workshop on Information Integration on the Web, pp 68–73
440.
Zurück zum Zitat Minami M, et al (2002) Using arcmap. In: Using ArcMap, ESRI Minami M, et al (2002) Using arcmap. In: Using ArcMap, ESRI
441.
Zurück zum Zitat Minkov E, Cohen WW, Ng AY (2006) Contextual search and name disambiguation in email using graphs. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, pp 27–34 Minkov E, Cohen WW, Ng AY (2006) Contextual search and name disambiguation in email using graphs. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, pp 27–34
442.
Zurück zum Zitat Minton SN, Nanjo C, Knoblock CA, Michalowski M, Michelson M (2005) A heterogeneous field matching method for record linkage. In: Fifth IEEE International Conference on Data Mining. IEEE, New YorkCrossRef Minton SN, Nanjo C, Knoblock CA, Michalowski M, Michelson M (2005) A heterogeneous field matching method for record linkage. In: Fifth IEEE International Conference on Data Mining. IEEE, New YorkCrossRef
457.
Zurück zum Zitat Musavi MT, Shirvaikar MV, Ramanathan E, Nekovei A (1988) A vision based method to automate map processing. Pattern Recognition 21(4):319–326CrossRef Musavi MT, Shirvaikar MV, Ramanathan E, Nekovei A (1988) A vision based method to automate map processing. Pattern Recognition 21(4):319–326CrossRef
473.
Zurück zum Zitat Nin J, Muntes-Mulero V, Martinez-Bazan N, Larriba-Pey JL (2007) On the use of semantic blocking techniques for data cleansing and integration. In: 11th International Database Engineering and Applications Symposium, 2007 (IDEAS 2007). IEEE, New York, pp 190–198 Nin J, Muntes-Mulero V, Martinez-Bazan N, Larriba-Pey JL (2007) On the use of semantic blocking techniques for data cleansing and integration. In: 11th International Database Engineering and Applications Symposium, 2007 (IDEAS 2007). IEEE, New York, pp 190–198
476.
Zurück zum Zitat Nuray-Turan R, Kalashnikov DV, Mehrotra S (2013) Adaptive connection strength models for relationship-based entity resolution. Journal of Data and Information Quality (JDIQ) 4(2):8 Nuray-Turan R, Kalashnikov DV, Mehrotra S (2013) Adaptive connection strength models for relationship-based entity resolution. Journal of Data and Information Quality (JDIQ) 4(2):8
488.
Zurück zum Zitat Papadimitriou CH (2003) Computational Complexity. Wiley, New YorkMATH Papadimitriou CH (2003) Computational Complexity. Wiley, New YorkMATH
500.
Zurück zum Zitat Phua C, Smith-Miles K, Lee V, Gayler R (2012) Resilient identity crime detection. IEEE Transactions on Knowledge and Data Engineering 24(3):533–546CrossRef Phua C, Smith-Miles K, Lee V, Gayler R (2012) Resilient identity crime detection. IEEE Transactions on Knowledge and Data Engineering 24(3):533–546CrossRef
505.
Zurück zum Zitat Pixton B, Giraud-Carrier C (2006) Using structured neural networks for record linkage. In: Proceedings of the Sixth Annual Workshop on Technology for Family History and Genealogical Research Pixton B, Giraud-Carrier C (2006) Using structured neural networks for record linkage. In: Proceedings of the Sixth Annual Workshop on Technology for Family History and Genealogical Research
512.
Zurück zum Zitat Porter EH, Winkler WE, et al (1997) Approximate string comparison and its effect on an advanced record linkage system. In: Advanced Record Linkage System. US Bureau of the Census, Research Report, Citeseer Porter EH, Winkler WE, et al (1997) Approximate string comparison and its effect on an advanced record linkage system. In: Advanced Record Linkage System. US Bureau of the Census, Research Report, Citeseer
518.
Zurück zum Zitat Recchia G, Louwerse M (2013) A comparison of string similarity measures for toponym matching. In: Proceedings of The First ACM SIGSPATIAL International Workshop on Computational Models of Place, pp 54–61 Recchia G, Louwerse M (2013) A comparison of string similarity measures for toponym matching. In: Proceedings of The First ACM SIGSPATIAL International Workshop on Computational Models of Place, pp 54–61
523.
Zurück zum Zitat Reuther P, Walter B (2006) Survey on test collections and techniques for personal name matching. International Journal of Metadata, Semantics and Ontologies 1(2):89–99CrossRef Reuther P, Walter B (2006) Survey on test collections and techniques for personal name matching. International Journal of Metadata, Semantics and Ontologies 1(2):89–99CrossRef
526.
Zurück zum Zitat Richardson M, Domingos P (2006) Markov logic networks. Machine Learning 62(1–2):107–136CrossRef Richardson M, Domingos P (2006) Markov logic networks. Machine Learning 62(1–2):107–136CrossRef
532.
Zurück zum Zitat Ruibin G, Tony K (2006) Syllable alignment: a novel model for phonetic string search. IEICE Transactions on Information and Systems 89(1):332–339 Ruibin G, Tony K (2006) Syllable alignment: a novel model for phonetic string search. IEICE Transactions on Information and Systems 89(1):332–339
534.
Zurück zum Zitat Saalfeld AJ (1993) Conflation: Automated map compilation. PhD thesis, University of Maryland at College Park, College Park, MD, USA, uMI Order No. GAX93-27487 Saalfeld AJ (1993) Conflation: Automated map compilation. PhD thesis, University of Maryland at College Park, College Park, MD, USA, uMI Order No. GAX93-27487
536.
Zurück zum Zitat Sadinle M, Fienberg SE (2013) A generalized fellegi–sunter framework for multiple record linkage with application to homicide record systems. Journal of the American Statistical Association 108(502):385–397MathSciNetMATHCrossRef Sadinle M, Fienberg SE (2013) A generalized fellegi–sunter framework for multiple record linkage with application to homicide record systems. Journal of the American Statistical Association 108(502):385–397MathSciNetMATHCrossRef
538.
Zurück zum Zitat Safra E, Kanza Y, Sagiv Y, Doytsher Y (2006) Efficient integration of road maps. In: Proceedings of the 14th Annual ACM International Symposium on Advances in Geographic Information Systems. ACM, New York, pp 59–66 Safra E, Kanza Y, Sagiv Y, Doytsher Y (2006) Efficient integration of road maps. In: Proceedings of the 14th Annual ACM International Symposium on Advances in Geographic Information Systems. ACM, New York, pp 59–66
539.
Zurück zum Zitat Safra E, Kanza Y, Sagiv Y, Beeri C, Doytsher Y (2010) Location-based algorithms for finding sets of corresponding objects over several geo-spatial data sets. International Journal of Geographical Information Science 24(1):69–106CrossRef Safra E, Kanza Y, Sagiv Y, Beeri C, Doytsher Y (2010) Location-based algorithms for finding sets of corresponding objects over several geo-spatial data sets. International Journal of Geographical Information Science 24(1):69–106CrossRef
540.
Zurück zum Zitat Safra E, Kanza Y, Sagiv Y, Doytsher Y (2013) Ad hoc matching of vectorial road networks. International Journal of Geographical Information Science 27(1):114–153CrossRef Safra E, Kanza Y, Sagiv Y, Doytsher Y (2013) Ad hoc matching of vectorial road networks. International Journal of Geographical Information Science 27(1):114–153CrossRef
544.
Zurück zum Zitat Salzberg SL (1997) On comparing classifiers: pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery 1(3):317–328CrossRef Salzberg SL (1997) On comparing classifiers: pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery 1(3):317–328CrossRef
545.
Zurück zum Zitat Sarawagi S, Bhamidipaty A (eds) (Edmonton, Alberta, Canada, 2002) Interactive Deduplication Using Active Learning Sarawagi S, Bhamidipaty A (eds) (Edmonton, Alberta, Canada, 2002) Interactive Deduplication Using Active Learning
550.
Zurück zum Zitat Scannapieco M, Figotin I, Bertino E, Elmagarmid AK (2007) Privacy preserving schema and data matching. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. ACM, New York, pp 653–664CrossRef Scannapieco M, Figotin I, Bertino E, Elmagarmid AK (2007) Privacy preserving schema and data matching. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. ACM, New York, pp 653–664CrossRef
553.
Zurück zum Zitat Schapire WWCRE, Singer Y (1998) Learning to order things. In: Advances in Neural Information Processing Systems 10: Proceedings of the 1997 Conference. MIT Press, Cambridge, vol 10, p 451 Schapire WWCRE, Singer Y (1998) Learning to order things. In: Advances in Neural Information Processing Systems 10: Proceedings of the 1997 Conference. MIT Press, Cambridge, vol 10, p 451
555.
Zurück zum Zitat Schneier B (2007) Applied Cryptography: Protocols, Algorithms, and Source Code in C. Wiley, New YorkMATH Schneier B (2007) Applied Cryptography: Protocols, Algorithms, and Source Code in C. Wiley, New YorkMATH
558.
Zurück zum Zitat Sehgal V, Getoor L, Viechnicki PD (2006) Entity resolution in geospatial data integration. In: Proceedings of the 14th Annual ACM International Symposium on Advances in Geographic Information Systems. ACM, New York, pp 83–90 Sehgal V, Getoor L, Viechnicki PD (2006) Entity resolution in geospatial data integration. In: Proceedings of the 14th Annual ACM International Symposium on Advances in Geographic Information Systems. ACM, New York, pp 83–90
566.
Zurück zum Zitat Shearer C (2000) The crisp-dm model: the new blueprint for data mining. Journal of Data Warehousing 5(4):13–22 Shearer C (2000) The crisp-dm model: the new blueprint for data mining. Journal of Data Warehousing 5(4):13–22
580.
Zurück zum Zitat Singla P, Domingos P (2006) Entity resolution with markov logic. In: Sixth International Conference on Data Mining, 2006 (ICDM’06). IEEE, New York, pp 572–582 Singla P, Domingos P (2006) Entity resolution with markov logic. In: Sixth International Conference on Data Mining, 2006 (ICDM’06). IEEE, New York, pp 572–582
582.
Zurück zum Zitat Smart PD, Jones CB, Twaroch FA (2010) Multi-source toponym data integration and mediation for a meta-gazetteer service. In: Geographic Information Science. Springer, New York, pp 234–248CrossRef Smart PD, Jones CB, Twaroch FA (2010) Multi-source toponym data integration and mediation for a meta-gazetteer service. In: Geographic Information Science. Springer, New York, pp 234–248CrossRef
613.
Zurück zum Zitat Trepetin S (2008) Privacy-preserving string comparisons in record linkage systems: a review. Information Security Journal: A Global Perspective 17(5–6):253–266 Trepetin S (2008) Privacy-preserving string comparisons in record linkage systems: a review. Information Security Journal: A Global Perspective 17(5–6):253–266
622.
Zurück zum Zitat Vapnik VN, Vapnik V (1998) Statistical Learning Theory. Wiley, New York, vol 2 Vapnik VN, Vapnik V (1998) Statistical Learning Theory. Wiley, New York, vol 2
623.
Zurück zum Zitat Vatsalan D, Christen P, Verykios VS (2013) A taxonomy of privacy-preserving record linkage techniques. Information Systems 38(6):946–969CrossRef Vatsalan D, Christen P, Verykios VS (2013) A taxonomy of privacy-preserving record linkage techniques. Information Systems 38(6):946–969CrossRef
627.
Zurück zum Zitat Verykios VS, Karakasidis A, Mitrogiannis VK (2009) Privacy preserving record linkage approaches. International Journal of Data Mining, Modelling and Management 1(2):206–221MATHCrossRef Verykios VS, Karakasidis A, Mitrogiannis VK (2009) Privacy preserving record linkage approaches. International Journal of Data Mining, Modelling and Management 1(2):206–221MATHCrossRef
642.
Zurück zum Zitat Wang J, Kraska T, Franklin MJ, Feng J (2012) Crowder: crowdsourcing entity resolution. Proceedings of the VLDB Endowment 5(11):1483–1494CrossRef Wang J, Kraska T, Franklin MJ, Feng J (2012) Crowder: crowdsourcing entity resolution. Proceedings of the VLDB Endowment 5(11):1483–1494CrossRef
660.
Zurück zum Zitat Weis M, Naumann F (2005) DogmatiX tracks down duplicates in XML. In: Proceedings of the SIGMOD 2005, pp 431–442 Weis M, Naumann F (2005) DogmatiX tracks down duplicates in XML. In: Proceedings of the SIGMOD 2005, pp 431–442
662.
Zurück zum Zitat Whang SE, Garcia-Molina H (2010) Entity resolution with evolving rules. Proceedings of the VLDB Endowment 3(1–2):1326–1337CrossRef Whang SE, Garcia-Molina H (2010) Entity resolution with evolving rules. Proceedings of the VLDB Endowment 3(1–2):1326–1337CrossRef
663.
Zurück zum Zitat Whang SE, Garcia-Molina H (2014) Incremental entity resolution on rules and data. The VLDB Journal, The International Journal on Very Large Data Bases 23(1):77–102CrossRef Whang SE, Garcia-Molina H (2014) Incremental entity resolution on rules and data. The VLDB Journal, The International Journal on Very Large Data Bases 23(1):77–102CrossRef
664.
Zurück zum Zitat Whang SE, Marmaros D, Garcia-Molina H (2013) Pay-as-you-go entity resolution. IEEE Transactions on Knowledge and Data Engineering 25(5):1111–1124CrossRef Whang SE, Marmaros D, Garcia-Molina H (2013) Pay-as-you-go entity resolution. IEEE Transactions on Knowledge and Data Engineering 25(5):1111–1124CrossRef
670.
Zurück zum Zitat Winkler WE (1995) Matching and record linkage. Business Survey Methods 1:355–384 Winkler WE (1995) Matching and record linkage. Business Survey Methods 1:355–384
674.
Zurück zum Zitat Winkler WE (2006) Overview of record linkage and current research directions. In: Bureau of the Census, Citeseer Winkler WE (2006) Overview of record linkage and current research directions. In: Bureau of the Census, Citeseer
682.
Zurück zum Zitat Yakout M, Atallah MJ, Elmagarmid A (2009) Efficient private record linkage. In: IEEE 25th International Conference on Data Engineering, 2009 (ICDE’09). IEEE, New York, pp 1283–1286CrossRef Yakout M, Atallah MJ, Elmagarmid A (2009) Efficient private record linkage. In: IEEE 25th International Conference on Data Engineering, 2009 (ICDE’09). IEEE, New York, pp 1283–1286CrossRef
683.
Zurück zum Zitat Yakout M, Elmagarmid AK, Elmeleegy H, Ouzzani M, Qi A (2010) Behavior based record linkage. Proceedings of the VLDB Endowment 3(1–2):439–448CrossRef Yakout M, Elmagarmid AK, Elmeleegy H, Ouzzani M, Qi A (2010) Behavior based record linkage. Proceedings of the VLDB Endowment 3(1–2):439–448CrossRef
685.
Zurück zum Zitat Yan S, Lee D, Kan MY, Giles LC (2007) Adaptive sorted neighborhood methods for efficient record linkage. In: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries. ACM, New York, pp 185–194 Yan S, Lee D, Kan MY, Giles LC (2007) Adaptive sorted neighborhood methods for efficient record linkage. In: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries. ACM, New York, pp 185–194
695.
Zurück zum Zitat Zhao H, Ram S (2005) Entity identification for heterogeneous database integration—a multiple classifier system approach and empirical evaluation. Information Systems 30(2):119–132CrossRef Zhao H, Ram S (2005) Entity identification for heterogeneous database integration—a multiple classifier system approach and empirical evaluation. Information Systems 30(2):119–132CrossRef
696.
Zurück zum Zitat Zingmond DS, Ye Z, Ettner SL, Liu H (2004) Linking hospital discharge and death records—accuracy and sources of bias. Journal of Clinical Epidemiology 57(1):21–29CrossRef Zingmond DS, Ye Z, Ettner SL, Liu H (2004) Linking hospital discharge and death records—accuracy and sources of bias. Journal of Clinical Epidemiology 57(1):21–29CrossRef
Metadaten
Titel
Recent Advances in Object Identification
verfasst von
Carlo Batini
Monica Scannapieco
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-24106-7_9