Skip to main content
Top
Published in: Data Mining and Knowledge Discovery 2/2020

23-12-2019

Mining relaxed functional dependencies from data

Authors: Loredana Caruccio, Vincenzo Deufemia, Giuseppe Polese

Published in: Data Mining and Knowledge Discovery | Issue 2/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Relaxed functional dependencies (rfds) are properties expressing important relationships among data. Thanks to the introduction of approximations in data comparison and/or validity, they can capture constraints useful for several purposes, such as the identification of data inconsistencies or patterns of semantically related data. Nevertheless, rfds can provide benefits only if they can be automatically discovered from data. In this paper we present an rfd discovery algorithm relying on a lattice structured search space, previously used for fd discovery, new pruning strategies, and a new candidate rfd validation method. An experimental evaluation demonstrates the discovery performances of the proposed algorithm on real datasets, also providing a comparison with other algorithms.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Flach PA, Savnik I (1999) Database dependency discovery: a machine learning approach. AI Commun 12(3):139–160MathSciNet Flach PA, Savnik I (1999) Database dependency discovery: a machine learning approach. AI Commun 12(3):139–160MathSciNet
go back to reference Ilyas IF, Markl V, Haas P, Brown P, Aboulnaga A (2004) CORDS: automatic discovery of correlations and soft functional dependencies. In: Proceedings of the 2004 ACM SIGMOD international conference on management of data, SIGMOD ’04, pp 647–658. https://doi.org/10.1145/1007568.1007641 Ilyas IF, Markl V, Haas P, Brown P, Aboulnaga A (2004) CORDS: automatic discovery of correlations and soft functional dependencies. In: Proceedings of the 2004 ACM SIGMOD international conference on management of data, SIGMOD ’04, pp 647–658. https://​doi.​org/​10.​1145/​1007568.​1007641
go back to reference Johnson DS, Garey MR (1979) Computers and intractability: a guide to the theory of NP-completeness. WH Freeman, New YorkMATH Johnson DS, Garey MR (1979) Computers and intractability: a guide to the theory of NP-completeness. WH Freeman, New YorkMATH
go back to reference Kleinberg J, Tardos E (2006) Algorithm design. Pearson Education India, New Delhi Kleinberg J, Tardos E (2006) Algorithm design. Pearson Education India, New Delhi
go back to reference Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707–710MathSciNet Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707–710MathSciNet
go back to reference Song S (2010) Data dependencies in the presence of difference. PhD thesis, The Hong Kong University Song S (2010) Data dependencies in the presence of difference. PhD thesis, The Hong Kong University
go back to reference Song S, Sun Y, Zhang A, Chen L, Wang J (2018) Enriching data imputation under similarity rule constraints. To appear in IEEE transactions on knowledge and data engineering Song S, Sun Y, Zhang A, Chen L, Wang J (2018) Enriching data imputation under similarity rule constraints. To appear in IEEE transactions on knowledge and data engineering
go back to reference Szlichta J, Golab L, Srivastava D (2015) On axiomatization and inference complexity over a hierarchy of functional dependencies. In: Proceedings of the 9th Alberto Mendelzon international workshop on foundations of data management, AMW ’15 Szlichta J, Golab L, Srivastava D (2015) On axiomatization and inference complexity over a hierarchy of functional dependencies. In: Proceedings of the 9th Alberto Mendelzon international workshop on foundations of data management, AMW ’15
go back to reference Wyss C, Giannella C, Robertson E (2001) FastFDs: a heuristic-driven, depth-first algorithm for mining functional dependencies from relation instances extended abstract. In: Proceedings of the 3rd international conference on data warehousing and knowledge discovery, DaWaK ’01, pp 101–110. https://doi.org/10.1007/3-540-44801-2_11 Wyss C, Giannella C, Robertson E (2001) FastFDs: a heuristic-driven, depth-first algorithm for mining functional dependencies from relation instances extended abstract. In: Proceedings of the 3rd international conference on data warehousing and knowledge discovery, DaWaK ’01, pp 101–110. https://​doi.​org/​10.​1007/​3-540-44801-2_​11
Metadata
Title
Mining relaxed functional dependencies from data
Authors
Loredana Caruccio
Vincenzo Deufemia
Giuseppe Polese
Publication date
23-12-2019
Publisher
Springer US
Published in
Data Mining and Knowledge Discovery / Issue 2/2020
Print ISSN: 1384-5810
Electronic ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-019-00667-7

Other articles of this Issue 2/2020

Data Mining and Knowledge Discovery 2/2020 Go to the issue

Premium Partner