Skip to main content
Top

2020 | OriginalPaper | Chapter

Functional Dependency Discovery on Distributed Database: Sampling Verification Framework

Authors : Chenxin Gu, Jie Cao

Published in: Data Science

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In relational databases, functional dependencies discovery is a very important database analysis technology, which has a wide range of applications in knowledge discovery, database semantic analysis, data quality assessment and database design. The existing functional dependencies discovery algorithms are mainly designed for centralized data, which are usually only applicable when the data size is small. With the rapid development of the database scale of the times, the distributed environment function dependence discovery has more and more important practical significance. A functional dependencies discovery algorithm for big data in distributed environment is proposed. The basic idea is to first perform functional dependencies discovery on the sampled data set, and then globally verify the functional dependencies that may be globally established, so that all functional dependencies can be discovered. Parallel computing can be used to improve discovery efficiency while ensuring correctness.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Meng, X., Ci, X.: Big data management: concepts, techniques and challenges. J. Comput. Res. Dev. 50(1), 146–169 (2013) Meng, X., Ci, X.: Big data management: concepts, techniques and challenges. J. Comput. Res. Dev. 50(1), 146–169 (2013)
2.
go back to reference Liu, X., Liu, X.: An important aspect of big data: data usability. J. Comput. Res. Dev. 50(6), 1147–1162 (2013) Liu, X., Liu, X.: An important aspect of big data: data usability. J. Comput. Res. Dev. 50(6), 1147–1162 (2013)
3.
go back to reference Huhtala, Y., Karkkainen, J., Porkka, P., et al.: TANE: an efficient algorithm for discovering functional and approximate dependencies. Comput. J. 42(2), 100–111 (1999)CrossRef Huhtala, Y., Karkkainen, J., Porkka, P., et al.: TANE: an efficient algorithm for discovering functional and approximate dependencies. Comput. J. 42(2), 100–111 (1999)CrossRef
4.
go back to reference Novelli, N., Cicchetti, R.: FUN: an efficient algorithm for mining functional and embedded dependencies. In: Proceedings of the 8th International Conference on Database Theory, pp. 189–203. ACM, New York (2001) Novelli, N., Cicchetti, R.: FUN: an efficient algorithm for mining functional and embedded dependencies. In: Proceedings of the 8th International Conference on Database Theory, pp. 189–203. ACM, New York (2001)
5.
go back to reference Yao, H., Hamilton, H.J., Butz, C.J.: FD_Mine: discovering functional dependencies in a database using equivalences. In: IEEE International Conference on Data Mining. IEEE Computer Society (2002) Yao, H., Hamilton, H.J., Butz, C.J.: FD_Mine: discovering functional dependencies in a database using equivalences. In: IEEE International Conference on Data Mining. IEEE Computer Society (2002)
6.
go back to reference Abedjan, Z., Schulze, P., Naumann, F.: DFD: efficient functional dependency discovery (2014) Abedjan, Z., Schulze, P., Naumann, F.: DFD: efficient functional dependency discovery (2014)
7.
go back to reference Wyss, C., Giannella, C., Robertson, E.: FastFDs: a heuristic-driven, depth-first algorithm for mining functional dependencies from relation instances. In: Proceedings of the 3rd International Conference on Data Warehousing and Knowledge Discovery, pp. 101–110. ACM, New York (2001) Wyss, C., Giannella, C., Robertson, E.: FastFDs: a heuristic-driven, depth-first algorithm for mining functional dependencies from relation instances. In: Proceedings of the 3rd International Conference on Data Warehousing and Knowledge Discovery, pp. 101–110. ACM, New York (2001)
8.
go back to reference Flach, P.A., Savnik, I.: Database dependency discovery: a machine learning approach. AI Commun. 12(3), 139–160 (1999)MathSciNet Flach, P.A., Savnik, I.: Database dependency discovery: a machine learning approach. AI Commun. 12(3), 139–160 (1999)MathSciNet
9.
go back to reference King, R.S., Legendre, J.J.: Discovery of functional and approximate functional dependencies in relational databases. J. Appl. Math. Decis. Sci. 7(1), 49–59 (2003)MathSciNetCrossRef King, R.S., Legendre, J.J.: Discovery of functional and approximate functional dependencies in relational databases. J. Appl. Math. Decis. Sci. 7(1), 49–59 (2003)MathSciNetCrossRef
10.
go back to reference Allard, P., Ferré, S., Ridoux, O.: Discovering functional dependencies and association rules by navigating in a lattice of OLAP views. In: Proceedings of the Concept Lattices and Their Applications, vol. 1, no. 1, pp. 199–210 (2010) Allard, P., Ferré, S., Ridoux, O.: Discovering functional dependencies and association rules by navigating in a lattice of OLAP views. In: Proceedings of the Concept Lattices and Their Applications, vol. 1, no. 1, pp. 199–210 (2010)
11.
go back to reference Cabrerizo, F.J., Alonso, S., Herrera-Viedma, E.: A consensus model for group decision making problems with unbalanced fuzzy linguistic information. Int. J. Inf. Technol. Decis. Making 08(01), 109–131 (2009)CrossRef Cabrerizo, F.J., Alonso, S., Herrera-Viedma, E.: A consensus model for group decision making problems with unbalanced fuzzy linguistic information. Int. J. Inf. Technol. Decis. Making 08(01), 109–131 (2009)CrossRef
12.
go back to reference Ye, F., Liu, J., Qian, J., Xue, X.: A framework for mining functional dependencies from large distributed databases. In: Proceedings of 2010 International Conference on Artificial Intelligence and Computational Intelligence, pp. 109–113. IEEE, Alamitos (2010) Ye, F., Liu, J., Qian, J., Xue, X.: A framework for mining functional dependencies from large distributed databases. In: Proceedings of 2010 International Conference on Artificial Intelligence and Computational Intelligence, pp. 109–113. IEEE, Alamitos (2010)
13.
go back to reference Peng, Y., Kou, G., Shi, Y., et al.: A descriptive framework for the field of data mining and knowledge discovery. Int. J. Inf. Technol. Decis. Making 07(04), 639–682 (2008)CrossRef Peng, Y., Kou, G., Shi, Y., et al.: A descriptive framework for the field of data mining and knowledge discovery. Int. J. Inf. Technol. Decis. Making 07(04), 639–682 (2008)CrossRef
14.
go back to reference Liu, J., Li, J., Liu, C., et al.: Discover dependencies from data-a review. IEEE Trans. Knowl. Data Eng. 24(2), 251–264 (2012)CrossRef Liu, J., Li, J., Liu, C., et al.: Discover dependencies from data-a review. IEEE Trans. Knowl. Data Eng. 24(2), 251–264 (2012)CrossRef
15.
go back to reference Lopes, S., Petit, J., Lakhal, L.: Efficient discovery of functional dependencies and armstrong relations. In: Proceedings of the 7th International Conference on Extending Database Technology, pp. 350–364. ACM, New York (2000) Lopes, S., Petit, J., Lakhal, L.: Efficient discovery of functional dependencies and armstrong relations. In: Proceedings of the 7th International Conference on Extending Database Technology, pp. 350–364. ACM, New York (2000)
16.
go back to reference Yu, M., Zhao, X., Xu, Z.: Survey on using dependencies to improve data consistency. J. Comput. Appl. 38(S2), 72–76 + 102 (2018) Yu, M., Zhao, X., Xu, Z.: Survey on using dependencies to improve data consistency. J. Comput. Appl. 38(S2), 72–76 + 102 (2018)
17.
go back to reference Fan, W., Geerts, F., Ma, S., et al.: Detecting inconsistencies in distributed data. In: Proceedings of the 26th International Conference on Data Engineering, pp. 64–75. IEEE, Alamitos (2010) Fan, W., Geerts, F., Ma, S., et al.: Detecting inconsistencies in distributed data. In: Proceedings of the 26th International Conference on Data Engineering, pp. 64–75. IEEE, Alamitos (2010)
18.
go back to reference Yang, Q., Wu, X.: 10 challenging problems in data mining research. Int. J. Inf. Technol. Decis. Making 05(04), 597–604 (2006)CrossRef Yang, Q., Wu, X.: 10 challenging problems in data mining research. Int. J. Inf. Technol. Decis. Making 05(04), 597–604 (2006)CrossRef
Metadata
Title
Functional Dependency Discovery on Distributed Database: Sampling Verification Framework
Authors
Chenxin Gu
Jie Cao
Copyright Year
2020
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-15-2810-1_43

Premium Partner