Skip to main content
Erschienen in: The VLDB Journal 6/2018

02.07.2018 | Regular Paper

Probabilistic Cardinality Constraints

Validation, Reasoning, and Semantic Summaries

verfasst von: Tania Roblot, Miika Hannula, Sebastian Link

Erschienen in: The VLDB Journal | Ausgabe 6/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Probabilistic databases address the requirements of applications that produce large collections of uncertain data. They should provide declarative means to control the integrity of data. Cardinality constraints, in particular, control the occurrences of data patterns by declaring in how many records a combination of data values can occur. We propose cardinality constraints on probabilistic data, which stipulate lower bounds on the marginal probability by which a cardinality constraint holds. We investigate limits and opportunities for automating their use in integrity control. This includes hardness results for their validation, axiomatic and efficient algorithmic characterisations of their implication problem, and an algorithm that computes succinct semantic summaries for any collection of these constraints. Experiments complement our theoretical analysis on the time and space complexity of computing semantic summaries, suggesting that their computation provides the basis to acquire meaningful constraints. We also establish evidence that probabilistic functional and inclusion dependencies cannot be managed as simply as probabilistic cardinality constraints.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Available for download at www.​cs.​auckland.​ac.​nz/​~tkr/​fortuna.​html where additional examples are presented as well.
 
Literatur
1.
Zurück zum Zitat Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Boston (1995)MATH Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Boston (1995)MATH
2.
Zurück zum Zitat Armstrong, W.W.: Dependency structures of data base relationships. In: IFIP Congress, pp. 580–583 (1974) Armstrong, W.W.: Dependency structures of data base relationships. In: IFIP Congress, pp. 580–583 (1974)
3.
Zurück zum Zitat Arora, S., Barak, B.: Computational Complexity: A Modern Approach, 1st edn. Cambridge University Press, New York (2009)CrossRef Arora, S., Barak, B.: Computational Complexity: A Modern Approach, 1st edn. Cambridge University Press, New York (2009)CrossRef
4.
Zurück zum Zitat Baader, F., Buchheit, M., Hollunder, B.: Cardinality restrictions on concepts. Artif. Intell. 88(1–2), 195–213 (1996)CrossRef Baader, F., Buchheit, M., Hollunder, B.: Cardinality restrictions on concepts. Artif. Intell. 88(1–2), 195–213 (1996)CrossRef
5.
Zurück zum Zitat Balcazar, J.L., Diaz, J., Gabarro, J.: Structural Complexity I. Springer, Berlin (2012)MATH Balcazar, J.L., Diaz, J., Gabarro, J.: Structural Complexity I. Springer, Berlin (2012)MATH
6.
Zurück zum Zitat Beeri, C., Dowd, M., Fagin, R., Statman, R.: On the structure of Armstrong relations for functional dependencies. J. ACM 31(1), 30–46 (1984)MathSciNetCrossRef Beeri, C., Dowd, M., Fagin, R., Statman, R.: On the structure of Armstrong relations for functional dependencies. J. ACM 31(1), 30–46 (1984)MathSciNetCrossRef
7.
Zurück zum Zitat Berardi, D., Calvanese, D., De Giacomo, G.: Reasoning on UML class diagrams. Artif. Intell. 168(1–2), 70–118 (2005)MathSciNetCrossRef Berardi, D., Calvanese, D., De Giacomo, G.: Reasoning on UML class diagrams. Artif. Intell. 168(1–2), 70–118 (2005)MathSciNetCrossRef
8.
Zurück zum Zitat Biskup, J., Menzel, R., Polle, T., Sagiv, Y.: Decomposition of relationships through pivoting. In: ER, pp. 28–41 (1996) Biskup, J., Menzel, R., Polle, T., Sagiv, Y.: Decomposition of relationships through pivoting. In: ER, pp. 28–41 (1996)
9.
Zurück zum Zitat Brown, P., Ganesan, J., Köhler, H., Link, S.: Keys with probabilistic intervals. In: ER, pp. 164–179 (2016) Brown, P., Ganesan, J., Köhler, H., Link, S.: Keys with probabilistic intervals. In: ER, pp. 164–179 (2016)
10.
Zurück zum Zitat Brown, P., Link, S.: Probabilistic keys for data quality management. In: CAiSE, pp. 118–132 (2015) Brown, P., Link, S.: Probabilistic keys for data quality management. In: CAiSE, pp. 118–132 (2015)
11.
Zurück zum Zitat Brown, P., Link, S.: Probabilistic keys. IEEE Trans. Knowl. Data Eng. 29(3), 670–682 (2017)CrossRef Brown, P., Link, S.: Probabilistic keys. IEEE Trans. Knowl. Data Eng. 29(3), 670–682 (2017)CrossRef
12.
Zurück zum Zitat Bruno, N., Chaudhuri, S., Thomas, D.: Generating queries with cardinality constraints for DBMS testing. IEEE Trans. Knowl. Data Eng. 18(12), 1721–1725 (2006)CrossRef Bruno, N., Chaudhuri, S., Thomas, D.: Generating queries with cardinality constraints for DBMS testing. IEEE Trans. Knowl. Data Eng. 18(12), 1721–1725 (2006)CrossRef
13.
Zurück zum Zitat Calvanese, D., Lenzerini, M.: On the interaction between ISA and cardinality constraints. In: ICDE, pp. 204–213 (1994) Calvanese, D., Lenzerini, M.: On the interaction between ISA and cardinality constraints. In: ICDE, pp. 204–213 (1994)
14.
Zurück zum Zitat Casanova, M.A., Fagin, R., Papadimitriou, C.H.: Inclusion dependencies and their interaction with functional dependencies. J. Comput. Syst. Sci. 28(1), 29–59 (1984)MathSciNetCrossRef Casanova, M.A., Fagin, R., Papadimitriou, C.H.: Inclusion dependencies and their interaction with functional dependencies. J. Comput. Syst. Sci. 28(1), 29–59 (1984)MathSciNetCrossRef
15.
Zurück zum Zitat Chaudhuri, S.: Query optimizers: time to rethink the contract? In: SIGMOD, pp. 961–968 (2009) Chaudhuri, S.: Query optimizers: time to rethink the contract? In: SIGMOD, pp. 961–968 (2009)
16.
Zurück zum Zitat Chen, P.P.: The Entity-Relationship model—toward a unified view of data. ACM Trans. Database Syst. 1(1), 9–36 (1976)CrossRef Chen, P.P.: The Entity-Relationship model—toward a unified view of data. ACM Trans. Database Syst. 1(1), 9–36 (1976)CrossRef
17.
Zurück zum Zitat Chen, W., Fan, W., Ma, S.: Incorporating cardinality constraints and synonym rules into conditional functional dependencies. Inf. Process. Lett. 109(14), 783–789 (2009)MathSciNetCrossRef Chen, W., Fan, W., Ma, S.: Incorporating cardinality constraints and synonym rules into conditional functional dependencies. Inf. Process. Lett. 109(14), 783–789 (2009)MathSciNetCrossRef
18.
Zurück zum Zitat Cormode, G., Srivastava, D., Shen, E., Yu, T.: Aggregate query answering on possibilistic data with cardinality constraints. In: ICDE, pp. 258–269 (2012) Cormode, G., Srivastava, D., Shen, E., Yu, T.: Aggregate query answering on possibilistic data with cardinality constraints. In: ICDE, pp. 258–269 (2012)
20.
Zurück zum Zitat Fuxman, A., Miller, R.J.: First-order query rewriting for inconsistent databases. J. Comput. Syst. Sci. 73(4), 610–635 (2007)MathSciNetCrossRef Fuxman, A., Miller, R.J.: First-order query rewriting for inconsistent databases. J. Comput. Syst. Sci. 73(4), 610–635 (2007)MathSciNetCrossRef
21.
Zurück zum Zitat Hall, N., Köhler, H., Link, S., Prade, H., Zhou, X.: Cardinality constraints on qualitatively uncertain data. Data Knowl. Eng. 99, 126–150 (2015)CrossRef Hall, N., Köhler, H., Link, S., Prade, H., Zhou, X.: Cardinality constraints on qualitatively uncertain data. Data Knowl. Eng. 99, 126–150 (2015)CrossRef
22.
Zurück zum Zitat Hartmann, S.: On the consistency of Int-cardinality constraints. In: ER, pp. 150–163 (1998)CrossRef Hartmann, S.: On the consistency of Int-cardinality constraints. In: ER, pp. 150–163 (1998)CrossRef
23.
Zurück zum Zitat Hartmann, S.: On the implication problem for cardinality constraints and functional dependencies. Ann. Math. Artif. Intell. 33(2–4), 253–307 (2001)MathSciNetCrossRef Hartmann, S.: On the implication problem for cardinality constraints and functional dependencies. Ann. Math. Artif. Intell. 33(2–4), 253–307 (2001)MathSciNetCrossRef
24.
Zurück zum Zitat Hartmann, S.: Reasoning about participation constraints and chen’s constraints. In: ADC, pp. 105–113 (2003) Hartmann, S.: Reasoning about participation constraints and chen’s constraints. In: ADC, pp. 105–113 (2003)
25.
Zurück zum Zitat Hartmann, S., Kirchberg, M., Link, S.: Design by example for SQL table definitions with functional dependencies. VLDB J. 21(1), 121–144 (2012)CrossRef Hartmann, S., Kirchberg, M., Link, S.: Design by example for SQL table definitions with functional dependencies. VLDB J. 21(1), 121–144 (2012)CrossRef
26.
Zurück zum Zitat Hartmann, S., Link, S.: Efficient reasoning about a robust XML key fragment. ACM Trans. Database Syst. 34(2), 10 (2009)CrossRef Hartmann, S., Link, S.: Efficient reasoning about a robust XML key fragment. ACM Trans. Database Syst. 34(2), 10 (2009)CrossRef
28.
Zurück zum Zitat Ilyas, I.F., Markl, V., Haas, P.J., Brown, P., Aboulnaga, A.: CORDS: automatic discovery of correlations and soft functional dependencies. In: SIGMOD, pp. 647–658 (2004) Ilyas, I.F., Markl, V., Haas, P.J., Brown, P., Aboulnaga, A.: CORDS: automatic discovery of correlations and soft functional dependencies. In: SIGMOD, pp. 647–658 (2004)
29.
Zurück zum Zitat Jones, T.H., Song, I.-Y.: Analysis of binary/ternary cardinality combinations in Entity-Relationship modeling. Data Knowl. Eng. 19(1), 39–64 (1996)CrossRef Jones, T.H., Song, I.-Y.: Analysis of binary/ternary cardinality combinations in Entity-Relationship modeling. Data Knowl. Eng. 19(1), 39–64 (1996)CrossRef
30.
Zurück zum Zitat Köhler, H., Link, S.: Inclusion dependencies and their interaction with functional dependencies in SQL. J. Comput. Syst. Sci. 85, 104–131 (2017)MathSciNetCrossRef Köhler, H., Link, S.: Inclusion dependencies and their interaction with functional dependencies in SQL. J. Comput. Syst. Sci. 85, 104–131 (2017)MathSciNetCrossRef
31.
Zurück zum Zitat Langeveldt, W.-D., Link, S.: Empirical evidence for the usefulness of Armstrong relations in the acquisition of meaningful FDs. Inf. Syst. 35(3), 352–374 (2010)CrossRef Langeveldt, W.-D., Link, S.: Empirical evidence for the usefulness of Armstrong relations in the acquisition of meaningful FDs. Inf. Syst. 35(3), 352–374 (2010)CrossRef
32.
Zurück zum Zitat Lenzerini, M., Nobili, P.: On the satisfiability of dependency constraints in entity-relationship schemata. In: VLDB, pp. 147–154 (1987) Lenzerini, M., Nobili, P.: On the satisfiability of dependency constraints in entity-relationship schemata. In: VLDB, pp. 147–154 (1987)
33.
Zurück zum Zitat Letchner, J., Ré, C., Balazinska, M., Philipose, M.: Access methods for Markovian streams. In: ICDE, pp. 246–257 (2009) Letchner, J., Ré, C., Balazinska, M., Philipose, M.: Access methods for Markovian streams. In: ICDE, pp. 246–257 (2009)
34.
Zurück zum Zitat Liddle, S.W., Embley, D.W., Woodfield, S.N.: Cardinality constraints in semantic data models. Data Knowl. Eng. 11(3), 235–270 (1993)CrossRef Liddle, S.W., Embley, D.W., Woodfield, S.N.: Cardinality constraints in semantic data models. Data Knowl. Eng. 11(3), 235–270 (1993)CrossRef
35.
Zurück zum Zitat Liu, J., Li, J., Liu, C., Chen, Y.: Discover dependencies from data—a review. IEEE TKDE 24(2), 251–264 (2012) Liu, J., Li, J., Liu, C., Chen, Y.: Discover dependencies from data—a review. IEEE TKDE 24(2), 251–264 (2012)
36.
Zurück zum Zitat Mannila, H., Räihä, K.-J.: Design by example: an application of armstrong relations. J. Comput. Syst. Sci. 33(2), 126–141 (1986)MathSciNetCrossRef Mannila, H., Räihä, K.-J.: Design by example: an application of armstrong relations. J. Comput. Syst. Sci. 33(2), 126–141 (1986)MathSciNetCrossRef
37.
Zurück zum Zitat Marchi, F.D., Petit, J.: Semantic sampling of existing databases through informative Armstrong databases. Inf. Syst. 32(3), 446–457 (2007)CrossRef Marchi, F.D., Petit, J.: Semantic sampling of existing databases through informative Armstrong databases. Inf. Syst. 32(3), 446–457 (2007)CrossRef
38.
Zurück zum Zitat Olivé, A.: Conceptual modeling of information systems. Springer, Berlin (2007)MATH Olivé, A.: Conceptual modeling of information systems. Springer, Berlin (2007)MATH
39.
Zurück zum Zitat Orr, L., Suciu, D., Balazinska, M.: Probabilistic database summarization for interactive data exploration. PVLDB 10(10), 1154–1165 (2017) Orr, L., Suciu, D., Balazinska, M.: Probabilistic database summarization for interactive data exploration. PVLDB 10(10), 1154–1165 (2017)
40.
Zurück zum Zitat Roblot, T., Link, S.: Probabilistic cardinality constraints. In: ER, pp. 214–228 (2015) Roblot, T., Link, S.: Probabilistic cardinality constraints. In: ER, pp. 214–228 (2015)
41.
Zurück zum Zitat Roblot, T.K., Link, S.: Possibilistic cardinality constraints and functional dependencies. In ER, pp. 133–148 (2016) Roblot, T.K., Link, S.: Possibilistic cardinality constraints and functional dependencies. In ER, pp. 133–148 (2016)
42.
Zurück zum Zitat Roblot, T.K., Link, S.: URD: A data summarization tool for the acquisition of meaningful cardinality constraints with probabilistic intervals. In: ICDE, pp. 1379–1380 (2017) Roblot, T.K., Link, S.: URD: A data summarization tool for the acquisition of meaningful cardinality constraints with probabilistic intervals. In: ICDE, pp. 1379–1380 (2017)
43.
Zurück zum Zitat Shen, W., Li, X., Doan, A.: Constraint-based entity matching. In: AAAI, pp. 862–867 (2005) Shen, W., Li, X., Doan, A.: Constraint-based entity matching. In: AAAI, pp. 862–867 (2005)
44.
Zurück zum Zitat Siau, K., Wand, Y., Benbasat, I.: The relative importance of structural constraints and surface semantics in information modeling. Inf. Syst. 22(2/3), 155–170 (1997)CrossRef Siau, K., Wand, Y., Benbasat, I.: The relative importance of structural constraints and surface semantics in information modeling. Inf. Syst. 22(2/3), 155–170 (1997)CrossRef
45.
Zurück zum Zitat Singla, P., Domingos, P.M.: Entity resolution with markov logic. In: ICDM, pp. 572–582 (2006) Singla, P., Domingos, P.M.: Entity resolution with markov logic. In: ICDM, pp. 572–582 (2006)
46.
Zurück zum Zitat Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, San Rafael (2011)MATH Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, San Rafael (2011)MATH
47.
Zurück zum Zitat Thalheim, B.: Fundamentals of cardinality constraints. In: ER, pp. 7–23 (1992)CrossRef Thalheim, B.: Fundamentals of cardinality constraints. In: ER, pp. 7–23 (1992)CrossRef
48.
49.
Zurück zum Zitat Thalheim, B.: Integrity constraints in (conceptual) database models. In: The Evolution of Conceptual Modeling, pp. 42–67 (2008) Thalheim, B.: Integrity constraints in (conceptual) database models. In: The Evolution of Conceptual Modeling, pp. 42–67 (2008)
Metadaten
Titel
Probabilistic Cardinality Constraints
Validation, Reasoning, and Semantic Summaries
verfasst von
Tania Roblot
Miika Hannula
Sebastian Link
Publikationsdatum
02.07.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
The VLDB Journal / Ausgabe 6/2018
Print ISSN: 1066-8888
Elektronische ISSN: 0949-877X
DOI
https://doi.org/10.1007/s00778-018-0511-z

Weitere Artikel der Ausgabe 6/2018

The VLDB Journal 6/2018 Zur Ausgabe