Skip to main content
Top
Published in: Theory of Computing Systems 4/2018

08-05-2017

Reasoning about integrity constraints for tree-structured data

Authors: Wojciech Czerwiński, Claire David, Filip Murlak, Paweł Parys

Published in: Theory of Computing Systems | Issue 4/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We study a class of integrity constraints for tree-structured data modelled as data trees, whose nodes have a label from a finite alphabet and store a data value from an infinite data domain. The constraints require each tuple of nodes selected by a conjunctive query (using navigational axes and labels) to satisfy a positive combination of equalities and a positive combination of inequalities over the stored data values. Such constraints are instances of the general framework of XML-to-relational constraints proposed recently by Niewerth and Schwentick. They cover some common classes of constraints, including W3C XML Schema key and unique constraints, as well as domain restrictions and denial constraints, but cannot express inclusion constraints, such as reference keys. Our main result is that consistency of such integrity constraints with respect to a given schema (modelled as a tree automaton) is decidable. An easy extension gives decidability for the entailment problem. Equivalently, we show that validity and containment of unions of conjunctive queries using navigational axes, labels, data equalities and inequalities is decidable, as long as none of the conjunctive queries uses both equalities and inequalities; without this restriction, both problems are known to be undecidable. In the context of XML data exchange, our result can be used to establish decidability for a consistency problem for XML schema mappings. All the decision procedures are doubly exponential, with matching lower bounds. The complexity may be lowered to singly exponential, when conjunctive queries are replaced by tree patterns, and the number of data comparisons is bounded.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
Provided by Michał Pilipczuk, during the Warsaw Automata Group’s research camp Autob ó z 2015.
 
Literature
1.
go back to reference Arenas, M., Barceló, P., Libkin, L., Murlak, F.: Foundations of data exchange. Cambridge University Press (2014) Arenas, M., Barceló, P., Libkin, L., Murlak, F.: Foundations of data exchange. Cambridge University Press (2014)
2.
go back to reference Arenas, M., Fan, W., Libkin, L.: On the complexity of verifying consistency of XML specifications. SIAM J. Comput. 38(3), 841–880 (2008)MathSciNetCrossRefMATH Arenas, M., Fan, W., Libkin, L.: On the complexity of verifying consistency of XML specifications. SIAM J. Comput. 38(3), 841–880 (2008)MathSciNetCrossRefMATH
3.
go back to reference Arenas, M., Libkin, L.: A normal form for XML documents. ACM Trans. Database Syst. 29, 195–232 (2004)CrossRef Arenas, M., Libkin, L.: A normal form for XML documents. ACM Trans. Database Syst. 29, 195–232 (2004)CrossRef
4.
go back to reference Arenas, M., Libkin, L.: XML data exchange: Consistency and query answering. J. ACM, 55(2) (2008) Arenas, M., Libkin, L.: XML data exchange: Consistency and query answering. J. ACM, 55(2) (2008)
5.
go back to reference Benedikt, M., Fan, W., Geerts, F.: XPath satisfiability in the presence of DTDs. J. ACM, 55(2) (2008) Benedikt, M., Fan, W., Geerts, F.: XPath satisfiability in the presence of DTDs. J. ACM, 55(2) (2008)
6.
go back to reference Björklund, H., Martens, W., Schwentick, T.: Conjunctive query containment over trees using schema information. Acta Informatica, 1–40 (2016) Björklund, H., Martens, W., Schwentick, T.: Conjunctive query containment over trees using schema information. Acta Informatica, 1–40 (2016)
7.
go back to reference Bojańczyk, M., Murlak, F., Witkowski, A.: Containment of monadic datalog programs via bounded clique-width. In: Proceedings of the ICALP 2015, pp. 427–439 (2015) Bojańczyk, M., Murlak, F., Witkowski, A.: Containment of monadic datalog programs via bounded clique-width. In: Proceedings of the ICALP 2015, pp. 427–439 (2015)
8.
go back to reference Bojańczyk, M., Muscholl, A., Schwentick, T., Segoufin, L.: Two-variable logic on data trees and XML reasoning. J. ACM 56(3), 13:1–13:48 (2009)MathSciNetMATH Bojańczyk, M., Muscholl, A., Schwentick, T., Segoufin, L.: Two-variable logic on data trees and XML reasoning. J. ACM 56(3), 13:1–13:48 (2009)MathSciNetMATH
9.
go back to reference Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational data bases. In: Proceedings of the STOC 1977, pp. 77–90 (1977) Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational data bases. In: Proceedings of the STOC 1977, pp. 77–90 (1977)
10.
go back to reference Courcelle, B., Makowsky, J.A., Rotics, U.: Linear time solvable optimization problems on graphs of bounded clique-width. Theory Comput. Syst. 33(2), 125–150 (2000)MathSciNetCrossRefMATH Courcelle, B., Makowsky, J.A., Rotics, U.: Linear time solvable optimization problems on graphs of bounded clique-width. Theory Comput. Syst. 33(2), 125–150 (2000)MathSciNetCrossRefMATH
12.
go back to reference David, C., Gheerbrant, A., Libkin, L., Martens, W.: Containment of pattern-based queries over data trees. In: Proceedings of the ICDT 2013, pp. 201–212 (2013) David, C., Gheerbrant, A., Libkin, L., Martens, W.: Containment of pattern-based queries over data trees. In: Proceedings of the ICDT 2013, pp. 201–212 (2013)
13.
go back to reference David, C., Hofman, P., Murlak, F., Pilipczuk, M.: Synthesizing transformations from XML schema mappings. In: Proceedings of the ICDT 2014, pp. 61–71 (2014) David, C., Hofman, P., Murlak, F., Pilipczuk, M.: Synthesizing transformations from XML schema mappings. In: Proceedings of the ICDT 2014, pp. 61–71 (2014)
14.
go back to reference Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. Theor. Comput. Sci. 336(1), 89–124 (2005)MathSciNetCrossRefMATH Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. Theor. Comput. Sci. 336(1), 89–124 (2005)MathSciNetCrossRefMATH
15.
go back to reference Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.C.: Composing schema mappings: Second-order dependencies to the rescue. ACM Trans. Database Syst. 30(4), 994–1055 (2005)CrossRef Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.C.: Composing schema mappings: Second-order dependencies to the rescue. ACM Trans. Database Syst. 30(4), 994–1055 (2005)CrossRef
16.
go back to reference Fagin, R., Vardi, M.Y.: The theory of data dependencies - a survey. In: Mathematics of information processing, volume 34 of proceedings of symposia in applied mathematics, pp. 19–71. American Mathematical Society, Providence, Rhode Island (1986) Fagin, R., Vardi, M.Y.: The theory of data dependencies - a survey. In: Mathematics of information processing, volume 34 of proceedings of symposia in applied mathematics, pp. 19–71. American Mathematical Society, Providence, Rhode Island (1986)
17.
go back to reference Figueira, D.: Alternating register automata on finite words and trees. Logical Methods Comput. Sci. 8(1), 2012 Figueira, D.: Alternating register automata on finite words and trees. Logical Methods Comput. Sci. 8(1), 2012
18.
go back to reference Gao, S., Sperberg-McQueen, C.M., Thompson, H.S., Mendelsohn, N., Beech, D., Maloney, M.: W3C XML Schema Definition Language (XSD) 1.1, Part 1: Structures. Technical report, World Wide Web Consortium, April (2009) Gao, S., Sperberg-McQueen, C.M., Thompson, H.S., Mendelsohn, N., Beech, D., Maloney, M.: W3C XML Schema Definition Language (XSD) 1.1, Part 1: Structures. Technical report, World Wide Web Consortium, April (2009)
19.
go back to reference Gogacz, T., Marcinkowski, J.: All-instances termination of chase is undecidable. In: Proceedings of the ICALP 2014, pp. 293–304 (2014) Gogacz, T., Marcinkowski, J.: All-instances termination of chase is undecidable. In: Proceedings of the ICALP 2014, pp. 293–304 (2014)
20.
go back to reference Gogacz, T., Marcinkowski, J.: Red spider meets a rainworm: query finite determinacy is undecidable. In: Proceedings of the PODS 2016, pp. 121–134 (2016) Gogacz, T., Marcinkowski, J.: Red spider meets a rainworm: query finite determinacy is undecidable. In: Proceedings of the PODS 2016, pp. 121–134 (2016)
21.
go back to reference Hartmann, S., Link, S.: More functional dependencies for XML. In: Proceedings of the ADBIS 2003, pp. 355–369 (2003) Hartmann, S., Link, S.: More functional dependencies for XML. In: Proceedings of the ADBIS 2003, pp. 355–369 (2003)
22.
go back to reference Hartmann, S., Link, S., Trinh, T.: Solving the implication problem for XML, functional dependencies with properties. In: Proceedings of the WoLLIC 2010, pp. 161–175 (2010) Hartmann, S., Link, S., Trinh, T.: Solving the implication problem for XML, functional dependencies with properties. In: Proceedings of the WoLLIC 2010, pp. 161–175 (2010)
23.
go back to reference Jurdzinski, M., Lazic, R.: Alternating automata on data trees and XPath satisfiability. ACM Trans. Comput. Logic 12(3), 19:1–19:21 (2011)MathSciNetCrossRefMATH Jurdzinski, M., Lazic, R.: Alternating automata on data trees and XPath satisfiability. ACM Trans. Comput. Logic 12(3), 19:1–19:21 (2011)MathSciNetCrossRefMATH
24.
go back to reference Lenzerini, M.: Data integration: A theoretical perspective. In: Proceedings of the PODS 2002, pp. 233–246 (2002) Lenzerini, M.: Data integration: A theoretical perspective. In: Proceedings of the PODS 2002, pp. 233–246 (2002)
26.
go back to reference Neven, F., Schwentick, T.: On the complexity of XPath containment in the presence of disjunction, DTDs, and variables. Log. Meth. Comput. Sci., 2(3) (2006) Neven, F., Schwentick, T.: On the complexity of XPath containment in the presence of disjunction, DTDs, and variables. Log. Meth. Comput. Sci., 2(3) (2006)
27.
go back to reference Niewerth, M., Schwentick, T.: Reasoning XML constraints based on XML-to-relational mappings. In: Proceedings of the ICDT 2014, pp. 72–83 (2014) Niewerth, M., Schwentick, T.: Reasoning XML constraints based on XML-to-relational mappings. In: Proceedings of the ICDT 2014, pp. 72–83 (2014)
28.
go back to reference Vardi, M.Y.: Fundamentals of dependency theory. In: Borger, E. (ed.) Trends in theoretical computer science, pp. 171–224. Computer Science Press (1987) Vardi, M.Y.: Fundamentals of dependency theory. In: Borger, E. (ed.) Trends in theoretical computer science, pp. 171–224. Computer Science Press (1987)
Metadata
Title
Reasoning about integrity constraints for tree-structured data
Authors
Wojciech Czerwiński
Claire David
Filip Murlak
Paweł Parys
Publication date
08-05-2017
Publisher
Springer US
Published in
Theory of Computing Systems / Issue 4/2018
Print ISSN: 1432-4350
Electronic ISSN: 1433-0490
DOI
https://doi.org/10.1007/s00224-017-9771-z

Other articles of this Issue 4/2018

Theory of Computing Systems 4/2018 Go to the issue

Premium Partner