ABSTRACT
This paper provides new worst-case bounds for the size and treewith of the result Q(D) of a conjunctive query Q to a database D. We derive bounds for the result size |Q(D)| in terms of structural properties of Q, both in the absence and in the presence of keys and functional dependencies. These bounds are based on a novel "coloring" of the query variables that associates a coloring number C(Q) to each query Q. Using this coloring number, we derive tight bounds for the size of Q(D) in case (i) no functional dependencies or keys are specified, and (ii) simple (one-attribute) keys are given. These results generalize recent size-bounds for join queries obtained by Atserias, Grohe, and Marx (FOCS 2008). An extension of our coloring technique also gives a lower bound for |Q(D)| in the general setting of a query with arbitrary functional dependencies. Our new coloring scheme also allows us to precisely characterize (both in the absence of keys and with simple keys) the treewidth-preserving queries--the queries for which the output treewidth is bounded by a function of the input treewidth. Finally we characterize the queries that preserve the sparsity of the input in the general setting with arbitrary functional dependencies.
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
- A.V. Aho, C. Beeri, and J.D. Ullman. The theory of joins in relational databases. ACM Trans. Database Syst., 4(3):297--314, 1979. Google ScholarDigital Library
- A.V. Aho, Y. Sagiv, and J.D. Ullman. Equivalence of relational expressions. SIAM J. of Computing, 8(2):218--246, May 1979.Google ScholarDigital Library
- S. Arnborg, J. Lagergren, and D. Seese. Easy problems for tree-decomposable graphs. J. Algorithms, 12(2):308--340, 1991. Google ScholarDigital Library
- A. Atserias, M. Grohe, and D. Marx. Size bounds and query plans for relational joins. In IEEE FOCS'08. Google ScholarDigital Library
- C. Beeri and M.Y. Vardi. A proof procedure for data dependencies. J. ACM, 31(4):718--741, 1984. Google ScholarDigital Library
- A.K. Chandra and P.M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In ACM STOC, 1977. Google ScholarDigital Library
- S. Chaudhuri. An overview of query optimization in relational systems. In PODS 1998. Google ScholarDigital Library
- B. Courcelle. The monadic second-order logic of graphs. i. recognizable sets of finite graphs. Inf. Comput., 85(1):12--75, 1990. Google ScholarDigital Library
- A. Deutsch, L. Popa, and V. Tannen. Query reformulation with constraints. SIGMOD Rec., 35(1):65--73, 2006. Google ScholarDigital Library
- R. Fagin, P.G. Kolaitis, R.J. Miller, and L. Popa. Data exchange: Semantics and query answering. In ICDT, 2003. Google ScholarDigital Library
- G. Gottlob and S.T. Lee. A logical approach to multicut problems. Inf. Proc. Letters, 103(4):136--141, 2007. Google ScholarDigital Library
- G. Gottlob, S.T. Lee, and G.J. Valiant. Size and treewidth bounds for conjunctive queries (extended version). Available from the authors upon request.Google Scholar
- G. Gottlob, R. Pichler, and F. Wei. Efficient datalog abduction through bounded treewidth. In AAAI, pages 1626--1631, 2007. Google ScholarDigital Library
- M. Grohe and D. Marx. Constraint solving via fractional edge covers. In SODA 2006. Google ScholarDigital Library
- P.J. Haas, J.F. Naughton, S. Seshadri, and A.N. Swami. Selectivity and cost estimation for joins based on random sampling. J. Comput. Syst. Sci., 52(3):550--569, 1996. Google ScholarDigital Library
- M. Jarke and J. Koch. Query optimization in database systems. ACM Comput. Surv., 16(2):111--152, 1984. Google ScholarDigital Library
- P. Kolaitis. Schema mappings, data exchange, and metadata management. In PODS, 2005. Google ScholarDigital Library
- M. Lenzerini. Data integration: a theoretical perspective. In PODS, 2002. Google ScholarDigital Library
- A.Y. Levy, A.O. Mendelzon, and Y. Sagiv. Answering queries using views. In PODS 1995. Google ScholarDigital Library
- D. Maier, A.O. Mendelzon, and Y. Sagiv. Testing implications of data dependencies. ACM Trans. Database Syst., 4(4):455--469, 1979. Google ScholarDigital Library
- F. Olken and D. Rotem. Random sampling from database files: A survey. In Proc. of Stat. and Scientific Database Management, 1990. Google ScholarDigital Library
- N. Robertson and P.D. Seymour. Graph minors II: Algorithmic Aspects of Tree-Width. Journal of Algorithms, 7:309--322, 1986.Google ScholarCross Ref
- A.N. Swami and K.B. Schiefer. On the estimation of join result sizes. In Advances in Database Technology -- EDBT'94. 4th Int. Conf. on Extending Database Technology, 1994. Google ScholarDigital Library
- M. Thorup. Structured programs have small tree-width and good register allocation. Information and Computation, 142:318--332, 1998. Google ScholarDigital Library
Index Terms
- Size and treewidth bounds for conjunctive queries
Recommendations
Size and Treewidth Bounds for Conjunctive Queries
This article provides new worst-case bounds for the size and treewith of the result Q(D) of a conjunctive query Q applied to a database D. We derive bounds for the result size |Q(D)| in terms of structural properties of Q, both in the absence and in the ...
Efficient approximations of conjunctive queries
PODS '12: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database SystemsWhen finding exact answers to a query over a large database is infeasible, it is natural to approximate the query by a more efficient one that comes from a class with good bounds on the complexity of query evaluation. In this paper we study such ...
Size Bounds for Factorised Representations of Query Results
We study two succinct representation systems for relational data based on relational algebra expressions with unions, Cartesian products, and singleton relations: f-representations, which employ algebraic factorisation using distributivity of product ...
Comments