ABSTRACT
The problem of query containment is fundamental to many aspects of database systems, including query optimization, determining independence of queries from updates, and rewriting queries using views. In the data integration framework, however, the standard notion of query containment does not suffice. We define relative containment, which formalizes the notion of query containment relative to the sources available to the integration system. First we provide optimal bounds for relative containment for several important classes of datalog queries, including the common case of conjunctive queries. Next we provide bounds for the case when sources enforce access restrictions in the form of binding pattern constraints. Surprisingly, we show that relative containment for conjunctive queries is still decidable in this case, even though it is known that finding all answers to such queries may require a recursive datalog program over the sources. Finally, we provide tight bounds for variants of relative containment when the queries and source descriptions may contain comparison predicates.
- 1.S. Abiteboul and O. Duschka. Complexity of answering queries using materialized views. In Proe. of the A CM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), Seattle, WA, 1998.]] Google ScholarDigital Library
- 2.S. Adali, K. Candan, Y. Papakonstantinou, and V. Subrahmanian. Query caching and optimization in distributed mediator systems. In Proc. of A CM SIGMOD Conf. on Management of Data, Montreal, Canada, 1996.]] Google ScholarDigital Library
- 3.A. Aho, Y. Sagiv, and J. D. Ullman. Equivalence of relational expressions. SIAM Journal of Computing, (8)2:218- 246, 1979.]]Google ScholarDigital Library
- 4.J. L. Ambite, N. Ashish, G. Barish, C. A. Knoblock, S. Minton, P. J. Modi, I. Muslea, A. Philpot, and S. Tejada. ARIADNE: A system for constructing mediators for internet sources (system demonstration). In Proe. of A CM SIGMOD Conf. on Management of Data, Seattle, WA, 1998.]] Google ScholarDigital Library
- 5.C. Beeri, G. Elber, T. Milo, Y. Sagiv, O.Shmueli, N.Tishby, Y.Kogan, D.Konopnicki, P. Mogilevski, and N.Slonim. Websuite-a tool suite for harnessing web data. In Proceedings of the International Workshop on the Web and Databases, Valencia, Spain, 1998.]] Google ScholarDigital Library
- 6.C. Beeri, A. Y. Levy, and M.-C. Rousset. Rewriting queries using views in description logics. In Proc. of the A CM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), Tucson, Arizona., 1997.]] Google ScholarDigital Library
- 7.D. Calvanese, G. D. Giacomo, and M. Lenzerini. Answering queries using views in description logics. In Working notes of the KRDB Workshop, 1999.]]Google Scholar
- 8.T. Catarci and M. Lenzerini. Representing and using interschema knowledge in cooperative information systems. Journal of Intelligent and Cooperative Information Systems, 1993.]]Google ScholarCross Ref
- 9.A. Chandra and P. Merlin. Optimal implementation of conjunctive queries in relational databases. In Proceedings of the Ninth Annual A CM Symposium on Theory of Computing, pages 77-90, 1977.]] Google ScholarDigital Library
- 10.S. Chaudhuri, R. Krishnamurthy, S. Potamianos, and K. Shim. Optimizing queries with materialized views. In Proe. of Int. Conf. on Data Engineering (ICDE), Taipei, Taiwan, 1995.]] Google ScholarDigital Library
- 11.S. Chaudhuri and M. Vardi. On the equivalence of recursive and nonrecursive datalog programs. In Proc. of the A CM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 55-66, San Diego, CA., 1992.]] Google ScholarDigital Library
- 12.C. Chen and N. Roussopoulos. Implementation and performance evaluation of the ADMS query optimizer. In Proc. of the Conf. on Extending Database Technology (EDBT), March 1994.]] Google ScholarDigital Library
- 13.W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity. In Proc. of A CM SIGMOD Conf. on Management of Data, Seattle, WA, 1998.]] Google ScholarDigital Library
- 14.S. Dar, M. J. Franklin, B. Jonsson, D. Srivastava, and M. Tan. Semantic data caching and replacement. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 330-341, 1996.]] Google ScholarDigital Library
- 15.O. Duschka, M. Genesereth, and A. Levy. Recursive query plans for data integration. Journal of Logic Programming, special issue on Logic Based Heterogeneous Information Systerns, 1999.]]Google Scholar
- 16.O. M. Duschka and M. R. Genesereth. Answering recursive queries using views. In Proc. of the A CM SIGACT- SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), Tucson, Arizona., 1997.]] Google ScholarDigital Library
- 17.O. M. Duschka and M. R. Genesereth. Query planning in infomaster. In Proceedings of the A CM Symposium on Applied Computing, San Jose, CA, 1997.]] Google ScholarDigital Library
- 18.D. Florescu, L. Raschid, and P. Valduriez. A methodology for query reformulation in cis using semantic knowledge. Int. Journal of Intelligent ~4 Cooperative Information Systems, special issue on Formal Methods in Cooperative Information Systems, 5(4), 1996.]]Google ScholarCross Ref
- 19.M. Friedman, A. Levy, and T. Millstein. Navigational plans for data integration. In Proceedings of the National Conference on Artificial Intelligence, 1999.]] Google ScholarDigital Library
- 20.M. Friedman and D. Weld. Efficient execution of information gathering plans. In Proceedings of the International Joint Conference on Artificial Intelligence, Nagoya, Japan, 1997.]]Google Scholar
- 21.M. T. Friedman. Optimization Issues in Data Integration. PhD thesis, University of Washington, 1999.]]Google Scholar
- 22.H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman, and J. Widom. The TSIMMIS project: Integration of heterogeneous information sources. Journal of Intelligent Information Systems, 8(2):117-132, March 1997.]] Google ScholarDigital Library
- 23.G. Grahne and A. O. Mendelzon. Tableau techniques for querying information sources through global schemas. In Proc. of the Int. Conf. on Database Theory (ICDT), 1999.]] Google ScholarDigital Library
- 24.A. Gupta, Y. Sagiv, J. D. Ullman, and J. Widom. Constraint checking with partial information. In Proc. of the A CM SIGA CT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 45-55, Minneapolis, Minnesota, 1994.]] Google ScholarDigital Library
- 25.L. Haas, D. Kossmann, E. Wimmers, and J. Yang. Optimizing queries across diverse data sources. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), Athens, Greece, 1997.]] Google ScholarDigital Library
- 26.Z. Ives, D. Florescu, M. Friedman, A. Levy, and D. Weld. An adaptive query execution engine for data integration. In Proc. of A CM SIGMOD Conf. on Management of Data, 1999.]] Google ScholarDigital Library
- 27.A. M. Keller and J. Basu. A predicate-based caching scheme for client-server database architectures. VLDB Journal, 5(1):35-47, 1996.]] Google ScholarDigital Library
- 28.A. Klug. On conjunctive queries containing inequalities. Journal of the ACM, pages 35(1): 146-160, 1988.]] Google ScholarDigital Library
- 29.C. T. Kwok and D. S. Weld. Planning to gather information. In Proceedings of the AAAI Thirteenth National Conference on Artificial Intelligence, 1996.]]Google Scholar
- 30.E. Lambrecht, S. Kambhampati, and S. Gnanaprakasam. Optimizing recursive information gathering plans. In Proceedings of the 16th International Joint Conference on Attificial Intelligence, pages 1204-1210, 1999.]] Google ScholarDigital Library
- 31.A. Y. Levy, A. O. Mendelzon, Y. Sagiv, and D. Srivastava. Answering queries using views. In Proc. of the A CM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), San Jose, CA, 1995.]] Google ScholarDigital Library
- 32.A. Y. Levy, A. Rajaraman, and J. J. Ordille. Querying heterogeneous information sources using source descriptions. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), Bombay, India, 1996.]] Google ScholarDigital Library
- 33.A. Y. Levy and Y. Sagiv. Queries independent of updates. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 171-181, Dublin, Ireland, 1993.]] Google ScholarDigital Library
- 34.A. Rajaraman, Y. Sagiv, and J. D. Ullman. Answering queries using templates with binding patterns. In Proc. of the A CM SIGA CT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), San Jose, CA, 1995.]] Google ScholarDigital Library
- 35.Y. Sagiv and M. Yannakakis. Equivalence among relational expressions with the union and difference operators. Journal of the ACM, 27(4):633-655, 1981.]] Google ScholarDigital Library
- 36.O. Shmueli. Equivalence of datalog queries is undecidable. Journal of Logic Programming, 15:231-241, 1993.]] Google ScholarDigital Library
- 37.L. Stockmeyer. The polynomiM-time hierarchy. Theoretical Computer Science, 3:1-22, 1976.]]Google ScholarCross Ref
- 38.O. G. TsatMos, M. H. Solomon, and Y. E. Ioannidis. The GMAP: A versatile tool for physical data independence. VLDB Journal, 5(2):101-118, 1996.]] Google ScholarDigital Library
- 39.R. van der Meyden. The complexity of querying indefinite data about linearly ordered domains. In Proc. of the A CM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 331-345, San Diego, CA., 1992.]] Google ScholarDigital Library
- 40.H. Z. Yang and P. A. Larson. Query transformation for PSJ- queries. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 245-254, Brighton, England, 1987.]] Google ScholarDigital Library
Index Terms
- Query containment for data integration systems
Recommendations
Query containment for data integration systems
Special issue on PODS 2000The problem of query containment is fundamental to many aspects of database systems, including query optimization, determining independence of queries from updates, and rewriting queries using views. In the data-integration framework, however, the ...
View-based query containment
PODS '03: Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsQuery containment is the problem of checking whether for all databases the answer to a query is a subset of the answer to a second query. In several data management tasks, such as data integration, mobile computing, etc., the data of interest are only ...
Query containment under bag and bag-set semantics
Conjunctive queries (CQs) are at the core of query languages encountered in many logic-based research fields such as AI, or database systems. The majority of existing work assumes set semantics but often in real applications the manipulation of ...
Comments