skip to main content
10.1145/335168.335208acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article
Free Access

Query containment for data integration systems

Published:01 May 2000Publication History

ABSTRACT

The problem of query containment is fundamental to many aspects of database systems, including query optimization, determining independence of queries from updates, and rewriting queries using views. In the data integration framework, however, the standard notion of query containment does not suffice. We define relative containment, which formalizes the notion of query containment relative to the sources available to the integration system. First we provide optimal bounds for relative containment for several important classes of datalog queries, including the common case of conjunctive queries. Next we provide bounds for the case when sources enforce access restrictions in the form of binding pattern constraints. Surprisingly, we show that relative containment for conjunctive queries is still decidable in this case, even though it is known that finding all answers to such queries may require a recursive datalog program over the sources. Finally, we provide tight bounds for variants of relative containment when the queries and source descriptions may contain comparison predicates.

References

  1. 1.S. Abiteboul and O. Duschka. Complexity of answering queries using materialized views. In Proe. of the A CM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), Seattle, WA, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.S. Adali, K. Candan, Y. Papakonstantinou, and V. Subrahmanian. Query caching and optimization in distributed mediator systems. In Proc. of A CM SIGMOD Conf. on Management of Data, Montreal, Canada, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.A. Aho, Y. Sagiv, and J. D. Ullman. Equivalence of relational expressions. SIAM Journal of Computing, (8)2:218- 246, 1979.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.J. L. Ambite, N. Ashish, G. Barish, C. A. Knoblock, S. Minton, P. J. Modi, I. Muslea, A. Philpot, and S. Tejada. ARIADNE: A system for constructing mediators for internet sources (system demonstration). In Proe. of A CM SIGMOD Conf. on Management of Data, Seattle, WA, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.C. Beeri, G. Elber, T. Milo, Y. Sagiv, O.Shmueli, N.Tishby, Y.Kogan, D.Konopnicki, P. Mogilevski, and N.Slonim. Websuite-a tool suite for harnessing web data. In Proceedings of the International Workshop on the Web and Databases, Valencia, Spain, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.C. Beeri, A. Y. Levy, and M.-C. Rousset. Rewriting queries using views in description logics. In Proc. of the A CM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), Tucson, Arizona., 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.D. Calvanese, G. D. Giacomo, and M. Lenzerini. Answering queries using views in description logics. In Working notes of the KRDB Workshop, 1999.]]Google ScholarGoogle Scholar
  8. 8.T. Catarci and M. Lenzerini. Representing and using interschema knowledge in cooperative information systems. Journal of Intelligent and Cooperative Information Systems, 1993.]]Google ScholarGoogle ScholarCross RefCross Ref
  9. 9.A. Chandra and P. Merlin. Optimal implementation of conjunctive queries in relational databases. In Proceedings of the Ninth Annual A CM Symposium on Theory of Computing, pages 77-90, 1977.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.S. Chaudhuri, R. Krishnamurthy, S. Potamianos, and K. Shim. Optimizing queries with materialized views. In Proe. of Int. Conf. on Data Engineering (ICDE), Taipei, Taiwan, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.S. Chaudhuri and M. Vardi. On the equivalence of recursive and nonrecursive datalog programs. In Proc. of the A CM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 55-66, San Diego, CA., 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12.C. Chen and N. Roussopoulos. Implementation and performance evaluation of the ADMS query optimizer. In Proc. of the Conf. on Extending Database Technology (EDBT), March 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13.W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity. In Proc. of A CM SIGMOD Conf. on Management of Data, Seattle, WA, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.S. Dar, M. J. Franklin, B. Jonsson, D. Srivastava, and M. Tan. Semantic data caching and replacement. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 330-341, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15.O. Duschka, M. Genesereth, and A. Levy. Recursive query plans for data integration. Journal of Logic Programming, special issue on Logic Based Heterogeneous Information Systerns, 1999.]]Google ScholarGoogle Scholar
  16. 16.O. M. Duschka and M. R. Genesereth. Answering recursive queries using views. In Proc. of the A CM SIGACT- SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), Tucson, Arizona., 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.O. M. Duschka and M. R. Genesereth. Query planning in infomaster. In Proceedings of the A CM Symposium on Applied Computing, San Jose, CA, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.D. Florescu, L. Raschid, and P. Valduriez. A methodology for query reformulation in cis using semantic knowledge. Int. Journal of Intelligent ~4 Cooperative Information Systems, special issue on Formal Methods in Cooperative Information Systems, 5(4), 1996.]]Google ScholarGoogle ScholarCross RefCross Ref
  19. 19.M. Friedman, A. Levy, and T. Millstein. Navigational plans for data integration. In Proceedings of the National Conference on Artificial Intelligence, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20.M. Friedman and D. Weld. Efficient execution of information gathering plans. In Proceedings of the International Joint Conference on Artificial Intelligence, Nagoya, Japan, 1997.]]Google ScholarGoogle Scholar
  21. 21.M. T. Friedman. Optimization Issues in Data Integration. PhD thesis, University of Washington, 1999.]]Google ScholarGoogle Scholar
  22. 22.H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman, and J. Widom. The TSIMMIS project: Integration of heterogeneous information sources. Journal of Intelligent Information Systems, 8(2):117-132, March 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23.G. Grahne and A. O. Mendelzon. Tableau techniques for querying information sources through global schemas. In Proc. of the Int. Conf. on Database Theory (ICDT), 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24.A. Gupta, Y. Sagiv, J. D. Ullman, and J. Widom. Constraint checking with partial information. In Proc. of the A CM SIGA CT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 45-55, Minneapolis, Minnesota, 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. 25.L. Haas, D. Kossmann, E. Wimmers, and J. Yang. Optimizing queries across diverse data sources. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), Athens, Greece, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. 26.Z. Ives, D. Florescu, M. Friedman, A. Levy, and D. Weld. An adaptive query execution engine for data integration. In Proc. of A CM SIGMOD Conf. on Management of Data, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. 27.A. M. Keller and J. Basu. A predicate-based caching scheme for client-server database architectures. VLDB Journal, 5(1):35-47, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. 28.A. Klug. On conjunctive queries containing inequalities. Journal of the ACM, pages 35(1): 146-160, 1988.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. 29.C. T. Kwok and D. S. Weld. Planning to gather information. In Proceedings of the AAAI Thirteenth National Conference on Artificial Intelligence, 1996.]]Google ScholarGoogle Scholar
  30. 30.E. Lambrecht, S. Kambhampati, and S. Gnanaprakasam. Optimizing recursive information gathering plans. In Proceedings of the 16th International Joint Conference on Attificial Intelligence, pages 1204-1210, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. 31.A. Y. Levy, A. O. Mendelzon, Y. Sagiv, and D. Srivastava. Answering queries using views. In Proc. of the A CM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), San Jose, CA, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. 32.A. Y. Levy, A. Rajaraman, and J. J. Ordille. Querying heterogeneous information sources using source descriptions. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), Bombay, India, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. 33.A. Y. Levy and Y. Sagiv. Queries independent of updates. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 171-181, Dublin, Ireland, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. 34.A. Rajaraman, Y. Sagiv, and J. D. Ullman. Answering queries using templates with binding patterns. In Proc. of the A CM SIGA CT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), San Jose, CA, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. 35.Y. Sagiv and M. Yannakakis. Equivalence among relational expressions with the union and difference operators. Journal of the ACM, 27(4):633-655, 1981.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. 36.O. Shmueli. Equivalence of datalog queries is undecidable. Journal of Logic Programming, 15:231-241, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. 37.L. Stockmeyer. The polynomiM-time hierarchy. Theoretical Computer Science, 3:1-22, 1976.]]Google ScholarGoogle ScholarCross RefCross Ref
  38. 38.O. G. TsatMos, M. H. Solomon, and Y. E. Ioannidis. The GMAP: A versatile tool for physical data independence. VLDB Journal, 5(2):101-118, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. 39.R. van der Meyden. The complexity of querying indefinite data about linearly ordered domains. In Proc. of the A CM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 331-345, San Diego, CA., 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. 40.H. Z. Yang and P. A. Larson. Query transformation for PSJ- queries. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 245-254, Brighton, England, 1987.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Query containment for data integration systems

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PODS '00: Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
        May 2000
        281 pages
        ISBN:158113214X
        DOI:10.1145/335168

        Copyright © 2000 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 May 2000

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        PODS '00 Paper Acceptance Rate26of119submissions,22%Overall Acceptance Rate642of2,707submissions,24%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader