skip to main content
10.1145/1007568.1007629acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Efficient query reformulation in peer data management systems

Published:13 June 2004Publication History

ABSTRACT

Peer data management systems (PDMS) offer a flexible architecture for decentralized data sharing. In a PDMS, every peer is associated with a schema that represents the peer's domain of interest, and semantic relationships between peers are provided locally between pairs (or small sets) of peers. By traversing semantic paths of mappings, a query over one peer can obtain relevant data from any reachable peer in the network. Semantic paths are traversed by reformulating queries at a peer into queries on its neighbors.Naively following semantic paths is highly inefficient in practice. We describe several techniques for optimizing the reformulation process in a PDMS and validate their effectiveness using real-life data sets. In particular, we develop techniques for pruning paths in the reformulation process and for minimizing the reformulated queries as they are created. In addition, we consider the effect of the strategy we use to search through the space of reformulations. Finally, we show that pre-computing semantic paths in a PDMS can greatly improve the efficiency of the reformulation process. Together, all of these techniques form a basis for scalable query reformulation in PDMS.To enable our optimizations, we developed practical algorithms, of independent interest, for checking containment and minimization of XML queries, and for composing XML mappings.

References

  1. K. Aberer, P. Cudre-Mauroux, and M. Hauswirth. A framework for semantic gossiping. SIGMOD Record, 31(4), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Abiteboul and O. Duschka. Complexity of answering queries using materialized views. In Proc. of PODS, pages 254--263, Seattle, WA, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison Weseley, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Amer-Yahia, S. Cho. L. Lakshmanan, and D. Srivastava. Minimization of tree pattern queries. In Proc. of SIGMOD, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Arenas, V. Kantere, A. Kementsietsidis, I. Kiringa, R. J. Miller, and J. Mylopoulos. The hyperion project: From data integration to data coordination. SIGMOD Record, September 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Berners-Lee, J. Hendler, and O. Lassila. The semantic web. Scientific American, May 2001.Google ScholarGoogle ScholarCross RefCross Ref
  7. P. Bernstein, F. Giunchiglia, A. Kementsietsidis, J. Mylopoulos, L. Serafini, and I. Zaihrayeu. Data management for peer-to-peer computing: A vision. In Proceedings of the WebDB Workshop, 2002.Google ScholarGoogle Scholar
  8. S. Cluet, P. Veltri, and D. Vodislav. Views in a large scale XML repository. In Proc. of VLDB, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Deutsch and V. Tannen. Containment and integrity constraints for xpath fragments. In KRDB, 2001.Google ScholarGoogle Scholar
  10. A. Deutsch and V. Tannen. Mars: A system for publishing xml from mixed and redundant storage. In Proc. of VLDB, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H.-H. Do and E. Rahm. COMA - a system for flexible combination of schema matching approaches. In Proc. of VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Doan, P. Domingos, and A. Halevy. Reconciling schemas of disparate data sources: a machine learning approach. In Proc. of SIGMOD, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. X. Dong, A. Halevy, and I. Tatarinov. Containment of nested XML queries. Submitted for publication, 2004.Google ScholarGoogle Scholar
  14. O. M. Duschka and M. R. Genesereth. Query planning in Informaster. In Proc. of the ACM Symposium on Applied Computing, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Fagin, P. Kolaitis, L. Popa, and W.-C. Tan. Composing schema mappings: Second-order dependencies to the rescue. In Proc. of PODS, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Fernandez, W.-C. Tan, and D. Suciu. Silkroute: Trading between relations and xml. In Proc. of the Int. WWW Conf., 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Flesca, F. Furfaro, and E. Masciari. On the minimization of xpath queries. In VLDB, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Florescu, A. Levy, and A. Mendelzon. Database techniques for the world-wide web: A survey. SIGMOD Record, 27(3):59--74, September 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Halevy, Z. Ives, P. Mork, and I. Tatarinov. Piazza: Data management infrastructure for semantic web applications. In Proc. of the Int. WWW Conf., 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Halevy, Z. Ives, D. Suciu, and I. Tatarinov. Schema mediation in peer data management systems. In Proc. of ICDE, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  21. A. Y. Halevy. Answering queries using views: A survey. VLDB Journal, 10(4), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Y. Halevy, Z. G. Ives, D. Suciu, and I. Tatarinov. Schema mediation in peer data management systems. In Proc. of ICDE, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  23. M. Lenzerini. Data integration: A theoretical perspective. In Proc. of PODS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Y. Levy and D. Suciu. Deciding containment for queries with complex objects and aggregations. In Proc. of PODS, Tucson, Arizona., 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Madhavan and A. Halevy. Composing mappings among data sources. In Proc. of VLDB, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. Miklau and D. Suciu. Containment and equivalence for an xpath fragment. In Proc. of PODS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. B. Ooi, Y. Shu, and K.-L. Tan. Relational data sharing in peer-based data management systems. SIGMOD Record, 23(3), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. E. Rahm and P. A. Bernstein. A survey of approaches to automatic schema matching. VLDB Journal, 10(4):334--350, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Efficient query reformulation in peer data management systems

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data
        June 2004
        988 pages
        ISBN:1581138598
        DOI:10.1145/1007568

        Copyright © 2004 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 June 2004

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader