skip to main content
10.1145/1007568.1007611acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Constraint-based XML query rewriting for data integration

Published:13 June 2004Publication History

ABSTRACT

We study the problem of answering queries through a target schema, given a set of mappings between one or more source schemas and this target schema, and given that the data is at the sources. The schemas can be any combination of relational or XML schemas, and can be independently designed. In addition to the source-to-target mappings, we consider as part of the mapping scenario a set of target constraints specifying additional properties on the target schema. This becomes particularly important when integrating data from multiple data sources with overlapping data and when such constraints can express data merging rules at the target. We define the semantics of query answering in such an integration scenario, and design two novel algorithms, basic query rewrite and query resolution, to implement the semantics. The basic query rewrite algorithm reformulates target queries in terms of the source schemas, based on the mappings. The query resolution algorithm generates additional rewritings that merge related information from multiple sources and assemble a coherent view of the data, by incorporating target constraints. The algorithms are implemented and then evaluated using a comprehensive set of experiments based on both synthetic and real-life data integration scenarios.

References

  1. S. Abiteboul and N. Bidoit. Non-first Normal Form Relations: An Algebra Allowing Data Restructuring. JCSS, 33:361--393, 1986.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Amann, C. Beeri, I. Fundulaki, and M. Scholl. Querying XML sources using an ontology-based mediator. In CoopIS, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Beeri, A. Y. Levy, and M.-C. Rousset. Rewriting queries using views in description logics. In PODS, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Beeri and M. Y. Vardi. A Proof Procedure for Data Dependencies. J. ACM, 31(4):718--741, 1984.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Benedikt, C. Y. Chan, W. Fan, J. Freire, and R. Rastogi. Capturing both types and constraints in data integration. In SIGMOD, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Calvanese, G. D. Giacomo, and M. Lenzerini. View-based query processing for regular path queries with inverse. In PODS, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Chamberlin. XQuery: An XML query language. IBM Systems Journal, 41(4):597--615, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Chapman, C. Yu, and H. V. Jagadish. Effective integration of protein data through better data modeling. OMICS: A Journal of Integrative Biology, 7(1):101--102, 2003.]]Google ScholarGoogle ScholarCross RefCross Ref
  9. A. Deutsch, L. Popa, and V. Tannen. Physical data independence, constraints, and optimization with universal plans. In VLDB, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Deutsch and V. Tannen. MARS: A system for publishing XML from mixed and redundant storage. In VLDB, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Deutsch and V. Tannen. Reformulation of XML queries and constraints. In ICDT, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. O. Duschka, M. Genesereth, and A. Levy. Recursive query plans for data integration. Journal of Logic Programming, 43(1):49--73, 2000.]]Google ScholarGoogle ScholarCross RefCross Ref
  13. O. M. Duschka and M. R. Genesereth. Answering recursive queries using views. In PODS, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data exchange: Semantics and query answering. In ICDT, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Fernández, Y. Kadiyska, D. Suciu, A. Morishima, and W.-C. Tan. SilkRoute: A framework for publishing relational data in XML. TODS, 27(4):438--493, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Friedman, A. Levy, and T. Millstein. Navigational Plans for Data Integration. In AAAI, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Y. Halevy. Answering queries using views: A survey. The VLDB Journal, 10:270--294, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. V. Josifovski, P. Schwarz, L. Haas, and E. Lin. Garlic: a new flavor of federated query processing for DB2. In SIGMOD, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Krishnamurthy, R. Kaushik, and J. F. Naughton. XML-SQL Query Translation Literature: The State of the Art and Open Problems. In XSym, 2003.]]Google ScholarGoogle ScholarCross RefCross Ref
  20. M. Lenzerini. Data Integration: A Theoretical Perspective. In PODS, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Y. Levy, A. O. Mendelzon, Y. Sagiv, and D. Srivastava. Answering queries using views. In PODS, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. I. Manolescu, D. Florescu, and D. Kossman. Answering XML queries over heterogeneous data sources. In VLDB, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. J. Miller, L. M. Haas, and M. A. Hernández. Schema mapping as query discovery. In VLDB, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Papakonstantinou and V. Vassalos. Rewriting queries using semistructured views. In SIGMOD, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Popa and V. Tannen. An equational chase for path-conjunctive queries, constraints, and views. In ICDT, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Popa, Y. Velegrakis, R. J. Miller, M. A. Hernández, and R. Fagin. Translating web data. In VLDB, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. E. Rahm and P. A. Bernstein. A Survey of Approaches to Automatic Schema Matching. The VLDB Journal, 10(4):334--350, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Shanmugasundaram, J. Kiernan, E. Shekita, C. Fan, and J. Funderburk. Querying XML views of relational data. In VLDB, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. van der Meyden. Logical Approaches to Incomplete Information: A Survey. In Logics for Databases and Information Systems, pages 307--356. Kluwer, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Constraint-based XML query rewriting for data integration

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data
        June 2004
        988 pages
        ISBN:1581138598
        DOI:10.1145/1007568

        Copyright © 2004 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 June 2004

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader