Abstract
A schema mapping is a declarative specification of the relationship between instances of a source schema and a target schema. The data exchange (or data translation) problem asks: given an instance over the source schema, materialize an instance (or solution) over the target schema that satisfies the schema mapping. In general, a given source instance may have numerous different solutions. Among all the solutions, universal solutions and core universal solutions have been singled out and extensively studied. A universal solution is a most general one and also represents the entire space of solutions, while a core universal solution is the smallest universal solution and is unique up to isomorphism (hence, we can talk about the core).
The problem of designing efficient algorithms for computing the core has attracted considerable attention in recent years. In this paper, we present a method for directly computing the core by SQL queries, when schema mappings are specified by source-to-target tuple-generating dependencies (s-t tgds). Unlike prior methods that, given a source instance, first compute a target instance and then recursively minimize that instance to the core, our method avoids the construction of such intermediate instances. This is done by rewriting the schema mapping into a laconic schema mapping that is specified by first-order s-t tgds with a linear order in the active domain of the source instances. A laconic schema mapping has the property that a "direct translation" of the source instance according to the laconic schema mapping produces the core. Furthermore, a laconic schema mapping can be easily translated into SQL, hence it can be optimized and executed by a database system to produce the core. We also show that our results are optimal: the use of the linear order is inevitable and, in general, schema mappings with constraints over the target schema cannot be rewritten to a laconic schema mapping.
- A. Bonifati, E. Q. Chang, T. Ho, L. V. Lakshmanan, and R. Pottinger. HePToX: Marrying XML and Heterogeneity in Your P2P Databases. In VLDB (demo), pages 1267--1270, 2006. Google ScholarDigital Library
- A. K. Chandra and P. M. Merlin. Optimal Implementation of Conjunctive Queries in Relational Data Bases. In STOC, pages 77--90, 1977. Google ScholarDigital Library
- L. Chiticariu. Computing the Core in Data Exchange: Algorithmic Issues. MS project report. CS Dept., UCSC, 2005.Google Scholar
- J. V. den Bussche and D. V. Gucht. A semideterministic approach to object creation and nondeterminism in database queries. J. Comput. Syst. Sci., 54(1):34--47, 1997.Google ScholarDigital Library
- R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data Exchange: Semantics and Query Answering. TCS, 336(1):89--124, 2005. Google ScholarDigital Library
- R. Fagin, P. G. Kolaitis, A. Nash, and L. Popa. Towards a Theory of Schema-mapping Optimization. In PODS, pages 33--42, 2008. Google ScholarDigital Library
- R. Fagin, P. G. Kolaitis, and L. Popa. Data Exchange: Getting to the Core. TODS, 30(1):174--210, 2005. Google ScholarDigital Library
- A. Fuxman, M. A. Hernández, H. Ho, R. J. Miller, P. Papotti, and L. Popa. Nested Mappings: Schema Mapping Reloaded. In VLDB, pages 67--78, 2006. Google ScholarDigital Library
- G. Gottlob and A. Nash. Efficient Core Computation in Data Exchange. JACM, 55(2):1--49, 2008. Google ScholarDigital Library
- L. M. Haas, M. A. Hernández, H. Ho, L. Popa, and M. Roth. Clio Grows Up: From Research Prototype to Industrial Tool. In SIGMOD, pages 805--810, 2005. Google ScholarDigital Library
- M. Lenzerini. Data Integration: A Theoretical Perspective. In PODS, pages 233--246, 2002. Google ScholarDigital Library
- G. Mecca, P. Papotti, and S. Raunich. Core schema mappings. In SIGMOD, 2009. Google ScholarDigital Library
- R. Pottinger and A. Halevy. MiniCon: A Scalable Algorithm for Answering Queries using Views. VLDB Journal, 10(2--3):182--198, 2001. Google ScholarDigital Library
- N. C. Shu, B. C. Housel, R. W. Taylor, S. P. Ghosh, and V. Y. Lum. EXPRESS: A Data EXtraction, Processing, amd REStructuring System. TODS, 2(2):134--174, 1977. Google ScholarDigital Library
- B. ten Cate and P. G. Kolaitis. Structural Characterizations of Schema Mapping Languages. In ICDT, 2009. Google ScholarDigital Library
Index Terms
- Laconic schema mappings: computing the core with SQL queries
Recommendations
Composing schema mappings: second-order dependencies to the rescue
PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsA schema mapping is a specification that describes how data structured under one schema (the source schema) is to be transformed into data structured under a different schema (the target schema). Schema mappings play a key role in numerous areas of ...
Composing schema mappings: Second-order dependencies to the rescue
Special Issue: SIGMOD/PODS 2004A schema mapping is a specification that describes how data structured under one schema (the source schema) is to be transformed into data structured under a different schema (the target schema). A fundamental problem is composing schema mappings: given ...
Quasi-inverses of schema mappings
Schema mappings are high-level specifications that describe the relationship between two database schemas. Two operators on schema mappings, namely the composition operator and the inverse operator, are regarded as especially important. Progress on the ...
Comments