ABSTRACT
Data Exchange is the problem of inserting data structured under a source schema into a target schema of different structure (possibly with integrity constraints), while reflecting the source data as accurately as possible. We study computational issues related to data exchange in the setting of Fagin, Kolaitis, and Popa(PODS'03). We use the technique of hypertree decompositions to derive improved algorithms for computing the core of a relational instance with labeled nulls, a problem we show to be fixed-parameter intractable with respect to the block size of the input instances. We show that computing the core of a data exchange problem is tractable for two large and useful classes of target constraints. The first class includes functional dependencies and weakly acyclic inclusion dependencies. The second class consists of full tuple generating dependencies and arbitrary equation generating dependencies. Finally, we show that computing cores is NP-hard in presence of a system-predicate NULL(x), which is true iff x is a null value.
- S. Abiteboul, R. Hull, V. Vianu. Foundations of Databases, Addison-Wesley Publishing Company, 1995. Google ScholarDigital Library
- A. Aho, J. Hopcroft, and J. Ullman. Data Structures and Algorithms. Addison-Wesley, Reading MA, 1983. Google ScholarDigital Library
- A. Aho, J. Sagiv, and J. Ullman. Equivalence of Relational Expressions. SIAM J. Comput. 8:2, 1979.Google ScholarDigital Library
- A. Aho, J. Sagiv, and J. Ullman. Efficient Optimization of a Class of Relational Expressions. TODS 4:4, 1979. Google ScholarDigital Library
- M. Arenas, P. Barceló, R. Fagin, and L. Libkin, Locally Consistent Transformations and Query Answering in Data Exchange. Proc. ACM PODS 2004, pp. 229--240. Google ScholarDigital Library
- P. Atzeni and E. P. F. Chan. Independent Database Schemes under Functional and Inclusion Dependencies. Acta Informatica 28:8, pp. 777--779, 1991. Google ScholarDigital Library
- C. Beeri and M. Vardi. A Proof Procedure for Data Dependencies. J.ACM. 31:4, PP. 718--741, 1984. Google ScholarDigital Library
- M. A. Casanova, R. Fagin, and C. H. Papadimitriou. Inclusion Dependencies and Their Interaction with Functional Dependencies. JCSS 28:1, 1984.Google Scholar
- A. K. Chandra and P. M. Merlin. Optimal Implementation of Conjunctive Queries in Relational Databases. STOC 1977. Google ScholarDigital Library
- C. Chekuri and A. Rajaraman. Conjunctive Query Containment Revisited. TCS 239:2, 2000. Google ScholarDigital Library
- R. G. Downey and M. R. Fellows. Parameterized Complexity. Springer, 1999. Google ScholarCross Ref
- R. G. Downey and M. R. Fellows and U. Tailor. The parameterized Complexity of Relational Database Queries and an Improved Characterization of W{1}. In D. S. Bridges et al., editors, Combinatorics, Complexity, and Logic -- Proc. DMTCS'96.Google Scholar
- R. Fagin, Ph. Kolaitis, R. Miller, and L. Popa, Data Exchange: Semantics and Query Answering, ICDT'03, pp. 207--224. Full version to appear in TCS. Google ScholarDigital Library
- R. Fagin, Ph. Kolaitis, and L. Popa, Data Exchange: Getting to the Core. In Proc. PODS'03, pp. 90--101, 2003. Full version to appear in ACM TODS. Google ScholarDigital Library
- J. Flum, M. Frick, and M. Grohe. Query Evaluation via Tree-Decompositions. J. ACM49:6, 2002. Google ScholarDigital Library
- G. Gottlob. Computing Cores for Data Exchange: New Algorithms and Practical Solutions. Extended version of the present paper. Currently available at: www.dbai.tuwien.ac.at/staff/gottlob/extcore.pdfGoogle Scholar
- G. Gottlob and A. Leitsch. On the efficiency of Subsumption Algorithms. J.ACM 32:2, 1985. Google ScholarDigital Library
- G. Gottlob, N. Leone, and F. Scarcello. Hypertree Decompositions and Tractable queries. Journal of Computer and System Sciences 64:3, pp. 579--627, 2002.Google ScholarDigital Library
- G. Gottlob, N. Leone, and F. Scarcello. The Complexity of Acyclic Conjunctive Queries. J.ACM 48:3, pp. 431--498, 2001. Google ScholarDigital Library
- G. Gottlob, N. Leone, and F. Scarcello. Robbers, Marshals, and Guards: Game Theoretic and Logical Characterizations of Hypertree Width. Journal of Computer and System Sciences 66(4): 775--808, 2003. Google ScholarDigital Library
- G. Gottlob, N. Leone, and F. Scarcello. A Comparison of Structural CSP Decomposition Methods. Artificial Intelligence 124:2, pp. 243--282, 2000. Google ScholarDigital Library
- G. Gottlob, R. Pichler. Hypergraphs in Model Checking: Acyclicity in Hypertree-Width versus Clique-Width. SIAM J. Comput., 33:2, 2004. Google ScholarDigital Library
- C. Gutiérrez, C. Hurtado, and A. Mendelzon. Foundations of Semantic Web Databases. ACM PODS 2004, pp. 95--106, 2004. Google ScholarDigital Library
- M. Grohe. Parameterized Complexity for the Database Theorist. ACM SIGMOD Record, 31:4, December 2002, pp. 86--96, 2002. Google ScholarDigital Library
- T. Imielinski. Abstraction in Query Processing. Journal of the ACM 38:3, pp. 534--558, 1991. Google ScholarDigital Library
- Ph. Kolaitis. Data Exchange: Schema Mappings, Data Exchange, and Metadata Management.PODS'05. Google ScholarDigital Library
- Ph. G. Kolaitis, M. Y. Vardi. Conjunctive-Query Containment and Constraint Satisfaction. Journal of Computer and System Sciences 61:2, pp. 302--332, 2000. Google ScholarDigital Library
- M. Levene, G. Loizou. How to Prevent Interaction of Functional and Inclusion Dependencies. Information Processing Letters 71:3--4, pp. 115--125, 1999.Google ScholarCross Ref
- M. Levene, G. Loizou. Guaranteeing no interaction between functional dependencies and tree-like inclusion dependencies. TCS 254:1--2, 2001. Google ScholarDigital Library
- A. Y. Levy, A. O. Mendelzon, Y. Sagiv, D. Srivastava. Answering Queries Using Views. PODS 1995. pp. 95--104, 1995. Google ScholarDigital Library
- H. Mannila and K.-J. Räihä. The Design of Relational Datases. Addison Wesley 1992. Google ScholarDigital Library
- D. Maier, A. Mendelzon, and Y. Sagiv. Testing Implication of Data Dependencies, ACM TODS 4:4, pp. 455--469, 1979. Google ScholarDigital Library
- R. Miller, L. Haas, and M. Hernández. Schema Mapping as Query Discovery. VLDB 2000. Google ScholarDigital Library
- L. Popa, Y. Velegrakis, R. Miller, M. Hernández, and R. Fagin. Translating Web Data. VLDB 2002. Google ScholarDigital Library
- C.H. Papadimitriou and M. Yannakakis. On the Complexity of Database Queries. Journal of Computers and System Sciences, 58:407--427, 1999. Google ScholarDigital Library
- X. Qian. Query Folding. ICDE 1996. Google ScholarDigital Library
- N. Robertson and P. D. Seymour. Graph Minors II. Algorithmic Aspects of Tree-Width. J. Algorithms, 7:309--322, 1986.Google Scholar
- F. Scarcello, G. Greco, and N. Leone. Weighted Hypertree. Decompositions and Optimal Query Plans. PODS 2004. Google ScholarDigital Library
- J.D. Ullman. Principles of Database and Knowledge Base Systems, Computer Science Press, Rockville, MD, vol. I, 1988, and vol. II, 1989. Google ScholarDigital Library
- M. Yannakakis. Algorithms for Acyclic Database Schemes. VLDB 1981, pp. 82--94.Google Scholar
Recommendations
Data exchange: getting to the core
Special Issue: SIGMOD/PODS 2003Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. Given a source instance, there may be many solutions to the data exchange ...
Data exchange: getting to the core
PODS '03: Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsData exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. Given a source instance, there may be many solutions to the data exchange ...
Data exchange: computing cores in polynomial time
PODS '06: Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsData exchange deals with inserting data from one database into another database having a different schema. We study and solve a central computational problem of data exchange, namely, computing the core of a universal solution to a data exchange ...
Comments