Abstract
In this article, we introduce and study a framework, called peer data exchange, for sharing and exchanging data between peers. This framework is a special case of a full-fledged peer data management system and a generalization of data exchange between a source schema and a target schema. The motivation behind peer data exchange is to model authority relationships between peers, where a source peer may contribute data to a target peer, specified using source-to-target constraints, and a target peer may use target-to-source constraints to restrict the data it is willing to receive, but cannot modify the data of the source peer.A fundamental algorithmic problem in this framework is that of deciding the existence of a solution: given a source instance and a target instance for a fixed peer data exchange setting, can the target instance be augmented in such a way that the source instance and the augmented target instance satisfy all constraints of the setting? We investigate the computational complexity of the problem for peer data exchange settings in which the constraints are given by tuple generating dependencies. We show that this problem is always in NP, and that it can be NP-complete even for “acyclic” peer data exchange settings. We also show that the data complexity of the certain answers of target conjunctive queries is in coNP, and that it can be coNP-complete even for “acyclic” peer data exchange settings.After this, we explore the boundary between tractability and intractability for deciding the existence of a solution and for computing the certain answers of target conjunctive queries. To this effect, we identify broad syntactic conditions on the constraints between the peers under which the existence-of-solutions problem is solvable in polynomial time. We also identify syntactic conditions between peer data exchange settings and target conjunctive queries that yield polynomial-time algorithms for computing the certain answers. For both problems, these syntactic conditions turn out to be tight, in the sense that minimal relaxations of them lead to intractability. Finally, we introduce the concept of a universal basis of solutions in peer data exchange and explore its properties.
- Abiteboul, S. and Duschka, O. M. 1998. Complexity of answering queries using materialized views. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS). 254--263.]] Google ScholarDigital Library
- Arenas, M., Bertossi, L., and Chomicki, J. 1999. Consistent query answers in inconsistent databases. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS). 68--79.]] Google ScholarDigital Library
- Beeri, C. and Vardi, M. 1984. A proof procedure for data dependencies. J. Assoc. Comput. Mach. 31, 4, 718--741.]] Google ScholarDigital Library
- Bernstein, P., Giunchiglia, F., Kementsietsidis, A., Mylopoulos, J., Serafini, L., and Zaihrayeu, I. 2002. Data management for peer-to-peer computing: A vision. In Proceedings of the International Workshop on the Web and Databases (WebDB). 89--94.]]Google Scholar
- Bertossi, L. and Bravo, L. 2004. Query answering in peer-to-peer data exchange systems. In Proceedings of the EDBT Workshop on Peer-to-Peer Computing and Databases. 476--485.]]Google Scholar
- Calì, A., Calvanese, D., De Giacomo, G., and Lenzerini, M. 2004. Data integration under integrity constraints. Inform. Syst. 29, 147--163.]] Google ScholarDigital Library
- Calì, A., Lembo, D., and Rosati, R. 2003. On the decidability and complexity of query answering over inconsistent and incomplete databases. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS). 260--271.]] Google ScholarDigital Library
- Calvanese, D., Damaggio, E., De Giacomo, G., Lenzerini, M., and Rosati, R. 2004a. Semantic data integration in P2P systems. In Databases, Information Systems, and Peer-to-Peer Computing. Lecture Notes in Computer Science, vol. 2944. Springer-Verlag, Berlin, Germany, 77--90.]]Google Scholar
- Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., and Rosati, R. 2005. Inconsistency tolerance in P2P data integration: An epistemic logic approach. In Database Programming Languages. Lecture Notes in Computer Science, vol. 3774. Springer-Verlag, Berlin, Germany, 90--105.]] Google ScholarDigital Library
- Calvanese, D., De Giacomo, G., Lenzerini, M., and Rosati, R. 2004b. Logical foundations of peer-to-peer data integration. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS). 241--251.]] Google ScholarDigital Library
- Chandra, A. K. and Merlin, P. M. 1977. Optimal implementation of conjunctive queries in relational data bases. In Proceedings of the ACM Symposium on Theory of Computing (STOC). 77--90.]] Google ScholarDigital Library
- Deutsch, A. and Tannen, V. 2003. Reformulation of XML queries and constraints. In Proceedings of the International Conference on Database Theory (ICDT). 225--241.]] Google ScholarDigital Library
- Fagin, R., Kolaitis, P. G., Miller, R. J., and Popa, L. 2003. Data exchange: Semantics and query answering. In Proceedings of the International Conference on Database Theory (ICDT). 207--224.]] Google ScholarDigital Library
- Fagin, R., Kolaitis, P. G., Miller, R. J., and Popa, L. 2005a. Data exchange: Semantics and query answering. Theoret. Comput. Sci. (special issue with selected papers from ICDT 2003) 336, 1, 89--124.]] Google ScholarCross Ref
- Fagin, R., Kolaitis, P. G., and Popa, L. 2005b. Data exchange: Getting to the core. ACM Trans. Database Syst. (special issue with selected papers from PODS 2003) 30, 1, 174--210.]] Google ScholarDigital Library
- Franconi, E., Kuper, G., Lopatenko, A., and Serafini, L. 2003. A robust logical and computational characterisation of peer-to-peer database systems. In Proceedings of the VLDB Workshop on Databases, Information Systems and Peer-to-Peer Computing.]]Google Scholar
- Franconi, E., Kuper, G., Lopatenko, A., and Zaihrayeu, I. 2004. The coDB robust peer-to-peer database system. In Proceedings of the Symposium on Advanced Database Systems. 382--393.]]Google Scholar
- Grahne, G. 1991. The Problem of Incomplete Information in Relational Databases. Lecture Notes in Computer Science, vol. 554. Springer-Verlag, Berlin, Germany.]] Google ScholarDigital Library
- Grahne, G. and Mendelzon, A. 1999. Tableau techniques for querying information sources through global schemas. In Proceedings of the International Conference on Database Theory (ICDT). 332--347.]] Google ScholarDigital Library
- Halevy, A., Ives, Z., Suciu, D., and Tatarinov, I. 2005. Schema mediation for large-scale semantic data sharing. VLDB J. 14, 1, 68--83.]] Google ScholarDigital Library
- Kolaitis, P. G., Panttaja, J., and Tan, W. 2006. The complexity of data exchange. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS). To appear.]] Google ScholarDigital Library
- Lenzerini, M. 2002. Data integration: A theoretical perspective. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS). 233--246.]] Google ScholarDigital Library
- Li, C. 2004. Raccoon: A peer-based system for data integration and sharing. In Proceedings of the International Conference on Data Engineering (ICDE). 852. (System Demonstration).]] Google ScholarDigital Library
- Nash, A., Deutsch, A., and Remmel, J. 2006. Data exchange, data integration, and chase. Tech. rep. CS2006-0859, University of California, San Diego, San Diego, CA.]]Google Scholar
- O'Donovan, C., Martin, M. J., Gattiker, A., Gasteiger, E., Bairoch, A., and Apweiler, R. 2002. High-quality protein knowledge resource: Swiss-prot and trembl. Briefings Bioinformat. 3, 3, 275--284.]]Google ScholarCross Ref
- Popa, L., Velegrakis, Y., Miller, R. J., Hernández, M. A., and Fagin, R. 2002. Translating Web data. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 598--609.]]Google ScholarDigital Library
- Tatarinov, I. 2004. Semantic data sharing with a peer data management system. Ph.D. dissertation. University of Washington, Seattle, Washington.]] Google ScholarDigital Library
- Tatarinov, I. and Halevy, A. Y. 2004. Efficient query reformulation in peer data management systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 539--550.]] Google ScholarDigital Library
- van der Meyden, R. 1998. Logical approaches to incomplete information: A survey. In Logics for Databases and Information Systems, J. Chomicki and G. Saake, Eds. Kluwer, Dordrecht, The Netherlands, 307--356.]] Google ScholarDigital Library
Index Terms
- Peer data exchange
Recommendations
Data exchange: getting to the core
Special Issue: SIGMOD/PODS 2003Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. Given a source instance, there may be many solutions to the data exchange ...
Composing schema mappings: Second-order dependencies to the rescue
Special Issue: SIGMOD/PODS 2004A schema mapping is a specification that describes how data structured under one schema (the source schema) is to be transformed into data structured under a different schema (the target schema). A fundamental problem is composing schema mappings: given ...
Data exchange: computing cores in polynomial time
PODS '06: Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsData exchange deals with inserting data from one database into another database having a different schema. We study and solve a central computational problem of data exchange, namely, computing the core of a universal solution to a data exchange ...
Comments