skip to main content
research-article

Reverse data exchange: Coping with nulls

Published:02 June 2011Publication History
Skip Abstract Section

Abstract

An inverse of a schema mapping M is intended to undo what M does, thus providing a way to perform reverse data exchange. In recent years, three different formalizations of this concept have been introduced and studied, namely the notions of an inverse of a schema mapping, a quasi-inverse of a schema mapping, and a maximum recovery of a schema mapping. The study of these notions has been carried out in the context in which source instances are restricted to consist entirely of constants, while target instances may contain both constants and labeled nulls. This restriction on source instances is crucial for obtaining some of the main technical results about these three notions, but, at the same time, limits their usefulness, since reverse data exchange naturally leads to source instances that may contain both constants and labeled nulls.

We develop a new framework for reverse data exchange that supports source instances that may contain nulls, and we thereby overcome the semantic mismatch between source and target instances of the previous formalizations. The development of this new framework requires a careful reformulation of all the important notions, including the notions of the identity schema mapping, inverse, and maximum recovery. To this effect, we introduce the notions of extended identity schema mapping, extended inverse, and maximum extended recovery, by making systematic use of the homomorphism relation on instances. We give results concerning the existence of extended inverses and of maximum extended recoveries, and results concerning their applications to reverse data exchange and query answering. Moreover, we show that maximum extended recoveries can be used to capture in a quantitative way, the amount of information loss embodied in a schema mapping specified by source-to-target tuple-generating dependencies.

References

  1. Abiteboul, S., Hull, R., and Vianu, V. 1995. Foundations of Databases. Addison-Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Afrati, F., Li, C., and Pavlaki, V. 2008. Data exchange: Query answering on incomplete data sources. In Proceedings of the International ICST Conference on Scalable Information Systems (InfoScale). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Arenas, M., Pérez, J., Reutter, J. L., and Riveros, C. 2009a. Composition and inversion of schema mappings. SIGMOD Record 38, 3, 17--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Arenas, M., Pérez, J., Reutter, J. L., and Riveros, C. 2009b. Inverting schema mappings: Bridging the gap between theory and practice. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 1018--1029. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Arenas, M., Pérez, J., Reutter, J. L., and Riveros, C. 2010. Foundations of schema mapping management. In Proceedings of ACM Symposium on Principles of Database Systems (PODS). 227--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Arenas, M., Pérez, J., and Riveros, C. 2009. The recovery of a schema mapping: Bringing exchanged data back. ACM Trans. Data. Syst. 34, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Beeri, C. and Vardi, M. Y. 1984. A proof procedure for data dependencies. J. ACM 31, 4, 718--741. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bernstein, P. A. 2003. Applying model management to classical meta-data problems. In Proceedings of the Conference on Innovative Data Systems Research (CIDR). 209--220.Google ScholarGoogle Scholar
  9. Bernstein, P. A., Green, T. J., Melnik, S., and Nash, A. 2008. Implementing mapping composition. VLDB J. 17, 2, 333--353. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Deutsch, A. and Tannen, V. 2001. Optimization properties for classes of conjunctive regular path queries. In Proceedings of the International Workshop on Database Programming Languages (DBPL). 21--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Fagin, R. 2007. Inverting schema mappings. ACM Trans. Data. Syst. 32, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Fagin, R., Kolaitis, P. G., Miller, R. J., and Popa, L. 2005a. Data exchange: Semantics and query answering. Theor. Comput. Sci. 336, 1, 89--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Fagin, R., Kolaitis, P. G., Popa, L., and Tan, W.-C. 2005b. Composing schema mappings: Second-order dependencies to the rescue. ACM Trans. Data. Syst. 30, 4, 994--1055. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Fagin, R., Kolaitis, P. G., Popa, L., and Tan, W. C. 2008. Quasi-inverses of schema mappings. ACM Trans. Data. Syst. 33, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Fagin, R., Kolaitis, P. G., Popa, L., and Tan, W.-C. 2009. Reverse data exchange: Coping with nulls. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS). 23--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Fagin, R., Kolaitis, P. G., Popa, L., and Tan, W.-C. 2011. Schema mapping evolution through composition and inversion. In Schema Matching and Mapping, Z. Bellahsene and A. Bonifati and E. Rahm, Ed. Springer, 191--222.Google ScholarGoogle Scholar
  17. Fagin, R. and Nash, A. 2010. The structure of inverses in schema mappings. J. ACM 57, 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Fuxman, A., Hernández, M. A., Ho, C. T. H., Miller, R. J., Papotti, P., and Popa, L. 2006. Nested mappings: Schema mapping reloaded. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 67--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Imieliński, T. and Lipski, Jr., W. 1983. Incomplete information and dependencies in relational databases. In Proceedings of the ACM Symposium on Management of Data (SIGMOD). 178--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Imieliński, T. and Lipski, Jr., W. 1984. Incomplete information in relational databases. J. ACM 31, 4, 761--791. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Lenzerini, M. 2002. Data integration: A theoretical perspective. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS). 233--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Madhavan, J. and Halevy, A. Y. 2003. Composing mappings among data sources. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 572--583. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Melnik, S. 2004. Generic Model Management: Concepts and Algorithms. Lecture Notes in Computer Science, vol. 2967, Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Nash, A., Bernstein, P. A., and Melnik, S. 2005. Composition of mappings given by embedded dependencies. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS). 172--183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Popa, L., Velegrakis, Y., Miller, R. J., Hernández, M. A., and Fagin, R. 2002. Translating Web data. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 598--609. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Reverse data exchange: Coping with nulls

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Database Systems
            ACM Transactions on Database Systems  Volume 36, Issue 2
            May 2011
            257 pages
            ISSN:0362-5915
            EISSN:1557-4644
            DOI:10.1145/1966385
            Issue’s Table of Contents

            Copyright © 2011 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 2 June 2011
            • Accepted: 1 February 2011
            • Revised: 1 January 2010
            • Received: 1 October 2009
            Published in tods Volume 36, Issue 2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader