ABSTRACT
A schema mapping is a high-level specification of the relationship between two database schemas. For the past fifteen years, schema mappings have played an essential role in the modeling and analysis of data exchange, data integration, and related data inter-operability tasks. The aim of this talk is to critically reflect on the body of work carried out to date, describe some of the persisting challenges, and suggest directions for future work. The first part of the talk will focus on schema-mapping languages, especially on the language of GLAV (global-and-local as view) mappings and its two main sublanguages, the language of GAV (global-as-view) mappings and the language of LAV (local-as-view) mappings. After highlighting the fundamental structural properties of these languages, we will discuss how structural properties can actually characterize schema-mapping languages. The second part of the talk will focus on metadata management by considering operators on schema mappings, such as the composition operator and the inverse operator. We will discuss why richer languages are needed to express these operators, and will illustrate some of their uses in schema-mapping evolution. The third and final part of the talk will focus on the derivation of schema mappings from semantic information. In particular, we will discuss a variety of approaches for deriving schema mappings from data examples, including casting the derivation of schema mappings as an optimization problem and as a learning problem.
- Foto N. Afrati and Phokion G. Kolaitis. 2008. Answering aggregate queries in data exchange, See citeNDBLP:conf/pods/2008, 129--138. Google ScholarDigital Library
- Foto N. Afrati, Chen Li, and Vassia Pavlaki. 2008. Data exchange in the presence of arithmetic comparisons EDBT 2008, 11th International Conference on Extending Database Technology, Nantes, France, March 25--29, 2008, Proceedings (ACM International Conference Proceeding Series), Alfons Kemper, Patrick Valduriez, Noureddine Mouaddib, Jens Teubner, Mokrane Bouzeghoub, Volker Markl, Laurent Amsaleg, and Ioana Manolescu (Eds.), Vol. Vol. 261. ACM, 487--498. Google ScholarDigital Library
- Bogdan Alexe, Balder SortNoopCateten Cate, Phokion G. Kolaitis, and Wang Chiew Tan. 2011 a. Characterizing schema mappings via data examples. ACM Trans. Database Syst. Vol. 36, 4 (2011), 23:1--23:48. Google ScholarDigital Library
- Bogdan Alexe, Balder SortNoopCateten Cate, Phokion G. Kolaitis, and Wang Chiew Tan. 2011 b. Designing and refining schema mappings via data examples Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12--16, 2011, Timos K. Sellis, Renée J. Miller, Anastasios Kementsietsidis, and Yannis Velegrakis (Eds.). ACM, 133--144. Google ScholarDigital Library
- Bogdan Alexe, Balder SortNoopCateten Cate, Phokion G. Kolaitis, and Wang Chiew Tan. 2011 c. EIRENE: Interactive Design and Refinement of Schema Mappings via Data Examples. PVLDB Vol. 4, 12 (2011), 1414--1417. http://www.vldb.org/pvldb/vol4/p1414-alexe.pdfGoogle ScholarDigital Library
- Bogdan Alexe, Laura Chiticariu, Renée J. Miller, and Wang Chiew Tan. 2008 a. Muse: Mapping Understanding and deSign by Example. In Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, April 7--12, 2008, Cancún, México, Gustavo Alonso, José A. Blakeley, and Arbee L. P. Chen (Eds.). IEEE Computer Society, 10--19. Google ScholarDigital Library
- Bogdan Alexe, Wang Chiew Tan, and Yannis Velegrakis. 2008 b. STBenchmark: towards a benchmark for mapping systems. PVLDB Vol. 1, 1 (2008), 230--244. http://www.vldb.org/pvldb/1/1453886.pdf Google ScholarDigital Library
- Shun'ichi Amano, Claire David, Leonid Libkin, and Filip Murlak. 2014. XML Schema Mappings: Data Exchange and Metadata Management. J. ACM Vol. 61, 2 (2014), 12:1--12:48. Google ScholarDigital Library
- Marcelo Arenas, Pablo Barceló, Ronald Fagin, and Leonid Libkin. 2013 a. Solutions and query rewriting in data exchange. Inf. Comput. Vol. 228 (2013), 28--61. Google ScholarDigital Library
- Marcelo Arenas, Pablo Barceló, Leonid Libkin, and Filip Murlak. 2014. Foundations of Data Exchange. Cambridge University Press. http://www.cambridge.org/9781107016163 Google ScholarDigital Library
- Marcelo Arenas, Pablo Barceló, and Juan L. Reutter. 2011 a. Query Languages for Data Exchange: Beyond Unions of Conjunctive Queries. Theory Comput. Syst. Vol. 49, 2 (2011), 489--564. Google ScholarDigital Library
- Marcelo Arenas, Ronald Fagin, and Alan Nash. 2011 b. Composition with Target Constraints. Logical Methods in Computer Science Vol. 7, 3 (2011).Google Scholar
- Marcelo Arenas and Leonid Libkin. 2008. XML data exchange: Consistency and query answering. J. ACM Vol. 55, 2 (2008), 7:1--7:72. Google ScholarDigital Library
- Marcelo Arenas, Jorge Pérez, and Juan L. Reutter. 2013 b. Data exchange beyond complete data. J. ACM Vol. 60, 4 (2013), 28:1--28:59. Google ScholarDigital Library
- Marcelo Arenas, Jorge Pérez, Juan L. Reutter, and Cristian Riveros. 2009 b. Composition and inversion of schema mappings. SIGMOD Record Vol. 38, 3 (2009), 17--28. Google ScholarDigital Library
- Marcelo Arenas, Jorge Pérez, Juan L. Reutter, and Cristian Riveros. 2013 c. The language of plain SO-tgds: Composition, inversion and structural properties. J. Comput. Syst. Sci. Vol. 79, 6 (2013), 763--784. Google ScholarDigital Library
- Marcelo Arenas, Jorge Pérez, and Cristian Riveros. 2009 a. The recovery of a schema mapping: Bringing exchanged data back. ACM Trans. Database Syst. Vol. 34, 4 (2009), 22:1--22:48. Google ScholarDigital Library
- Patricia C. Arocena, Ariel Fuxman, and Renée J. Miller. 2010. Composing local-as-view mappings: closure and applications Database Theory - ICDT 2010, 13th International Conference, Lausanne, Switzerland, March 23--25, 2010, Proceedings (ACM International Conference Proceeding Series), Luc Segoufin (Ed.). ACM, 209--218. Google ScholarDigital Library
- Patricia C. Arocena, Boris Glavic, Radu Ciucanu, and Renée J. Miller. 2015. The iBench Integration Metadata Generator. PVLDB Vol. 9, 3 (2015), 108--119. http://www.vldb.org/pvldb/vol9/p108-arocena.pdf Google ScholarDigital Library
- Patricia C. Arocena, Boris Glavic, and Renée J. Miller. 2013. Value invention in data exchange. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22--27, 2013, Kenneth A. Ross, Divesh Srivastava, and Dimitris Papadias (Eds.). ACM, 157--168. Google ScholarDigital Library
- Michael Benedikt, George Konstantinidis, Giansalvatore Mecca, Boris Motik, Paolo Papotti, Donatello Santoro, and Efthymia Tsamoura. 2017. Benchmarking the Chase. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14--19, 2017, Emanuel Sallinger, Jan Van den Bussche, and Floris Geerts (Eds.). ACM, 37--52. Google ScholarDigital Library
- Philip A. Bernstein. 2003. Applying Model Management to Classical Meta Data Problems CIDR 2003, First Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 5--8, 2003, Online Proceedings. www.cidrdb.org. http://www-db.cs.wisc.edu/cidr/cidr2003/program/p19.pdfGoogle Scholar
- Philip A. Bernstein, Todd J. Green, Sergey Melnik, and Alan Nash. 2008. Implementing mapping composition. VLDB J. Vol. 17, 2 (2008), 333--353. Google ScholarDigital Library
- Philip A. Bernstein and Laura M. Haas. 2008. Information integration in the enterprise. Commun. ACM Vol. 51, 9 (2008), 72--79. Google ScholarDigital Library
- Philip A. Bernstein and Sergey Melnik. 2007. Model management 2.0: manipulating richer mappings Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12--14, 2007, Chee Yong Chan, Beng Chin Ooi, and Aoying Zhou (Eds.). ACM, 1--12. Google ScholarDigital Library
- Angela Bonifati, Ugo Comignani, Emmanuel Coquery, and Romuald Thion. 2017. Interactive Mapping Specification with Exemplar Tuples Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14--19, 2017, Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu (Eds.). ACM, 667--682. Google ScholarDigital Library
- Douglas Burdick, Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang-Chiew Tan. 2016. A Declarative Framework for Linking Entities. ACM Trans. Database Syst. Vol. 41, 3 (2016), 17:1--17:38. Google ScholarDigital Library
- Balder SortNoopCateten Cate, V'ıctor Dalmau, and Phokion G. Kolaitis. 2013 a. Learning schema mappings. ACM Trans. Database Syst. Vol. 38, 4 (2013), 28:1--28:31. Google ScholarDigital Library
- Balder SortNoopCateten Cate, Richard L. Halpert, and Phokion G. Kolaitis. 2016. Exchange-Repairs - Managing Inconsistency in Data Exchange. J. Data Semantics Vol. 5, 2 (2016), 77--97.Google ScholarCross Ref
- Balder SortNoopCateten Cate and Phokion G. Kolaitis. 2010. Structural characterizations of schema-mapping languages. Commun. ACM Vol. 53, 1 (2010), 101--110. Google ScholarDigital Library
- Balder SortNoopCateten Cate and Phokion G. Kolaitis. 2014. Schema Mappings: A Case of Logical Dynamics in Database Theory. In Johan van Benthem on Logic and Information Dynamics, Alexandru Baltag and Sonja Smets (Eds.). Springer, 67--100.Google Scholar
- Balder SortNoopCateten Cate, Phokion G. Kolaitis, and Walied Othman. 2013 b. Data exchange with arithmetic operations. In Joint 2013 EDBT/ICDT Conferences, EDBT '13 Proceedings, Genoa, Italy, March 18--22, 2013, Giovanna Guerrini and Norman W. Paton (Eds.). ACM, 537--548. Google ScholarDigital Library
- Balder SortNoopCateten Cate, Phokion G. Kolaitis, Kun Qian, and Wang-Chiew Tan. 2017. Approximation Algorithms for Schema-Mapping Discovery from Data Examples. ACM Trans. Database Syst. Vol. 42, 2 (2017), 12:1--12:41. Google ScholarDigital Library
- Balder SortNoopCateten Cate, Phokion G. Kolaitis, Kun Qian, and Wang-Chiew Tan. 2018. Active Learning of GAV Schema Mappings. (2018). In this proceedings.Google Scholar
- Laura Chiticariu, Phokion G. Kolaitis, and Lucian Popa. 2008. Interactive generation of integrated schemas. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10--12, 2008, Jason Tsong-Li Wang (Ed.). ACM, 833--846. Google ScholarDigital Library
- Ronald Fagin. 2007. Inverting schema mappings. ACM Trans. Database Syst. Vol. 32, 4 (2007), 25. Google ScholarDigital Library
- Ronald Fagin, Laura M. Haas, Mauricio A. Hernández, Renée J. Miller, Lucian Popa, and Yannis Velegrakis. 2009. Clio: Schema Mapping Creation and Data Exchange. In Conceptual Modeling: Foundations and Applications - Essays in Honor of John Mylopoulos (Lecture Notes in Computer Science), Alexander Borgida, Vinay K. Chaudhri, Paolo Giorgini, and Eric S. K. Yu (Eds.), Vol. Vol. 5600. Springer, 198--236. Google ScholarDigital Library
- Ronald Fagin, Benny Kimelfeld, and Phokion G. Kolaitis. 2011 a. Probabilistic data exchange. J. ACM Vol. 58, 4 (2011), 15:1--15:55. Google ScholarDigital Library
- Ronald Fagin, Phokion G. Kolaitis, Renée J. Miller, and Lucian Popa. 2005 b. Data exchange: semantics and query answering. Theor. Comput. Sci. Vol. 336, 1 (2005), 89--124. Google ScholarDigital Library
- Ronald Fagin, Phokion G. Kolaitis, Alan Nash, and Lucian Popa. 2008 a. Towards a theory of schema-mapping optimization, See citeNDBLP:conf/pods/2008, 33--42. Google ScholarDigital Library
- Ronald Fagin, Phokion G. Kolaitis, and Lucian Popa. 2005 a. Data exchange: getting to the core. ACM Trans. Database Syst. Vol. 30, 1 (2005), 174--210. Google ScholarDigital Library
- Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang Chiew Tan. 2005 c. Composing schema mappings: Second-order dependencies to the rescue. ACM Trans. Database Syst. Vol. 30, 4 (2005), 994--1055. Google ScholarDigital Library
- Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang Chiew Tan. 2008 b. Quasi-inverses of schema mappings. ACM Trans. Database Syst. Vol. 33, 2 (2008), 11:1--11:52. Google ScholarDigital Library
- Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang Chiew Tan. 2011 b. Reverse data exchange: Coping with nulls. ACM Trans. Database Syst. Vol. 36, 2 (2011), 11:1--11:42. Google ScholarDigital Library
- Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang Chiew Tan. 2011 c. Schema Mapping Evolution Through Composition and Inversion. In Schema Matching and Mapping, Zohra Bellahsene, Angela Bonifati, and Erhard Rahm (Eds.). Springer, 191--222.Google Scholar
- Ronald Fagin and Alan Nash. 2010. The structure of inverses in schema mappings. J. ACM Vol. 57, 6 (2010), 31:1--31:57. Google ScholarDigital Library
- Ariel Fuxman, Phokion G. Kolaitis, Renée J. Miller, and Wang Chiew Tan. 2006. Peer data exchange. ACM Trans. Database Syst. Vol. 31, 4 (2006), 1454--1498. Google ScholarDigital Library
- Georg Gottlob. 2005. Computing cores for data exchange: new algorithms and practical solutions, See citeNDBLP:conf/pods/2005, 148--159. Google ScholarDigital Library
- Georg Gottlob and Alan Nash. 2008. Efficient core computation in data exchange. J. ACM Vol. 55, 2 (2008), 9:1--9:49. Google ScholarDigital Library
- Georg Gottlob, Reinhard Pichler, and Vadim Savenkov. 2011. Normalization and optimization of schema mappings. VLDB J. Vol. 20, 2 (2011), 277--302. Google ScholarDigital Library
- Georg Gottlob and Pierre Senellart. 2010. Schema mapping discovery from data instances. J. ACM Vol. 57, 2 (2010), 6:1--6:37. Google ScholarDigital Library
- Laura M. Haas, Mauricio A. Hernández, Howard Ho, Lucian Popa, and Mary Roth. 2005. Clio grows up: from research prototype to industrial tool Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14--16, 2005, Fatma Özcan (Ed.). ACM, 805--810. Google ScholarDigital Library
- Mauricio A. Hernández, Howard Ho, Lucian Popa, Ariel Fuxman, Renée J. Miller, Takeshi Fukuda, and Paolo Papotti. 2007. Creating Nested Mappings with Clio. In Proceedings of the 23rd International Conference on Data Engineering, ICDE 2007, The Marmara Hotel, Istanbul, Turkey, April 15--20, 2007, Rada Chirkova, Asuman Dogac, M. Tamer Özsu, and Timos K. Sellis (Eds.). IEEE Computer Society, 1487--1488.Google ScholarCross Ref
- André Hernich. 2011. Answering Non-Monotonic Queries in Relational Data Exchange. Logical Methods in Computer Science Vol. 7, 3 (2011).Google Scholar
- André Hernich and Phokion G. Kolaitis. 2017. Foundations of information integration under bag semantics 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20--23, 2017. IEEE Computer Society, 1--12.Google Scholar
- André Hernich, Leonid Libkin, and Nicole Schweikardt. 2011. Closed world data exchange. ACM Trans. Database Syst. Vol. 36, 2 (2011), 14:1--14:40. Google ScholarDigital Library
- Angelika Kimmig, Alex Memory, Renée J. Miller, and Lise Getoor. 2017. A Collective, Probabilistic Approach to Schema Mapping 33rd IEEE International Conference on Data Engineering, ICDE 2017, San Diego, CA, USA, April 19--22, 2017. IEEE Computer Society, 921--932.Google Scholar
- Phokion G. Kolaitis. 2005. Schema mappings, data exchange, and metadata management, See citeNDBLP:conf/pods/2005, 61--75. Google ScholarDigital Library
- Phokion G. Kolaitis, Maurizio Lenzerini, and Nicole Schweikardt (Eds.).. 2013. Data Exchange, Integration, and Streams. Dagstuhl Follow-Ups, Vol. Vol. 5. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik. http://www.dagstuhl.de/dagpub/978--3--939897--61--3Google Scholar
- Phokion G. Kolaitis, Jonathan Panttaja, and Wang Chiew Tan. 2006. The complexity of data exchange, See citeNDBLP:conf/pods/2006, 30--39. Google ScholarDigital Library
- Phokion G. Kolaitis, Reinhard Pichler, Emanuel Sallinger, and Vadim Savenkov. 2014. Nested dependencies: structure and reasoning. In Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS'14, Snowbird, UT, USA, June 22--27, 2014, Richard Hull and Martin Grohe (Eds.). ACM, 176--187. Google ScholarDigital Library
- Phokion G. Kolaitis, Reinhard Pichler, Emanuel Sallinger, and Vadim Savenkov. 2016. Limits of Schema Mappings. In 19th International Conference on Database Theory, ICDT 2016, Bordeaux, France, March 15--18, 2016 (LIPIcs), Wim Martens and Thomas Zeume (Eds.), Vol. Vol. 48. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 19:1--19:17.Google Scholar
- Maurizio Lenzerini and Domenico Lembo (Eds.).. 2008. Proceedings of the Twenty-Seventh ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2008, June 9--11, 2008, Vancouver, BC, Canada. ACM. http://dl.acm.org/citation.cfm?id=1376916Google Scholar
- Chen Li (Ed.).. 2005. Proceedings of the Twenty-fourth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 13--15, 2005, Baltimore, Maryland, USA. ACM. http://dl.acm.org/citation.cfm?id=1065167Google Scholar
- Leonid Libkin. 2006. Data exchange and incomplete information, See NDBLP:conf/pods/2006, 60--69. Google ScholarDigital Library
- Leonid Libkin and Cristina Sirangelo. 2011. Data exchange and schema mappings in open and closed worlds. J. Comput. Syst. Sci. Vol. 77, 3 (2011), 542--571. Google ScholarDigital Library
- Jayant Madhavan and Alon Y. Halevy. 2003. Composing Mappings Among Data Sources. In VLDB 2003, Proceedings of 29th International Conference on Very Large Data Bases, September 9--12, 2003, Berlin, Germany, Johann Christoph Freytag, Peter C. Lockemann, Serge Abiteboul, Michael J. Carey, Patricia G. Selinger, and Andreas Heuer (Eds.). Morgan Kaufmann, 572--583. http://www.vldb.org/conf/2003/papers/S18P01.pdf Google ScholarDigital Library
- Sergey Melnik. 2004. Generic Model Management: Concepts and Algorithms. Lecture Notes in Computer Science, Vol. Vol. 2967. Springer. Google ScholarDigital Library
- Alan Nash, Philip A. Bernstein, and Sergey Melnik. 2007. Composition of mappings given by embedded dependencies. ACM Trans. Database Syst. Vol. 32, 1 (2007), 4. Google ScholarDigital Library
- Stijn Vansummeren (Ed.). 2006. Proceedings of the Twenty-Fifth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 26--28, 2006, Chicago, Illinois, USA. ACM. http://dl.acm.org/citation.cfm?id=1142351Google Scholar
Index Terms
- Reflections on Schema Mappings, Data Exchange, and Metadata Management
Recommendations
Characterizing schema mappings via data examples
PODS '10: Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsSchema mappings are high-level specifications that describe the relationship between two database schemas; they are considered to be the essential building blocks in data exchange and data integration, and have been the object of extensive research ...
Characterizing schema mappings via data examples
Schema mappings are high-level specifications that describe the relationship between two database schemas; they are considered to be the essential building blocks in data exchange and data integration, and have been the object of extensive research ...
Composing schema mappings: Second-order dependencies to the rescue
Special Issue: SIGMOD/PODS 2004A schema mapping is a specification that describes how data structured under one schema (the source schema) is to be transformed into data structured under a different schema (the target schema). A fundamental problem is composing schema mappings: given ...
Comments