Abstract
Though partially automated, developing schema mappings remains a complex and potentially error-prone task. In this paper, we present TRAMP (TRAnsformation Mapping Provenance), an extensive suite of tools supporting the debugging and tracing of schema mappings and transformation queries. TRAMP combines and extends data provenance with two novel notions, transformation provenance and mapping provenance, to explain the relationship between transformed data and those transformations and mappings that produced that data. In addition we provide query support for transformations, data, and all forms of provenance. We formally define transformation and mapping provenance, present an efficient implementation of both forms of provenance, and evaluate the resulting system through extensive experiments.
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
- P. Agrawal, O. Benjelloun, A. D. Sarma, C. Hayworth, S. Nabar, T. Sugihara, and J. Widom. An Introduction to ULDBs and the Trio System. IEEE Data Engineering Bulletin, 29(1):5--16, 2006.Google Scholar
- B. Alexe, L. Chiticariu, R. Miller, and W. Tan. Muse: Mapping Understanding and Design by Example. In ICDE, pages 10--19, 2008. Google ScholarDigital Library
- B. Alexe, W. Tan, and Y. Velegrakis. STBenchmark: Towards a Benchmark for Mapping Systems. PVLDB, 1(1):230--244, 2008. Google ScholarDigital Library
- M. Blow, V. R. Borkar, M. J. Carey, C. Hillery, A. Kotopoulis, D. Lychagin, R. Preotiuc-Pietro, P. Reveliotis, J. Spiegel, and T. Westmann. Updates in the AquaLogic Data Services Platform. In ICDE, pages 1431--1442, 2009. Google ScholarDigital Library
- A. Chapman and H. Jagadish. Why Not? In SIGMOD, pages 523--534, 2009. Google ScholarDigital Library
- J. Cheney, L. Chiticariu, and W. Tan. Provenance in Databases: Why, How, and Where. Foundations and Trends in Databases, 1(4):379--474, 2009. Google ScholarDigital Library
- L. Chiticariu and W. Tan. Debugging Schema Mappings with Routes. In VLDB, pages 79--90, 2006. Google ScholarDigital Library
- Y. Cui and J. Widom. Lineage Tracing in a Data Warehousing System. In ICDE, page 683, 2000. Google ScholarDigital Library
- R. Fagin, L. M. Haas, M. A. Hernández, R. J. Miller, L. Popa, and Y. Velegrakis. Clio: Schema Mapping Creation and Data Exchange. Springer, 2009.Google ScholarDigital Library
- R. Fagin, P. Kolaitis, R. Miller, and L. Popa. Data Exchange: Semantics and Query Answering. Theoretical Computer Science, 336(1):89--124, 2005. Google ScholarDigital Library
- A. Fuxman, M. Hernandez, H. Ho, R. Miller, P. Papotti, and L. Popa. Nested Mappings: Schema Mapping Reloaded. In VLDB, pages 67--78, 2006. Google ScholarDigital Library
- B. Glavic. Perm: Efficient Provenance Support for Relational Databases. PhD thesis, University of Zurich, 2010.Google Scholar
- B. Glavic and G. Alonso. Perm: Processing Provenance and Data on the same Data Model through Query Rewriting. In ICDE, pages 174--185, 2009. Google ScholarDigital Library
- T. Green, G. Karvounarakis, and V. Tannen. Provenance Semirings. In PODS, pages 31--40, 2007. Google ScholarDigital Library
- M. Herschel, M. Hernández, and W. Tan. Artemis: A System for Analyzing Missing Answers. In VLDB, pages 1550--1553, 2009. Google ScholarDigital Library
- M. Lenzerini. Data Integration: A Theoretical Perspective. In PODS, pages 233--246, 2002. Google ScholarDigital Library
- G. Mecca, P. Papotti, and S. Raunich. Core schema mappings. In SIGMOD Conference, pages 655--668, 2009. Google ScholarDigital Library
- R. J. Miller, D. Fisla, M. Huang, D. Kymlicka, F. Ku, and V. Lee. The Amalgam Schema and Data Integration Test Suite, 2001. www.cs.toronto.edu/miller/amalgam.Google Scholar
- R. J. Miller, L. M. Haas, and M. Hernández. Schema Mapping as Query Discovery. In VLDB, pages 77--88, 2000. Google ScholarDigital Library
- E. Rahm and P. Bernstein. A Survey of Approaches to Automatic Schema Matching. VLDB Journal, 10(4):334--350, 2001. Google ScholarDigital Library
- Y. L. Simmhan, B. Plale, and D. Gannon. A Survey of Data Provenance in e-Science. SIGMOD Rec., 34(3):31--36, 2005. Google ScholarDigital Library
- B. ten Cate, L. Chiticariu, P. G. Kolaitis, and W.-C. Tan. Laconic schema mappings: Computing the core with sql queries. PVLDB, 2(1):1006--1017, 2009. Google ScholarDigital Library
- J. Van den Bussche, S. Vansummeren, and G. Vossen. Towards Practical Meta-Querying. Information Systems, 30(4):317--332, 2005. Google ScholarDigital Library
- Y. Velegrakis, R. Miller, and J. Mylopoulos. Representing and Querying Data Transformations. In ICDE, pages 81--92, 2005. Google ScholarDigital Library
- L. Yan, R. Miller, L. Haas, and R. Fagin. Data-driven Understanding and Refinement of Schema Mappings. In SIGMOD, pages 485--496, 2001. Google ScholarDigital Library
Index Terms
- TRAMP: understanding the behavior of schema mappings through provenance
Recommendations
OPQL: A First OPM-Level Query Language for Scientific Workflow Provenance
SCC '11: Proceedings of the 2011 IEEE International Conference on Services ComputingProvenance, which is one kind of metadata that captures the derivation history of a data product, including its original data sources, intermediate products, and the steps that were applied to produce it, has become increasingly important in services ...
The perm provenance management system in action
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataIn this demonstration we present the Perm provenance management system (PMS). Perm is capable of computing, storing and querying provenance information for the relational data model. Provenance is computed by using query rewriting techniques to annotate ...
Comments