ABSTRACT
At the heart of many data-intensive applications is the problem of quickly and accurately transforming data into a new form. Database researchers have long advocated the use of declarative queries for this process. Yet tools for creating, managing and understanding the complex queries necessary for data transformation are still too primitive to permit widespread adoption of this approach. We present a new framework that uses data examples as the basis for understanding and refining declarative schema mappings. We identify a small set of intuitive operators for manipulating examples. These operators permit a user to follow and refine an example by walking through a data source. We show that our operators are powerful enough both to identify a large class of schema mappings and to distinguish effectively between alternative schema mappings. These operators permit a user to quickly and intuitively build and refine complex data transformation queries that map one data source into another.
- 1.G. Bhargava, P. God, and B. R. Iyer. Hypergraph based reorderings of outer join queries with complex predicates. In ACM SIGMOD Int'l Conf. on the Management of Data, pp. 304-315, 1995. Google ScholarDigital Library
- 2.A. Deutsch, L. Popa, and V. Tannen. Physical Data Independance, Constraints, and Optimization with Universal Plans. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), pp. 459-470, 1999. Google ScholarDigital Library
- 3.R. Fagin, A. O. Mendelzon, and J. D. Ullman. A Simplified Universal Relation Assumption and Its Properties. ACM Trans. on Database Sys. (TODS), 7(3):343-360, Sept. 1982. Google ScholarDigital Library
- 4.C. A. Galindo-Legaria. Outer-joins as Disjunctions. In ACM SIGMOD Int'l Conf. on the Management of Data, pp. 348-358, 1994. Google Scholar
- 5.C. A. Galindo-Legaria mad A. Rosentbal. Outer-join Simplification mad Reordering for Query Optimization. ACM Trans. on Database Sys. (TODS), 22(1):43-73, 1997. Google ScholarDigital Library
- 6.L. M. Haas, R. J. Miller, B. Niswonger, M. T. Roth, P. M. Schwarz, mad E. L. Wimmers. Transforming Heterogeneous Data with Database Middleware: Beyond Integration. IEEE Data Engineering, 22(1):31-36, 1999.Google Scholar
- 7.C.-T. Ho, F. Natmann, X. Tian, L. Haas, and N. Megiddo. Automatic classification of attributes using feature analysis. Submitted, 2001.Google Scholar
- 8.H. F. Korth, G. M. Kuper, J. Feigenbaum, A. V. Gelder, and J. D. Ullman. System/U: A Database System Based on the Universal Relation Assumption. TODS, 9(3):331-347, 1984. Google ScholarDigital Library
- 9.R. J. Miller, L. M. Haas, and M. Hernandez. Schema Mapping as Query Discovery. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), pp. 77-88, Cairo, Egypt, Sept. 2000. Google ScholarDigital Library
- 10.R. J. Miller, M. A. Hernandez, L. M. Haas, L. Yan, C. T. H. Ho, R. Fagin, and L. Popa. The Clio Project: Managing Heterogeneity. SIGMOD Record, 30(1), Mar. 2001. Google ScholarDigital Library
- 11.A. Rajaraman and J. D. Ullman. Integrating Information by Outerjoins and Full Disjunctions. In Proc. of the ACM Syrup. on Principles of Database Systems (PODS'), pp. 238-248, 1996. Google ScholarDigital Library
- 12.S. Ram and V. Ramesh. Schema Integration: Past, Current and Future. In A. Elmagarmid, et al, eds, Management of Heterogeneous & Autonomous Database Systems, pp. 119-155. Morgan Kaufmann Publishers, 1999. Google Scholar
- 13.V. Raman, A. Chou, and J. M. Hellerstein. Scalable Spreadsheets for Interactive Data Analysis. In ACM-SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, May 1999.Google Scholar
- 14.E. A. Rundensteiner, ed. Special issue on data transformations. IEEE Data Eng. Bull., 22(1), 1999.Google Scholar
- 15.J. Stein and D. Maier. Relaxing the universal relation scheme assumption. In Proceedings of the Fourth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, March 25-27, 1985, Portland, Oregon, pp. 76-84. ACM, 1985. Google ScholarDigital Library
- 16.J. D. Ullman. Princples of Database and Knowledge-Base Systems, volume II: The New Technologies. Computer Science Press, 1989. Google ScholarDigital Library
- 17.L. Yan, R. J. Miller, L. Haas, and R. Fagin. Data-Driven Schema Mapping. Tectmical Report CSRG-423, Univ. of Toronto, 2001.Google Scholar
Index Terms
- Data-driven understanding and refinement of schema mappings
Recommendations
Data-driven understanding and refinement of schema mappings
At the heart of many data-intensive applications is the problem of quickly and accurately transforming data into a new form. Database researchers have long advocated the use of declarative queries for this process. Yet tools for creating, managing and ...
Composing schema mappings: Second-order dependencies to the rescue
Special Issue: SIGMOD/PODS 2004A schema mapping is a specification that describes how data structured under one schema (the source schema) is to be transformed into data structured under a different schema (the target schema). A fundamental problem is composing schema mappings: given ...
Quasi-inverses of schema mappings
Schema mappings are high-level specifications that describe the relationship between two database schemas. Two operators on schema mappings, namely the composition operator and the inverse operator, are regarded as especially important. Progress on the ...
Comments