Abstract
Supporting graceful schema evolution represents an unsolved problem for traditional information systems that is further exacerbated in web information systems, such as Wikipedia and public scientific databases: in these projects based on multiparty cooperation the frequency of database schema changes has increased while tolerance for downtimes has nearly disappeared. As of today, schema evolution remains an error-prone and time-consuming undertaking, because the DB Administrator (DBA) lacks the methods and tools needed to manage and automate this endeavor by (i) predicting and evaluating the effects of the proposed schema changes, (ii) rewriting queries and applications to operate on the new schema, and (iii) migrating the database.
Our PRISM system takes a big first step toward addressing this pressing need by providing: (i) a language of Schema Modification Operators to express concisely complex schema changes, (ii) tools that allow the DBA to evaluate the effects of such changes, (iii) optimized translation of old queries to work on the new schema version, (iv) automatic data migration, and (v) full documentation of intervened changes as needed to support data provenance, database flash back, and historical queries. PRISM solves these problems by integrating recent theoretical advances on mapping composition and invertibility, into a design that also achieves usability and scalability. Wikipedia and its 170+ schema versions provided an invaluable testbed for validating PRISM tools and their ability to support legacy queries.
- R. B. Almeida, B. Mozafari, and J. Cho. On the evolution of wikipedia. In Int. Conf. on Weblogs and Social Media, March 2007.Google Scholar
- D. Barbosa, J. Freire, and A. O. Mendelzon. Designing information-preserving mapping schemes for xml. In VLDB, pages 109--120, 2005. Google ScholarDigital Library
- P. A. Bernstein. Applying model management to classical meta data problems. In CIDR, 2003.Google Scholar
- P. A. Bernstein, T. J. Green, S. Melnik, and A. Nash. Implementing mapping composition. VLDB J., 17(2):333--353, 2008. Google ScholarDigital Library
- P. A. Bernstein and E. Rahm. Data warehouse scenarios for model management. In ER, 2003. Google ScholarDigital Library
- H. Bounif and R. Pottinger. Schema repository for database schema evolution. DEXA, 0:647--651, 2006. Google ScholarDigital Library
- A. Cleve and J.-L. Hainaut. Co-transformations in database applications evolution. Generative and Transformational Techniques in Software Engineering, pages 409--421, 2006. Google ScholarDigital Library
- C. A. Curino, H. J. Moon, L. Tanca, and C. Zaniolo. Schema Evolution in Wikipedia: toward a Web Information System Benchmark. ICEIS, 2008.Google Scholar
- C. A. Curino, H. J. Moon, and C. Zaniolo. Managing the history of metadata in support for db archiving and schema evolution. In ECDM, 2008. Google ScholarDigital Library
- DB2 development team. DB2 Change Management Expert. 2006.Google Scholar
- A. Deutsch and V. Tannen. Optimization properties for classes of conjunctive regular path queries. In DBPL '01: Revised Papers from the 8th International Workshop on Database Programming Languages, pages 21--39, London, UK, 2002. Springer-Verlag. Google ScholarDigital Library
- A. Deutsch and V. Tannen. Mars: A system for publishing XML from mixed and redundant storage. In VLDB, 2003. Google ScholarDigital Library
- R. Fagin. Inverting schema mappings. ACM Trans. Database Syst., 32 (4): 25, 2007. Google ScholarDigital Library
- R. Fagin, P. G. Kolaitis, L. Popa, and W. C. Tan. Composing schema mappings: Second-order dependencies to the rescue. In PODS, pages 83--94, 2004. Google ScholarDigital Library
- R. Fagin, P. G. Kolaitis, L. Popa, and W.-C. Tan. Quasi-inverses of schema mappings. In PODS '07, pages 123--132, 2007. Google ScholarDigital Library
- R. d. M. Galante, C. S. dos Santos, N. Edelweiss, and A. F. Moreira. Temporal and versioning model for schema evolution in object-oriented databases. Data & Knowledge Engineering, 53(2):99--128, 2005. Google ScholarDigital Library
- M. Golfarelli, J. Lechtenbörger, S. Rizzi, and G. Vossen. Schema versioning in data warehouses. In ER (Workshops), pages 415--428, 2004.Google ScholarCross Ref
- J.-M. Hick and J.-L. Hainaut. Database application evolution: a transformational approach. Data Knowl. Eng., 59(3):534--558, 2006. Google ScholarDigital Library
- H. V. Jagadish, I. S. Mumick, and M. Rabinovich. Scalable versioning in distributed databases with commuting updates. In Conference on Data Engineering, pages 520--531, 1997. Google ScholarDigital Library
- T. Lemke and R. Manthey. The schema evolution assistant: Tool description, 1995.Google Scholar
- J. Madhavan and A. Y. Halevy. Composing mappings among data sources. In VLDB, 2003. Google ScholarDigital Library
- S. Melnik, E. Rahm, and P. A. Bernstein. Rondo: A programming platform for generic model management. In SIGMOD, 2003. Google ScholarDigital Library
- H. J. Moon, C. A. Curino, A. D. C.-Y. Hou, and C. Zaniolo. Managing and querying transaction-time databases under schema evolution. In VLDB, 2008. Google ScholarDigital Library
- M. M. Moro, S. Malaika, and L. Lim. Preserving XML Queries during Schema Evolution. In WWW, pages 1341--1342, 2007. Google ScholarDigital Library
- A. Nash, P. A. Bernstein, and S. Melnik. Composition of mappings given by embedded dependencies. In PODS, 2005. Google ScholarDigital Library
- Oracle development team. Oracle database 10g online data reorganization and redefinition. 2005.Google Scholar
- Y.-G. Ra. Relational schema evolution for program independency. Intelligent Information Technology, pages 273--281, 2005. Google ScholarDigital Library
- S. Rizzi and M. Golfarelli. X-time: Schema versioning and cross-version querying in data warehouses. In ICDE, pages 1471--1472, 2007.Google ScholarCross Ref
- J. Roddick. A Survey of Schema Versioning Issues for Database Systems. Information and Software Technology, 37(7):383--393, 1995.Google ScholarCross Ref
- J. Ullman. Principles of Database System., Computer Science Press, 1982. Google ScholarDigital Library
- Y. Velegrakis, R. J. Miller, and L. Popa. Mapping adaptation under evolving schemas. In VLDB, 2003. Google ScholarDigital Library
- Wikimedia Foundation. Mediawiki http://www.mediawiki.org, 2007. {Online}.Google Scholar
- Wikimedia Foundation. Wikipedia http://en.wikipedia.org/, 2007. {Online}.Google Scholar
- C. Yu and L. Popa. Semantic adaptation of schema mappings when schemas evolve. In VLDB, 2005. Google ScholarDigital Library
Index Terms
- Graceful database schema evolution: the PRISM workbench
Recommendations
Design of Automatic Database Schema Generator Based on XML Schema
CIS '07: Proceedings of the 2007 International Conference on Computational Intelligence and SecurityWith the recent expansion of e-commerce, B2B has surfaced as an area of substantial interest to the corporate world. B2B refers to economic transactions created among businesses through various networks, including the Internet. Currently, XML documents ...
Data schema evolution support in XML-relational database systems
Many XML-relational systems, i.e., the systems that use an XML schema as an external schema and a relational schema as an internal schema of the data application representation level, require modifications of the data schemas in the course of time. ...
On transformation to redundancy free XML schema from relational database schema
APWeb'03: Proceedings of the 5th Asia-Pacific web conference on Web technologies and applicationsWhile XML is emerging as the universal format for publishing and exchanging data on the Web, most business data is still stored and maintained in relational database management systems. As a result, there is an increasing need to efficiently publish ...
Comments