Understanding database schema evolution: A case study

https://doi.org/10.1016/j.scico.2013.11.025Get rights and content
Under an Elsevier user license
open archive

Highlights

  • We present a tool-supported method to analyze the history of a database schema.

  • The method makes use of mining software repositories (MSR) techniques.

  • We report on the application of the method to a large-scale case study.

Abstract

Database reverse engineering (DRE) has traditionally been carried out by considering three main information sources: (1) the database schema, (2) the stored data, and (3) the application programs. Not all of these information sources are always available, or of sufficient quality to inform the DRE process. For example, getting access to real-world data is often extremely problematic for information systems that maintain private data. In recent years, the analysis of the evolution history of software programs have gained an increasing role in reverse engineering in general, but comparatively little such research has been carried out in the context of database reverse engineering. The goal of this paper is to contribute to narrowing this gap and exploring the use of the database evolution history as an additional information source to aid database schema reverse engineering. We present a tool-supported method for analyzing the evolution history of legacy databases, and we report on a large-scale case study of reverse engineering a complex information system and curate it as a benchmark for future research efforts within the community.

Keywords

Database understanding
Schema evolution
Software repository mining

Cited by (0)