Translating relational schema into XML schema definition with data semantic preservation and XSD graph

https://doi.org/10.1016/j.infsof.2004.09.010Get rights and content

Abstract

Many legacy systems have been created by using relational database operating not for the Internet expression. Since the relational database is not an efficient way for data explosion, electronic transfer of data, and electronic business on the Web, we introduce a methodology in which a relational schema will be translated to an Extensible Markup Language (XML) schema definition for creating an XML database that is a simple and efficient format on the Web. We apply the Indirect Schema Translation Method that is a semantic-based methodology in this project. The mechanism is that the Relational Schema will be translated into the conceptual model, an Extended Entity Relationship (EER) Model using Reverse Engineering. Afterward, the EER model will be mapped to an XML Schema Definition Language (XSD) Graph as an XML conceptual schema using Semantic Transformation. Finally, the XSD Graph will be mapped into the XSD as an XML logical schema in the process of Forward Engineering, and the data semantics of participation, cardinality, generalization, aggregation, categorization, N-ary and U-ary relationship are preserved in the translated XML schema definition.

Introduction

On the Internet, data retrieval is an important issue relating to the growth of Internet highway. Many companies use legacy systems without Internet-oriented expression such as hierarchical, relational, and object-oriented databases for their daily interoperations on the Internet between users and companies. To overcome this issue, the XML database becomes an important database structure in presenting and storing data for the Internet world. However, XML database has its limitation on presenting its structure using semantics to store data. Biancheri et al.[1] dictated the choice of storing their data using relational database instead of native XML because such a system would enable them to store a larger set of information. In the data storage, the XML technology will be improved for storing the larger set of information. At present, system analyst has no toolset for modeling and analyzing XML system. Our solution applies an XSD Graph as a toolset for XML database to ease analyzing the structure of XML database as shown in Fig. 1. The tree diagram of XSD Graph represents inter-relationship of different elements inside a system. It is important not only for visualizing, specifying, and documenting structural models, but also for constructing executable systems through forward engineering [2] (from design to implementation).

Many papers have focused on Document Type Definition (DTD). We have focused on XML schema definition (XSD). We use XSD Graph to represent the conceptual schema of XML model. The model is introduced in this paper. Many papers have used XML tree to expand not only for the elements, but also for the attributes and data. The benefit is that reader can analyze all components of XML schema on the tree. Ng [3] extended the notion of functional dependency (FD) and compared the values of leaf nodes in a specified context of its corresponding XML tree to form an integrated XML tree. The XML tree consisted of elements, attribute, and data value. Vincent and Jixue Liu [4] modeled an XML document as a tree which consisted of element, attribute, and data value. They proposed FD of XML with justification by mapping a relation to a XML document assisting by the XML tree. Yue[5] applied key constraint to create an instance XML tree with element, attribute, and data value.

We propose XSD Graph for representing the conceptual schema of XML model because it shows the data semantics in a more user-friendly approach to readers. In our approach, relational schema is mapped into an EER model in reverse engineering (from implementation to design), and then we map XSD Graph into XML Schema Definition Language [6] as shown in Fig. 2.

The followings are the definition of the data semantics constraints notations:

Functional dependency (FD). A functional dependency is a statement of the form XY, where X and Y are sets of attributes. The FD: XY holds for relation R if whenever s and t are tuples of R where s[X]=t[X], then s[Y]=t[Y].

Inclusion dependency (ID). An inclusion dependency is a statement of the form XY such that X is a subset of Y. For example, X is a foreign key of a child relation and Y is a referred primary key of its parent relation.

Multi-valued dependency (MVD). Let R be a relation, and let X, Y, and Z be attributes of R. Then Y is multi-dependent on X in MVD: X→→Y|Z if and only if the set of Y-values matching a given (X-value, Z-value) pair in R depends only on the X-value and is independent of the Z-value.

The structure of paper is organized as follows. Section 2 presents the related works from other researchers. Section 3 presents our methodology of indirect schema translation. We apply the methodology in a case study in Section 4. Section 5 shows a prototype and Section 6 concludes our paper.

Section snippets

Related work

Nicolle [7] created an X-Time that consisted of three layers: specification layer, grammar integration layer, and translation layer. He used the XML model to integrate all other models for creating the universal data management system. Fong [8] used XML-based topology for system integration between relational model and XML model. The topology consisted of four different types: (1) functional dependency; (2) multi-valued dependency; (3) join dependency; and (4) M: N cardinality. Chen [9] used

Methodology of indirect schema translation

The first process extracts all features from Relational Schema into EER Model. The second process maps the conceptual schemas from EER model to XSD Graph. The third process translates XSD Graph into XSD. The following processes are outlines of our methodology of indirect schema translation from relational schema to XSD through EER model and XSD Graph.

Process 1: Reverse Engineering—Relational Schema to EER Model

Direct translation is a logical schema translation from relational into XML. In the

Case study for indirect schema translation

In a case study, XSD Graph and XML schema definition focuses on products of a factory. The products can be categorized in different levels. Each product is composed by many small parts and consists of many features. They have their own value. In relational database, a relational schema is created and sample data are loaded into the physical repository. We employ XSD Graph for recapturing data semantics from the relational database. The relational schema is mapped into an XSD Graph. According to

Relational schema→XSD

The General and Semantic Rules are applied in this prototype. The former is to outline the scope of application and to identify necessary information to be completed during the transformation. The latter is to recapture data semantics.

Relational database

The following information extracted from the relational schema is stored for transforming the basic EER entities into XML elements.

Supplier

Item_noNameAddress
1ABC CompanyFlat H, 3 Main Road, HK
2Whitehouse CompanyFlat R, 34 Main Road East, HK

Catalogitem

Item_noEmpty Cell

Conclusion

After validating the results between the source relational schema and the target XSD by implementing in the ‘xmlspy’ and ‘foxpro’, respectively, our methodology is feasible because both can come up with the same result. We conclude that seven semantics can be converted from Relational model to XML model including aggregation, generalization/isa, participation, cardinality, categorization, N-ary, and Unary. Although there is no key concept in XML model, we can use ‘complexType’ and ‘element and

References (27)

  • J. Fong

    Converting relational database into XML documents with DOM

    Information and Software Technology

    (2003)
  • C. Biancheri et al.

    EIHA?!?: Deploying Web and Wap Services Using XML Technology

    SIGMOD Record

    (2001)
  • G. Booch

    The Unified Modeling Language User Guide

    (1999)
  • Ng Wilfred

    Maintaining Consistency of Integrated XML Trees

    (2002)
  • M.W. Vincent et al.

    Functional Dependencies for XML

    (2003)
  • K. Yue

    Constraint Preserving XML Updating

    (2003)
  • R. Wyke et al.

    XML Schema Essentials

    (2002)
  • Nicolle

    XML Integration and toolkit for B2B applications

    Journal of Database Management

    (2003)
  • E.V.D. Vlist, Using W3C XML Schema, http://www.xml.com/pub/a/2000/11/29/sche, October...
  • A. Zhou et al.

    VXMLR: A Visual XML-Relational Database System

    (2001)
  • M. Fern Ndez et al.

    SilkRoute: trading between relations and XML

    Computer Networks

    (2000)
  • M. Yoshikawa et al.

    XRel: a path-based approach to storage and retrieval of XML documents using relational databases

    ACM Transactions on Internet Technology

    (2001)
  • J. Shanmugasundaram et al.

    Efficiently Publishing Relational Data as XML Documents

    (2000)
  • Cited by (30)

    View all citing articles on Scopus
    View full text