Elsevier

Computer Networks

Volume 31, Issues 11–16, 17 May 1999, Pages 1171-1187
Computer Networks

XML-GL: a graphical language for querying and restructuring XML documents1

https://doi.org/10.1016/S1389-1286(99)00014-6Get rights and content

Abstract

The growing acceptance of XML as a standard for semi-structured documents on the Web opens up challenging opportunities for Web query languages. In this paper we introduce XML-GL, a graphical query language for XML documents. The use of a visual formalism for representing both the content of XML documents (and of their DTDs) and the syntax and semantics of queries enables an intuitive expression of queries, even when they are rather complex. XML-GL is inspired by G-log, a general purpose, logic-based language for querying structured and semi-structured data. The paper presents the basic capabilities of XML-GL through a sequence of examples of increasing complexity.

Section snippets

Introduction and motivations

XML [19] is a recent recommendation of the World Wide Web Consortium for a meta-language to define mark-ups for content publishing on the Web. The design goals of XML are driven by the five-year experience of usage of HTML as a content description language, which has exposed several inadequacies:

  • the HTML tag set is fixed, and its extension to cover new application requirements either breaks the standard or demands a long standardization process.

  • HTML mark-up intermixes structural and visual

The XML-GL data model

An XML document can be compliant to a Document Type Definition (DTD), that specifies the types of mark-up elements that can appear in the document, their attributes and containment relationships. If an XML document adheres to a DTD, it is said to be valid. If an XML document lacks a DTD but respects some syntactic rules for tag placement, it is said to be well formed.

For coherence with the visual nature of XML-GL, we introduce an explicit data model for XML documents, called XML-GDM (XML

Query language

XML-GL is a query language for XML-GDM data. An XML-GL query can be applied either to a single XML document or to a set of documents, e.g., those composing a Web site. The query produces a new XML document as the result. Thus, the execution of a query results in a transformation of the source XML document(s) into a new XML document. An XML-GL query consists of four parts:

  • 1.

    The extract part identifies the scope of the query, by indicating both the target documents and the target elements inside

Related work

The huge amount of data published via the World Wide Web has led to a number of research efforts on techniques to index, query and restructure Web sites contents. In this section we provide a brief overview of related work on XML query languages and, more generally, on query languages for the Web (see also [7]).

A considerable amount of research has been made on how to complement keyword-based searching with database-style support for querying the Web. Several projects addressed this problem,

Conclusions

XML-GL is a sophisticated, but intuitive, visual language for querying XML data sources. It draws its unique features from an original combination of orthogonal, natural primitives for visualizing DTDs and documents, extracting their content, producing new content from extracted data, and formatting query results in complex ways. The use of a visual interface and language for querying XML-based Web documents seems very appealing.

Our research activity will concentrate on the following

Stefano Ceri is professor at the Dipartimento di Elettronica e Informazione, Politecnico di Milano; he has been visiting professor at the Computer Science Department of Stanford University between 1983 and 1990. His research interests are focused on extending database technology to incorporate data distribution, deductive rules, active rules, and object-orientation; he is also currently interested in the integration between Web and database technologies. He is author of several books, including

References (19)

  • D. Brickley, R. Guha and A. Layman, W3C RDF Schemas (working draft), October 1998,...
  • S. Ceri, S. Comai, E. Damiani, P. Fraternali, S. Paraboschi and L. Tanca, XML-GL: a query language for XML documents,...
  • S. Comai, E. Damiani, R. Posenato and L. Tanca, A schema-based approach to modeling and querying WWW data, in: Proc....
  • A. Cortesi, A. Dovier, E. Quintarelli and L. Tanca, Operational and abstract semantics of a query language for...
  • S.J. DeRose, XQuery: a unified syntax for linking and querying general XML documents, in: Query Languages...
  • A. Deutsch, M. Fernandez, D. Florescu, A. Levy and D. Suciu, XML-QL: a query language for XML, in: Proc. QL'98 — The...
  • D. Florescu, A. Levy and A. Mendelzon, Database techiques for the World-Wide Web: a survey, ACM Sigmod Record 27 (3)...
  • H. Ishikawa, K. Kubota and Y. Kanemasa, XQL: a query language for XML data, in: Query Languages 98 (World-Wide Web...
  • D. Konopnicki and O. Shmueli, W3QL: a query system for the World Wide Web, in: Proc. 21th Int. Conf. on Very Large...
There are more references available in the full text version of this article.

Cited by (68)

  • A survey on tree matching and XML retrieval

    2013, Computer Science Review
    Citation Excerpt :

    For example, the information retrieval-oriented NEXI language is introduced as “the simplest query language that could possibly work” [13], whereas the database-oriented XQuery Full-text query language propose a very complete but hard to learn syntax for end-user. To overcome this problem of complexity for end users, graphical query languages were proposed [38,39], but they are far from being extensively used. Whatever the query language used, content and structure queries, in the same manner than XML documents, can be represented as labeled trees.

  • Applying model-checking to solve queries on semistructured data

    2009, Computer Languages, Systems and Structures
    Citation Excerpt :

    This kind of data has no absolute schema fixed in advance, and its structure may be irregular or incomplete [1]. It is a common approach to represent semistructured data by using data models based on directed labeled graphs [2–5], thus, a query is expected to extract information stored in labeled graphs; the data retrieval activity can be reduced to the problem of finding subgraphs of the database instance graph that satisfy the requirements of the query. In some of these languages, queries are graphs themselves [6,7,21], therefore the above activity amounts to find subgraphs of the instance graph that match the graph representing the query.

  • Efficient processing of XPath queries using indexes

    2007, Information Systems
    Citation Excerpt :

    XPath uses path notations for navigating through the hierarchical structure of an XML document. A query written in any of the query languages such as XQuery [2], XML-QL [3], XML-GL [4], Lorel [5], and Quilt [6] is easily transformed to an XPath expression. If we need to retrieve a relatively small part (data) from the large XML file under certain constraints expressed using XPath, it will be expensive to compare each node with given search conditions.

  • VIREX: Visual relational to XML conversion tool

    2006, Journal of Visual Languages and Computing
    Citation Excerpt :

    DataGuides [25] is a user interface for browsing XML data used in the Lore system [26]. XML-GL [27] and Xing [28] are visual XML query languages for querying and restructuring XML data, whereas VXT [29] is a visual XML transformation language. Xing is designed for a broad audience including end-users who wish to create queries (also expressed as rules) containing information on the structure to be queried.

  • VXQ: A visual query language for XML data

    2015, Information Systems Frontiers
View all citing articles on Scopus

  1. Download : Download full-size image
Stefano Ceri is professor at the Dipartimento di Elettronica e Informazione, Politecnico di Milano; he has been visiting professor at the Computer Science Department of Stanford University between 1983 and 1990. His research interests are focused on extending database technology to incorporate data distribution, deductive rules, active rules, and object-orientation; he is also currently interested in the integration between Web and database technologies. He is author of several books, including The Art and Craft of Computing (Addison-Wesley, 1997), Advanced Database Systems (Morgan Kaufmann, 1997), and Active Database Systems (Morgan Kaufmann, 1995).

  1. Download : Download full-size image
Sara Comai received her Laurea degree in Ingegneria Gestionale in 1996 from Politecnico di Milano (Italy). Since 1997 she is a Ph.D. student in Ingegneria Informatica e Automatica at the same university. Her research interests are mainly in the areas of active databases and semistructured information representation and processing.

  1. Download : Download full-size image
Ernesto Damiani holds a Laurea degree in Ingegneria Elettronica from Università di Pavia and a Ph.D. degree in Computer Science from Università di Milano. He is currently an assistant professor at the campus located in Crema of Università di Milano, and a Visiting Lecturer at the Computer Science Department of LaTrobe University in Melbourne, Australia. His research interests include distributed and object oriented systems, semi-structured information processing and soft computing.

  1. Download : Download full-size image
Piero Fraternali is an associate professor at the Dipartimento di Elettronica e Informazione of Politecnico di Milano. He received the Laurea Degree in Ingegneria Elettronica in 1989, and a Ph.D. in Ingegneria Informatica in 1994, both from Politecnico di Milano. His main research interest is currently in the area of the integration of Web and databases. His research focuses also on active databases, object orientation, and software engineering methodologies. He is the author, with Stefano Ceri, of the book Designing Database Applications with Objects and Rules: The IDEA Methodology (Addison-Wesley, 1997).

  1. Download : Download full-size image
Stefano Paraboschi is an associate professor at the Dipartimento di Elettronica e Informazione of Politecnico di Milano. He received the Laurea Degree in Ingegneria Elettronica in 1990, and a Ph.D. in Ingegneria Informatica in 1994, both from Politecnico di Milano. His main research interests are in the area of databases, with a focus on active databases, data warehouses, and the construction of data-intensive Web sites. He is the author, together with Paolo Atzeni, Stefano Ceri, and Riccardo Torlone, of the book Database Systems: Concepts, Languages and Architectures (McGraw-Hill, 1999).

  1. Download : Download full-size image
Letizia Tanca is professor at the Dipartimento di Elettronica e Informazione, Politecnico di Milano; she has been professor at Università di Verona between 1995 and 1998. Her research interests concern advanced database languages and systems, and currently focus on query languages for the Web, graphical query languages, and active database systems. She is author, with Stefano Ceri and Georg Gottlob, of the book Logic Programming and Databases (Springer-Verlag, 1990).

1

The work presented in the paper has been supported by Esprit Project nr. 28771 `W3I3', and MURST project `Interdata'.

2

E-mail: {ceri,comai,fraterna,parabosc,tanca}@elet.polimi.it

3

E-mail: [email protected]

View full text