Invited Papers

How to Build Google2Google – An (Incomplete) Recipe –

This talk explores aspects relevant for peer-to-peer search infrastructures, which we think are better suited to semantic web search than centralized approaches. It does so in the form of an (incomplete) cookbook recipe, listing necessary ingredients for putting together a distributed search infrastructure. The reader has to be aware, though, that many of these ingredients are research questions rather than solutions, and that it needs quite a few more research papers on these aspects before we can really cook and serve the final infrastructure. We’ll include appropriate references as examples for the aspects discussed (with some bias to our own work at L3S), though a complete literature overview would go well beyond cookbook recipe length limits.

Small Can Be Beautiful in the Semantic Web

In 1984, Peter Patel-Schneider published a paper [1] entitled Small can be Beautiful in Knowledge Representation in which he advocated for limiting the expressive power of knowledge representation formalisms in order to guarantee good computational properties and thus make knowledge-based systems usable as part of larger systems. In this paper, I aim at showing that the same argument holds for the Semantic Web: if we want to give a chance for the Semantic Web to scale up and to be broadly used, we have to limit the expressive power of the ontologies serving as semantic marking up of resources. In addition, due to the scale of the Web and the disparity of its users, it is unavoidable to have to deal with distributed heterogeneous ontologies. In this paper, I will argue that a peer-to-peer infrastructure enriched with simple distributed ontologies (e.g., taxonomies) reconciled through mappings is appropriate and scalable for supporting the future Semantic Web.

Data Semantics

A Method for Converting Thesauri to RDF/OWL

This paper describes a method for converting existing thesauri and related resources from their native format to RDF(S) and OWL. The method identifies four steps in the conversion process. In each step, decisions have to be taken with respect to the syntax or semantics of the resulting representation. Each step is supported through a number of guidelines. The method is illustrated through conversions of two large thesauri: MeSH and WordNet.

Contexts for the Semantic Web

A central theme of the Semantic Web is that programs should be able to easily aggregate data from different sources. Unfortunately, even if two sites provide their data using the same data model and vocabulary, subtle differences in their use of terms and in the assumptions they make pose challenges for aggregation. Experiences with the TAP project reveal some of the phenomena that pose obstacles to a simplistic model of aggregation. Similar experiences have been reported by AI projects such as Cyc, which has led to the development and use of various context mechanisms. In this paper we report on some of the problems with aggregating independently published data and propose a context mechanism to handle some of these problems. We briefly survey the context mechanisms developed in AI and contrast them with the requirements of a context mechanism for the Semantic Web. Finally, we present a context mechanism for the Semantic Web that is adequate to handle the aggregation tasks, yet simple from both computational and model theoretic perspectives.

Bipartite Graphs as Intermediate Model for RDF

RDF Graphs are sets of assertions in the form of subject-predicate-object triples of information resources. Although for simple examples they can be understood intuitively as directed labeled graphs, this representation does not scale well for more complex cases, particularly regarding the central notion of connectivity of resources.We argue in this paper that there is need for an intermediate representation of RDF to enable the application of well-established methods from graph theory. We introduce the concept of RDF Bipartite Graph and show its advantages as intermediate model between the abstract triple syntax and data structures used by applications. In the light of this model we explore the issues of transformation costs, data/schema-structure, and the notion of RDF connectivity.

A Model Theoretic Semantics for Ontology Versioning

We show that the Semantic Web needs a formal semantics for the various kinds of links between ontologies and other documents. We provide a model theoretic semantics that takes into account ontology extension and ontology versioning. Since the Web is the product of a diverse community, as opposed to a single agent, this semantics accommodates different viewpoints by having different entailment relations for different ontology perspectives. We discuss how this theory can be practically applied to RDF and OWL and provide a theorem that shows how to compute perspective-based entailment using existing logical reasoners. We illustrate these concepts using examples and conclude with a discussion of future work.

Extending the RDFS Entailment Lemma

We complement the RDF semantics specification of the W3C by proving decidability of RDFS entailment. Furthermore, we show completeness and decidability of entailment for RDFS extended with datatypes and a property-related subset of OWL.The RDF semantics specification provides a complete set of entailment rules for reasoning with RDFS, but does not prove decidability of RDFS entailment: the closure graphs used in the completeness proof are infinite for finite RDF graphs. We define partial closure graphs, which can be taken to be finite for finite RDF graphs, which can be computed in polynomial time, and which are sufficient to decide RDFS entailment.We consider the extension of RDFS with datatypes and a property-related fragment of OWL: FunctionalProperty, InverseFunctionalProperty, sameAs, SymmetricProperty, TransitiveProperty, and inverseOf. In order to obtain a complete set of simple entailment rules, the semantics that we use for these extensions is in line with the ‘if-semantics’ of RDFS, and weaker than the ‘iff-semantics’ defining D-entailment and OWL (DL or Full) entailment. Classes can be used as instances, the use of FunctionalProperty and TransitiveProperty is not restricted to obtain decidability, and a partial closure that is sufficient for deciding entailment can be computed in polynomial time.

Using Semantic Web Technologies for Representing E-science Provenance

Life science researchers increasingly rely on the web as a primary source of data, forcing them to apply the same rigor to its use as to an experiment in the laboratory. The ${}^{\mbox{\scriptsize my}}$Grid project is developing the use of workflows to explicitly capture web-based procedures, and provenance to describe how and why results were produced. Experience within ${}^{\mbox{\scriptsize my}}$Grid has shown that this provenance metadata is formed from a complex web of heterogenous resources that impact on the production of a result. Therefore we have explored the use of Semantic Web technologies such as RDF, and ontologies to support its representation and used existing initiatives such as Jena and LSID, to generate and store such material. The effective presentation of complex RDF graphs is challenging. Haystack has been used to provide multiple views of provenance metadata that can be further annotated. This work therefore forms a case study showing how existing Semantic Web tools can effectively support the emerging requirements of life science research.

P2P Systems

GridVine: Building Internet-Scale Semantic Overlay Networks

This paper addresses the problem of building scalable semantic overlay networks. Our approach follows the principle of data independence by separating a logical layer, the semantic overlay for managing and mapping data and metadata schemas, from a physical layer consisting of a structured peer-to-peer overlay network for efficient routing of messages. The physical layer is used to implement various functions at the logical layer, including attribute-based search, schema management and schema mapping management. The separation of a physical from a logical layer allows us to process logical operations in the semantic overlay using different physical execution strategies. In particular we identify iterative and recursive strategies for the traversal of semantic overlay networks as two important alternatives. At the logical layer we support semantic interoperability through schema inheritance and Semantic Gossiping. Thus our system provides a complete solution to the implementation of semantic overlay networks supporting both scalability and interoperability.

Bibster – A Semantics-Based Bibliographic Peer-to-Peer System

This paper describes the design and implementation of Bibster, a Peer-to-Peer system for exchanging bibliographic data among researchers. Bibster exploits ontologies in data storage, query formulation, query routing and answer presentation: When bibliographic entries are made available for use in Bibster, they are structured and classified according to two different ontologies. This ontological structure is then exploited to help users formulate their queries. Subsequently, the ontologies are used to improve query routing across the Peer-to-Peer network. Finally, the ontologies are used to post-process the returned answers in order to do duplicate detection. The paper describes each of these ontology-based aspects of Bibster. Bibster is a fully implemented open source solution built on top of the JXTA platform.

Top-k Query Evaluation for Schema-Based Peer-to-Peer Networks

Increasing the number of peers in a peer-to-peer network usually increases the number of answers to a given query as well. While having more answers is nice in principle, users are not interested in arbitrarily large and unordered answer sets, but rather in a small set of “best” answers. Inspired by the success of ranking algorithms in Web search engine and top-k query evaluation algorithms in databases, we propose a decentralized top-k query evaluation algorithm for peer-to-peer networks which makes use of local rankings, rank merging and optimized routing based on peer ranks, and minimizes both answer set size and network traffic among peers. As our algorithm is based on dynamically collected query statistics only, no continuous index update processes are necessary, allowing it to scale easily to large numbers of peers.

Semantic Web Mining

Learning Meta-descriptions of the FOAF Network

We argue that in a distributed context, such as the Semantic Web, ontology engineers and data creators often cannot control (or even imagine) the possible uses their data or ontologies might have. Therefore ontologies are unlikely to identify every useful or interesting classification possible in a problem domain, for example these might be of a personalised nature and only appropriate for a certain user in a certain context, or they might be of a different granularity than the initial scope of the ontology. We argue that machine learning techniques will be essential within the Semantic Web context to allow these unspecified classifications to be identified. In this paper we explore the application of machine learning methods to FOAF, highlighting the challenges posed by the characteristics of such data. Specifically, we use clustering to identify classes of people and inductive logic programming (ILP) to learn descriptions of these groups. We argue that these descriptions constitute re-usable, first class knowledge that is neither explicitly stated nor deducible from the input data. These new descriptions can be represented as simple OWL class restrictions or more sophisticated descriptions using SWRL. These are then suitable either for incorporation into future versions of ontologies or for on-the-fly use for personalisation tasks.

From Tables to Frames

Turning the current Web into a Semantic Web requires automatic approaches for annotation of existing data since manual approaches will not scale in general. We here present an approach for automatic generation of F-Logic frames out of tables which subsequently supports the automatic population of ontologies from table-like structures. The approach consists of a methodology, an accompanying implementation and a thorough evaluation. It is based on a grounded cognitive table model which is stepwise instantiated by our methodology.

Tools and Methodologies for Web Agents

The Specification of Agent Behavior by Ordinary People: A Case Study

The development of intelligent agents is a key part of the Semantic Web vision, but how does an ordinary person tell an agent what to do? One approach to this problem is to use RDF templates that are authored once but then instantiated many times by ordinary users. This approach, however, raises a number of challenges. For instance, how can templates concisely represent a broad range of potential uses, yet ensure that each possible instantiation will function properly? And how does the agent explain its actions to the humans involved? This paper addresses these challenges in the context of a case study carried out on our fully-deployed system for semantic email agents. We describe how high-level features of our template language enable the concise specification of flexible goals. In response to the first question, we show that it is possible to verify, in polynomial time, that a given template will always produce a valid instantiation. Second, we show how to automatically generate explanations for the agent’s actions, and identify cases where explanations can be computed in polynomial time. These results both improve the usefulness of semantic email and suggest general issues and techniques that may be applicable in other Semantic Web systems.

User Interfaces and Visualization

Visual Modeling of OWL DL Ontologies Using UML

This paper introduces a visual, UML-based notation for OWL ontologies. We provide a standard MOF2 compliant metamodel which captures the language primitives offered by OWL DL. Similarly, we invent a UML profile, which allows to visually model OWL ontologies in a notation that is close to the UML notation. This allows to develop ontologies using UML tools. Throughout the paper, the significant differences to some earlier proposals for a visual, UML-based notation for ontologies are discussed.

What Would It Mean to Blog on the Semantic Web?

The phenomenon known as Web logging (“blogging”) has helped realize an initial goal of the Web: to turn Web content consumers (i.e., end users) into Web content producers. As the Semantic Web unfolds, we feel there are two questions worth posing: (1) do blog entries have semantic structure that can be usefully captured and exploited? (2) is blogging a natural way to encourage growth of the Semantic Web? We explore empirical evidence for answering these questions in the affirmative and propose means to bring blogging into the mainstream of the Semantic Web, including ontologies that extend the RSS 1.0 specification and an XSL transform for handling RSS 0.9x/2.0 files. To demonstrate the validity of our approach we have constructed a semantic blogging environment based on Haystack. We argue that with tools such as Haystack, semantic blogging will be an important paradigm by which metadata authoring will occur in the future.

The Protégé OWL Plugin: An Open Development Environment for Semantic Web Applications

We introduce the OWL Plugin, a Semantic Web extension of the Protégé ontology development platform. The OWL Plugin can be used to edit ontologies in the Web Ontology Language (OWL), to access description logic reasoners, and to acquire instances for semantic markup. In many of these features, the OWL Plugin has created and facilitated new practices for building Semantic Web contents, often driven by the needs of and feedback from our users. Furthermore, Protégé’s flexible open-source platform means that it is easy to integrate custom-tailored components to build real-world applications. This document describes the architecture of the OWL Plugin, walks through its most important features, and discusses some of our design decisions.

OntoTrack: Combining Browsing and Editing with Reasoning and Explaining for OWL Lite Ontologies

OntoTrack is a new browsing and editing “in-one-view” ontology authoring tool that combines a hierarchical graphical layout and instant reasoning feedback for (the most rational fraction of) OWL Lite. OntoTrack provides an animated and zoomable view with context sensitive features like click-able miniature branches or selective detail views together with drag-and-drop editing. Each editing step is instantly synchronized with an external reasoner in order to provide appropriate graphical feedback about relevant modeling consequences. The most recent feature of OntoTrack is an on demand textual explanation for subsumption and equivalence between or unsatisfiability of classes. This paper describes the key features of the current implementation and discusses future work as well as some development issues.

Tracking Changes During Ontology Evolution

As ontology development becomes a collaborative process, developers face the problem of maintaining versions of ontologies akin to maintaining versions of software code or versions of documents in large projects. Traditional versioning systems enable users to compare versions, examine changes, and accept or reject changes. However, while versioning systems usually treat software code and text documents as text files, a versioning system for ontologies must compare and present structural changes rather than changes in text representation of ontologies. In this paper, we present the PromptDiff ontology-versioning environment, which address these challenges. PromptDiff includes an efficient version-comparison algorithm that produces a structural diff between ontologies. The results are presented to the users through an intuitive user interface for analyzing the changes that enables users to view concepts and groups of concepts that were added, deleted, and moved, distinguished by their appearance and with direct access to additional information characterizing the change. The users can then act on the changes, accepting or rejecting them. We present results of a pilot user study that demonstrate the effectiveness of the tool for change management. We discuss design principles for an end-to-end ontology-versioning environment and position ontology versioning as a component in a general ontology-management framework.

Large Scale Knowledge Management

An Evaluation of Knowledge Base Systems for Large OWL Datasets

In this paper, we present an evaluation of four knowledge base systems (KBS) with respect to use in large OWL applications. To our knowledge, no experiment has been done with the scale of data used here. The smallest dataset used consists of 15 OWL files totaling 8MB, while the largest dataset consists of 999 files totaling 583MB. We evaluated two memory-based systems (OWLJessKB and memory-based Sesame) and two systems with persistent storage (database-based Sesame and DLDB-OWL). We describe how we have performed the evaluation and what factors we have considered in it. We show the results of the experiment and discuss the performance of each system. In particular, we have concluded that existing systems need to place a greater emphasis on scalability.

Structure-Based Partitioning of Large Concept Hierarchies

The increasing awareness of the benefits of ontologies for information processing has lead to the creation of a number of large ontologies about real-world domains. The size of these ontologies and their monolithic character cause serious problems in handling them. In other areas, e.g. software engineering, these problems are tackled by partitioning monolithic entities into sets of meaningful and mostly self-contained modules. In this paper, we suggest a similar approach for ontologies. We propose a method for automatically partitioning large ontologies into smaller modules based on the structure of the class hierarchy. We show that the structure-based method performs surprisingly well on real-world ontologies. We support this claim by experiments carried out on real-world ontologies including SUMO and the NCI cancer ontology. The results of these experiments are available online at http://swserver.cs.vu.nl/partitioning/.

Semantic Web Services

Semantic Web Service Interaction Protocols: An Ontological Approach

A central requirement for achieving the vision of run-time discovery and dynamic composition of services is the provision of appropriate descriptions of the operation of a service, that is, how the service interacts with agents or other services. In this paper, we use experience gained through the development of real-life Grid applications to produce a set of requirements for such descriptions and then attempt to match those requirements against the offerings of existing work, such as OWL-S [1] and IRS-II [2]. Based on this analysis we identify which requirements are not addressed by current research and, in response, produce a model for describing the interaction protocol of a service in response. The main contributions of this model are the ability to describe the interactions of multiple parties with respect to a single service, distinguish between interactions initiated by the service itself and interactions that are initiated by clients or other cooperating services, and capture within the description service state changes relevant to interacting parties that are either a result of internal service events or interactions. The aim of the model is not to replace existing work, since it only focuses on the description of the interaction protocol of a service, but to inform the further development of such work.

ASSAM: A Tool for Semi-automatically Annotating Semantic Web Services

The semantic Web Services vision requires that each service be annotated with semantic metadata. Manually creating such metadata is tedious and error-prone, and many software engineers, accustomed to tools that automatically generate WSDL, might not want to invest the additional effort. We therefore propose ASSAM, a tool that assists a user in creating semantic metadata for Web Services. ASSAM is intended for service consumers who want to integrate a number of services and therefore must annotate them according to some shared ontology. ASSAM is also relevant for service producers who have deployed a Web Service and want to make it compatible with an existing ontology. ASSAM’s capabilities to automatically create semantic metadata are supported by two machine learning algorithms. First, we have developed an iterative relational classification algorithm for semantically classifying Web Services, their operations, and input and output messages. Second, to aggregate the data returned by multiple semantically related Web Services, we have developed a schema mapping algorithm that is based on an ensemble of string distance metrics.

Information Gathering During Planning for Web Service Composition

Hierarchical Task-Network (HTN) based planning techniques have been applied to the problem of composing Web Services, especially when described using the OWL − S service ontologies. Many of the existing Web Services are either exclusively information providing or crucially depend on information-providing services. Thus, many interesting service compositions involve collecting information either during execution or during the composition process itself. In this paper, we focus on the latter issue. In particular, we present ENQUIRER, an HTN-planning algorithm designed for planning domains in which the information about the initial state of the world may not be complete, but it is discoverable through plan-time information-gathering queries. We have shown that ENQUIRER is sound and complete, and derived several mathematical relationships among the amount of available information, the likelihood of the planner finding a plan, and the quality of the plan found. We have performed experimental tests that confirmed our theoretical results and that demonstrated how ENQUIRER can be used in Web Service composition.

Applying Semantic Web Services to Bioinformatics: Experiences Gained, Lessons Learnt

We have seen an increasing amount of interest in the application of Semantic Web technologies to Web services. The aim is to support automated discovery and composition of the services allowing seamless and transparent interoperability. In this paper we discuss three projects that are applying such technologies to bioinformatics: ${}^{\scriptsize my}$Grid, MOBY-Services and Semantic-MOBY. Through an examination of the differences and similarities between the solutions produced, we highlight some of the practical difficulties in developing Semantic Web services and suggest that the experiences with these projects have implications for the development of Semantic Web services as a whole.

Automating Scientific Experiments on the Semantic Grid

We present a framework to facilitate automated synthesis of scientific experiment workflows in Semantic Grids based on high-level goal specification. Our framework has two main features which distinguish it from other work in this area. First, we propose a dynamic and adaptive mechanism for automating the construction of experiment workflows. Second, we distinguish between different levels of abstraction of loosely coupled experiment workflows to facilitate reuse and sharing of experiments. We illustrate our framework using a real world application scenario in the physics domain involving the detection of gravitational waves from astrophysical sources.

Automated Composition of Semantic Web Services into Executable Processes

Different planning techniques have been applied to the problem of automated composition of web services. However, in realistic cases, this planning problem is far from trivial: the planner needs to deal with the nondeterministic behavior of web services, the partial observability of their internal status, and with complex goals expressing temporal conditions and preference requirements. We propose a planning technique for the automated composition of web services described in OWL-S process models, which can deal effectively with nondeterminism, partial observability, and complex goals. The technique allows for the synthesis of plans that encode compositions of web services with the usual programming constructs, like conditionals and iterations. The generated plans can thus be translated into executable processes, e.g., BPEL4WS programs. We implement our solution in a planner and do some preliminary experimental evaluations that show the potentialities of our approach, and the gain in performance of automating the composition at the semantic level w.r.t. the automated composition at the level of executable processes.

A Conceptual Architecture for Semantic Web Services

In this paper, we present an abstract conceptual architecture for semantic web services. We define requirements on the architecture by analyzing a set of case studies developed as part of the EU Semantic Web-enabled Web Services project. The architecture is developed as a refinement and extension of the W3C Web Services Architecture. We assess our architecture against the requirements, and provide an analysis of OWL-S.

From Software APIs to Web Service Ontologies: A Semi-automatic Extraction Method

Successful employment of semantic web services depends on the availability of high quality ontologies to describe the domains of these services. As always, building such ontologies is difficult and costly, thus hampering web service deployment. Our hypothesis is that since the functionality offered by a web service is reflected by the underlying software, domain ontologies could be built by analyzing the documentation of that software. We verify this hypothesis in the domain of RDF ontology storage tools. We implemented and fine-tuned a semi-automatic method to extract domain ontologies from software documentation. The quality of the extracted ontologies was verified against a high quality hand-built ontology of the same domain. Despite the low linguistic quality of the corpus, our method allows extracting a considerable amount of information for a domain ontology.

Applying KAoS Services to Ensure Policy Compliance for Semantic Web Services Workflow Composition and Enactment

In this paper we describe our experience in applying KAoS services to ensure policy compliance for Semantic Web Services workflow composition and enactment. We are developing these capabilities within the context of two applications: Coalition Search and Rescue (CoSAR-TS) and Semantic Firewall (SFW). We describe how this work has uncovered requirements for increasing the expressivity of policy beyond what can be done with description logic (e.g., role-value-maps), and how we are extending our representation and reasoning mechanisms in a carefully controlled manner to that end. Since KAoS employs OWL for policy representation, it fits naturally with the use of OWL-S workflow descriptions generated by the AIAI I-X planning system in the CoSAR-TS application. The advanced reasoning mechanisms of KAoS are based on the JTP inference engine and enable the analysis of classes and instances of processes from a policy perspective. As the result of analysis, KAoS concludes whether a particular workflow step is allowed by policy and whether the performance of this step would incur additional policy-generated obligations. Issues in the representation of processes within OWL-S are described. Besides what is done during workflow composition, aspects of policy compliance can be checked at runtime when a workflow is enacted. We illustrate these capabilities through two application examples. Finally, we outline plans for future work.

Inference

Knowledge-Intensive Induction of Terminologies from Metadata

We focus on the induction and revision of terminologies from metadata. Following a Machine Learning approach, this setting can be cast as a search problem to be solved employing operators that traverse the search space expressed in a structural representation, aiming at correct concept definitions. The progressive refinement of such definitions in a terminology is driven by the available extensional knowledge (metadata). A knowledge-intensive inductive approach to this task is presented, that can deal with on the expressive Semantic Web representations based on Description Logics, which are endowed with well-founded reasoning capabilities. The core inferential mechanism, based on multilevel counterfactuals, can be used for either inducing new concept descriptions or refining existing (incorrect) ones. The soundness of the approach and its applicability are also proved and discussed.

Inferring Data Transformation Rules to Integrate Semantic Web Services

OWL-S allows selecting, composing and invoking Web Services at different levels of abstraction: selection uses high level abstract descriptions, invocation uses low level grounding ones, while composition needs to consider both high and low level descriptions. In our setting, two Web Services are to be composed so that output from the upstream one is used to create input for the downstream one. These Web Services may have different data models but are related to each other through high and low level descriptions. Correspondences must be found between components of the upstream data type and the downstream ones. Low level data transformation functions may be required (e.g. unit conversions, data type conversions). The components may be arranged in different XML tree structures. Thus, multiple data transformations are necessary: reshaping the message tree, matching leaves by corresponding types, translating through ontologies, and calling conversion functions. Our prototype compiles these transformations into a set of data transformation rules, using our tableau-based $\cal ALC$ Description Logic reasoner to reason over the given OWL-S and WSDL descriptions, as well as the related ontologies. A resolution-based inference mechanism for running these rules is embedded in an inference queue that conducts data from the upstream to the downstream service, running the rules to perform the data transformation in the process.

Using Vampire to Reason with OWL

OWL DL corresponds to a Description Logic (DL) that is a fragment of classical first-order predicate logic (FOL). Therefore, the standard methods of automated reasoning for full FOL can potentially be used instead of dedicated DL reasoners to solve OWL DL reasoning tasks. In this paper we report on some experiments designed to explore the feasibility of using existing general-purpose FOL provers to reason with OWL DL. We also extend our approach to SWRL, a proposed rule language extension to OWL.

Searching and Querying

Generating On the Fly Queries for the Semantic Web: The ICS-FORTH Graphical RQL Interface (GRQL)

Building user-friendly GUIs for browsing and filtering RDF/S description bases while exploiting in a transparent way the expressiveness of declarative query/view languages is vital for various Semantic Web applications (e.g., e-learning, e-science). In this paper we present a novel interface, called GRQL, which relies on the full power of the RDF/S data model for constructing on the fly queries expressed in RQL. More precisely, a user can navigate graphically through the individual RDF/S class and property definitions and generate transparently the RQL path expressions required to access the resources of interest. These expressions capture accurately the meaning of its navigation steps through the class (or property) subsumption and/or associations. Additionally, users can enrich the generated queries with filtering conditions on the attributes of the currently visited class while they can easily specify the resource’s class(es) appearing in the query result. To the best of our knowledge, GRQL is the first application-independent GUI able to generate a unique RQL query which captures the cumulative effect of an entire user navigation session.

A Comparison of RDF Query Languages

The purpose of this paper is to provide a rigorous comparison of six query languages for RDF. We outline and categorize features that any RDF query language should provide and compare the individual languages along these features. We describe several practical usage examples for RDF queries and conclude with a comparison of the expressiveness of the particular query languages. The use cases, sample data and queries for the respective languages are available on the web [6].

Information Retrieval Support for Ontology Construction and Use

Information retrieval can contribute towards the construction of ontologies and the effective usage of ontologies. We use collocation-based keyword extraction to suggest new concepts, and study the generation of hyperlinks to automate the population of ontologies with instances. We evaluate our methods within the setting of digital library project, using information retrieval evaluation methodology. Within the same setting we study retrieval methods that complement the navigational support offered by the semantic relations in most ontologies to help users explore the ontology.

Rules-By-Example – A Novel Approach to Semantic Indexing and Querying of Images

Images represent a key source of information in many domains and the ability to exploit them through their discovery, analysis and integration by services and agents on the Semantic Web is a challenging and significant problem. To date the semantic indexing of images has concentrated on applying machine-learning techniques to a set of manually-annotated images in order to automatically label images with keywords. In this paper we propose a new hybrid, user-assisted approach, Rules-By-Example (RBE), which is based on a combination of RuleML and Query-By-Example. Our RBE user interface enables domain-experts to graphically define domain-specific rules that can infer high-level semantic descriptions of images from combinations of low-level visual features (e.g., color, texture, shape, size of regions) which have been specified through examples. Using these rules, the system is able to analyze the visual features of any given image from this domain and generate semantically meaningful labels, using terms defined in the domain-specific ontology. We believe that this approach, in combination with traditional solutions, will enable faster, more flexible, cost-effective and accurate semantic indexing of images and hence maximize their potential for discovery, re-use, integration and processing by Semantic Web services, tools and agents.

Query Answering for OWL-DL with Rules

Both OWL-DL and function-free Horn rules are decidable logics with interesting, yet orthogonal expressive power: from the rules perspective, OWL-DL is restricted to tree-like rules, but provides both existentially and universally quantified variables and full, monotonic negation. From the description logic perspective, rules are restricted to universal quantification, but allow for the interaction of variables in arbitrary ways. Clearly, a combination of OWL-DL and rules is desirable for building Semantic Web ontologies, and several such combinations have already been discussed. However, such a combination might easily lead to the undecidability of interesting reasoning problems. Here, we present a decidable such combination which is, to the best of our knowledge, more general than similar decidable combinations proposed so far. Decidability is obtained by restricting rules to so-called DL-safe ones, requiring each variable in a rule to occur in a non-DL-atom in the rule body. We show that query answering in such a combined logic is decidable, and we discuss its expressive power by means of a non-trivial example. Finally, we present an algorithm for query answering in $\mathcal{SHIQ}(\mathbf{D})$ extended with DL-safe rules based on the reduction to disjunctive datalog.

Semantic Web Middleware

A Semantic Web Resource Protocol: XPointer and HTTP

Semantic Web resources — that is, knowledge representation formalisms existing in a distributed hypermedia system — require different addressing and processing models and capacities than the typical kinds of World Wide Web resources. We describe an approach to building a Semantic Web resource protocol — a scalable, extensible logical addressing scheme and transport protocol — by using and extending existing specifications and technologies. We introduce XPointer and some infrequently used, but useful features of HTTP/1.1, in order to support addressing and server side processing of resource and subresource operations. We consider applications of the XPointer Framework for use in the Semantic Web, particularly for RDF and OWL resources and subresources. We describe two initial implementations: filtering of RSS resources by date and item range; RDF subresource selection using RDQL. Finally, we describe possible application to the problem of OWL imports.

On the Emergent Semantic Web and Overlooked Issues

The emergent Semantic Web, despite being in its infancy, has already received a lot of attention from academia and industry. This resulted in an abundance of prototype systems and discussion most of which are centred around the underlying infrastructure. However, when we critically review the work done to date we realise that there is little discussion with respect to the vision of the Semantic Web. In particular, there is an observed dearth of discussion on how to deliver knowledge sharing in an environment such as the Semantic Web in effective and efficient manners. There are a lot of overlooked issues, associated with agents and trust to hidden assumptions made with respect to knowledge representation and robust reasoning in a distributed environment. These issues could potentially hinder further development if not considered at the early stages of designing Semantic Web systems. In this perspectives’ paper, we aim to help engineers and practitioners of the Semantic Web by raising awareness of these issues.

Metadata-Driven Personal Knowledge Publishing

We propose a personal knowledge publishing system called Semblog is realized with integration of Semantic Web techniques and Weblog tools. Semblog suite provides an integrated environment for gathering, authoring, publishing, and making human relationship seamlessly to enable people to exchange information and knowledge with easy and casual fashion. We use a lightweight metadata format like RSS to activate the information flow and its activities. We define three level of interest of information gathering and publishing i.e., “check”, “clip” and “post” and provide suitable ways to distribute information depending on the interest level. Our system called Semblog platform consists of two types of extended content aggregator and information retrieval / recommendation applications. We also design a new metadata module to define personal ontology that realizes semantic relations among people and Weblog sites.

Integration and Interoperability

An Extensible Directory Enabling Efficient Semantic Web Service Integration

In an open environment populated by large numbers of heterogeneous information services, integration is a major challenge. In such a setting, the efficient coupling between directory-based service discovery and service composition engines is crucial. In this paper we present a directory service that offers specific functionality in order to enable efficient service integration. The directory implementation relies on a compact numerical encoding of service parameters and on a multidimensional index structure. It supports isolated service integration sessions providing a consistent view of the directory data. During a session a client may issue multiple queries to the directory and retrieve the results incrementally. In order to optimize the interaction of the directory with different service composition algorithms, the directory supports custom ranking functions that are dynamically installed with the aid of mobile code. The ranking functions are written in Java, but the directory service imposes severe restrictions on the programming model in order to protect itself against malicious or erroneous code (e.g., denial-of-service attacks). With the aid of user-defined ranking functions, application-specific ordering heuristics can be deployed directly. Experiments on randomly generated problems show that they significantly reduce the number of query results that have to be transmitted to the client by up to 5 times.

Working with Multiple Ontologies on the Semantic Web

The standardization of the second generation Web Ontology Language, OWL, leaves a crucial issue for Web-based ontologies unsatisfactorily resolved: how to represent and reason with multiple distinct, but linked, ontologies. OWL provides the owl:imports construct which, roughly, allows Web ontologies to include other Web ontologies, but only by merging all the linked ontologies into a single logical “space.” Recent work on multidimensional logics, fusions and other combinations of modal logics, distributed and contextual logics, and the like have tried to find formalisms wherein knowledge bases (and their logic) are kept more distinct but yet affect each other. These formalisms have various degrees of robustness in their computational complexity, their modularity, their expressivity, and their intuitiveness to modelers. In this paper, we explore a family of such formalisms, grounded in $\mathcal{E}$-connections as extensions to OWL, with emphasis on a novel sub-formalism that seems very straightforward to implement on existing tableau OWL reasoners, as witnessed by our implementation of this formalism in the OWL reasoner Pellet. We discuss how to integrate those formalisms into OWL, as well as some of the issues that modelers have to face when using such formalisms in the context of a large number of heterogeneous, independently developed, richly interconnected ontologies that we expect to be the norm on the Semantic Web.

Opening Up Magpie via Semantic Services

Magpie is a suite of tools supporting a ‘zero-cost’ approach to semantic web browsing: it avoids the need for manual annotation by automatically associating an ontology-based semantic layer to web resources. An important aspect of Magpie, which differentiates it from superficially similar hypermedia systems, is that the association between items on a web page and semantic concepts is not merely a mechanism for dynamic linking, but it is the enabling condition for locating services and making them available to a user. These services can be manually activated by a user (pull services), or opportunistically triggered when the appropriate web entities are encountered during a browsing session (push services). In this paper we analyze Magpie from the perspective of building semantic web applications and we note that earlier implementations did not fulfill the criterion of “open as to services”, which is a key aspect of the emerging semantic web. For this reason, in the past twelve months we have carried out a radical redesign of Magpie, resulting in a novel architecture, which is open both with respect to ontologies and semantic web services. This new architecture goes beyond the idea of merely providing support for semantic web browsing and can be seen as a software framework for designing and implementing semantic web applications.

Ontologies

Towards a Symptom Ontology for Semantic Web Applications

As the use of Semantic Web ontologies continues to expand there is a growing need for tools that can validate ontological consistency and provide guidance in the correction of detected defects and errors. A number of tools already exist as evidenced by the ten systems participating in the W3C’s evaluation of the OWL Test Cases. For the most part, these first generation tools focus on experimental approaches to consistency checking, while minimal attention is paid to how the results will be used or how the systems might interoperate. For this reason very few of these systems produce results in a machine-readable format (for example as OWL annotations) and there is no shared notion across the tools of how to identify and describe what it is that makes a specific ontology or annotation inconsistent. In this paper we propose the development of a Symptom Ontology for the Semantic Web that would serve as a common language for identifying and describing semantic errors and warnings that may be indicative of inconsistencies in ontologies and annotations; we refer to such errors and warnings as symptoms. We offer the symptom ontology currently used by the ConsVISor consistency-checking tool, as the starting point for a discussion on the desirable characteristics of such an ontology. Included among these characteristics are 1) a hierarchy of common symptoms, 2) clear associations between specific symptoms and the axioms of the languages they violate and 3) a means for relating individual symptoms back to the specific constructs in the input file(s) through which they were implicated. We conclude with a number of suggestions for future directions of this work including its extension to syntactic symptoms.

Patching Syntax in OWL Ontologies

An analysis of OWL ontologies represented in RDF/XML on the Web shows that a majority are OWL Full. In many cases this may not be through a desire to use the expressivity provided by OWL Full, but is rather due to syntactic errors or accidental misuse of the vocabulary. We present a “rogues gallery” of common errors encountered, and describe how robust parsers that attempt to cope with such errors can be produced.

QOM – Quick Ontology Mapping

(Semi-)automatic mapping – also called (semi-)automatic alignment – of ontologies is a core task to achieve interoperability when two agents or services use different ontologies. In the existing literature, the focus has so far been on improving the quality of mapping results. We here consider QOM, Quick Ontology Mapping, as a way to trade off between effectiveness (i.e. quality) and efficiency of the mapping generation algorithms. We show that QOM has lower run-time complexity than existing prominent approaches. Then, we show in experiments that this theoretical investigation translates into practical benefits. While QOM gives up some of the possibilities for producing high-quality results in favor of efficiency, our experiments show that this loss of quality is marginal.

An API for Ontology Alignment

Ontologies are seen as the solution to data heterogeneity on the web. However, the available ontologies are themselves source of heterogeneity. This can be overcome by aligning ontologies, or finding the correspondence between their components. These alignments deserve to be treated as objects: they can be referenced on the web as such, be completed by an algorithm that improves a particular alignment, be compared with other alignments and be transformed into a set of axioms or a translation program. We present here a format for expressing alignments in RDF, so that they can be published on the web. Then we propose an implementation of this format as an Alignment API, which can be seen as an extension of the OWL API and shares some design goals with it. We show how this API can be used for effectively aligning ontologies and completing partial alignments, thresholding alignments or generating axioms and transformations.

Specifying Ontology Views by Traversal

One of the original motivations behind ontology research was the belief that ontologies can help with reuse in knowledge representation. However, many of the ontologies that are developed with reuse in mind, such as standard reference ontologies and controlled terminologies, are extremely large, while the users often need to reuse only a small part of these resources in their work. Specifying various views of an ontology enables users to limit the set of concepts that they see. In this paper, we develop the concept of a Traversal View, a view where a user specifies the central concept or concepts of interest, the relationships to traverse to find other concepts to include in the view, and the depth of the traversal. For example, given a large ontology of anatomy, a user may use a Traversal View to extract a concept of Heart and organs and organ parts that surround the heart or are contained in the heart. We define the notion of Traversal Views formally, discuss their properties, present a strategy for maintaining the view through ontology evolution and describe our tool for defining and extracting Traversal Views.

Automatic Generation of Ontology for Scholarly Semantic Web

Semantic Web provides a knowledge-based environment that enables information to be shared and retrieved effectively. In this research, we propose the Scholarly Semantic Web for the sharing, reuse and management of scholarly information. To support the Scholarly Semantic Web, we need to construct ontology from data which is a tedious and difficult task. To generate ontology automatically, Formal Concept Analysis (FCA) is an effective technique that can formally abstract data as conceptual structures. To enable FCA to deal with uncertainty in data and interpret the concept hierarchy reasonably, we propose to incorporate fuzzy logic into FCA for automatic generation of ontology. The proposed new framework is known as Fuzzy Formal Concept Analysis (FFCA). In this paper, we will discuss the Scholarly Semantic Web, and the ontology generation process from the FFCA framework. In addition, the performance of the FFCA framework for ontology generation will also be evaluated and presented.

Industrial Track

Querying Real World Services Through the Semantic Web

We propose a framework for querying real world services using meta data of Web pages and ontologies and implement a prototype system using standardized technologies developed by the Semantic Web community. The meta data and the ontologies enable systems to infer related classes while modifying the queries and retrieving approximate result and help them smooth obstacles in interaction while querying. The standardized technologies help us to describe the meta data and the ontologies in RDF (Resource Description Framework) and OWL (Web Ontology Language) formally and enable us to share them on the network. In this paper, we illustrate details of the framework and the prototype system, demonstrate how it works using practical data in Kyoto, Japan, and discuss requirements for actualizing this framework on the network.

Public Deployment of Semantic Service Matchmaker with UDDI Business Registry

Public deployment of the semantic service matchmaker to a UDDI registry for half a year is shown in this paper. UDDI is a standard registry for Web Services, but if we consider it a search engine, its functionality is limited to a keyword search. Therefore, the Matchmaker was developed to enhance UDDI by service capability matching. Then, we have deployed the Matchmaker to one of four official UDDI registries operated by NTT-Communications since September 2003. In this paper, we first introduce the Matchmaker with UDDI, and illustrate client tools which lower the threshold of use of semantics. Then, we evaluate this deployment by benchmarks in terms of performance and functionality. Finally, we discuss user requirements obtained by two ways: questionnaire and observation of search behaviour.

SemanticOrganizer: A Customizable Semantic Repository for Distributed NASA Project Teams

SemanticOrganizer is a collaborative knowledge management system designed to support distributed NASA projects, including multidisciplinary teams of scientists, engineers, and accident investigators. The system provides a customizable, semantically structured information repository that stores work products relevant to multiple projects of differing types. SemanticOrganizer is one of the earliest and largest semantic web applications deployed at NASA to date, and has been used in varying contexts ranging from the investigation of Space Shuttle Columbia’s accident to the search for life on other planets. Although the underlying repository employs a single unified ontology, access control and ontology customization mechanisms make the repository contents appear different for each project team. This paper describes SemanticOrganizer, its customization facilities, and a sampling of its applications. The paper also summarizes some key lessons learned from building and fielding a successful semantic web application across a wide-ranging set of domains with disparate users.

SWS for Financial Overdrawn Alerting

In this paper, we present a Notification Agent designed and implemented using Semantic Web Services. The Notification Agent manages alerts when critical financial situations arise discovering and selecting notification services. This agent applies open research results on the Semantic Web Services technologies including on-the-fly composition based on a finite state machine and automatic discovery of semantic services. Financial Domain ontologies, based on IFX financial standard, have been constructed and extended for building agent systems using OWL and OWL-S standard (as well as other approaches like DL or f-Logic). This agent is going to be offered through integrated Online Aggregation systems in commercial financial organizations.

OntoViews – A Tool for Creating Semantic Web Portals

This paper presents a semantic web portal tool OntoViews for publishing RDF content on the web. OntoViews provides the portal designer with a content-based search engine server, Ontogator, and a link recommendation system server, Ontodella. The user interface is created by combining these servers with the Apache Cocoon framework. From the end-user’s viewpoint, the key idea of OntoViews is to combine the multi-facet search paradigm, developed within the information retrieval research community, with semantic web RDFS ontologies, and extend the search service with a semantic browsing facility based on ontological reasoning. OntoViews is presented from the viewpoints of the end-user, architecture, and implementation. The implementation described is modular, easily modified and extended, and provides a good practical basis for creating semantic portals on the web. As a proof of concept, application of OntoViews to a deployed semantic web portal is discussed.

Applying Semantic Web Technology to the Life Cycle Support of Complex Engineering Assets

Complex engineering assets, such as ships and aircraft, are designed to be in-service for many years. Over its life, the support of such an asset costs an organization many times more than the original cost of the asset itself. An industry/government initiative has resulted in an International Standard information model aimed at satisfying three significant business requirements for owners of these assets: 1) reducing the cost of total ownership of such assets, 2) protecting the investment in produce data through life, and 3) increasing the use of the asset to deliver enhanced business performance. This standard, called Product Life Cycle Support (PLCS), defines a domain-specific, but flexible, information model designed to be tailored by using organizations through the use of Reference Data. This paper describes the approach used to take advantage of the Web Ontology Language (OWL) in the definition of Reference Data and how it is being applied in pilot projects. The use of Semantic Web technology for Reference Data is a first step towards the application of that technology in the Life Cycle Support domain. The relationship between the information model, and its modelling language called EXPRESS, and OWL is also explored.

ORIENT: Integrate Ontology Engineering into Industry Tooling Environment

Orient is a project to develop an ontology engineering tool that integrates into existing industry tooling environments – the Eclipse platform and the WebSphere Studio developing tools family. This paper describes how two important issues are addressed during the project, namely tool integration and scalability. We show how Orient morphs into the Eclipse platform and achieves UI and data level integration with the Eclipse platform and other modelling tools. We also describe how we implemented a scalable RDF(S) storage, query, manipulation and inference mechanism on top of a relational database. In particular, we report the empirical performance of our RDF(S) closure inference algorithm on a DB2 database.

Springer Professional

Über dieses Buch

Inhaltsverzeichnis