Ontology Engineering

Towards On-the-Fly Ontology Construction – Focusing on Ontology Quality Improvement

In order to realize the on-the-fly ontology construction for the Semantic Web, this paper proposes DODDLE-R, a support environment for user-centered ontology development. It consists of two main parts: pre-processing part and quality improvement part. Pre-processing part generates a prototype ontology semi-automatically, and quality improvement part supports the refinement of it interactively. As we believe that careful construction of ontologies from preliminary phase is more efficient than attempting generate ontologies full-automatically (it may cause too many modification by hand), quality improvement part plays significant role in DODDLE-R. Through interactive support for improving the quality of prototype ontology, OWL-Lite level ontology, which consists of taxonomic relationships (class – sub class relationship) and non-taxonomic relationships (defined as property), is constructed efficiently.

Naoki Sugiura, Yoshihiro Shigeta, Naoki Fukuta, Noriaki Izumi, Takahira Yamaguchi

OntoEdit Empowering SWAP: a Case Study in Supporting DIstributed, Loosely-Controlled and evolvInG Engineering of oNTologies (DILIGENT)

Knowledge management solutions relying on central repositories sometimes have not met expectations, since users often create knowledge ad-hoc using their individual vocabulary and using their own decentral IT infrastructure (e.g., their laptop). To improve knowledge management for such decentralized and individualized knowledge work, it is necessary to, first, provide a corresponding IT infrastructure and to, second, deal with the harmonization of different vocabularies/ontologies. In this paper, we briefly sketch the technical peer-to-peer platform that we have built, but then we focus on the harmonization of the participating ontologies.Thereby, the objective of this harmonization is to avoid the worst incongruencies by having users share a core ontology that they can expand for local use at their will and individual needs. The task that then needs to be solved is one of distributed, loosely-controlled and evolving engineering of ontologies. We have performed along these lines. To support the ontology engineering process in the case study we have furthermore extended the existing ontology engineering environment, OntoEdit. The case study process and the extended tool are presented in this paper.

Sofia Pinto, Steffen Staab, York Sure, Christoph Tempich

A Protégé Plug-In for Ontology Extraction from Text Based on Linguistic Analysis

In this paper we describe a plug-in (OntoLT) for the widely used Protégé ontology development tool that supports the interactive extraction and/or extension of ontologies from text. The OntoLT approach provides an environment for the integration of linguistic analysis in ontology engineering through the definition of mapping rules that map linguistic entities in annotated text collections to concept and attribute candidates (i.e. Protégé classes and slots). The paper ex-plains this approach in more detail and discusses some initial experiments on deriving a shallow ontology for the neurology domain from a corresponding collection of neurological scientific abstracts.

Paul Buitelaar, Daniel Olejnik, Michael Sintek

Ontology Matching and Mapping

Formal Support for Representing and Automating Semantic Interoperability

We discuss approaches to semantic heterogeneity and propose a formalisation of semantic interoperability based on the Barwise-Seligman theory of information flow. We argue for a theoretical framework that favours the analysis and implementation of semantic interoperability scenarios relative to particular understandings of semantics. We present an example case of such a scenario where our framework has been applied as well as variations of it in the domain of ontology mapping.

Yannis Kalfoglou, Marco Schorlemmer

S-Match: an Algorithm and an Implementation of Semantic Matching

We think of Match as an operator which takes two graph-like structures (e.g., conceptual hierarchies or ontologies) and produces a mapping between those nodes of the two graphs that correspond semantically to each other. Semantic matching is a novel approach where semantic correspondences are discovered by computing, and returning as a result, the semantic information implicitly or explicitly codified in the labels of nodes and arcs. In this paper we present an algorithm implementing semantic matching, and we discuss its implementation within the S-Match system. We also test S-Match against three state of the art matching systems. The results, though preliminary, look promising, in particular for what concerns precision and recall.

Fausto Giunchiglia, Pavel Shvaiko, Mikalai Yatskevich

Ontology Mapping – An Integrated Approach

Ontology mapping is important when working with more than one ontology. Typically similarity considerations are the basis for this. In this paper an approach to integrate various similarity methods is presented. In brief, we determine similarity through rules which have been encoded by ontology experts. These rules are then combined for one overall result. Several boosting small actions are added. All this is thoroughly evaluated with very promising results.

Marc Ehrig, York Sure

Ontology-Based Querying

Application of Ontology Techniques to View-Based Semantic Search and Browsing

We show how the benefits of the view-based search method, developed within the information retrieval community, can be extended with ontology-based search, developed within the Semantic Web community, and with semantic recommendations. As a proof of the concept, we have implemented an ontology- and view-based search engine and recommendation system Ontogator for RDF(S) repositories. Ontogator is innovative in two ways. Firstly, the RDFS-based ontologies used for annotating metadata are used in the user interface to facilitate view-based information retrieval. The views provide the user with an overview of the repository contents and a vocabulary for expressing search queries. Secondly, a semantic browsing function is provided by a recommender system. This system enriches instance level metadata by ontologies and provides the user with links to semantically related relevant resources. The semantic linkage is specified in terms of logical rules. To illustrate and discuss the ideas, a deployed application of Ontogator to a photo repository of the Helsinki University Museum is presented.

Eero Hyvönen, Samppa Saarela, Kim Viljanen

Active Ontologies for Data Source Queries

In this paper we describe the work that was done in the Corporate Ontology Grid (COG) project on the querying of existing legacy data sources from the automotive industry using ontology technology and a conceptual ontology query language. We describe the conceptual ontology query language developed by Unicorn, the querying support provided by the Unicorn Workbench, and describe the use of these queries in the run-time architecture built in the COG project.

Jos de Bruijn, Holger Lausen

Ontology Merging and Population

Knowledge Discovery in an Agents Environment

We describe work undertaken to investigate automated querying of simple forms of ontology by software agents to acquire the semantics of metadata terms. Individual terms as well as whole vocabularies can be investigated by agents through a software interface and by humans through an interactive web-based interface. The server supports discovery, sharing and re-use of vocabularies and specific terms, facilitating machine interpretation of semantics and convergence of ontologies in specific domains. Exposure, and hence alignment through ontological engineering should lead to an improvement in interoperability of systems in particular sectors such as education, cultural heritage and publishing.

Manjula Patel, Monica Duke

The HCONE Approach to Ontology Merging

Existing efforts on ontology mapping, alignment and merging vary from methodological and theoretical frameworks, to methods and tools that support the semi-automatic coordination of ontologies. However, only latest research efforts ”touch” on the mapping/merging of ontologies using the whole breadth of available knowledge. This paper aims to thoroughly describe the HCONE approach on ontology merging. The approach described is based on (a) capturing the intended informal interpretations of concepts by mapping them to WordNet senses using lexical semantic indexing, and (b) exploiting the formal semantics of concepts’ definitions by means of description logics’ reasoning services.

Konstantinos Kotis, George A. Vouros

Question Answering Towards Automatic Augmentations of Ontology Instances

Ontology instances are typically stored as triples which associate two named entities with a pre-defined relational description. Sometimes such triples can be incomplete in that one entity is known but the other entity is missing. The automatic discovery of the missing values is closely related to relation extraction systems that extract binary relations between two identified entities. Relation extraction systems rely on the availability of accurately named entities in that mislabelled entities can decrease the number of relations correctly identified. Although recent results demonstrate over 80% accuracy for recognising named entities, when input texts have less consistent patterns, the performance decreases rapidly. This paper presents OntotripleQA which is the application of question-answering techniques to relation extraction in order to reduce the reliance on the named entities and take into account other assessments when evaluating potential relations. Not only does this increase the number of relations extracted, but it also improves the accuracy of extracting relations by considering features which are not extractable with only comparisons of the named entities. A small dataset was collected to test the proposed approach and the experiment demonstrates that it is effective on sentences from Web documents with an accuracy of 68% on average.

Sanghee Kim, Paul Lewis, Kirk Martinez, Simon Goodall

Infrastructure

The SCAM Framework: Helping Semantic Web Applications to Store and Access Metadata

In this paper we discuss the design of the SCAM framework, which aims to simplify the storage and access of metadata for a variety of different applications that can be built on top of it. A basic design principle of SCAM is the aggregation of metadata into two kinds of sets of different granularity (SCAM records and SCAM contexts). These sets correspond to the typical access needs of an application with regard to metadata, and they constitute the foundation upon which access control is provided.

Matthias Palmér, Ambjörn Naeve, Fredrik Paulsson

Publish/Subscribe for RDF-based P2P Networks

Publish/subscribe systems are an alternative to query based systems in cases where the same information is asked for over and over, and where clients want to get updated answers for the same query over a period of time. Recent publish/subscribe systems such as P2P-DIET have introduced this paradigm in the P2P context. In this paper we built on the experience gained with P2P-DIET and the Edutella P2P infrastructure and present the first implementation of a P2P publish/subscribe system supporting metadata and a query language based on RDF. We define formally the basic concepts of our system and present detailed protocols for its operation. Our work utilizes the latest ideas in query processing for RDF data, P2P indexing and routing research.

Paul-Alexandru Chirita, Stratos Idreos, Manolis Koubarakis, Wolfgang Nejdl

Streaming OWL DL

A triple based approach to syntactically recognizing OWL DL files is described in detail. An incremental runtime refinement algorithm and two compile time algorithms are given. Many aspects of OWL DL syntax are addressed. These techniques are combined into a streaming OWL recogniser. This shows a threefold time and space improvement over abstract syntax tree based approaches.

Jeremy J. Carroll

Semantic Web Services

Mathematics on the (Semantic) NET

Although web service technology is becoming more prevalent the mechanisms for advertising and discovering web services are still at a rudimentary stage. WSDL provides information about service name and parameters for the purpose of invocation. UDDI provides a set of WSDL documents matching keywords in a query. The aim of the Mathematics On the NET (MONET) project is to deliver a proof-of-concept demonstration of a framework for mathematical web services which uses semantic web technologies to broker between user requirements and deployed services. This requires mechanisms for describing mathematical objects and properties so that a piece of software can evaluate the applicability of a particular service to a given problem. Thus we describe our Mathematical Service Description Language (MSDL), with its ontological grounding in OpenMath and outline its role in service brokerage and service composition within MONET. We believe similar issues arise in many other (scientific) domains, and the leverage obtained here, through the formal background of mathematics, suggests a road-map for the development of similar domain-specific service description languages.

Olga Caprotti, James H. Davenport, Mike Dewar, Julian Padget

Approaches to Semantic Web Services: an Overview and Comparisons

The next Web generation promises to deliver Semantic Web Services (SWS); services that are self-described and amenable to automated discovery, composition and invocation. A prerequisite to this, however, is the emergence and evolution of the Semantic Web, which provides the infrastructure for the semantic interoperability of Web Services. Web Services will be augmented with rich formal descriptions of their capabilities, such that they can be utilized by applications or other services without human assistance or highly con-strained agreements on interfaces or protocols. Thus, Semantic Web Services have the potential to change the way knowledge and business services are consumed and provided on the Web. In this paper, we survey the state of the art of current enabling technologies for Semantic Web Services. In addition, we characterize the infrastructure of Semantic Web Services along three orthogonal dimensions: activities, architecture and service ontology. Further, we examine and contrast three current approaches to SWS according to the proposed dimensions.

Liliana Cabral, John Domingue, Enrico Motta, Terry Payne, Farshad Hakimpour

OWL-S Semantics of Security Web Services: a Case Study

The power of Web services (WS) technology lies in the fact that it takes integration to a new level. With the increasing amount of services available on the Web, solutions are needed that address security concerns of distributed Web service applications such as end-to-end service requirements for authentication, authorization, data integrity and confidentiality, and non-repudiation in the context of dynamic WS applications. Semantic Web technology and Semantic Web services (SWSs) promise to provide solutions to the challenges of dynamically composed service-based applications. We investigate the use of semantic annotations for security WS that can be used by matchmakers or composition tools to achieve security goals. In the long-term we aim at establishing a security framework for SWS applications that include security services, authentication and authorization protocols, and techniques to exchange and negotiate policies. In this paper, we report on the first step toward this larger vision: specification, design, and deployment of semantically well-defined security services.

Grit Denker, Son Nguyen, Andrew Ton

Service Discovery and Composition

Directory Services for Incremental Service Integration

In an open environment populated by heterogeneous information services integration will be a major challenge. Even if the problem is similar to planning in some aspects, the number and the difference in specificity of services makes existing techniques not suitable and requires a different approach. Our solution is to incrementally solve integration problems by using an interplay between service discovery and integration alongside with a technique for composing specific partially matching services into more generic constructs. In this paper we present a directory system and a number of mechanisms designed to support incremental integration algorithms with partial matches for large numbers of service descriptions. We also report experiments on randomly generated composition problems that show that using partial matches can decrease the failure rate of the integration algorithm using only complete matches by up to 7 times with no increase in the number of directory accesses required.

Ion Constantinescu, Walter Binder, Boi Faltings

A Framework for Automated Service Composition in Service-Oriented Architectures

Automated service composition refers to automating the entire process of composing a workflow. This involves automating the discovery and selection of the service, ensuring semantic and data type compatibility. We present a framework to facilitate automated service composition in Service-Oriented Architectures using Semantic Web technologies. The main objective of the framework is to support the discovery, selection, and composition of semantically-described heterogeneous services. Our framework has three main features which distinguish it from other work in this area. First, we propose a dynamic, adaptive, and highly fault-tolerant service discovery and composition algorithm. Second, we distinguish between different levels of granularity of loosely coupled workflows. Finally, our framework allows the user to specify and refine a high-level objective. In this paper, we describe the main components of our framework and describe a scenario in the genealogy domain.

Shalil Majithia, David W. Walker, W. A. Gray

Reusing Petri Nets Through the Semantic Web

The paper presents the Petri net ontology that should enable sharing Petri nets on the Semantic Web. Previous work on formal methods for representing Petri nets mainly defines tool-specific Petri net descriptions (i.e. metamodels) or formats for Petri net model interchange (i.e. syntax). However, such efforts do not provide a suitable model description for using Petri nets on the Semantic Web. This paper uses the Petri net UML model as a starting point for implementing the Petri net ontology. The UML model is then refined using the Protégé ontology development tool and the Ontology UML profile. Resulting Petri net models are represented on the Semantic Web is using XML-based ontology representation languages, Resource Description Framework (RDF) and Web Ontology Language (OWL). We implemented a Petri net software tool as well as tools for the Petri net Semantic Web infrastructure.

Dragan Gašević, Vladan Devedžić

Data for the Semantic Web

Methods for Porting Resources to the Semantic Web

Ontologies will play a central role in the development of the Semantic Web. It is unrealistic to assume that such ontologies will be developed from scratch. Rather, we assume that existing resources such as thesauri and lexical data bases will be reused in the development of ontologies for the Semantic Web. In this paper we describe a method for converting existing source material to a representation that is compatible with Semantic Web languages such as RDF(S) and OWL. The method is illustrated with three case studies: converting WordNet, AAT and MeSH to RDF(S) and OWL.

Bob Wielinga, Jan Wielemaker, Guus Schreiber, Mark van Assem

Learning to Harvest Information for the Semantic Web

In this paper we describe a methodology for harvesting information from large distributed repositories (e.g. large Web sites) with minimum user intervention. The methodology is based on a combination of information extraction, information integration and machine learning techniques. Learning is seeded by extracting information from structured sources (e.g. databases and digital libraries) or a user-defined lexicon. Retrieved information is then used to partially annotate documents. Annotated documents are used to bootstrap learning for simple Information Extraction (IE) methodologies, which in turn will produce more annotation to annotate more documents that will be used to train more complex IE engines and so on. In this paper we describe the methodology and its implementation in the Armadillo system, compare it with the current state of the art, and describe the details of an implemented application. Finally we draw some conclusions and highlight some challenges and future work.

Fabio Ciravegna, Sam Chapman, Alexiei Dingli, Yorick Wilks

Reverse Engineering of Relational Databases to Ontologies

A majority of the work on reverse engineering has been done on extracting entity-relationship and object models from relational databases. There exist only a few approaches that consider ontologies, as the target for reverse engineering. Moreover, the existing approaches can extract only a small subset of semantics embedded within a relational database, or they can require much user interaction for semantic annotation. In our opinion, the potential source of these problems lies in that the primary focus has been on analyzing key correlation. Data and attribute correlations are considered rarely and thus, have received little or no analysis. As an attempt to resolve the problems, we propose a novel approach, which is based on an analysis of key, data and attribute correlations, as well as their combination. Our approach can be applied to migrating data-intensive Web pages, which are usually based on relational databases, into the ontology-based Semantic Web.

Irina Astrova

Knowledge Representation

No Registration Needed: How to Use Declarative Policies and Negotiation to Access Sensitive Resources on the Semantic Web

Gaining access to sensitive resources on the Web usually involves an explicit registration step, where the client has to provide a predetermined set of information to the server. The registration process yields a login/password combination, a cookie, or something similar that can be used to access the sensitive resources. In this paper we show how an explicit registration step can be avoided on the Semantic Web by using appropriate semantic annotations, rule-oriented access control policies, and automated trust negotiation. After presenting the PeerTrust language for policies and trust negotiation, we describe our implementation of implicit registration and authentication that runs under the Java-based MINERVA Prolog engine. The implementation includes a PeerTrust policy applet and evaluator, facilities to import local metadata, policies and credentials, and secure communication channels between all parties.

Rita Gavriloaie, Wolfgang Nejdl, Daniel Olmedilla, Kent E. Seamons, Marianne Winslett

Semantic Annotation Support in the Absence of Consensus

We are interested in the annotation of knowledge which does not necessarily require a consensus. Scholarly debate is an example of such a category of knowledge where disagreement and contest are widespread and desirable, and unlike many Semantic Web approaches, we are interested in the capture and the compilation of these conflicting viewpoints and perspectives. The Scholarly Ontologies project provides the underlying formalism to represent this meta-knowledge, and we will look at ways to lighten the burden of its creation. After having described some particularities of this kind of knowledge, we introduce ClaimSpotter, our approach to support its ‘capture’, based on the elicitation of a number of recommendations which are presented for consideration to our annotators (or analysts), and give some elements of evaluation.

Bertrand Sereno, Victoria Uren, Simon Buckingham Shum, Enrico Motta

Uncertainty in Knowledge Provenance

Knowledge Provenance is an approach to determining the origin and validity of knowledge/information on the web by means of modeling and maintaining information sources and dependencies, as well as trust structures. This paper constructs an uncertainty-oriented Knowledge Provenance model to address the provenance problem with uncertain truth values and uncertain trust relationships by using information theory and probability theory. This proposed model could be used for both people and web applications to determine the validity of web information in a world where information is uncertain.

Jingwei Huang, Mark S. Fox

Applications

Collaborative Semantic Web Browsing with Magpie

Web browsing is often a collaborative activity. Users involved in a joint information gathering exercise will wish to share knowledge about the web pages visited and the contents found. Magpie is a suite of tools supporting the interpretation of web pages and semantically enriched web browsing. By automatically associating an ontology-based semantic layer to web resources, Magpie allows relevant services to be invoked as well as remotely triggered within a standard web browser. In this paper we describe how Magpie trigger services can provide semantic support to collaborative browsing activities.

John Domingue, Martin Dzbor, Enrico Motta

Toward a Framework for Semantic Organizational Information Portal

Information Portals have gathered lot of attention among many organizations interested in a single point of access to their information and services. But developing portals from scratch is sometimes too expensive, so many vendors have proposed frameworks to make it affordable. Notwithstanding the frameworks the market offers seem stuck in a simplicity vs. flexibility trade off imposed by the Web technologies they are built with. Therefore we believe that a technology change is required and that Semantic Web technology can play a key role in developing a new, Semantic, generation of simpler and, at the same time, flexibler frameworks for Organizational Information Portal.

Emanuele Della Valle, Maurizio Brioschi

CS AKTiveSpace: Building a Semantic Web Application

In this paper we reflect on the lessons learned from deploying the award winning [1] Semantic Web application, CS AKTiveSpace. We look at issues in service orientation and modularisation, harvesting, and interaction design for supporting this 10million-triple-based application. We consider next steps for the application, based on these lessons, and propose a strategy for expanding and improving the services afforded by the application.

Hugh Glaser, Harith Alani, Les Carr, Sam Chapman, Fabio Ciravegna, Alexiei Dingli, Nicholas Gibbins, Stephen Harris, m. c. schraefel, Nigel Shadbolt

Content Management

Cultural Heritage and the Semantic Web

Online cultural archives represent vast amounts of interesting and useful information. During the last decades huge amounts of literature works have been scanned to provide better access to Humanities researchers and teachers. Was the problem 20 years ago one of scarceness of information (precious originals only to consult in major libraries), today’s problem is that of information overload: many databases online and many CD collections are available; each with their own search forms and attributes. This makes it cumbersome for users to find relevant information. In this paper, we describe a case study of how Semantic Web Technologies can be used to disclose cultural heritage information in a scalable way. We present an ontology of Humanities, a semi-automatic tool for annotation, and an application to exploit the annotated content. This tool, positioned somewhere in the middle, between a basic editor and a fully automatic wrapper, helps annotators performing heavy knowledge acquisition tasks in a more efficient and secure way.

V. R. Benjamins, J. Contreras, M. Blázquez, J. M. Dodero, A. Garcia, E. Navas, F. Hernandez, C. Wert

Neptuno: Semantic Web Technologies for a Digital Newspaper Archive

Newspaper archives are a fundamental working tool for editorial teams. Their exploitation in digital format through the web, and the provision of technology to make this possible, are also important businesses today. The volume of archive contents, and the complexity of human teams that create and maintain them, give rise to diverse management difficulties. We propose the introduction of the emergent semantic-based technologies to improve the processes of creation, maintenance, and exploitation of the digital archive of a newspaper. We describe a platform based on these technologies, that consists of a) a knowledge base associated to the newspaper archive, based on an ontology for the description of journalistic information, b) a semantic search module, and c) a module for content browsing and visualisation based on ontologies.

P. Castells, F. Perdrix, E. Pulido, M. Rico, R. Benjamins, J. Contreras, J. Lorés

Information Management and Integration

MIKSI – A Semantic and Service Oriented Integration Platform

The MIKSI platform provides a novel information and workflow infrastructure for common tasks of marketing and public relations (e.g. producing and sending of press releases). MIKSI is based on a service oriented architecture using web service technology for communication and data exchange. The process flow is implemented in BPEL [1] using RDF [12] for message exchange. The underlying data model consist of RDF(S) repositories, e.g. event and address data as demonstrated in the first prototype. This paper presents the MIKSI platform with emphasize on its architecture, business processes, semantic data model and interfaces, tools, and other technical issues.

Alexander Wahler, Bernhard Schreder, Aleksandar Balaban, Juan Miguel Gomez, Klaus Niederacher

Semantic Web Technologies for Economic and Financial Information Management

The field of economy and finance is a conceptually rich domain where information is complex, huge in volume and a highly valuable business product by itself. Novel management techniques are required for economic and financial information in order to enable an efficient generation, management and consumption of complex and big information resources. Following this direction, we have developed and ontology-based platform that provides a) the integration of contents and semantics in a knowledge base that provides a conceptual view on low-level contents, b) an adaptive hypermedia-based knowledge visualization and navigation system and c) semantic search facilities. We have developed, as the basis of this platform, an ontology for the domain of economic and financial information.

Pablo Castells, Borja Foncillas, Rubén Lara, Mariano Rico, Juan Luis Alonso

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter