Skip to main content

Über dieses Buch

After years of mostly theoretical research, Semantic Web Technologies are now reaching out into application areas like bioinformatics, eCommerce, eGovernment, or Social Webs. Applications like genomic ontologies, semantic web services, automated catalogue alignment, ontology matching, or blogs and social networks are constantly increasing, often driven or at least backed up by companies like Google, Amazon, YouTube, Facebook, LinkedIn and others. The need to leverage the potential of combining information in a meaningful way in order to be able to benefit from the Web will create further demand for and interest in Semantic Web research.

This movement, based on the growing maturity of related research results, necessitates a reliable reference source from which beginners to the field can draw a first basic knowledge of the main underlying technologies as well as state-of-the-art application areas. This handbook, put together by three leading authorities in the field, and supported by an advisory board of highly reputed researchers, fulfils exactly this need. It is the first dedicated reference work in this field, collecting contributions about both the technical foundations of the Semantic Web as well as their main usage in other scientific fields like life sciences, engineering, business, or education.



Foundations and Technologies

1. Introduction to the Semantic Web Technologies

Without Abstract
John Domingue, Dieter Fensel, James A. Hendler

2. Semantic Web Architecture

The Semantic Web extends the existing Web, adding a multitude of language standards and software components to give humans and machines direct access to data. The chapter starts with deriving the architecture of the Semantic Web as a whole from first principles, followed by a presentation of Web standards underpinning the Semantic Web that are used for data publishing, querying, and reasoning. Further, the chapter identifies functional software components required to implement capabilities and behavior in applications that publish and consume Semantic Web content.
Andreas Harth, Maciej Janik, Steffen Staab

3. Semantic Annotations and Retrieval: Manual, Semiautomatic, and Automatic Generation

The semantic annotation of textual Web content is key for the success of the Semantic Web. This entry reviews key approaches and state-of-the-art systems, as well as drawing conclusions on outstanding challenges and future work.
First, the problem of semantic annotation is defined and distinguished from other related research fields. Manual annotation tools are discussed next in the context of key requirements, such as support for diverse document formats, multiple ontologies, and collaborative, Web-based annotation.
Next, the entry discusses ontology-oriented, semiautomatic, and automatic systems, which typically target ontologies as their output format, but do not use them as a knowledge resource during semantic analysis. Then a number of more advanced ontology-based semantic annotation approaches are presented and compared to one another. Particular emphasis is on scalability (i.e., the ability to process millions of documents) and customization (i.e., how easy it is to adapt these systems to new domains and/or ontologies).
The semantic retrieval of documents enables users to find all documents that mention one or more instances from the ontology and/or relations. The queries can also mix free-text keywords, not just the annotations. Here different types of retrieval tools are reviewed, some of which provide document browsing functionality as well as search refinement capabilities. The entry then provides in-depth examples of three semantic annotation applications: the GATE framework, News Collector, and large-scale patent processing. Future issues to be addressed are making use of linked data, dealing with large-scale, highly ambiguous ontologies, multilinguality, lexicalization of ontologies, and from an implementational perspective, semantic annotation as a service.
Kalina Bontcheva, Hamish Cunningham

4. Semantic Annotation and Retrieval: RDF

The Resource Description Framework (RDF) is the de facto standard for metadata on the Web. Both, RDF and its schema language RDFS are recommended by W3C for interlinking resources on the Web and for fostering interoperability among distributed data sources. For this purpose, RDF relies on URIs for identifying resources, and constitutes a graph-based data model for linking such resources. To this end, RDF provides the fundamental building blocks for the graph-based data structures that are leveraged by the Semantic Web. More recently, RDF is no longer only the base layer of the Semantic Web, but its importance has increased, and nowadays RDF provides the principal data model for almost all data-minded protocols and formats that are promoted and standardized by W3C.
Fabien L. Gandon, Reto Krummenacher, Sung-Kook Han, Ioan Toma

5. Semantic Annotation and Retrieval: Web of Hypertext – RDFa and Microformats

For the Semantic Web to succeed in a major public setting, it needs to leverage the existing Web. Two major technologies have emerged over the last few years to bridge the vast, existing “clickable” Web and the new Semantic Web: Microformats and RDFa. Both allow authors to embed extra information within (X)HTML to mark up the structure, not just the visual presentation, of the information they publish. In this chapter, both approaches are explained, exploring their strengths and weaknesses, providing example applications, and touching on future considerations.
Ben Adida, Mark Birbeck, Ivan Herman

6. Semantic Annotation and Retrieval: Web of Data

The World Wide Web has radically altered the way knowledge is shared by lowering the barrier to publishing and accessing documents. Recent initiatives have extended the principles and architecture of the Web to sharing and accessing data, resulting in an interconnected, global data space – the Web of Data. This chapter reviews the historical and scientific context to the emergence of this Web of Data, before detailing the technologies and principles on which it is based. Applications that have been developed and deployed on the Web of Data are reviewed, as are key resources in the field. The chapter concludes with a discussion of current research challenges in areas such as user interfaces, data fusion, link maintenance, trust, and privacy.
Tom Heath, Christian Bizer

7. Storing the Semantic Web: Repositories

Semantic repositories are database management systems, capable of handling structured data, taking into consideration their semantics. The Semantic Web represents the next-generation Web of Data, where information is published and interlinked in a way, which facilitates both humans and machines to exploit its structure and meaning. To foster the realization of the Semantic Web, the World Wide Web Consortium (W3C) developed a series of metadata, ontology, and query languages for it. Following the enthusiasm about the Semantic Web and the wide adoption of the related standards, today, most of the semantic repositories are database engines, which deal with data represented in RDF, support SPARQL queries, and can interpret schemas and ontologies represented in RDFS and OWL. Naturally, such engines take the role of Web servers of the Semantic Web.
This chapter starts with an introduction to semantic repositories and discussion on their links to several other technology trends, including relational databases, column-stores, and expert systems. As the most distinguishing quality of the semantic repositories is reasoning, an overview of the strategies for the integration of inference in the data management life cycle is presented. An overall view of the mechanics of the engines is provided from the perspective of a conceptual framework that reveals all their tasks and activities (e.g., storage and retrieval) along with the factors that impact their performance (e.g., data size and complexity). A review of several design issues, including distribution, serves as a basis for understanding the different implementation approaches and their implications on the performance of semantic repositories. Several of the most popular benchmarks and datasets, which are often used as measuring sticks for the performance of the engines, and few of the outstanding semantic repositories, are presented along with the best published evaluation results.
The advantages and the typical applications of semantic repositories are presented focusing on two usage scenarios: reasoning with and the management of linked data (a popular trend in the Semantic Web) and enterprise data integration. The chapter ends with some considerations regarding the future development of semantic repositories and design topics like adaptive indexing and interoperability patterns.
Atanas Kiryakov, Mariana Damova

8. Querying the Semantic Web: SPARQL

SPARQL – Simple Protocol And RDF Query Language – is the language, proposed by W3C, for querying RDF data published on the Web, both stored natively or viewed via middleware. SPARQL offers a syntactically SQL-like language for querying RDF graphs via pattern matching, as well as a simple communication protocol that can be used by clients for issuing SPARQL queries against endpoints.
The first section provides the reader with a scientific and technical overview of the SPARQL query language. Basic concepts, such as the notions of triple and graph patterns are presented first. The section, then, shows how to write simple queries, and progressively introduces the reader to the full expressive power of SPARQL.
The second section illustrates some examples of applications, progressing in a quasi-chronological order. It starts with “early days” applications, when RDF data were lacking and the Semantic Web practitioners applied semantic technologies to bibliographic and conference data. Next, it moves on to “first uptakes” in the area of bioscience, which can be considered as the earlier science adopting Semantic Web technologies. This section is concluded by the presentation of some large applications, showing SPARQL queries that nowadays can be issued against interlinked RDF repositories about music and about governmental data.
The third section is dedicated to SPARQL implementations, in particular to those ones that are mostly used and widely deployed. It also discusses the standard compliance of the implementations, based upon the W3C test suite.
Finally, the fourth section discusses some of the issues that characterize the current development of SPARQL. It presents foreseen extensions to the query language, in particular a proposal for remotely updating RDF graphs, four vocabularies for describing SPARQL endpoints, the behavior of SPARQL under different entailment regimes, three approaches for querying the entire Semantic Web with SPARQL, and three proposals for extending SPARQL to the management of streams of rapidly flowing information.
Emanuele Della Valle, Stefano Ceri

9. KR and Reasoning on the Semantic Web: OWL

OWL is the ontology language recommended by the W3C. OWL is heavily based on the knowledge representation languages called Description Logic, which provide the basic representation features of OWL. OWL also includes facilities that integrate it into the mainstream of the Web, including use of Internationalized Resource Identifiers (IRIs) as names, XML Schema datatypes, and ontologies as Web documents, which can then import other OWL ontologies over the Web. Because OWL is based on Description Logics, its constructs have a well-denned meaning and there are tools that effectively perform inference within OWL, enabling the discovery of information that is not explicitly stated in OWL ontologies.
Ian Horrocks, Peter F. Patel-Schneider

10. KR and Reasoning on the Semantic Web: RIF

Rule Interchange Format (RIF) is a suite of W3C standards designed to facilitate rule exchange among different and dissimilar rule engines, especially among Web-enabled engines. Following on the heels of the earlier Semantic Web standards, RDF and OWL, RIF aims to revolutionize the field of Web application development and create infrastructure for truly intelligent Web applications. The goal of this chapter is to provide an overview of RIF, especially the syntax and semantics of its logic-based dialects. As an illustration, it is shown how RIF can be used to build a sophisticated distributed application for the procurement of mobile services, which heavily relies on rule-based reasoning. This chapter also discusses the limitations of RIF’s Basic Logic Dialect (RIF-BLD); in particular, where it falls short of the requirements for complex applications, such as above, and shows how RIF’s Framework for Logic Dialects (RIF-FLD) solves these problems by providing a general framework for designing more expressive dialects.
Michael Kifer

11. KR and Reasoning on the Semantic Web: Web-Scale Reasoning

 Reasoning is a key element of the Semantic Web. For the Semantic Web to scale, it is required that reasoning also scales. This chapter focuses on two approaches to achieve this: The first deals with increasing the computational power available for a given task by harnessing distributed resources. These distributed resources refer to peer-to-peer networks, federated data stores, or cluster-based computing. The second deals with containing the set of axioms that need to be considered for a given task. This can be achieved by using intelligent selection strategies and limiting the scope of statements. The former is exemplified by methods substituting expensive web-scale reasoning with the cheaper application of heuristics while the latter by methods to control the quality of the provided axioms. Finally, future issues concerning information centralization and logics vs information retrieval-based methods, metrics, and benchmarking are considered.
Spyros Kotoulas, Frank van Harmelen, Jesse Weaver

12. Social Semantic Web

 The Social Web has captured the attention of millions of users as well as billions of dollars in investment and acquisition. As more social websites form around the connections between people and their objects of interest, and as these “object-centered networks” grow bigger and more diverse, more intuitive methods are needed for representing and navigating content both within and across social websites. Also, to better enable user access to multiple sites and ultimately to content-creation facilities on the Web, interoperability among social websites is required in terms of both the content objects and the person-to-person networks expressed on each site. Semantic Web representation mechanisms are ideally suited to describing people and the objects that link them together, recording and representing the heterogeneous ties involved. The Semantic Web is also a useful platform for performing operations on diverse, distributed person- and object-related data. In the other direction, object-centered networks and user-centric services for generating collaborative content can serve as rich data sources for Semantic Web applications. This chapter will give an overview of the “Social Semantic Web,” where semantic technologies are being leveraged to overcome the aforementioned limitations in a variety of Social Web application areas.
John G. Breslin, Alexandre Passant, Denny Vrandečić

13. Ontologies and the Semantic Web

Ontologies have become a prominent topic in Computer Science where they serve as explicit conceptual knowledge models that make domain knowledge available to information systems. They play a key role in the vision of the Semantic Web where they provide the semantic vocabulary used to annotate websites in a way meaningful for machine interpretation. As studied in the context of information systems, ontologies borrow from the fields of symbolic knowledge representation in Artificial Intelligence, from formal logic and automated reasoning and from conceptual modeling in Software Engineering, while also building on Web-enabling features and standards.
Although in Computer Science ontologies are a rather new field of study, certain accomplishments can already be reported from the current situation in ontology research. Web-compliant ontology languages based on a thoroughly understood theory of underlying knowledge representation formalisms have been and are being standardized for their widespread use across the Web. Methodological aspects about the engineering of ontologies are being studied, concerning both their manual construction and (semi)automated generation. Initiatives on “linked open data” for collaborative maintenance and evolution of community knowledge based on ontologies emerge, and the first semantic applications of Web-based ontology technology are successfully positioned in areas like semantic search, information integration, or Web community portals.
This chapter will present ontologies as one of the major cornerstones of Semantic Web technology. It will first explain the notion of formal ontologies in Computer Science and will discuss the range of concrete knowledge models usually subsumed under this label. Next, the chapter surveys ontology engineering methods and tools, both for manual ontology construction and for the automated learning of ontologies from text. Finally, different kinds of usage of ontologies are presented and their benefits in various application scenarios illustrated.
Stephan Grimm, Andreas Abecker, Johanna Völker, Rudi Studer

14. Future Trends

This volume has introduced the foundations and technologies which make up the Semantic Web. Mostly the discussion has been on the state of the art, but what developments can be expected next in semantic technologies? What social and technological trends will spur and enable the next generation of semantic technology? Which application areas can one expect to gain most added value from implementation of semantic technologies in the next 15 years? With expert eyes on the crystal ball, this final chapter of the first volume will outline these future trends.
Lyndon Nixon, Raphael Volz, Fabio Ciravegna, Rudi Studer

Semantic Web Applications

15. Semantic Technology Adoption: A Business Perspective

In the past decade, significant budgets have been invested in the development of Semantic Technologies. In this chapter, relevant factors are laid out for semantic technology adoption and a framework is provided for describing and understanding the value proposition of semantic technology from a business point of view. An overview will be shown of the current semantic offering in the market place, and this will be interpreted in terms of the framework. Finally, three adoption horizons will be introduced for when semantic technology can be expected to reach the mainstream market and what role it will play in organizations. Overall, the conclusion is that there is some uptake of the technology, but that the mainstream market is still not reached. This is partly because of a lack of understanding by industries of in what situations semantic technologies can add value to their business. This is the reason for introducing a new framework that is aimed at enabling industries to more easily relate semantic technologies to their business.
V. Richard Benjamins, Mark Radoff, Mike Davis, Mark Greaves, Rose Lockwood, Jesús Contreras

16. Semantic Web Search Engines

The last couple of years have seen an increasing growth in the amount of Semantic Web data made available, and exploitable, on the Web. Compared to the Web, one unique feature of the Semantic Web is its friendly interface with software programs. In order to better serve human users with software programs, supporting infrastructures for finding and selecting the distributed online Semantic Web data are needed. A number of Semantic Web search engines have emerged recently. These systems are based on different design principles and provide different levels of support for users and/or applications. In this chapter, a survey of these Semantic Web search engines is presented, together with the detailed description of the design of two prominent systems: Swoogle and Watson. The way these systems are used to enable domain applications and support cutting-edge research on Semantic Web technologies is also discussed. In particular, this chapter includes examples of a new generation of semantic applications that, thanks to Semantic Web search engines, exploit online knowledge at runtime, without the need for laborious acquisition in specific domains. In addition, through collecting large amounts of semantic content online, Semantic Web search engines such as Watson and Swoogle allow researchers to better understand how knowledge is formally published online and how Semantic Web technologies are used. In other terms, by mining the collected semantic documents, it becomes possible to get an overview and explore the Semantic Web landscape today.
The first section below (Sect. 16.1) presents a general overview of the area, including the main challenges, related systems, as well as an abstract specification of what is called Semantic Web search engines. It also includes a detailed overview of the two systems more specifically considered as case studies, Swoogle (Sect. 16.1.4) and Watson (Sect. 16.1.5). Section 16.2 shows how these systems are currently being used and applied, both as development platforms to make possible the realization of applications exploiting Semantic Web content (Sect. 16.2.1), and as research platforms, allowing one to better understand the content of the Semantic Web, how knowledge is published online and how it is structured. Finally, Sect. 16.3 briefly introduces other resources to be considered in the area of Semantic Web search engines, and Sect. 16.4 concludes the chapter.
Mathieu d’Aquin, Li Ding, Enrico Motta

17. eScience

This chapter looks into how the use of semantic technologies can provide support to common needs in eScience projects, including data-intensive science, facilitating experiment knowledge reuse and recycle among scientists, lowering the barriers of knowledge exchange for interdisciplinary research, and bridging the gap between data from different sources and the gap between data sharing and digital scholarly publication. To illustrate this, we describe a set of pioneering semantic eScience projects that cover a diversity of application domains including bioinformatics, biology, chemistry, physics, environmental science, and astronomy, and we summarize some of the open issues and future lines of research and development in this area.
Jun Zhao, Oscar Corcho, Paolo Missier, Khalid Belhajjame, David Newmann, David de Roure, Carole A. Goble

18. Knowledge Management in Large Organizations

This chapter provides an overview of the knowledge management (KM) problems, and opportunities, faced by large organizations, and indeed also shared by some smaller organizations. The chapter shows how semantic technologies can make a contribution. It looks at the key application areas: finding and organizing information; sharing knowledge; supporting processes, in particular informal processes; information integration; extracting knowledge from unstructured information; and finally sharing and reusing knowledge across organizations. In each application area, the chapter describes some solutions, either currently available or being researched. This is done to provide examples of what is possible rather than to provide a comprehensive list. The chapter also describes some of the technologies which contribute to these solutions; for example, text mining for analyzing documents or text within documents; and natural language processing for analyzing language itself and, for example, identifying named entities. Most fundamentally, the use of ontologies as a form of knowledge representation underlies everything talked about in the chapter. Ontologies offer great expressive power; they provide enormous flexibility, with the ability to evolve dynamically unlike database schema; and they make possible machine reasoning. The chapter concludes by identifying the key trends and describing the key challenges to be faced in the development of more powerful tools to support knowledge work.
John Davies, Paul Warren

19. eBusiness

Integrating Semantic Web concepts into the domain of eBusiness is a hot topic. However, most of the efforts spent so far concentrated on the improvement on B2C (business-to-consumer) eCommerce applications, achieved by the semantic enrichment of information. With the growing importance of Service-Oriented Architectures (SOA) companies started to move into the section of the Electronic Data Interchange (EDI), where applications exchange their business information semiautomatically. This B2B (business-to-business) electronic commerce is driven by aligning the internal business processes of companies to publicly available business processes. Thereby companies often do not consider the economic drivers of their business processes, which leads to incompatibilities between management, administration, and technical layers. This chapter covers the two major domains of eBusiness/eCommerce, namely B2B and B2C. In the first, a model-driven approach toward B2B IT solutions is introduced, covering semantic aspects dealing with business models, business process models, and business document models. In the second application domain, the basic concepts of Semantic Web in the area of B2C eCommerce are examined using a representative example from the eTourism domain.
Christoph Grün, Christian Huemer, Philipp Liegl, Dieter Mayrhofer, Thomas Motal, Rainer Schuster, Hannes Werthner, Marco Zapletal

20. eGovernment

The use of the Semantic Web (SW) in eGovernment is reviewed. The challenges for the introduction of SW technology in eGovernment are surveyed from the point of view both of the SW as a new technology that has yet to reach its full potential, and of eGovernment as a complex digital application with many constraints, competing interests, and drivers, and a large and heterogeneous user base of citizens. The spread of SW technology through eGovernment is reviewed, looking at a number of international initiatives, and it is argued that pragmatic considerations stemming from the institutional context are as important as technical innovation. To illustrate these points, the chapter looks in detail at recent efforts by the UK government to represent and release public-sector information in order to support integration of heterogeneous information sources by both the government and the citizen. Two projects are focused on. AKTive PSI was a proof of concept, in which information was re-represented in RDF and made available against specially created ontologies, adding significant value to previously existing databases. Steps in the management of the project are described, to demonstrate how problems of perception can be overcome with relatively little overhead. Secondly, the project is discussed, showing the technical means by which it has exploited the growth of the Web of linked data to facilitate re-representation and integration of information from diverse and heterogeneous sources. Drawing on experience in the policy and organizational challenges of deploying SW capabilities at national scales are discussed as well as the prospects for the future.
Nigel Shadbolt, Kieron O’Hara, Manuel Salvadores, Harith Alani

21. Multimedia, Broadcasting, and eCulture

This chapter turns to the application of semantic technologies to areas where text is not dominant, but rather audiovisual content in the form of images, 3D objects, audio, and video/television. Non-textual digital content raises new challenges for semantic technology in terms of capturing the meaning of that content and expressing it in the form of semantic annotation. Where such annotations are available in combination with expressive ontologies describing the target domain, such as television and cultural heritage, new and exciting possibilities arise for multimedia applications.
Lyndon Nixon, Stamatia Dasiopoulou, Jean-Pierre Evain, Eero Hyvönen, Ioannis Kompatsiaris, Raphaël Troncy

22. Semantic Web Services

In recent years, service-orientation has increasingly been adopted as one of the main approaches for developing complex distributed systems from reusable components called services. Realizing the potential benefits of this software engineering approach requires semiautomated and automated techniques as well as tools for searching or locating services, selecting the suitable ones, composing them into complex processes, resolving heterogeneity issues through process and data mediation, and reducing other tedious yet recurrent tasks with minimal manual effort. Just as semantics has brought significant benefits to search, integration, and analysis of data, it is also seen as a key to achieving a greater level of automation to service-orientation. This has led to research and development, as well as standardization efforts on Semantic Web Services. Activities related to Semantic Web Services have involved developing conceptual models or ontologies, algorithms, and engines that could support machines in semiautomatically or automatically discovering, selecting, composing, orchestrating, mediating, and executing services. This chapter provides an overview of the area after nearly a decade of research. The chapter presents the main principles and conceptual models proposed thus far, including OWL-S, Web Service Modeling Ontology (WSMO), and Semantic Annotations for WSDL (SAWSDL)/Managing End-to-End Operations-Semantics (METEOR-S), as well as recent approaches that provide lighter solutions and bring support for the increasingly popular Web APIs and RESTful services, like SA-REST, WSMO-Lite, and MicroWSMO. The chapter also describes the main engines and frameworks developed by the research community, including discovery engines, composition engines, and even integrated frameworks that are able to use these semantic descriptions of services to support some of the typical activities related to services and service-based applications. Next, the ideas and techniques described are illustrated through two use cases that integrate Semantic Web Services technologies within real-world applications. Finally, a set of key resources that would allow the reader to reach a greater understanding of the field is provided, and the main issues that will drive the future of Semantic Web Services (SWS) are outlined.
Carlos Pedrinaci, John Domingue, Amit P. Sheth


Weitere Informationen

Premium Partner