Semantic community Web portals

https://doi.org/10.1016/S1389-1286(00)00039-6Get rights and content

Abstract

Community Web portals serve as portals for the information needs of particular communities on the Web. We here discuss how a comprehensive and flexible strategy for building and maintaining a high-value community Web portal has been conceived and implemented. The strategy includes collaborative information provisioning by the community members. It is based on an ontology as a semantic backbone for accessing information on the portal, for contributing information, as well as for developing and maintaining the portal. We have also implemented a set of ontology-based tools that have facilitated the construction of our show case — the community Web portal of the knowledge acquisition community.

Introduction

One of the major strengths of the Web is that virtually everyone who owns a computer may contribute high-value information — the real challenge is to make valuable information be found. Obviously, this challenge cannot only be achieved by centralized services, since the coverage of even the most powerful crawling and indexing machines has shrunk in the last few years in terms of percentage of the number of Web pages available on the Web. This means that a proper solution to this dilemma should rather be sought along a principle paradigm of the World Wide Web, viz. self-organization.

Self-organization does not necessarily mean automatic, machine-driven organization. Rather, from the very beginning communities of interest have formed on the Web that covered what they deemed to be of interest to their group of users in, what we here call, community Web portals. Community Web portals are similar to Yahoo and its likes by their goal of presenting a structured view onto the Web; however, they are dissimilar by the way knowledge is provided in a collaborative process with only few resources (manpower, money) for maintaining and editing the portal. Another major distinction is that community Web portals count in the millions, since a large percentage, if not the majority, of Web or intranet sites is not maintained by a central department, but rather by a community of users. Strangely enough, technology for supporting communities of interest has not quite kept up with the complexity of the tasks of managing community Web portals. A few years ago such a community of interest would have comparatively few sources of information to consider. Hence, the overall complexity of managing this task was low. Now, with so many more people participating a community portal of only modest size may easily reach the point where it appears to be more of a jungle of interest rather than a convenient portal to start from.

This problem gave us reason to reconsider the techniques for managing community Web portals. We observed that a successful Web portal would weave loose pieces of information into a coherent presentation adequate for sharing knowledge with the user. On the conceptual, knowledge-sharing, level we have found that Davenport and Prusak's maxime [6], ``people can't share knowledge if they don't speak a common language'', is utterly crucial for the case of community Web portals. The only difference with Davenport and Prusak's thoughts derives from the fact that knowledge need not only be shared between people, but also between people and machines.

At this point, ontologies and intelligent reasoning come in as key technologies that allow knowledge sharing at a conceptually concise and elaborate level. These AI techniques support core concerns of the `Semantic Web' (cf. [3]). In this view, information on the Web is not restricted to HTML only, but information may also be formal and, thus, machine understandable. The combination may be accounted for by an explicit model of knowledge structures in an ontology. The ontology formally represents common knowledge and interests that people share within their community. It is used to support the major tasks of a portal, viz. accessing the portal through manifold, dynamic, conceptually plausible views on the information of interest in a particular community, and providing information in a number of ways that reflect different types of information resources held by the individuals.

Following these principles when putting the semantic community Web portal into practice, incurs a range of subtasks that must be accounted for and requires a set of tools that support the accomplishment of these tasks. In Section 2we discuss requirements that we have derived from a particular application scenario, the KA2 portal, that also serves as our testbed for development. Section 3describes how ontologies are created and used for structuring information and, thus, appears as the conceptual cornerstone of our community Web portal. We proceed with the actual application of ontologies for the purposes of accessing the KA2 portal by navigating and querying explicit and implicit information through conceptual views on and rules in the ontology (Section 4). Section 5covers the information-provisioning part for the community Web portal considering problems like information gathering and integration. Then, we describe the engineering process for our approach (Section 6) and present the overall architecture of our system (Section 7). Before we conclude with a tie-up of experiences and further work, we compare our work with related approaches (Section 8).

Section snippets

Requirements for a community Web portal – the KA2 example

Examples for community Web portals abound. In fact, one finds portals that very well succeed regarding some of the requirements we describe in this section. For instance, MathNet3 introduces knowledge sharing through a database relying on Dublin Core metadata. Another example, RiboWeb [1], offers means to navigate a knowledge base about ribosomes. However, these approaches lack an integrated concept covering all phases of a community Web portal, viz. information

Structuring the community Web

Let us now summarize the principal stipulations we have found so far. We need:

  • a conceptual structure for presenting information to the user,

  • support for integrating information from different granularities stored in various formats,

  • comprehensive tool support for providing information, developing and maintaining the portal

  • a methodology for implementing the portal.

In particular, we need an explicit structuring mechanism that pervades the portal and reaches from development and maintenance, over

Accessing the community Web portal

A major requirement from Section 2has been that navigation and querying of the community Web portal need to be conceptually founded, because only then a structured, richly interwoven presentation may be compiled on the fly. In fact, we elaborate in this section how a semantic underpinning, like the KA2 ontology described above, lets us define a multitude of views that dynamically arrange information. Thus, our system may provide the rich interlinking that is most adequate for the individual

Providing information

`One method fits all' does not meet the requirements we have sketched above for the information-provisioning part of community Web portals. What one rather needs is a set of methods and tools that may account for the diversity of information sources of potential interest to the community portal. While these methods and tools need to obey different syntactic mechanisms, coherent integration of information is only possible with a conceptual basis that may sort loose pieces of information into a

The development and maintenance process

Even with the methodological and tool support we have described so far, developing a Web portal for a community of non-trivial size remains a complex task. Strictly ad-hoc rapid prototyping approaches easily doom the construction to fail or they easily lead up to unsatisfactory results. Hence, we have thought about a more principled approach towards the development process that serves as means for documenting development, as well as for communicating principal structures to co-developers and

The system architecture

This section summarizes the major components of our system. An overall view of our system is depicted in Fig. 6, which includes the core modules for accessing and maintaining a community Web portal:

  • Providing information in our community Web portal has already been introduced in Section 5. In our approach we distinguish between metadata-based, wrapper-based and fact-based information. Metadata-based information (such as HTML-A, Word-A, Excel-A, RDF, XML) is collected from the Web using a fact

Related work

This section positions our work in the context of existing Web portals like Yahoo and Netscape and also relates our work to other technologies that are or could be deployed for the construction of community Web portals.

One of the well-established Web portals on the Web is Yahoo4, a manually maintained Web index. Yahoo allows information seekers to retrieve Web documents by navigating a tree-like taxonomy of topics. Each Web document indexed by Yahoo is classified manually

Conclusion

We have demonstrated in this paper how a community may build a community Web portal. The portal is centered around an ontology that structures information for the purposes of presentation and provisioning of information, as well as for the development and maintenance of the portal. We have described a small-scale example, the KA portal, that illustrates some of our techniques and methods. In particular, we have developed a set of ontology-based tools that allow to present multiple views onto

Steffen Staab is assistant professor for Applied Computer Science at Karlsruhe University. He has published in the fields of computational linguistics, information extraction, knowledge representation and reasoning, knowledge management, knowledge discovery, and intelligent systems for the Web. Steffen studied computer science and computational linguistics between 1990 and 1998, earning a M.S.E. from the University of Pennsylvania during a Fulbright scholarship and a Dr. rer. nat. from Freiburg

References (30)

  • R. Altmann, M. Bada, X. Chai, M.W. Carillo, R. Chen and N. Abernethy, RiboWeb: an ontology-based system for...
  • R. Benjamins, D. Fensel and S. Decker, KA2: building ontologies for the Internet: a midterm report, Int. J. Human...
  • T. Berners-Lee, Weaving the Web, Harper, New York,...
  • V. Chaudri, A. Farquhar, R. Fikes, P. Karp and J. Rice, OKBC: a programmatic foundation for knowledge base...
  • W. Dalitz, M. Grötschel and J. Lügger, Information services for mathematics in the Internet (Math-Net), in: A. Sydow...
  • T. Davenport and L. Prusak, Working Knowledge: How Organizations Manage What They Know, Harvard Business School Press,...
  • S. Decker, D. Brickley, J. Saarela and J. Angele, A Query and Inference Service for RDF, in: Proc. of the W3C Query...
  • S. Decker, M. Erdmann, D. Fensel and R. Studer, Ontobroker: ontology based access to distributed and semi-structured...
  • A. Deutsch, M. Fernandez, D. Florescu, A. Levy and D. Suciu, A query language for XML, in: Proc. of the 8th...
  • M. Erdmann and R. Studer, Ontologies as conceptual models for XML documents, in: Proc. of the 12th International...
  • D. Fensel, S. Decker, M. Erdmann and R. Studer, Ontobroker: the very high idea, in: Proc. of the 11th International...
  • M. Fernandez, D. Florescu, J. Kang and A. Levy, Catching the boat with Strudel: experiences with a Web-site management...
  • P. Fröhlich, W. Neijdl and M. Wolpers, KBS-Hyperbook — an open hyperbook system for education, in: Proc. of the 10th...
  • T.R. Gruber, A translation approach to portable ontology specifications, Knowledge Acquisition 6 (2) (1993)...
  • M. Kesseler, A schema based approach to HTML authoring, in: Proc. of the 4th International World Wide Web Conference...
  • Cited by (0)

    1. Download : Download high-res image (52KB)
    2. Download : Download full-size image
    Steffen Staab is assistant professor for Applied Computer Science at Karlsruhe University. He has published in the fields of computational linguistics, information extraction, knowledge representation and reasoning, knowledge management, knowledge discovery, and intelligent systems for the Web. Steffen studied computer science and computational linguistics between 1990 and 1998, earning a M.S.E. from the University of Pennsylvania during a Fulbright scholarship and a Dr. rer. nat. from Freiburg University during a scholarship with Freiburg's graduate program in Cognitive Science. Since then, he has also been working as a consultant for knowledge management at Fraunhofer IAO and at the start-up company Ontoprise.

    1. Download : Download high-res image (37KB)
    2. Download : Download full-size image
    Jürgen Angele received the diploma degree in Computer Science in 1985 from the University of Karlsruhe. From 1985 to 1989 he worked for the companies AEG, Konstanz, and SEMA GROUP, Ulm, Germany. From 1989 to 1994 he was a research and teaching assistant at the University of Karlsruhe. He did research on the operationalization of the knowledge acquisition language KARL, which led to a Ph.D. from the University of Karlsruhe in 1993. In 1994 he became a full professor in applied computer science at the University of Applied Sciences, Braunschweig, Germany. In 1999 he cofounded the company Ontoprise together with S. Decker, H.-P. Schnurr, S. Staab, and R. Studer and has been CEO of Ontoprise since then. His interests lie in the development of knowledge management tools and systems, including innovative applications of knowledge-based systems to the World Wide Web.

    1. Download : Download high-res image (56KB)
    2. Download : Download full-size image
    Stefan Decker is working as a PostDoc at Stanfords Infolab together with Prof. Gio Wiederhold in the Scalable Knowledge Composition project on ontology articulations. He has published in the fields of ontologies, information extraction, knowledge representation and reasoning, knowledge management, problem solving methods and intelligent systems for the Web. He is one of the designers and implementers of the Ontobroker-System. Stefan Decker studied computer science and mathematics at the University of Kaiserslautern and finished his studies with the best possible result in 1995. From 1995–1999 he did his Ph.D. studies at the University of Karlsruhe, where he worked on the Ontobroker project.

    1. Download : Download high-res image (39KB)
    2. Download : Download full-size image
    Michael Erdmann gained his M.D. in Computer Science from the University of Koblenz (Germany) in 1995. Since October 1995 he has been working as a junior researcher at the University of Karlsruhe (Germany). He is a member of the Ontobroker-Project-Team and currently engaged in finishing his Ph.D. about the relationship between semantic knowledge modeling with ontologies and XML.

    1. Download : Download high-res image (42KB)
    2. Download : Download full-size image
    Andreas Hotho is a Ph.D. student at the Institute of Applied Computer Science and Formal Description Methods at Karlsruhe University. He earned his M.D. in information systems from the University of Braunschweig, Germany, in 1998. His research interests include the application of data mining techniques on very large databases and intelligent Web applications.

    1. Download : Download high-res image (42KB)
    2. Download : Download full-size image
    Alexander Maedche is a Ph.D. candidate at the Institute of Applied Computer Science and Formal Description Methods at Karlsruhe University. He received his diploma in Industrial Engineering (computer science, operations research) in 1999 from Karlsruhe University. His research interests include ontology engineering, machine learning, data and text mining and ontology-based applications.

    1. Download : Download high-res image (39KB)
    2. Download : Download full-size image
    Hans-Peter Schnurr is a Ph.D. candidate at the Institute of Applied Computer Science and Formal Description Methods at Karlsruhe University. He received his diploma in Industrial Engineering in 1995 from Karlsruhe University. Between 1995 and 1998, Hans-Peter was working as a researcher and practice analyst at McKinsey and Company and is co-founder of the start-up company Ontoprise, a knowledge management solutions provider. His current research interests include knowledge management methodologies and applications, ontology engineering and ontology-based applications.

    1. Download : Download high-res image (56KB)
    2. Download : Download full-size image
    Rudi Studer obtained a diploma in Computer Science at the University of Stuttgart in 1975. In 1982 he was awarded a Doctor's degree in mathematics and computer science at the University of Stuttgart, and in 1985 he obtained his Habilitation in Computer Science at the University of Stuttgart. From January 1977 to June 1985 he worked as a research scientist at the University of Stuttgart. From July 1985 to October 1989 he was project leader and manager at the Scientific Center of IBM Germany. Since November 1989 he has been full professor in Applied Computer Science at the University of Karlsruhe. His research interests include knowledge management, intelligent Web applications, knowledge engineering and knowledge discovery. He is co-founder and member of the scientific advisory board of the knowledge management start-up company Ontoprise.

    1. Download : Download high-res image (40KB)
    2. Download : Download full-size image
    York Sure is a Ph.D. candidate at the Institute of Applied Computer Science and Formal Description Methods at Karlsruhe University. He received his diploma in industrial engineering in 1999 from Karlsruhe University. His current research interests include knowledge management, ontology merging and mapping, ontology engineering and ontology-based applications.

    1

    http://www.aifb.uni-karlsruhe.de/WBS/

    2

    http://www.ontoprise.de

    View full text