Skip to main content

Über dieses Buch

Intelligent Integration of Information presents a collection of chapters bringing the science of intelligent integration forward. The focus on integration defines tasks that increase the value of information when information from multiple sources is accessed, related, and combined.
This contributed volume has also been published as a special double issue of the Journal of Intelligent Information Systems (JIIS), Volume 6:2/3.



Foreword: Intelligent Integration of Information

This issue of JUS is dedicated to topics in Intelligent Integration of Information (I3). I3 represents a field in Information Systems that parallels closely the objectives of this journal. The focus on Integration defines tasks that increase the value of information when information from multiple sources is accessed, related, and combined. The problem to be addressed in this context is that integration over the ever-expanding number of resources available on-line leads to what customers perceive as information overload. In actuality, the customers experience data overload, making it nearly impossible to extract information sufficiently relevant to lead to decisions and action out of a haystack of data.
Gio Wiederhold

Query Reformulation for Dynamic Information Integration

The standard approach to integrating heterogeneous information sources is to build a global schema that relates all of the information in the different sources, and to pose queries directly against it. The problem is that schema integration is usually difficult, and as soon as any of the information sources change or a new source is added, the process may have to be repeated.
The SIMS system uses an alternative approach. A domain model of the application domain is created, establishing a fixed vocabulary for describing data sets in the domain. Using this language, each available information source is described. Queries to SIMS against the collection of available information sources are posed using terms from the domain model, and reformulation operators are employed to dynamically select an appropriate set of information sources and to determine how to integrate the available information to satisfy a query. This approach results in a system that is more flexible than existing ones, more easily scalable, and able to respond dynamically to newly available or unexpectedly missing information sources.
This paper describes the query reformulation process in SIMS and the operators used in it. We provide precise definitions of the reformulation operators and explain the rationale behind choosing the specific ones SIMS uses. We have demonstrated the feasibility and effectiveness of this approach by applying SIMS in the domains of transportation planning and medical trauma care.
Yigal Arens, Craig A. Knoblock, Wei-Min Shen

Information Mediation in Cyberspace: Scalable Methods for Declarative Information Networks

An end-to-end discussion, from logical architecture to implementation, of issues and design decisions in declarative information networks is presented. A declarative information network is defined to be a dynamic and decentralized structure where value-added services are declared and applied as mediators in a scalable and controlled manner. A primary result is the need to adopt dynamically linked ontologies as the semantic basis for knowledge sharing in scalable networks. It is shown that data mining techniques provide a promising basis upon which to explore and develop this result. Our prototype system, entitled Mystique, is described in terms of KQML, distributed object management, and distributed agent execution. An example shows how we map our architecture into the World Wide Web (WWW) and transform the appearance of the WWW into an intelligently integrated and multi-subject distributed information network.
Son Dao, Brad Perry

An Approach to Information Mediation in the Industrial Domain

This paper presents an approach to information mediation in the industrial domain resulting from the NIIIP infrastructure. In this environment, representations of computing resources normally managed behind distinct organizational “firewalls” are shared by members of a Virtual Enterprise. Issues of descriptive heterogeneity, multiple sources of control, and semantic mismatch are addressed by providing a common representation of both physical resources and natural language tokens as linked objects. The focus of this paper is on the algorithm or “stopping rule” that causes the mediation portion of the system to be invoked to learn to resolve object/action level conflicts by adding high-level abstractions in the form of “triplet tokens” to the Virtual Enterprise’s Knowledge Base Management System.
Art Goldschmidt

NCL: A Common Language for Achieving Rule-Based Interoperability among Heterogeneous Systems

For achieving interoperability among heterogeneous computing systems, the Object Management Group (OMG) has adapted the Common Object Request Broker Architecture (CORBA) and the use of an Interface Definition Language (IDL) for specifying object properties and operations which encapsulate the data and programs of heterogeneous systems. This paper describes a common language which is an enhancement of IDL to include: 1) the semantic richness of EXPRESS, an information modeling language adapted by the ISO/STEP community for achieving product model and data exchange; and 2) the extensibility features and knowledge rule specification offered by the Object-oriented Semantic Association Model (OSAM*). This common language, named the NIIIP Common Language (NCL), is a part of the R&D effort of a project entitled the National Industrial Information Infrastructure Protocols (NIIIP). The design of NCL is standards-based, incorporating the semantic features of the two standard languages, IDL and EXPRESS, and conforming to their syntaxes as much as possible. It is an extensible language which supports the addition of new class, constraint, and association types to the language and its underlying object model in order to satisfy the diverse modeling needs of virtual enterprises. The language also contains a high-level rule specification component. Rules in NCL can be used for defining and enforcing integrity and security constraints, government or enterprise policies and regulations, and other types of semantic constraints that are local or global to heterogeneous systems. In this paper, we shall show how such a modeling language and its supporting KBMS functions can be used to achieve rule-based interoperability in an active heterogeneous system as an enhancement to OMG’s CORBA.
Stanley Y. W. Su, Herman Lam, Tsae-Feng Yu, Javier A. Arroyo-Figueroa, Zhidong Yang, Sooha Lee

Generating Data Integration Mediators that Use Materialization

This paper presents a framework for data integration that is based on using “Squirrel integration mediators” that use materialization to support integrated views over multiple databases. These mediators generalize techniques from active databases to provide incremental propagation of updates to the materialized views. A framework based on “View Decomposition Plans” for optimizing the support of materialized integrated views is introduced. The paper describes the Squirrel mediator generator currently under development, which can generate the mediators based on high-level specifications.
The integration of information by Squirrel mediators is expressed primarily through an extended version of a standard query language, that can refer to data from multiple information sources. In addition to materializing an integrated view of data, these mediators can monitor conditions that span multiple sources. The Squirrel framework also provides efficient support for the problem of “object matching”, that is, determining when object representations (e.g., OIDs) in different databases correspond to the same object-in-the-world, even if a universal key is not available.
To establish a context for the research, the paper presents a taxonomy that surveys a broad variety of approaches to supporting and maintaining integrated views.
Gang Zhou, Richard Hull, Roger King

CoBase: A Scalable and Extensible Cooperative Information System

A new generation of information systems that integrates knowledge base technology with database systems is presented for providing cooperative (approximate, conceptual, and associative) query answering. Based on the database schema and application characteristics, data are organized into Type Abstraction Hierarchies (TAHs). The higher levels of the hierarchy provide a more abstract data representation than the lower levels. Generalization (moving up in the hierarchy), specialization (moving down the hierarchy), and association (moving between hierarchies) are the three key operations in deriving cooperative query answers for the user. Based on the context, the TAHs can be constructed automatically from databases. An intelligent dictionary/directory in the system lists the location and characteristics (e.g., context and user type) of the TAHs. CoBase also has a relaxation manager to provide control for query relaxations. In addition, an explanation system is included to describe the relaxation and association processes and to provide the quality of the relaxed answers. CoBase uses a mediator architecture to provide scalability and extensibility. Each cooperative module, such as relaxation, association, explanation, and TAH management, is implemented as a mediator. Further, an intelligent directory mediator is provided to direct mediator requests to the appropriate service mediators. Mediators communicate with each other via KQML. The GUI includes a map server which allows users to specify queries graphically and incrementally on the map, greatly improving querying capabilities. CoBase has been demonstrated to answer imprecise queries for transportation and logistic planning applications. Currently, we are applying the CoBase methodology to match medical image (X-ray, MRI) features and approximate matching of emitter signals in electronic warfare applications.
Wesley W. Chu, Hua Yang, Kuorong Chiang, Michael Minock, Gladys Chow, Chris Larson

Integrating Information via Matchmaking

Trends such as the massive increase in information available via electronic networks, the use of on-line product data by distributed concurrent engineering teams, and dynamic supply chain integration for electronic commerce are placing severe burdens on traditional methods of information sharing and retrieval. Sources of information are far too numerous and dynamic to be found via traditional information retrieval methods, and potential consumers are seeing increased need for automatic notification services. Matchmaking is an approach based on emerging information integration technologies whereby potential producers and consumers of information send messages describing their information capabilities and needs. These descriptions, represented in rich, machine-interpretable description languages, are unified by the matchmaker to identify potential matches. Based on the matches, a variety of information brokering services are performed. We introduce matchmaking, and argue that it permits large numbers of dynamic consumers and providers, operating on rapidly-changing data, to share information more effectively than via traditional methods. Two matchmakers are described, the SHADE matchmaker, which operates over logic-based and structured text languages, and the COINS matchmaker, which operates over free text. These matchmakers have been used for a variety of applications, most significantly, in the domains of engineering and electronic commerce. We describe our experiences with the SHADE and COINS matchmaker, and we outline the major observed benefits and problems of matchmaking.
Daniel Kuokka, Larry Harada

Glossary: Intelligent Integration of Information

This glossary and its base vocabulary were initially established during the Intelligent Integration of Information (I 3) Architecture Meeting in Boulder CO, in November 1994, sponsored by ARPA and organized by Roger King and Richard Hull of the University of Colorado. It was subsequently refined during an I 3 architecture meeting held in January 1995, organized by Mike Genesereth of Stanford University. It has received inputs from many members of the community, although closure on the I 3 architecture specification itself has not yet been achieved. The architecture document is currently being maintained at George Mason University, as 〈http://​isse.​gmu.​edu/​/​I3_​Arch/​index.​html〉. Related material on I 3 technology can be found in files of the Stanford Logic Group 〈http://​logic.​stanford.​edu/​.​.​.​〉.
Gio Wiederhold


Weitere Informationen