nach oben

2004 | Buch

Kapitel lesen Erstes Kapitel lesen

Foundations of Information and Knowledge Systems

Third International Symposium, FoIKS 2004 Wilheminenburg Castle, Austria, February 17-20, 2004 Proceedings

herausgegeben von: Dietmar Seipel, José María Turull-Torres

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Professional Book Archive

Einloggen, um Zugang zu erhalten

Inhaltsverzeichnis

Frontmatter

Invited Talks

Hypergraph Transversals

Abstract

Hypergraph Transversals have been studied in Mathematics for a long time, cf. [2]. Generating minimal transversals of a hypergraph is an important problem which has many applications in Computer Science, especially in database Theory, Logic, and AI. We briefly survey some results on problems which are known to be related to computing the transversal hypergraph, where we focus on problems in database theory, propositional Logic and AI (for a more detailed survey and further references cf. [10]).

Georg Gottlob

Abstract State Machines: An Overview of the Project

Abstract

This is an extended abstract of an invited talk at the Third International Symposium on Foundations of Information and Knowledge Systems to be held in Vienna, Austria, in February 2004.

We quickly survey the ASM project, from its foundational roots to industrial applications.

Yuri Gurevich

Regular Papers

Database Repair by Signed Formulae

Abstract

We introduce a simple and practically efficient method for repairing inconsistent databases. The idea is to properly represent the underlying problem, and then use off-the-shelf applications for efficiently computing the corresponding solutions.

Given a possibly inconsistent database, we represent the possible ways to restore its consistency in terms of signed formulae. Then we show how the ‘signed theory’ that is obtained can be used by a variety of computational models for processing quantified Boolean formulae, or by constraint logic program solvers, in order to rapidly and efficiently compute desired solutions, i.e., consistent repairs of the database.

Ofer Arieli, Marc Denecker, Bert Van Nuffelen, Maurice Bruynooghe

Simplification of Integrity Constraints for Data Integration

Abstract

When two or more databases are combined into a global one, integrity may be violated even when each database is consistent with its own local integrity constraints. Efficient methods for checking global integrity in data integration systems are called for: answers to queries can then be trusted, because either the global database is known to be consistent or suitable actions have been taken to provide consistent views. The present work generalizes simplification techniques for integrity checking in traditional databases to the combined case. Knowledge of local consistency is employed, perhaps together with given a priori constraints on the combination, so that only a minimal number of tuples needs to be considered. Combination from scratch, integration of a new source, and absorption of local updates are dealt with for both the local-as-view and global-as-view approaches to data integration.

Henning Christiansen, Davide Martinenghi

On the Security of Individual Data

Abstract

We will consider the following problem in this paper:

Assume there are n numeric data {x ₁,x ₂,...,x _n} (like salaries of n individuals) stored in a database and some subsums of these numbers are disclosed by making public or just available for persons not eligible to learn the original data. Our motivating question is: at most how many of these subsums may be disclosed such that none of the numbers x ₁,x ₂,...,x _n can be uniquely determined from these sums. These types of problems arise in the cases when certain tasks concerning a database are done by subcontractors who are not eligible to learn the elements of the database, but naturally should be given some data to fulfill there task. In database theory such examples are called statistical databases as they are used for statistical purposes and no individual data are supposed to be obtained using a restricted list of SUM queries. This problem was originally introduced by Chin and Ozsoyoglu [1], originally solved by Miller et al. [5] and revisited by Griggs [4].

It turned out [5] that the problem is equivalent to the following question: If there are n real, non-zero numbers X={x ₁,x ₂,...,x _n} given, what is the maximum number of 0 subsums of it, that is, what is the maximum number of the subsets of X whose elements sum up to 0. This approach, together with the Sperner theorem shows that no more than \(\left(\begin{array}{c} n \\ n/2 \\ \end{array}\right)\) sub-sums of a given set of secure data may be disclosed without disclosing at least one of the data, which upper bound is sharp as well.

However, it is natural to assume that the disclosed sub-sums of the original elements of the database will contain only a limited number of elements, say at most k (in the applications databases are usually huge, while the number of operations is in most of the cases limited). We have now the same question: at most how many of these subsums of at most k members may be given such that none of the numbers x ₁,x ₂,...,x _n can be uniquely determined from these sums. The main result of this paper gives an upper bound on this number, which turns out to be sharp if we allow subsums of only k or k-1 members and asymptotically sharp in case of subsums of at most k members.

János Demetrovics, Gyula O. H. Katona, Dezső Miklós

Implementing Ordered Choice Logic Programming Using Answer Set Solvers

Abstract

Ordered Choice Logic Programming (OCLP) allows for dynamic preference-based decision-making with multiple alternatives without the need for any form of negation. This complete absence of negation does not weaken the language as both forms (classical and as-failure) can be intuitively simulated in the language and eliminated using a simple pre-processor, making it also an easy language for users less familiar with logic programming. The semantics of the language is based on the preference between alternatives, yielding both a skeptical and a credulous approach. In this paper we demonstrate how OCLPs can be translated to semi-negative logic programs such that, depending on the transformation, the answer sets of the latter correspond with the skeptical/credulous answer sets of the former. By providing such a mapping, we have a mechanism for implementing OCLP using answer set solvers like Smodels or dlv. We end with a discussion of the complexity of our system and the reasoning tasks it can perform.

Marina De Vos

Skyline Cardinality for Relational Processing

How Many Vectors Are Maximal?

Abstract

The skyline clause—also called the Pareto clause—recently has been proposed as an extension to SQL. It selects the tuples that are Pareto optimal with respect to a set of designated skyline attributes. This is the maximal vector problem in a relational context, but it represents a powerful extension to SQL which allows for the natural expression of on-line analytic processing (OLAP) queries and preferences in queries.

Cardinality estimation of skyline sets is the focus in this work. A better understanding of skyline cardinality—and other properties of the skyline—is useful for better design of skyline algorithms, is necessary to extend a query optimizer’s cost model to accommodate skyline queries, and helps to understand better how to use skyline effectively for OLAP and preference queries.

Within a basic model with assumptions of sparseness of values on attributes’ domains and statistical independence across attributes, we establish the expected skyline cardinality for skyline queries. While asymptotic bounds have been previously established, they are not widely known nor applied in skyline work. We show concrete estimates, as would be needed in a cost model, and consider the nature of the distribution of skyline. We next establish the effects on skyline cardinality as the constraints on our basic model are relaxed. Some of the results are quite counter-intuitive, and understanding these is critical to skyline’s use in OLAP and preference queries. We consider when attributes’ values repeat on their domains, and show the number of skyline is diminished. We consider the effects of having Zipfian distributions on the attributes’ domains, and generalize the expectation for other distributions. Last, we consider the ramifications of correlation across the attributes.

Parke Godfrey

Query Answering and Containment for Regular Path Queries under Distortions

Abstract

We give a general framework for approximate query processing in semistructured databases. We focus on regular path queries, which are the integral part of most of the query languages for semistructured databases. To enable approximations, we allow the regular path queries to be distorted. The distortions are expressed in the system by using weighted regular expressions, which correspond to weighted regular transducers. After defining the notion of weighted approximate answers we show how to compute them in order of their proximity to the query. In the new approximate setting, query containment has to be redefined in order to take into account the quantitative proximity information in the query answers. For this, we define approximate containment, and its variants k-containment and reliable containment. Then, we give an optimal algorithm for deciding the k-containment. Regarding the reliable approximate containment, we show that it is polynomial time equivalent to the notorious limitedness problem in distance automata.

Gösta Grahne, Alex Thomo

Weak Functional Dependencies in Higher-Order Datamodels

–The Case of the Union Constructor–

Abstract

We present an axiomatisation for weak functional dependencies, i.e. disjunctions of functional dependencies, in the presence of several constructors for complex values. These constructors are the tuple constructor, the set-constructor, an optionality constructor, and a union constructor. The theory is smooth and rather uniform, if the union-constructor is absent. Its presence, however, complicates all results and proofs significantly. The reason for this is that the union-constructor comes alomg with non-trivial restructuring rules. In particular, if the union-constructor is absent, a subset of the rules is complete for the implication of ordinary functional dependencies, but this does not hold, if the union constructor is present.

Sven Hartmann, Sebastian Link, Klaus-Dieter Schewe

Reasoning about Functional and Multi-valued Dependencies in the Presence of Lists

Abstract

Nested lists are used as a data structure whenever order matters. List types are therefore supported by many advanced data models such as genomic sequence, deductive and object-oriented data models including XML.

It is studied what impact the presence of the finite list type has on the two most important classes of relational dependencies. A finite axiomatisation of functional and multi-valued dependencies in databases supporting base, record and finite list types is presented. In order to capture different data models at a time, an abstract algebraic approach based on nested attributes and subtyping is taken. This algebraic framework together with a new inference rule allowing to derive non-trivial functional dependencies from multi-valued dependencies make the generalisation of the well-known theory from the relational data model natural.

Sven Hartmann, Sebastian Link, Klaus-Dieter Schewe

The Relative Complexity of Updates for a Class of Database Views

Abstract

It is well known that the complexity of testing the correctness of an arbitrary update to a database view can be far greater than the complexity of testing a corresponding update to the main schema. However, views are generally managed according to some protocol which limits the admissible updates to a subset of all possible changes. The question thus arises as to whether there is a more tractable relationship between these two complexities in the presence of such a protocol. In this paper, this question is answered in the affirmative for closed update strategies, which are based upon the constant-complement approach of Bancilhon and Spyratos. Working within a very general framework which is independent of any particular data model, but which recaptures relational schemata constrained by so-called equality-generating dependencies (EGDs), (which include functional dependencies (FDs)), it is shown that the complexity of testing the correctness of a view update which follows a closed update strategy is no greater than that of testing a corresponding update to the main schema. In particular, if the main schema is relational and constrained by FDs, then there exists a set of FDs on the view, against which any candidate update may be tested for correctness. This holds even though the entire view may not be finitely axiomatizable, much less constrained by FDs alone.

Stephen J. Hegner

Equivalence of OLAP Dimension Schemas

Abstract

Dimension schemas are abstract models of the data hierarchies that populate OLAP warehouses. Although there is abundant work on schema equivalence in a variety of data models, these works do not cover dimension schemas. In this paper we propose a notion of equivalence that allows to compare dimension schemas with respect to their information capacity. The proposed notion is intended to capture dimension schema equivalence in the context of OLAP schema restructuring. We offer characterizations of schema equivalence in terms of graph and schema isomorphisms, and present algorithms for testing it in well known classes of OLAP dimension schemas. Our results also permit to compare the expressiveness of different known classes of dimension schemas.

Carlos A. Hurtado, Claudio Gutiérrez

A New Approach to Belief Modeling

Abstract

It has been shown that, despite the differences in approach and interpretation, all belief function based models without the so-called dynamic component lead essentially to mathematically equivalent theories – at least in the finite case. In this paper, we first argue that at the logical level these models seem to share a common formal framework and various interpretations just come at the epistemic level. We then introduce a framework for belief modeling formally based on Dempster’s structure with adopting Smets’ view of the origin of beliefs. It is shown that the proposed model is more general than previous models, and may provide a suitable unified framework for belief modeling.

V. N. Huynh, Y. Nakamori, T. Murai, T. B. Ho

Computer-Oriented Calculi of Sequent Trees

Abstract

The problem of construction of a computer-oriented technique for inference search based on a certain sequent formalism for first-order classical logic with equality is solved. For this, special calculi of so-called sequent trees are constructed. The following features are inherent to the tree calculi: (i) preliminary skolemization is used for increasing their proof search efficiency with the help of a technique for finding the most general simultaneous unifier, (ii) every calculus is completely induced by a correspondent sequent calculus, (iii) any transformation of a sequent tree is defined by an appropriate (propositional) rule, and what’s more, (iv) certain kinds of the paramodulation rule are added to the sequent tree calculi. For all the calculi, some results about their soundness and completeness are given. Note that an approach under consideration can give a possibility to incorporate the proposed paramodulution technique into, for example, different modifications of the model elimination method, of goal-oriented sequent calculi, and of the tableaux method.

Alexander Lyaletski

On Updates of Logic Programs: A Properties-Based Approach

Abstract

We have studied the update operator ⊕ defined in [4] without tautologies and we have observed that satisfies an interesting property. This property is similar to one postulate proposed by AGM but, in this case for nonmonotonic logic and that we called WIS. Also, we consider other five additional basic properties about update programs and we show that ⊕ satisfies them. So, this work continues the analysis about the AGM postulates with respect to operator ⊕ under the refinated view that includes knowledge and beliefs that we began in a recent previous paper and that satisfies the WIS property for closed programs under tautologies.

Mauricio Osorio, Fernando Zacarías

Minimal Keys in Higher-Order Datamodels

Abstract

We study keys in higher-order datamodels. We show that they are equivalent with certain ideals. Based on that we introduce an ordering between key sets, and investigate systems of minimal keys. We give a sufficient condition for a Sperner-family of SHL-ideals being system of minimal keys, and give lower and upper bounds for the size of the smallest Armstrong-instance.

Attila Sali

Similarity Relational Calculus and Its Reduction to a Similarity Algebra

Abstract

Traditional database query languages are based on set theory and crisp logic. Many applications, however, need similarity or retrieval-like queries producing results with truth values from the interval [0,1]. Such truth values can be regarded as continuous membership values of tuples expressing how strongly a query is matched. Formulating queries by applying existing similarity relational algebras means to express the user’s need in a procedural manner. In order to support a declarative way of formulating queries, we generalize the classical relational domain calculus by incorporating fuzzy operations and user weights. Besides defining syntax and semantics we show how to map any calculus expression onto a corresponding similarity algebra expression. In this way, we present a theoretical foundation for a declarative query language combining retrieval functionality and traditional relational databases.

Ingo Schmitt, Nadine Schulz

Challenges in Fixpoint Computation with Multisets

Abstract

Uncertainty management has been a challenging issue in AI and database research. Logic database programming with its declarative advantage and its top-down and bottom-up query processing techniques has been an attractive formalism for representing and manipulating uncertain information, and numerous frameworks with uncertainty has been proposed. These proposals address fundamental issues of modeling, semantics, query processing and optimization, however, one important issue which remains unaddressed is efficient implementation of such frameworks. In this paper, we illustrate that the standard semi-naive evaluation method does not have a counterpart in general in these frameworks. We then propose a desired semi-naive algorithm, which extends the corresponding standard method, and establish its equivalence with the naive method with uncertainty. We implemented the algorithm and conducted numerous tests. Our experimental results indicate that the proposed technique is practical and supports efficient fixpoint computation with uncertainty. We believe that the method is also useful in a more general context of fixpoint computation with aggregations.

Nematollaah Shiri, Zhi Hong Zheng

Towards a Generalized Interaction Scheme for Information Access

Abstract

We introduce the formal framework of a generalized interaction scheme for information access between users and information sources. Within this framework we describe an interaction manager which supports more complex interaction schemes than those that are supported by existing systems, including: query by example, answer enlargement/reduction, query relaxation/restriction, index relaxation/contraction, “relevance” feedback, and adaptation facilities. We give the foundations of this interaction manager from a mathematical point of view, in terms of an abstract view of an information source.

Yannis Tzitzikas, Carlo Meghini, Nicolas Spyratos

Plan Databases: Model and Algebra

Abstract

Despite the fact that thousands of applications manipulate plans, there has been no work to date on managing large databases of plans. In this paper, we first propose a formal model of plan databases. We describe important notions of consistency and coherence for such databases. We then propose a set of operators similar to the relational algebra to query such databases of plans.

Fusun Yaman, Sibel Adali, Dana Nau, Maria L. Sapino, V. S. Subrahmanian

Backmatter

Titel: Foundations of Information and Knowledge Systems
herausgegeben von: Dietmar Seipel
José María Turull-Torres
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-540-24627-5
Print ISBN: 978-3-540-20965-2
DOI: https://doi.org/10.1007/b95342

Springer Professional

Inhaltsverzeichnis

Frontmatter

Invited Talks

Hypergraph Transversals

Abstract State Machines: An Overview of the Project

Regular Papers

Database Repair by Signed Formulae

Simplification of Integrity Constraints for Data Integration

On the Security of Individual Data

Implementing Ordered Choice Logic Programming Using Answer Set Solvers

Skyline Cardinality for Relational Processing

Query Answering and Containment for Regular Path Queries under Distortions

Weak Functional Dependencies in Higher-Order Datamodels

Reasoning about Functional and Multi-valued Dependencies in the Presence of Lists

The Relative Complexity of Updates for a Class of Database Views

Equivalence of OLAP Dimension Schemas

A New Approach to Belief Modeling

Computer-Oriented Calculi of Sequent Trees

On Updates of Logic Programs: A Properties-Based Approach

Minimal Keys in Higher-Order Datamodels

Similarity Relational Calculus and Its Reduction to a Similarity Algebra

Challenges in Fixpoint Computation with Multisets

Towards a Generalized Interaction Scheme for Information Access

Plan Databases: Model and Algebra

Backmatter