Skip to main content

Über dieses Buch

Managing uncertainty and inconsistency has been extensively explored in - ti?cial Intelligence over a number of years. Now with the advent of massive amounts of data and knowledge from distributed heterogeneous,and potentially con?icting, sources, there is interest in developing and applying formalisms for uncertainty andinconsistency widelyin systems that need to better managethis data and knowledge. The annual International Conference on Scalable Uncertainty Management (SUM) has grown out of this wide-ranging interest in managing uncertainty and inconsistency in databases, the Web, the Semantic Web, and AI. It aims at bringing together all those interested in the management of large volumes of uncertainty and inconsistency, irrespective of whether they are in databases,the Web, the Semantic Web, or in AI, as well as in other areas such as information retrieval, risk analysis, and computer vision, where signi?cant computational - forts are needed. After a promising First International Conference on Scalable Uncertainty Management was held in Washington DC, USA in 2007, the c- ference series has been successfully held in Napoli, Italy, in 2008, and again in Washington DC, USA, in 2009.



Invited Talks

Markov Chain Monte Carlo and Databases


Several currently ongoing research efforts aim to combine Markov Chain Monte Carlo (MCMC) with database management systems. The goal is to scale up the management of uncertain data in contexts where only MCMC is known to be applicable or where the range and flexibility of MCMC provides a compelling proposition for powerful and interesting systems. This talk surveys recent work in this area and identifies open research challenges.

Christoph Koch

Answer Set Programming, the Solving Paradigm for Knowledge Representation and Reasoning

Answer Set Programming (ASP;[1,2,3,4]) is a declarative problem solving approach, combining a rich yet simple modeling language with high-performance solving capacities. ASP is particularly suited for modeling problems in the area of Knowledge Representation and Reasoning involving incomplete, inconsistent, and changing information. From a formal perspective, ASP allows for solving all search problems in




) in a uniform way (being more compact than SAT). Applications of ASP include automatic synthesis of multiprocessor systems, decision support systems for NASA shuttle controllers, reasoning tools in systems biology, and many more. The versatility of ASP is also reflected by the ASP solver


[5,6,7], developed at the University of Potsdam, and winning first places at ASP’09, PB’09, and SAT’09.

The talk will give an overview about ASP, its modeling language, solving methodology, and portray some of its applications.

Torsten Schaub

Discussant Contributions

Graphical and Logical-Based Representations of Uncertain Information in a Possibility Theory Framework

Developing efficient approaches for reasoning under uncertainty is an important issue in many applications. Several graphical [3] and logical-based methods have been proposed to reason with incomplete information in various uncertainty theory frameworks. This paper focuses on possibility theory which is a convenient uncertainty theory framework to represent different kinds of prioritized pieces of information. It provides a brief overview of main compact representation formats, and their associated inference tools, that exist in a possibility theory framework.

Salem Benferhat

Probabilistic Data: A Tiny Survey

In this survey, we will visit existing projects and proposals for uncertain data, all supporting probabilistic handling of confidence scores.

Several models for uncertain data have been proposed over the years. Initial efforts all focused on relational data [3] and also currently efforts are being made in the relational setting [10,4,5,7,2]. With relational data models, two methods to associate confidences with data are commonly used.

Ander de Keijzer

The Role of Epistemic Uncertainty in Risk Analysis

The notion of uncertainty has been a controversial issue for a long time. In particular the prominence of probability theory in the scientific arena has blurred some distinctions that were present from its inception, namely between uncertainty due to the variability of physical phenomena, and uncertainty due to a lack of information. The Bayesian school claims that whatever its origin, uncertainty can be modeled by single probability distributions [18].

Didier Dubois

Uncertainty in Clustering and Classification

Clustering and classification are among the most important problem tasks in the realm of data analysis, data mining and machine learning. In fact, while clustering can be seen as the most popular representative of

unsupervised learning

, classification (together with regression) is arguably the most frequently considered task in

supervised learning

. Even though the literature on clustering and classification abounds, the interest in these topics seems to be unwaning, both from a research and application point of view.

Eyke Hüllermeier

Information Fusion

Merging information coming from different sources is an important issue in various domains of computer science like knowledge representation for artificial intelligence, decision making or databases. The aim of fusion is to obtain a global point of view, exploiting the complementarity between sources, solving different existing conflicts, reducing the possible redundancies.

Odile Papini

Use of the Domination Property for Interval Valued Digital Signal Processing

Imprecise probability framework is usually dedicated to decision processes. In recent work, we have shown that this framework can also be used to compute an interval-valued signal containing all outputs of processes involving a coherent family of conventional linear filters. This approach is based on a very straightforward extension of the expectation operator involving appropriate concave capacities.

Olivier Strauss

Regular Contributions

Managing Lineage and Uncertainty under a Data Exchange Setting

We present a data exchange framework that is capable of exchanging uncertain data with lineage and give meaningful certain answers on queries posed on the target schema. The data are stored in a

database with uncertainty and lineage (ULDB)

which represents a set of possible instances that are

databases with lineage (LDBs)

. Hence we need first to revisit all the notions related to data exchange for the case of LDBs. Producing all possible instances of a ULDB, like the semantics of certain answers would indicate, is exponential. We present a more efficient approach: a u-chase algorithm that extends the known chase procedure of traditional data exchange and show that it can be used to correctly compute certain answers for conjunctive queries in PTIME for a set of weakly acyclic tuple generating dependencies. We further show that if we allow equality generating dependencies in the set of constraints then computing certain answers for conjunctive queries becomes NP-hard.

Foto N. Afrati, Angelos Vasilakopoulos

A Formal Analysis of Logic-Based Argumentation Systems

Dung’s abstract argumentation model consists of a set of arguments and a binary relation encoding attacks among arguments. Different acceptability semantics have been defined for evaluating the arguments. What is worth noticing is that the model completely abstracts from the applications to which it can be applied. Thus, it is not clear what are the results that can be returned in a given application by each semantics. This paper answers this question. For that purpose, we start by plunging the model in a real application. That is, we assume that we have an inconsistent knowledge base (KB) containing formulas of an

abstract monotonic logic

. From this base, we show how to define arguments. Then, we characterize the different semantics in terms of the subsets of the KB that are returned by each extension. We show a full correspondence between maximal consistent subbases of a KB and maximal conflict-free sets of arguments. We show also that stable and preferred extensions choose


some consistent subbases of a base. Finally, we investigate the results of three argumentation systems that use well-known attack relations.

Leila Amgoud, Philippe Besnard

Handling Inconsistency with Preference-Based Argumentation

Argumentation is a promising approach for handling inconsistent knowledge bases, based on the justification of plausible conclusions by arguments. Due to inconsistency, arguments may be attacked by counterarguments. The problem is thus to evaluate the arguments in order to select the most acceptable ones.

The aim of this paper is to make a bridge between the argumentation-based and the coherence-based approaches for handling inconsistency. We are particularly interested by the case where priorities between the formulas of an inconsistent knowledge base are available. For that purpose, we will use the rich preference-based argumentation framework (PAF) we have proposed in an earlier work. A rich PAF has two main advantages: i) it overcomes the limits of existing PAFs, and ii) it encodes two different roles of preferences between arguments (handling critical attacks and refining the evaluation of arguments). We show that there exist full correspondences between particular cases of these PAF and two well known coherence-based approaches, namely the preferred sub-theories and the democratic as well.

Leila Amgoud, Srdjan Vesic

A Possibility Theory-Oriented Discussion of Conceptual Pattern Structures

A fruitful analogy between possibility theory and formal concept analysis has recently contributed to show the interest of introducing new operators in this latter setting. In particular, another Galois connection, distinct from the classical one that defines formal concepts, has been laid bare which allows for the decomposition of a formal context into sub-contexts when possible. This paper pursues a similar investigation by considering pattern structures which are known to offer a generalization of formal concept analysis. The new operators as well as the other Galois connection are introduced in this framework, where an object is associated to a structured description rather than just to its set of properties. The description may take many different forms. In this paper, we more particularly focus on two important particular cases, namely ordered lists of intervals, and propositional knowledge bases, which both allow for incomplete descriptions. They are then extended to fuzzy and uncertain descriptions by introducing fuzzy intervals and possibilistic logic bases respectively in these two settings.

Zainab Assaghir, Mehdi Kaytoue, Henri Prade

DK-BKM: Decremental K Belief K-Modes Method

This paper deals with the dynamic clustering under uncertainty by developing a decremental K Belief K-modes method (DK-BKM). Our clustering DK-BKM method tackles the problem of decreasing the number of clusters in an uncertain context using the Transferable Belief Model (TBM).

The proposed approach generalizes belief K-modes method (BKM) to a dynamic environment. Thus, this so-called DK-BKM method provides a new clustering technique handling uncertain categorical attribute’s values of dataset objects where dynamic clusters’ number is considered. Using the dissimilarity measure concept makes us to update the partition without performing complete reclustering. Experimental results of this dynamic approach show good performance on well-known benchmark datasets.

Sarra Ben Hariz, Zied Elouedi

On the Use of Fuzzy Cardinalities for Reducing Plethoric Answers to Fuzzy Queries

Retrieving data from large-scale databases sometimes leads to plethoric answers especially when queries are under-specified. To overcome this problem, we propose to strengthen the initial query with additional predicates. These predicates are selected among predefined ones according principally to their degrees of semantic correlation with the initial query in order to avoid an excessive modification of its initial scope. According to the size of the initial answer set and the number of expected results specified by the user, fuzzy cardinalities are used to assess the reduction capability of these correlated predefined predicates.

Patrick Bosc, Allel Hadjali, Olivier Pivert, Grégory Smits

From Bayesian Classifiers to Possibilistic Classifiers for Numerical Data

Naïve Bayesian classifiers are well-known for their simplicity and efficiency. They rely on independence hypotheses, together with a normality assumption, which may be too demanding, when dealing with numerical data. Possibility distributions are more compatible with the representation of poor data. This paper investigates two kinds of possibilistic elicitation methods that will be embedded into possibilistic naïve classifiers. The first one is derived from a probability-possibility transformation of Gaussian distributions (or mixtures of them), which introduces some further tolerance. The second kind is based on a direct interpretation of data in fuzzy histogram or possibilistic formats that exploit an idea of proximity between attribute values in different ways. Besides, possibilistic classifiers may be allowed to leave the classification open between several classes in case of insufficient information for choosing one (which may be of interest when the number of classes is large). The experiments reported show the interest of possibilistic classifiers.

Myriam Bounhas, Khaled Mellouli, Henri Prade, Mathieu Serrurier

Plausibility of Information Reported by Successive Sources

This paper deals with the evaluation of a piece of information when successively reported by several sources. It describes a model based on Dempster-Shafer’s theory in which the evaluation of a reported piece of information is defined by a plausibility degree. This value depends on the degrees at which sources are correct and the degrees at which they are wrong.

Laurence Cholvy

Combining Semantic Web Search with the Power of Inductive Reasoning

With the introduction of the Semantic Web as a future substitute of the Web, the key task for the Web, namely, Web Search, is evolving towards some novel form of

Semantic Web search

. A very promising recent approach to Semantic Web search is based on combining standard Web pages and search queries with ontological background knowledge, and using standard Web search engines as the main inference motor of Semantic Web search. In this paper, we continue this line of research. We propose to further enhance this approach by the use of inductive reasoning. This increases the robustness of Semantic Web search, as it adds the important ability to handle inconsistencies, noise, and incompleteness, which are all very likely to occur in distributed and heterogeneous environments such as the Web. In particular, inductive reasoning allows to infer (from training individuals) new knowledge, which is not logically deducible. We also report on a prototype implementation of the new approach and its experimental evaluations.

Claudia d’Amato, Nicola Fanizzi, Bettina Fazzinga, Georg Gottlob, Thomas Lukasiewicz

Evaluating Trust from Past Assessments with Imprecise Probabilities: Comparing Two Approaches

In this paper, we consider a trust system where the trust in an agent is evaluated from past assessments made by other agents. We consider that trust is evaluated by values given on a finite scale. To model the agent trustworthiness, we propose to build imprecise probabilistic models from these assessments. More precisely, we propose to derive probability intervals (i.e., bounds on singletons) using two different approaches: Goodman’s multinomial confidence regions and the imprecise Dirichlet model (IDM). We then use these models for two purposes: (1) evaluating the chances that a future assessments will take particular values, and (2) computing an interval summarizing the agent trustworthiness, eventually fuzzyfying this interval by letting the confidence value vary over the unit interval. We also give some elements of comparison between the two approaches.

Sebastien Destercke

Range-Consistent Answers of Aggregate Queries under Aggregate Constraints

A framework for computing range-consistent answers of aggregate queries in the presence of aggregate constraints is introduced. The range-consistent answer of an aggregate query is the narrowest interval containing all the answers of the query evaluated on every possible repaired database. A wide form of aggregate constraints is considered, consisting of linear inequalities on aggregate-sum functions. In this setting, three types of aggregate queries are investigated, namely






queries. Our approach computes consistent answers by solving Integer Linear Programming (ILP) problem instances, thus enabling well-established techniques for ILP resolution to be exploited.

Sergio Flesca, Filippo Furfaro, Francesco Parisi

Characterization, Propagation and Analysis of Aleatory and Epistemic Uncertainty in the 2008 Performance Assessment for the Proposed Repository for High-Level Radioactive Waste at Yucca Mountain, Nevada

The 2008 performance assessment (PA) for the proposed repository for high-level radioactive waste at Yucca Mountain (YM), Nevada, illustrates the conceptual structure of risk assessments for complex systems. The 2008 YM PA is based on the following three conceptual entities:a probability space that characterizes aleatory uncertainty; a function that predicts consequences for individual elements of the sample space for aleatory uncertainty; and a probability space that characterizes epistemic uncertainty. These entities and their use in the characterization, propagation and analysis of aleatory and epistemic uncertainty are described and illustrated with results from the 2008 YM PA.

Clifford W. Hansen, Jon C. Helton, Cédric J. Sallaberry

Comparing Evidential Graphical Models for Imprecise Reliability

This paper presents a comparison of two evidential networks applied to the reliability study of complex systems with uncertain knowledge. This comparison is based on different aspects. In particular, the original structure, the graphical structures for the inference, the message-passing schemes, the storage efficiencies, the computational efficiencies and the exactness of the results are studied.

Wafa Laâmari, Boutheina Ben Yaghlane, Christophe Simon

Imprecise Bipolar Belief Measures Based on Partial Knowledge from Agent Dialogues

Valuation pairs [6], [5] are proposed as a possible framework for representing assertability conventions applied to sentences from a propositional logic language. This approach is then extended by introducing imprecise valuation pairs to represent partially or imprecisely defined assertability conventions. Allowing for uncertainty then naturally results in lower and upper bipolar belief measures on the sentences of the language. The potential of this extended framework is illustrated by its application to an idealised dialogue between agents, each making assertions and condemnations based on their individual valuation pairs.

Jonathan Lawry

Kriging with Ill-Known Variogram and Data

Kriging consists in estimating or predicting the spatial phenomenon at non sampled locations from an estimated random function. Although information necessary to properly select a unique random function model seems to be partially lacking, geostatistics in general, and the kriging methodology in particular, does not account for the incompleteness of the information that seems to pervade the procedure. On the one hand, the collected data may be tainted with errors that can be modelled by intervals or fuzzy intervals. On the other hand, the choice of parameter values for the theoretical variogram, an essential step, contains some degrees of freedom that is seldom acknowledged. In this paper we propose to account for epistemic uncertainty pervading the variogram parameters, and possibly the data set, by means of fuzzy interval uncertainty. We lay bare its impact on the kriging results, improving on previous attempts by Bardossy and colleagues in the late 1980’s.

Kevin Loquin, Didier Dubois

Event Modelling and Reasoning with Uncertain Information for Distributed Sensor Networks

CCTV and sensor based surveillance systems are part of our daily lives now in this modern society due to the advances in telecommunications technology and the demand for better security. The analysis of sensor data produces semantic rich events describing activities and behaviours of objects being monitored. Three issues usually are associated with events descriptions. First, data could be collected from multiple sources (e.g., sensors, CCTVs, speedometers, etc). Second, descriptions about these data can be poor, inaccurate or uncertain when they are gathered from unreliable sensors or generated by analysis non-perfect algorithms. Third, in such systems, there is a need to incorporate domain specific knowledge, e.g., criminal statistics about certain areas or patterns, when making inferences. However, in the literature, these three phenomena are seldom considered in CCTV-based event composition models. To overcome these weaknesses, in this paper, we propose a general event modelling and reasoning model which can represent and reason with events from multiple sources including domain knowledge, integrating the Dempster-Shafer theory for dealing with uncertainty and incompleteness. We introduce a notion called event cluster to represent uncertain and incomplete events induced from an observation. Event clusters are then used in the merging and inference process. Furthermore, we provide a method to calculate the mass values of events which use evidential mapping techniques.

Jianbing Ma, Weiru Liu, Paul Miller

Uncertainty in Decision Tree Classifiers

One of the current challenges in the field of data mining is to develop techniques to analyze uncertain data. Among these techniques, in this paper we focus on decision tree classifiers. In particular, we introduce a new data structure that can be used to represent multiple decision trees generated from uncertain datasets.

Matteo Magnani, Danilo Montesi

Efficient Policy-Based Inconsistency Management in Relational Knowledge Bases

Real-world databases are frequently inconsistent. Even though the users who work with a body of data are far more familiar not only with that data, but also their own job and the risks they are willing to take and the inferences they are willing to make from inconsistent data, most DBMSs force them to use the policy embedded in the DBMS. Inconsistency management policies (IMPs) were introduced so that users can apply policies that they deem are appropriate for data they know and understand better than anyone else. In this paper, we develop an efficient “cluster table” method to implement IMPs and show that using cluster tables instead of a standard DBMS index is far more efficient when less than about 3% of a table is involved in an inconsistency (which is hopefully the case in most real world DBs), while standard DBMS indexes perform better when the amount of inconsistency in a database is over 3%.

Maria Vanina Martinez, Francesco Parisi, Andrea Pugliese, Gerardo I. Simari, V. S. Subrahmanian

Modelling Probabilistic Inference Networks and Classification in Probabilistic Datalog

Probabilistic Graphical Models (PGM) are a well-established approach for modelling uncertain knowledge and reasoning. Since we focus on inference, this paper explores Probabilistic Inference Networks (PIN’s) which are a special case of PGM. PIN’s, commonly referred as Bayesian Networks, are used in Information Retrieval to model tasks such as classification and ad-hoc retrieval.

Intuitively, a probabilistic logical framework such as Probabilistic Datalog (PDatalog) should provide the expressiveness required to model PIN’s. However, this modelling turned out to be more challenging than expected, requiring to extend the expressiveness of PDatalog. Also, for IR and when modelling more general tasks, it turned out that 1st generation PDatalog has expressiveness and scalability bottlenecks. Therefore, this paper makes a case for 2nd generation PDatalog which supports the modelling of PIN’s. In addition, the paper reports the implementation of a particular PIN application: Bayesian Classifiers to investigate and demonstrate the feasibility of the proposed approach.

Miguel Martinez-Alvarez, Thomas Roelleke

Handling Dirty Databases: From User Warning to Data Cleaning — Towards an Interactive Approach

One can conceive many reasonable ways of characterizing how dirty a database is with respect to a set of integrity constraints (e.g., functional dependencies). However, dirtiness measures, as good as they can be, are difficult to interpret for an end-user and do not give the database administrator much hint about how to clean the base. This paper discusses these aspects and proposes some methods aimed at either helping the user or the administrator overcome the limitations of dirtiness measures when it comes to handling dirty databases.

Olivier Pivert, Henri Prade

Disjunctive Fuzzy Logic Programs with Fuzzy Answer Set Semantics

Reasoning under fuzzy uncertainty arises in many applications including planning and scheduling in fuzzy environments. In many real-world applications, it is necessary to define fuzzy uncertainty over qualitative uncertainty, where fuzzy values are assigned over the possible outcomes of qualitative uncertainty. However, current fuzzy logic programming frameworks support only reasoning under fuzzy uncertainty. Moreover, disjunctive logic programs, although used for reasoning under qualitative uncertainty it cannot be used for reasoning with fuzzy uncertainty. In this paper we combine extended and normal fuzzy logic programs [30, 23], for reasoning under fuzzy uncertainty, with disjunctive logic programs [7, 4], for reasoning under qualitative uncertainty, in a unified logic programming framework, namely extended and normal disjunctive fuzzy logic programs. This is to allow directly and intuitively to represent and reason in the presence of both fuzzy uncertainty and qualitative uncertainty. The syntax and semantics of extended and normal disjunctive fuzzy logic programs naturally extends and subsumes the syntax and semantics of extended and normal fuzzy logic programs [30, 23] and disjunctive logic programs [7, 4]. Moreover, we show that extended and normal disjunctive fuzzy logic programs can be intuitively used for representing and reasoning about scheduling with fuzzy preferences.

Emad Saad

Cost-Based Query Answering in Action Probabilistic Logic Programs

Action-probabilistic logic programs (


-programs), a class of probabilistic logic programs, have been applied during the last few years for modeling behaviors of entities. Rules in


-programs have the form “If the environment in which entity


operates satisfies certain conditions, then the probability that


will take some action


is between




”. Given an


-program, we have addressed the problem of deciding if there is a way to change the environment (subject to some constraints) so that the probability that entity


takes some action (or combination of actions) is maximized. In this work we tackle a related problem, in which we are interested in reasoning about the expected reactions of the entity being modeled when the environment is changed. Therefore, rather than merely deciding if there is a way to obtain the desired outcome, we wish to find the


way to do so, given costs of possible outcomes. This is called the Cost-based Query Answering Problem (


). We first formally define and study an exact (intractable) approach to


, and then go on to propose a more efficient algorithm for a specific subclass of


-programs that builds on past work in a basic version of this problem.

Gerardo I. Simari, John P. Dickerson, V. S. Subrahmanian

Clustering Fuzzy Data Using the Fuzzy EM Algorithm

In this article, we address the problem of clustering imprecise data using finite mixtures of Gaussians. We propose to estimate the parameters of the mixture model using the fuzzy EM algorithm. This extension of the EM algorithm allows us to handle imprecise data represented by fuzzy numbers. First, we briefly recall the principle of the fuzzy EM algorithm. Then, we provide the update equations for the parameters of a Gaussian mixture model for fuzzy data. Experiments carried out on synthetic and real data demonstrate the interest of our approach for clustering data that are only imprecisely known.

Benjamin Quost, Thierry Denœux

Combining Multi-resolution Evidence for Georeferencing Flickr Images

We explore the task of determining the geographic location of photos on Flickr, using combined evidence from Naive Bayes classifiers that are trained at different spatial resolutions. In particular, we estimate the location of Flickr photos, based on their tags, at four different scales, ranging from a city-level granularity to fine-grained intra-city areas. Using Dempster-Shafer’s evidence theory, we combine the output of the different classifiers into a single mass assignment. We demonstrate experimentally that the induced belief and plausibility measures are useful to determine whether there is sufficient evidence to classify the photo at a given granularity. Thus an adaptive method is obtained, by which photos are georeferenced at the most appropriate resolution.

Olivier Van Laere, Steven Schockaert, Bart Dhoedt

A Structure-Based Similarity Spreading Approach for Ontology Matching

Most of the frequently used ontology mapping methods to date are based on linguistic information implied in ontologies. However, same concepts in different ontologies can represent different semantics under the context of different ontologies, so relationships on mapping cannot be solely recognized by applying linguistic information. Discovering and utilizing structural information in ontology is also very important. In this paper, we propose a structure-based similarity spreading method for ontology matching which consists of three steps. We first select centroid concepts from both ontologies using similarities between entities based on their linguistic information. Second, we partition each ontology based on the set of centroid concepts recognized in it using clustering method. Third, we utilize a similarity spreading method to update the similarities between entities from two ontologies and apply a greedy matching method to establish the final mapping results. The experimental results demonstrate that our approach is very effective and can obtain much better results comparing to other similarity based and similarity flooding based algorithms.

Ying Wang, Weiru Liu, David A. Bell

Risk Modeling for Decision Support

Decision-makers require tools to aid in risky situations. Fundamental to this is a need to model uncertainty associated with a course of action, an alternative’s uncertainty profile. In addition we need to model the responsible agents decision function, their attitude with respect to different uncertain risky situations. In the real world both these kinds of information are ill defined and imprecise. Here we look at some techniques arising from the modern technologies of computational intelligence and soft computing. The use of fuzzy rule based formulations to model decision functions is investigated. We discuss the role of perception based granular probability distributions as a means of modeling the uncertainty profiles of the alternatives. We suggest a more intuitive and human friendly way of describing uncertainty profiles is in terms of a perception based granular cumulative probability distribution function. We show how these perception based granular cumulative probability distributions can be expressed in terms of a fuzzy rule based model.

Ronald R. Yager


Weitere Informationen

Premium Partner