nach oben

2010 | Buch

Kapitel lesen Erstes Kapitel lesen

Modeling Decisions for Artificial Intelligence

7th International Conference, MDAI 2010, Perpignan, France, October 27-29, 2010. Proceedings

herausgegeben von: Vicenç Torra, Yasuo Narukawa, Marc Daumas

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book constitutes the proceedings of the 7th InternationalConference on Modeling Decisions for Artificial Intelligence, MDAI 2010,held in Perpignan, France, in October 2010.

The 25 papers presented were carefully reviewed and selected from 43submissions. The volume also contains extended abstracts of the threeinvited papers. The topics covered are aggregation operators anddecision making; clustering and similarity; computational intelligence;and data privacy.

Inhaltsverzeichnis

Frontmatter

Invited Papers

Relationships between Qualitative and Quantitative Scales for Aggregation Operations: The Example of Sugeno Integrals

Abstract

In decision applications, especially multicriteria decision- making, numerical approaches are often questionable because it is hard to elicit numerical values quantifying preference, criteria importance or uncertainty. More often than not, multicriteria decision-making methods come down to number-crunching recipes with debatable foundations. One way out of this difficulty is to adopt a qualitative approach where only maximum and minimum are used. Such methods enjoy a property of scale invariance that insures their robustness. One of the most sophisticated aggregation operation making sense on qualitative scales is Sugeno integral. It is not purely ordinal as it assumes commensurability between preference intensity and criteria importance or similarly, utility and uncertainty. However, since absolute qualitative value scales must have few levels so as to remain cognitively plausible, there are as many classes of equivalent decisions as value levels. Hence this approach suffers from a lack of discrimination power. In particular, qualitative aggregations such as Sugeno integrals cannot be strictly increasing and violate the strict Pareto property. In this talk, we report results obtained when trying to increase the discrimination power of Sugeno integrals, generalizing such refinements of the minimum and maximum as leximin and leximax. The representation of leximin and leximax by sums of numbers of different orders of magnitude (forming a super-increasing sequence) can be generalized to weighted max and min (yielding a “big-stepped” weighted average) and Sugeno integral (yielding a “big-stepped” Choquet integral). This methodology also requires qualitative monotonic set-functions to be refined by numerical set-functions, and we show they can always be belief or plausibility functions in the sense of Shafer.

Didier Dubois

User Privacy in Web Search

Abstract

Web search engines gather a lot of information on the preferences and interests of users. They actually gather enough information to create detailed user profiles which might enable re-identification of the individuals to which those profiles correspond, e.g. thanks to the so-called vanity queries or to linkage of several queries known to have been submitted by the same user. In this way, a broadly used search engine like Google becomes a “big brother” in the purest Orwellian style.

Josep Domingo-Ferrer

A Bibliometric Index Based on Collaboration Distances

Abstract

The h-index by Hirsch[1] has recently earned a lot of popularity in bibliometrics, being echoed in Nature and implemented in the Web of Science bibliometric database. Previous indicators were the total number of papers or the total number of citations. Following the widely accepted idea that not all papers should count equally, the h-index counts only those papers that are significant enough according to their number of citations. However, as for qualifying the significance of citations, beyond excluding self-citations by recent proposals[2,3,4,5,6], the fact that not all citations should count equally has remained unaddressed, with the exception of [7]. The h-index can be described in terms of a pool of evaluated objects (papers), a quality function on the evaluated object (citations received by each paper) and a sentencing line crossing the origin (y = x). When the evaluated objects are ordered by descreasing quality, then the intersection of the sentencing line with the graph of the quality function yields the index value.

Maria Bras-Amorós, Josep Domingo-Ferrer, Vicenç Torra

Regular Papers

Aggregation Operators and Decision Making

Measuring the Influence of the kth Largest Variable on Functions over the Unit Hypercube

Abstract

By considering a least squares approximation of a given square integrable function \(f : [0,1]^n \rightarrow {I \kern -3pt \mathcal R}\) by a shifted L-statistic function (a shifted linear combination of order statistics), we define an index which measures the global influence of the kth largest variable on f. We show that this influence index has appealing properties and we interpret it as an average value of the difference quotient of f in the direction of the kth largest variable or, under certain natural conditions on f, as an average value of the derivative of f in the direction of the kth largest variable. We also discuss a few applications of this index in statistics and aggregation theory.

Jean-Luc Marichal, Pierre Mathonet

Measuring the Interactions among Variables of Functions over the Unit Hypercube

Abstract

By considering a least squares approximation of a given square integrable function \(f : [0,1]^n \rightarrow {I \kern -3pt \mathcal R}\) by a multilinear polynomial of a specified degree, we define an index which measures the overall interaction among variables of f. This definition extends the concept of Banzhaf interaction index introduced in cooperative game theory. Our approach is partly inspired from multilinear regression analysis, where interactions among the independent variables are taken into consideration. We show that this interaction index has appealing properties which naturally generalize the properties of the Banzhaf interaction index. In particular, we interpret this index as an expected value of the difference quotients of f or, under certain natural conditions on f, as an expected value of the derivatives of f. These interpretations show a strong analogy between the introduced interaction index and the overall importance index defined by Grabisch and Labreuche [7]. Finally, we discuss a few applications of the interaction index.

Jean-Luc Marichal, Pierre Mathonet

Weighted Quasi-arithmetic Means and Conditional Expectations

Abstract

In this paper, the weighted quasi-arithmetic means are discussed from the viewpoint of utility functions and background risks in economics, and they are represented by weighting functions and conditional expectations. Using these representations, an index for background risks in stochastic environments is derived through the weighted quasi-arithmetic means. The first-order stochastic dominance and the risk premium are demonstrated using the weighted quasi-arithmetic means and the aggregated mean ratios, and they are characterized by the background risk index. Finally, examples of the weighted quasi-arithmetic mean and the aggregated mean ratio for various typical utility functions are given.

Yuji Yoshida

Modelling Group Decision Making Problems in Changeable Conditions

Abstract

The aim of this paper is to present a new group decision making model with two important characteristics: i) we apply mobile technologies in the decision process and ii) the set of alternatives is not constant through time. We implement a prototype of a mobile decision support system based on changeable sets of alternatives. Using their mobile devices (as mobile phones or PDAs), experts can provide/receive information in anywhere and anytime. The prototype also incorporates a new system to manage the alternatives and thus, to give more realism to decision processes allowing to manage changeable set of alternatives, focussing the discussion in a subset of them that changes in each stage of the process.

Ignacio J. Pérez, Sergio Alonso, Francisco J. Cabrerizo, Enrique Herrera-Viedma

Individual Opinions-Based Judgment Aggregation Procedures

Abstract

Judgment aggregation is a recent formal discipline that studies how to aggregate individual judgments on logically connected propositions to form collective decisions on the same propositions. Despite the apparent simplicity of the problem, the aggregation of individual judgments can result in an inconsistent outcome. This seriously troubles this research field. Expert panels, legal courts, boards, and councils are only some examples of group decision situations that confront themselves with such aggregation problems. So far, the existing framework and procedures considered in the literature are idealized. Our goal is to enrich standard judgment aggregation by allowing the individuals to agree or disagree on the decision rule. Moreover, the group members have the possibility to abstain or express neutral judgments. This provides a more realistic framework and, at the same time, consents the definition of an aggregation procedure that escapes the inconsistent group outcome.

Farah Benamara, Souhila Kaci, Gabriella Pigozzi

Aggregation of Bounded Fuzzy Natural Number-Valued Multisets

Abstract

Multisets (also called bags) are like-structures where an element can appear more than once. Recently, several generalizations of this concept have been studied. In this article we deal with a new extension of this concept, the bounded fuzzy natural number-valued multisets. On this kind of bags, a bounded distributive lattice structure is presented and a partial order is defined. Moreover, we study operations of aggregations (t-norms and t-conorms) and we provide two methods for their construction.

Jaume Casasnovas, J. Vicente Riera

Sugeno Utility Functions I: Axiomatizations

Abstract

In this paper we consider a multicriteria aggregation model where local utility functions of different sorts are aggregated using Sugeno integrals, and which we refer to as Sugeno utility functions. We propose a general approach to study such functions via the notion of pseudo-Sugeno integral (or, equivalently, pseudo-polynomial function), which naturally generalizes that of Sugeno integral, and provide several axiomatizations for this class of functions.

Miguel Couceiro, Tamás Waldhauser

Sugeno Utility Functions II: Factorizations

Abstract

In this paper we address and solve the problem posed in the companion paper [3] of factorizing an overall utility function as a composition q(ϕ ₁(x ₁),...,ϕ _n(x _n)) of a Sugeno integral q with local utility functions ϕ _i, if such a factorization exists.

Miguel Couceiro, Tamás Waldhauser

Managing Information Fusion with Formal Concept Analysis

Abstract

The main problem addressed in this paper is the merging of numerical information provided by several sources (databases, experts...). Merging pieces of information into an interpretable and useful format is a tricky task even when an information fusion method is chosen. Fusion results may not be in suitable form for being used in decision analysis. This is generally due to the fact that information sources are heterogeneous and provide inconsistent information, which may lead to imprecise results. In this paper, we propose the use of Formal Concept Analysis and more specifically pattern structures for organizing the results of fusion methods. This allows us to associate any subset of sources with its information fusion result. Once a fusion operator is chosen, a concept lattice is built. With examples throughout this paper, we show that this concept lattice gives an interesting classification of fusion results. When the fusion global result is too imprecise, the method enables the users to identify what maximal subset of sources that would support a more precise and useful result. Instead of providing a unique fusion result, the method yields a structured view of partial results labelled by subsets of sources. Finally, an experiment on a real-world application has been carried out for decision aid in agricultural practices.

Zainab Assaghir, Mehdi Kaytoue, Amedeo Napoli, Henri Prade

Clustering and Similarity

Indefinite Kernel Fuzzy c-Means Clustering Algorithms

Abstract

This paper proposes two types of kernel fuzzy c-means algorithms with an indefinite kernel. Both algorithms are based on the fact that the relational fuzzy c-means algorithm is a special case of the kernel fuzzy c-means algorithm. The first proposed algorithm adaptively updated the indefinite kernel matrix such that the dissimilarity between each datum and each cluster center in the feature space is non-negative, instead of subtracting the minimal eigenvalue of the given kernel matrix as its preprocess. This derivation follows the manner in which the non-Euclidean relational fuzzy c-means algorithm is derived from the original relational fuzzy c-means one. The second proposed method produces the memberships by solving the optimization problem in which the constraint of non-negative memberships is added to the one of K-sFCM. This derivation follows the manner in which the non-Euclidean fuzzy relational clustering algorithm is derived from the original relational fuzzy c-means one. Through a numerical example, the proposed algorithms are discussed.

Yuchi Kanzawa, Yasunori Endo, Sadaaki Miyamoto

Algorithms in Sequential Fuzzy Regression Models Based on Least Absolute Deviations

Abstract

The method of fuzzy c-regression models is known to be useful in real applications, but there are two drawbacks. First, the results have a strong dependency on the predefined number of clusters. Second, the method of least squares is frequently sensitive to outliers or noises. To avoid these drawbacks, we apply a method of sequentially extracting one cluster at a time using noise-detecting method to fuzzy c-regression models which enables an automatic determination of clusters. Moreover regression models are based on least absolute deviations (FCRMLAD) which are known to be robust to noises. We show the effectiveness of the proposed method by using numerical examples.

Hengjin Tang, Sadaaki Miyamoto

A Generalized Approach to the Suppressed Fuzzy c-Means Algorithm

Abstract

Suppressed fuzzy c-means (s-FCM) clustering was introduced with the intention of combining the higher convergence speed of hard c-means (HCM) clustering with the finer partition quality of fuzzy c-means (FCM) algorithm. Suppression modifies the FCM iteration by creating a competition among clusters: lower degrees of memberships are reduced via multiplication with a previously set constant suppression rate, while the largest fuzzy membership grows by swallowing all the suppressed parts of the small ones. Suppressing the FCM algorithm was found successful in terms of accuracy and working time. In this paper we introduce some generalized formulations of the suppression rule, leading to an infinite number of new clustering algorithms. Based on a large amount of numerical tests performed in multidimensional environment, some generalized forms of suppression proved to give more accurate partitions than FCM and s-FCM.

László Szilágyi, Sándor M. Szilágyi, Csilla Kiss

Semi-supervised Agglomerative Hierarchical Clustering Using Clusterwise Tolerance Based Pairwise Constraints

Abstract

Recently, semi-supervised clustering has been remarked and discussed in many researches. In semi-supervised clustering, pairwise constraints, that is, must-link and cannot-link are frequently used in order to improve clustering results by using prior knowledges or informations. In this paper, we will propose a clusterwise tolerance based pairwise constraint. In addition, we will propose semi-supervised agglomerative hierarchical clustering algorithms with centroid method based on it. Moreover, we will show the effectiveness of proposed method through numerical examples.

Yukihiro Hamasuna, Yasunori Endo, Sadaaki Miyamoto

Computational Intelligence

Gallbladder Segmentation in 2-D Ultrasound Images Using Deformable Contour Methods

Abstract

Segmenting the gallbladder from an ultrasonography (US) image allows background elements which are immaterial in the diagnostic process to be eliminated. In this project, several active contour models were used to extract the shape of the gallbladder, both for cases free of lesions, and for those showing specific disease units, namely: lithiasis, polyps, anatomical changes, such as folds or turns of the gallbladder. First, the histogram normalization transformation was executed allowing the contrast of US images to be improved. The approximate edge of the gallbladder was found by applying one of the active contour models like the motion equation, a center-point model or a balloon model. An operation of adding up areas delimited by the determined contours was also executed to more exactly approximate the shape of the gallbladder in US images. Then, the fragment of the image located outside the gallbladder contour was eliminated from the image. The tests conducted have shown that for the 220 US images of the gallbladder, the area error rate (AER) amounted to 16.4%.

Marcin Ciecholewski

Pattern Mining on Stars with FP-Growth

Abstract

Most existing data mining (DM) approaches look for patterns in a single table. Multi-relational DM approaches, on the other hand, look for patterns that involve multiple tables. In recent years, the most common DM techniques have been extended to the multi-relational case, but there are few dedicated to star schemas. These schemas are composed of a central fact table, linking a set of dimension tables, and joining all the tables before mining may not be a feasible solution. This work proposes a method for frequent pattern mining in a star schema based on FP-Growth. It does not materialize the entire join between the tables. Instead, it constructs an FP-Tree for each dimension and then combines them to form a super FP-Tree, that will serve as input to FP-Growth.

Andreia Silva, Cláudia Antunes

A Computational Intelligence Based Framework for One-Subsequence-Ahead Forecasting of Nonstationary Time Series

Abstract

This paper proposes a mix of noise filtering, fuzzy clustering, neural mapping and predictive techniques for one-subsequence-ahead forecasting of nonstationary time series. Optionally, we may start with de-noising the time series by wavelet decomposition. A non-overlapping subsequence time series clustering procedure with a sliding window is next addressed, by using a lower-bound of the Dynamic Time Warping distance as a dissimilarity measure, when applying the Fuzzy C-Means algorithm. Afterwards, the subsequence time series transition function is learned by neural mapping, consisting of deriving, for each subsequence time series, the degrees to which it belongs to the c cluster prototypes, when the p(c membership degrees of the previous p subsequences are presented as inputs to the neural network. Finally, this transition function is applied to forecasting one-subsequence-ahead time series, as a weighted mean of the c cluster prototypes to which it belongs, and the S&P 500 data are used for testing.

Vasile Georgescu

Non-hierarchical Clustering of Decision Tables toward Rough Set-Based Group Decision Aid

Abstract

In order to analyze the distribution of mind-sets (collections of evaluations) in a group, a hierarchical clustering of decision tables has been examined. By the method, we know clusters of mind-set but the clusters are not always optimal in some criterion. In this paper, we develop non-hierarchical clustering techniques for decision tables. In order to treat positive and negative evaluations to a common profile, we use a vector of rough membership values to represent individual opinion to a profile. Using rough membership values, we develop a K-means method as well as fuzzy c-means methods for clustering decision tables. We examined the proposed methods in clustering real world decision tables.

Masahiro Inuiguchi, Ryuta Enomoto, Yoshifumi Kusunoki

Revisiting Natural Actor-Critics with Value Function Approximation

Abstract

Actor-critics architectures have become popular during the last decade in the field of reinforcement learning because of the introduction of the policy gradient with function approximation theorem. It allows combining rationally actor-critic architectures with value function approximation and therefore addressing large-scale problems. Recent researches led to the replacement of policy gradient by a natural policy gradient, improving the efficiency of the corresponding algorithms. However, a common drawback of these approaches is that they require the manipulation of the so-called advantage function which does not satisfy any Bellman equation. Consequently, derivation of actor-critic algorithms is not straightforward. In this paper, we re-derive theorems in a way that allows reasoning directly with the state-action value function (or Q-function) and thus relying on the Bellman equation again. Consequently, new forms of critics can easily be integrated in the actor-critic framework.

Matthieu Geist, Olivier Pietquin

A Cost-Continuity Model for Web Search

Abstract

In this paper we present and empirically evaluate a ‘continuity-cost model’ for Internet query sessions made by users. We study the relation of different ‘cost factors’ for a user query session, with the continuity of the user in that query session, and the order of the query in the query session. We define cost indicators from the available query log data, which are to be studied in relation to continuity and to the order/number of the query (1st, 2nd, 3rd, ..). One of our hypotheses is that cost related factors will reflect the step by step nature of the query session process. We use descriptive statistics together with rule induction to identify the most relevant factors and observable trends, and produce three classifier data models, one for each ‘query number’, using the ‘continuity flag’ as classifier label. Using the cost factors, we identify trends relating continuity/query number to user behavior, and we can use that information, for example, to make decisions about caching and query recommendation.

David F. Nettleton, Joan Codina

An Enhanced Framework of Subjective Logic for Semantic Document Analysis

Abstract

Unlike propositional logic which works on truth or falsity of statements, human judgements are subjective in nature having certain degree of uncertainty. Two different people will analyse and interpret a document in two different ways based on their background and current focus. In this paper we present an enhanced framework of subjective logic for automated single document analysis where each sentence in the document represents a proposition, and ‘opinions’ are constructed about this proposition to focus the degree of uncertainty associated with it. The ‘opinion’ about a sentence determines the significance of that sentence in a document. The input arguments are built automatically from a document in the form of evidence; then they are analyzed based on subjective logic parameters. Two different approaches are described here. The first utilises “bag of words” concept. However, this approach tends to miss the underlying semantic meanings of the context, so we further enhanced it into the latter approach which incorporates semantic information of the context, by extending the basic definitions of subjective logic.

Sukanya Manna, B. Sumudu. U. Mendis, Tom Gedeon

Data Privacy

Ontology-Based Anonymization of Categorical Values

Abstract

The analysis of sensible data requires a proper anonymization of values in order to preserve the privacy of individuals. Information loss should be minimized during the masking process in order to enable a proper exploitation of data. Even though several masking methods have been designed for numerical data, very few of them deal with categorical (textual) information. In this case, the quality of the anonymized dataset is closely related to the preservation of semantics, a dimension which is commonly neglected of shallowly considered in related words. In this paper, a new masking method for unbounded categorical attributes is proposed. It relies on the knowledge modeled in ontologies in order to semantically interpret the input data and perform data transformations aiming to minimize the loss of semantic content. On the contrary to exhaustive methods based on simple hierarchical structures, our approach relies on a set of heuristics in order to guide and optimize the masking process, ensuring its scalability when dealing with big and heterogenous datasets and wide ontologies. The evaluation performed over real textual data suggests that our method is able to produce anonymized datasets which significantly preserve data semantics in comparison to apporaches based on data distribution metrics.

Sergio Martínez, David Sánchez, Aida Valls

Rational Privacy Disclosure in Social Networks

Abstract

Social networking web sites or social networks for short (SNs) have become an important web service with a broad range of applications. In an SN, a user publishes and shares information and services. We propose a utility function to measure the rational benefit derived by a user from her participation in an SN, in terms of information acquired vs information provided. We show that independently and selfishly maximizing this utility leads users to “free-riding”, i.e. getting information about other users and offering no information about themselves. This results in SN shutdown (no functionality). We then propose protocols to achieve a correlated equilibrium between users, in which they coordinate their disclosures in view of jointly maximizing their utilities. The proposed protocol can be used to assist an SN user in making rational decisions regarding which of her attributes she reveals to other users.

Josep Domingo-Ferrer

Towards Semantic Microaggregation of Categorical Data for Confidential Documents

Abstract

In the data privacy context, specifically, in statistical disclosure control techniques, microaggregation is a well-known microdata protection method, ensuring the confidentiality of each individual. In this paper, we propose a new approach of microaggregation to deal with semantic sets of categorical data, like text documents. This method relies on the WordNet framework that provides complete semantic relationship taxonomy between words. Therefore, this extension aims ensure the confidentiality of text documents, but at the same time, it should preserve the general meaning. We apply some measures to evaluate the quality of the protection method relying on information loss.

Daniel Abril, Guillermo Navarro-Arribas, Vicenç Torra

Using Classification Methods to Evaluate Attribute Disclosure Risk

Abstract

Statistical Disclosure Control protection methods perturb the non-confidential attributes of an original dataset and publish the perturbed results along with the values of confidential attributes. Traditionally, such a method is considered to achieve a good privacy level if attackers who try to link an original record with its perturbed counterpart have a low success probability. Another opinion is lately gaining popularity: the protection methods should resist not only record re-identification attacks, but also attacks that try to guess the true value of some confidential attribute of some original record(s). This is known as attribute disclosure risk.

In this paper we propose a quite simple strategy to estimate the attribute disclosure risk suffered by a protection method: using a classifier, constructed from the protected (public) dataset, to predict the attribute values of some original record. After defining this approach in detail, we describe some experiments that show the power and danger of the approach: very popular protection methods suffer from very high attribute disclosure risk values.

Jordi Nin, Javier Herranz, Vicenç Torra

A Misleading Attack against Semi-supervised Learning for Intrusion Detection

Abstract

Machine learning has became a popular method for intrusion detection due to self-adaption for changing situation. Limited to lack of high quality labeled instances, some researchers focused on semi-supervised learning to utilize unlabeled instances enhancing classification. But involving the unlabeled instances into learning process also introduces vulnerability: attackers can generate fake unlabeled instances to mislead the final classifier so that a few intrusions can not be detected. We show how attackers can influence the semi-supervised classifier by constructing unlabeled instances in this paper. And a possible defence method which based on active learning is proposed. Experiments show that the misleading attack can reduce the accuracy of the semi-supervised learning method and the presented defense method against the misleading attack can obtain higher accuracy than the original semi-supervised learner under the proposed attack.

Fangzhou Zhu, Jun Long, Wentao Zhao, Zhiping Cai

Backmatter

Titel: Modeling Decisions for Artificial Intelligence
herausgegeben von: Vicenç Torra
Yasuo Narukawa
Marc Daumas
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-16292-3
Print ISBN: 978-3-642-16291-6
DOI: https://doi.org/10.1007/978-3-642-16292-3