Plenary Papers

Decision Networks

A decision network is a finite, directed acyclic graph, nodes of which represent logical formulas, whereas branches – are interpreted as decision rules. Every path in the graph represents a chain of decision rules, which describe compound decision.Some properties of decision networks will be given and a simple example will illustrate the presented ideas and show possible applications.

Zdzisław Pawlak

Toward Rough Set Foundations. Mereological Approach

In this semi-plenary lecture, we would like to discuss rough inclusions defined in Rough Mereology, a joint idea with Andrzej Skowron, as a basis for models for rough set theory. We demonstrate that mereological theory of rough sets extends and generalizes rough set theory written down in naive set theory framework.

Lech Polkowski

Generalizations of Rough Sets: From Crisp to Fuzzy Cases

Rough sets can be interpreted in two ways: classification of objects and approximation of a set. In this paper, we discuss the differences and similarities of generalized rough sets based on those two different interpretations. We describe the relations between generalized rough sets and types of extracted decision rules. Moreover, we extend the discussion to fuzzy rough sets. Through this paper, the relations among generalized crisp rough sets and fuzzy rough sets are clarified and two different directions of applications in rule extraction are suggested.

Masahiro Inuiguchi

Theory

Investigation about Time Monotonicity of Similarity and Preclusive Rough Approximations in Incomplete Information Systems

Starting from an incomplete information system, we add some information in two different ways: by an increase in the number of known values and by an increase in the number of attributes. The behavior of the similarity and preclusive rough approximations are studied in both cases.

Gianpiero Cattaneo, Davide Ciucci

The Ordered Set of Rough Sets

We study the ordered set of rough sets determined by relations which are not necessarily reflexive, symmetric, or transitive. We show that for tolerances and transitive binary relations the set of rough sets is not necessarily even a semilattice. We also prove that the set of rough sets determined by a symmetric and transitive binary relation forms a complete Stone lattice. Furthermore, for the ordered sets of rough sets that are not necessarily lattices we present some possible canonical completions.

Jouni Järvinen

A Comparative Study of Formal Concept Analysis and Rough Set Theory in Data Analysis

The theory of rough sets and formal concept analysis are compared in a common framework based on formal contexts. Different concept lattices can be constructed. Formal concept analysis focuses on concepts that are definable by conjuctions of properties, rough set theory focuses on concepts that are definable by disjunctions of properties. They produce different types of rules summarizing knowledge embedded in data.

Yiyu Yao

Structure of Rough Approximations Based on Molecular Lattices

Generalization of rough set model is one important aspect of rough set theory study, and it is very helpful to consummate rough set theory. Developing rough set theory using algebra systems has been paid great attention, and some researchers had reported significant developments. But the base algebra systems, on which approximation operators are defined, are confined to special Boolean algebras, including set algebra and atomic Boolean lattice. This paper introduces molecular lattices as base algebra system. Based on molecules of a molecular lattice, a mapping called meta-mapping is defined. Consequently, the approximation operators, which are more general and abstract compared with approximation operators reported in some papers, are defined based on the frame of molecular lattices. The properties of the approximations are also studied.

Jian-Hua Dai

Rough Approximations under Level Fuzzy Sets

The combination of fuzzy set and rough set theories lead to various models. Functional and set approaches are two categories based on different fuzzy representations. In this paper, we study rough approximations based on the notion of level fuzzy sets. Two rough approximation models, namely α-level rough set and β-level rough set, are proposed. It shows that β-level fuzzy rough set model can approximate a fuzzy set at different precisions.

W. -N. Liu, JingTao Yao, Yiyu Yao

Fuzzy-Rough Modus Ponens and Modus Tollens as a Basis for Approximate Reasoning

We have proposed a fuzzy rough set approach without using any fuzzy logical connectives to extract gradual decision rules from decision tables. In this paper, we discuss the use of these gradual decision rules within modus ponens and modus tollens inference patterns. We discuss the difference and similarity between modus ponens and modus tollens and, moreover, we generalize them to formalize approximate reasoning based on the extracted gradual decision rules. We demonstrate that approximate reasoning can be performed by manipulation of modifier functions associated with the gradual decision rules.

Masahiro Inuiguchi, Salvatore Greco, Roman Słowiński

Logic and Rough Sets

Rough Truth, Consequence, Consistency and Belief Revision

The article aims at re-visiting the notion of rough truth proposed by Pawlak in 1987 [11] and investigating some of its ‘logical’ consequences. We focus on the formal deductive apparatus $\mathcal{L}_\mathcal{R}$ [12], that is sound and complete with respect to a semantics based on rough truth (extended to rough validity). The notion of rough consequence [4] is used in a modified form to formulate $\mathcal{L}_\mathcal{R}$. The system has some desirable features of ‘rough’ reasoning – e.g. roughly true propositions can be derived from roughly true premisses in an information system. Further, rough consistency [4] is used to prove completeness. These properties of $\mathcal{L}_\mathcal{R}$ motivate us to use it for a proposal of rough belief change. During change, the operative constraints on a system of beliefs are that of rough consistency preservation and deductive closure with respect to $\mathcal{L}_\mathcal{R}$. Following the AGM [1] line, postulates for defining revision and contraction functions are presented. Interrelationships of these functions are also proved.

Mohua Banerjee

A Note on Ziarko’s Variable Precision Rough Set Model and Nonmonotonic Reasoning

Granular reasoning is a way of reasoning using granularized possible worlds and lower approximation in rough set theory. However, it can deal only with monotonicity. Then, the extended lower approximation in Ziarko’s variable precision rough set model is introduced to describe nonmonotonic reasoning.

Tetsuya Murai, Masayuki Sanada, Y. Kudo, Mineichi Kudo

Fuzzy Reasoning Based on Propositional Modal Logic

In order to deal with some vague assertions more efficiently, fuzzy modal logics have been discussed by many researchers. This paper introduces the notation of fuzzy assertion based on propositional modal logic. As an extension of the traditional semantics about the modal logics, the fuzzy Kripke semantics are considered and the formal system of the fuzzy reasoning based on propositional modal logic is established and the properties about the satisfiability of the reasoning system are discussed.

Zaiyue Zhang, Yuefei Sui, Cungen Cao

Granular Computing

Approximation Spaces and Information Granulation

We discuss approximation spaces in the granular computing framework. Such approximation spaces generalise the approaches to concept approximation existing in rough set theory. Approximation spaces are constructed as higher level information granules and are obtained as the result of complex modelling. We present illustrative examples of modelling approximation spaces including approximation spaces for function approximation, inducing concept approximation, and some other information granule approximations. In modelling of such approximation spaces we use an important assumption that not only objects but also more complex information granules involved in approximations are perceived using only partial information about them.

Andrzej Skowron, Roman Swiniarski, Piotr Synak

Granular Language and Its Applications in Problem Solving

Problem solving in AI points out that a global problem is decomposed into several locals, until into elementary sub-problems solved directly. Subsequently, the solutions of various sub-problems are amalgamated into an answer of the global problem. The solving problem in practice is represented by a logical formula, and then the formula is decomposed into several sub-formulas, until into propositions or predicates. The solutions of these sub-formulas are amalgamated into an answer of the global formula. If the formula is described as a granulation, then it will be collapsed into several sub-granules. So, the collapsing algorithm and rules of amalgamating will be defined. In order to realizing the idea in above, a granular language and its calculus will be discussed.

Qing Liu

Belief Reasoning, Revision and Fusion by Matrix Algebra

Representation of belief states is an important issue forknowledge based systems. In this paper, we develop a matrix representation for ordered belief states and show that belief reasoning, revision and fusion can all be interpreted as operations of matrix algebra. Thus, the matrix representation can serve as the basis of algebraic semantics for belief logic.

Churn-Jung Liau

Rough and Fuzzy Relations

On the Correspondence between Approximations and Similarity

This paper focuses on the use and interpretation of approximate databases where both rough sets and indiscernibility partitions are generalized and replaced by approximate relations and similarity spaces. Similarity spaces are used to define neighborhoods around individuals and these in turn are used to define approximate sets and relations. There is a wide spectrum of choice as to what properties the similarity relation should have and how this affects the properties of approximate relations in the database. In order to make this interaction precise, we propose a technique which permits specification of both approximation and similarity constraints on approximate databases and automatic translation between them. This technique provides great insight into the relation between similarity and approximation and is similar to that used in modal correspondence theory. In order to automate the translations, quantifier elimination techniques are used.

Patrick Doherty, Andrzej Szałas

Toward Rough Knowledge Bases with Quantitative Measures

We present a language for defining new rough relations from given decision tables and we show how to query relations defined in this way. The language provides a uniform formalism for expressing rough data together with background knowledge, and for capturing well-known techniques such as the variable precision rough set model. Its essential feature is the use of quantitative measures, such as support, strength and accuracy.

Aida Vitória, Carlos Viegas Damásio, Jan Małuszyński

Considering Semantic Ambiguity and Indistinguishability for Values of Membership Attribute in Possibility-Based Fuzzy Relational Models

A possibility-based fuzzy relational model is proposed under considering semantic ambiguity and indistinguishability for values of membership attribute. In order to eliminate the semantic ambiguity, a membership attribute is attached to every attribute. This clarifies where each value of membership attributes comes from. What the values of membership attributes mean depends on the property of those attributes. In order to eliminate the indistinguishability for values of membership attribute, these values are expressed by possibility distributions on the interval [0,1]. This clarifies what effects an imprecise data value allowed for an attribute has on its value of membership attribute. Therefore, there is no semantic ambiguity and no indistinguishability for the values of membership attributes in the possibility-based fuzzy relational model.

Michinori Nakata

Foundations of Data Mining

Research on Integrating Ordbms and Rough Set Theory

It is a new application area that integrating rough set theory and ORDBMS to implement data mining in ORDBMS. A suitable rough set algebra must be designed to implement the tight coupling between rough set and ORDBMS. Equivalence matrices algebra doesn’t meet this requirement and must be extended. We extend the equivalence matrices algebra to define the low and upper approximation, relative core and reduction. A prototype system has been designed and implemented in our research, called RSORDMS, adding data mining capabilities to ORDBMS while preserving its traditional power. In the prototype system, a technique is adopted which combines rough set with SQL to make data cleaning and core computation very fast, which can be proved by experiments. The whole prototype system has a good performance.

HuiQin Sun, Zhang Xiong, Ye Wang

Feature Subset Selection Based on Relative Dependency between Attributes

Feature subset selection is an importent component of knowledge discovery and data mining systems to help reduce the data dimensionality. Rough sets theory provides a mechanism of selecting feature subsets. In the rough set community, most feature subset selection algorithms are attributes reduct-oriented; that is, finding minimum reducts of the conditional attributes of a decision table. Two main approaches to finding attribute reducts are categorized as discernibility functions-based and attribute dependency-based. These algorithms, however, suffer from intensive computations of either discernibility functions for the former or positive regions for the latter. In this paper, we propose a new concept, called relative attribute dependency, and present a sufficient and necessary condition of the minimum conditional attributes reduct of a decision table represented with the relative attribute dependency. The relative attribute dependency can be calculated by counting the distinct rows of the sub-decision table, instead of generating discernibility functions or positive regions. Thus the computation efficiency of minimum reducts are highly improved. We develop two algorithms for finding minimum reducts of the conditional attributes, one brute-force algorithm and the other heuristic algorithm using attribute entropy as the heuristic function. We also show the results of these algorithms by an illustrative example.

Jianchao Han, Xiaohua Hu, Tsao Young Lin

Granular Computing on Extensional Functional Dependencies for Information System

In this paper, a new approach to discover extensional functional dependencies for information systems is presented based on information granules using their bit representations. The principle of information granules, granular computing and the machine oriented model for data mining are investigated firstly. In addition, the approach to identify the classical functional dependencies, identity dependencies and partial dependencies is discussed and some conclusions on extensional functional dependencies are obtained. The information granules are represented with bit, then the data format can be closed to the inner representations of the computer, hence, the patterns contained in the information system can be directly mined.

Qiusheng An, Junyi Shen

Greedy Algorithm for Decision Tree Construction in Context of Knowledge Discovery Problems

In the paper a greedy algorithm for minimization of decision tree depth is considered and bounds on the algorithm precision are discussed. Under some natural assumptions on the class NP and on the class of considered tables, this algorithm is, apparently, close to best approximate polynomial algorithms for minimization of decision tree depth. Unfortunately, the performance ratio of this algorithm grows almost as natural logarithm on the number of rows in the table. Except usual greedy algorithm we study greedy algorithm with threshold which constructs approximate decision trees. Such approach is fully admissible if we see on decision trees as on a way for knowledge representation. We obtain upper bounds on the depth of decision trees, constructed by this algorithms, which are independent of the number of rows in the table.

Mikhail Ju. Moshkov

GAMInG – A Framework for Generalization of Association Mining via Information Granulation

Rather than finding new association-mining types one at a time, in this paper, we propose a framework, which is called Generalization of Association Mining via Information Granulation (GAMInG), based on which new association-mining types capable of discovering new patterns hidden in data can be systematically defined.

Ying Xie, Vijay V. Raghavan

Mining Un-interpreted Generalized Association Rules by Linear Inequalities

Descriptive/Deductive Data Mining Approach

Taking the spirit of descriptive statistic methods data mining is viewed as a deductive science, no inductive generalizations or predicative assertions. We call this approach descriptive/deductive data mining (DDM) to stress spirit of descriptive statistic methods and the role of mathematical deductions.Such a seemingly restrictive methodology, somewhat surprisingly, turns out to be quite far reaching. Previously, we have observed in ICDM02 that (1) Isomorphic relations have isomorphic patterns (classical association rules). This observation implies, from data mining prospect, that relations and patterns are syntactic in nature. We also have reported that (2) attributes or features (including un-interpreted ones) of a given relation can be enumerated mathematically, though, in intractable time. In this paper, we proved (3) generalized association rules (including un-interpreted rules) can be discovered by solving a finite set of integral linear inequalities within polynomial time.

Tsau Young Lin

A Graded Applicability of Rules

We address the problem of rough applicability of rules within the framework of approximation spaces. The graded applicability of a rule for an object of an approximation space, introduced here, is viewed as a fundamental form of rough applicability. It is based on the graded meaning of a set of formulas, defined in our previous works. The notion of graded applicability enjoys a number of interesting properties and it is useful – in our opinion – in modeling of rule-based complex systems like multi-agent systems.

Anna Gomolińska

On the Degree of Independence of a Contingency Matrix

A contingency table summarizes the conditional frequencies of two attributes and shows how these two attributes are dependent on each other. Thus, this table is a fundamental tool for pattern discovery with conditional probabilities, such as rule discovery. In this paper, a contingency table is interpreted from the viewpoint of statistical independence and granular computing. The first important observation is that a contingency table compares two attributes with respect to the number of equivalence classes. For example, a n × n table compares two attributes with the same granularity, while a m × n (m ≥ n) table compares two attributes with different granularities. The second important observation is that matrix algebra is a key point of analysis of this table. Especially, the degree of independence, rank plays a very important role in evaluating the degree of statistical independence. Relations between rank and the degree of dependence are also investigated.

Shoji Hirano, Shusaku Tsumoto

K Nearest Neighbor Classification with Local Induction of the Simple Value Difference Metric

The classical k nearest neighbor (k-nn) classification assumes that a fixed global metric is defined and searching for nearest neighbors is always based on this global metric. In the paper we present a model with local induction of a metric. Any test object induces a local metric from the neighborhood of this object and selects k nearest neighbors according to this locally induced metric. To induce both the global and the local metric we use the weighted Simple Value Difference Metric (SVDM). The experimental results show that the proposed classification model with local induction of a metric reduces classification error up to several times in comparison to the classical k-nn method.

Andrzej Skowron, Arkadiusz Wojna

A Note on the Regularization Algorithm

Regularization Algorithm (also called Regularization Network) is a technique for solving problems of learning from examples – in particular, the problem of approximating a multivariate function from sparse data. We analyze behavior of Regularization Algorithm for regularizator parameter equal to zero. We propose an approximative version of algorithm in order to overcome the computational cost for large data sets. We give proof of convergence and estimation for error of approximation.

Wojciech Jaworski

Incomplete Information Systems

Characteristic Relations for Incomplete Data: A Generalization of the Indiscernibility Relation

This paper shows that attribute-value pair blocks, used for many years in rule induction, may be used as well for computing indiscernibility relations for completely specified decision tables. Much more importantly, for incompletely specified decision tables, i.e., for data with missing attribute values, the same idea of attribute-value pair blocks is a convenient tool to compute characteristic sets, a generalization of equivalence classes of the indiscernibility relation, and also characteristic relations, a generalization of the indiscernibility relation. For incompletely specified decision tables there are three different ways lower and upper approximations may be defined: singleton, subset and concept. Finally, it is shown that, for a given incomplete data set, the set of all characteristic relations for the set of all congruent decision tables is a lattice.

Jerzy W. Grzymała-Busse

Data Decomposition and Decision Rule Joining for Classification of Data with Missing Values

In this paper we present a new approach to handling incomplete information and classifier complexity reduction. We describe a method, called D3RJ, that performs data decomposition and decision rule joining to avoid the necessity of reasoning with missing attribute values. In the consequence more complex reasoning process is needed than in the case of known algorithms for induction of decision rules. The original incomplete data table is decomposed into sub-tables without missing values. Next, methods for induction of decision rules are applied to these sets. Finally, an algorithm for decision rule joining is used to obtain the final rule set from partial rule sets. Using D3RJ method it is possible to obtain smaller set of rules and next better classification accuracy than standard decision rule induction methods. We provide an empirical evaluation of the D3RJ method accuracy and model size on data with missing values of natural origin.

Rafał Latkowski, Michał Mikołajczyk

Interestingness

Bayesian Confirmation Measures within Rough Set Approach

Bayesian confirmation theory considers a variety of non-equivalent confirmation measures quantifying the degree to which a piece of evidence supports a hypothesis. In this paper, we apply some of the most relevant confirmation measures within the rough set approach. Moreover, we discuss interesting properties of these confirmation measures and we propose a new property of monotonicity that is particularly relevant within rough set approach. The main result of this paper states which one of the confirmation measures considered in the literature have the desirable properties from the viewpoint of the rough set approach.

Salvatore Greco, Zdzisław Pawlak, Roman Słowiński

Discovering Maximal Potentially Useful Association Rules Based on Probability Logic

Apriori-like algorithms are widely used for mining support-based association rules. In such applications, the underlying assumption is that rules of frequent itemsets are useful or interesting to the users. However, in many applications, infrequent events may be of interest or frequency of events may have no relationship to their interestingness to the user. Apriori-like algorithms do not present efficient methods for discovering interesting infrequent itemsets. In this paper, We present a new model of Knowledge Discovery in Databases (KDD) based on probability logic and develop a new notion of Maximal Potentially Use Ful (MaxPUF) patterns, leading to a new class of association rules called maximal potentially useful (MaxPUF) association rules, which is a set of high-confidence rules that are most informational and potentially useful. MaxPUF association rules are defined independent of support constraint, and therefore are suitable for applications in which both frequent and infrequent itemsets maybe of interest. We develop an efficient algorithm to discover MaxPUF association rules. The efficiency and effectiveness of our approach is validated by experimemts based on weather data collected at the Clay Center, Nebraska, USA from 1959 to 1999.

Jitender Deogun, Liying Jiang, Vijay V. Raghavan

Semantics and Syntactic Patterns in Data

This paper examines the semantics and syntactic views of classical association rule mining. A relational table is considered as a (knowledge) representation of a universe (= the set of real world entities). A pattern is said to be realizable, if there is a real world phenomenon corresponding to it. The central two issues are: Why do unrealizable data patterns appear? How could they be pruned away? For this purpose, the semantics of the original schema are considered. In additions, semantic is included into the knowledge representation of the universe. Based on model theory, two new relational structures, functions and binary relations, are added to represent some additional semantics of the given universe. Association rule mining based on such additional semantics are considered.

Eric Louie, Tsau Young Lin

Multiagents and Information Systems

Dialogue in Rough Context

Two agents Ag₁ and Ag₂ confront each other with their own perspectives represented by approximation spaces (U,R₁) and (U,R₂) [3]. They enter into a dialogue (negotiation) over either the extension of the same ‘concept’ or over two pieces of information or beliefs, A and B, the first for Ag₁ and the second for Ag₂ respectively, which are subsets of U. A combined approximation space (U,R) emerges out of the superimposition of the equivalence classes due to R₁ and R₂.Each agent performs some specified operations one at a time. After an operation by an agent the turn comes to the co-agent. Rounds and effects of rounds are then defined. A dialogue is a sequence of rounds.There are certain rules of the game that depend on the three approximation spaces.The result of a dialogue after n rounds starting with the initial sets A,B is a pair (A_n,B_n), A_n,B_n being supersets of A and B respectively. A dialogue is characterised depending on the various kinds of overlap of the sets A_n and B_n and their lower and upper approximations. It is satisfactory if the sets A_n and B_n turn out to be roughly equal with respect to the approximation space (U,R). Dialogues of lower satisfaction are not altogether rejected. This latter type generalizes the notion of Belief-Merging [2].Some preliminary observations are made and future directions of work are indicated.

Mihir K. Chakraborty, Mohua Banerjee

Constrained Sums of Information Systems

We study properties of infomorphisms between information systems. In particular, we interpret infomorphisms between information systems in terms of sums with constraints (constrained sums, for short) that are some operations on information systems. Applications of approximation spaces, used in rough set theory, to study properties of infomorphisms are included.

Andrzej Skowron, Jarosław Stepaniuk

Defeasible Deontic Control for Discrete Events Based on EVALPSN

We have developed an annotated logic program called an EVALPSN(Extended Vector Annotated Logic Program with Strong Negation), which can deal with defeasible deontic reasoning and some kinds of contradiction, and applied EVALPSN to automatic safety verification, traffic signal control, robot action control, etc.. Generally, discrete event control can be represented as deontic rules such as it is forbidden for both the cat and mouse to occupy the same room simultaneously, and must deal with contradiction to avoid unexpected system states. We show that such discrete event control can be easily formalized in EVALPSN. In this paper, we introduce the application of EVALPSN to discrete event control with taking a famous example Cat and Mouse.

Kazumi Nakamatsu, Hayato Komaba, Atsuyuki Suzuki, Chung-Lun Lie, Sheng-Luen Chung

Fuzzy Logic and Modeling

Rough Set Based Fuzzy Modeling by Occupancy Degree and Optimal Partition of Projection

The rough set theory suggested by Pawlak has a property that it can represent the degree of consistency between condition and decision attributes of data pairs which don’t have linguistic information. In this paper, by using this ability of rough set theory, we define a measure called occupancy degree which can represent a consistency degree of premise and consequent variables in fuzzy rules describing experimental data pairs. We also propose a method by which we partition the projected data on input space and find an optimal fuzzy rule table and membership functions of input and output variables from data without preliminary linguistic information.

Chang-Woo Park, Young-Wan Cho, Jun-Hyuk Choi, Ha-Gyeong Sung

A Novel High Performance Fuzzy Controller Applied to Traffic Control of ATM Networks

In this paper, we will design a fuzzy logic based intelligent high performance traffic controller for Asynchronous Transfer Mode (ATM) networks. In the proposed fuzzy traffic controller, the actual mean cell rate of traffic source is estimated and the traffic controller is adjusted so its loss load is equal to generated excessive load. Simulation results show that the proposed fuzzy traffic controller can outperform the traditional Usage Parameter Control (UPC) mechanisms.

Mahdi Jalili-Kharaajoo

Design of a Speed Drive Based on Fuzzy Logic for a Dual Three-Phase Induction Motor

The main objective of this paper is devoted to a fuzzy speed controller for a multiphase induction machine. In the proposed approach of this paper, using a simple and flexible structure of a fuzzy controller, worthy results such as, high accuracy in regulating speed for any conditions, fast reaction to the variation with minimum offset, minimum pulsation in the output torque, and minimum distortion in the phase current are obtained.

Mahdi Jalili-Kharaajoo

Rough Classification

Rough Set Theory Analysis on Decision Subdivision

The degree of subdivision of the decision attribute value influences upon the accuracy of approximation classification, the approximation quality of rules, the core attributes and the information entropy in decision systems based on rough set theory. The finer the decision attribute discretization of a decision table is, the less the accuracy of approximation classification, the approximation quality of rules, and information entropy are on any condition attribute set. Meanwhile, if the attribute values of decision attributes are divided into finer values, then the core attributes set obtained from the finer decision table must include the core attributes set obtained from the previous decision table. These conclusions are proved theoretically. So the discrete degree of decision attributes should be chosen properly. The research is helpful to attribute reduction and enhancing confidences of decision rules.

Jiucheng Xu, Junyi Shen, Guoyin Wang

Rough Set Methods in Approximation of Hierarchical Concepts

Many learning methods ignore domain knowledge in synthesis of concept approximation. We propose to use hierarchical schemes for learning approximations of complex concepts from experimental data using inference diagrams based on domain knowledge. Our solution is based on the rough set and rough mereological approaches. The effectiveness of the proposed approach is performed and evaluated on artificial data sets generated by a traffic road simulator.

Jan G. Bazan, Sinh Hoa Nguyen, Hung Son Nguyen, Andrzej Skowron

Classifiers Based on Two-Layered Learning

In this paper we present an exemplary classifier (classification algorithm) based on two-layered learning. In the first layer of learning a collection of classifiers is induced from a part of original training data set. In the second layer classifiers are induced using patterns extracted from already constructed classifiers on the basis of their performance on the remaining part of training data. We report results of experiments performed on the following data sets, well known from literature: diabetes, heart disease, australian credit (see [5]) and lymphography (see [4]). We compare the standard rough set method used to induce classifiers (see [1] for more details), based on minimal consistent decision rules (see [6]), with the classifier based on two-layered learning.

Jan G. Bazan

Rough Fuzzy Integrals for Information Fusion and Classification

This paper presents two extended fuzzy integrals under rough uncertainty, i.e. rough upper fuzzy and lower fuzzy integrals, and extended properties are also given. Furthermore, these two integrals are applied here in information fusion and classification processes for rough features, and the corresponding extended models are also proposed. These types of integrals generalize fuzzy integrals and enlarge their domains of applications in fusion and classification under rough uncertainty. Examples show that they fuse or classify objects with rough features with fairly good effects while the existed methods based on reals can not solve.

Tao Guan, Boqin Feng

Rough Sets and Probabilities

Towards Jointree Propagation with Conditional Probability Distributions

In this paper, we suggest a novel approach to jointree computation. Unlike all previous jointree methods, we propose that jointree computation should use conditional probability distributions rather than potentials. One salient feature of this approach is that the exact form of the messages to be transmitted throughout the network can be identified a priori. Consequently, irrelevant messages can be ignored, while relevant messages can be computed more efficiently. We discuss four advantages of our jointree propagation method.

Cory J. Butz, Hong Yao, Howard J. Hamilton

Condition Class Classification Stability in RST due to Continuous Value Discretisation

Rough Set Theory (RST) is a nascent technique for object classification, where each object in an information system is characterised and classified by a number of condition and decision attributes respectively. A level of continuous value discretisation (CVD) is often employed to reduce the possible large granularity of the information system. This paper considers the effect of CVD on the association between condition and decision classes in RST. Moreover, the stability of the classification of the objects in the condition classes is investigated. Novel measures are introduced to describe the association of objects (condition classes) to the different decision classes.

Malcolm J. Beynon

The Rough Bayesian Model for Distributed Decision Systems

The article presents a new approach to understanding the concepts of the theory of rough sets basing on the inversive probabilities derivable from distributed decision systems. The Rough Bayesian model – a novel probabilistic extension of rough sets related to Bayes’ factor and Bayesian methods of the statistical hypothesis testing is proposed. Advantages of the Rough Bayesian model are illustrated by the examples.

Dominik Ślȩzak

Variable Precision Rough Set Model

On Learnability of Decision Tables

The article is exploring the learnabilty issues of decision tables acquired from data within the frameworks of rough set and of variable precision rough set models. Measures of learning problem complexity and of learned table domain coverage are proposed. Several methods for enhancing the learnabilty of decision tables are discussed, including a new technique based on value reducts.

Wojciech Ziarko

Remarks on Approximation Quality in Variable Precision Fuzzy Rough Sets Model

In this paper some properties of the variable precision fuzzy rough sets model will be considered. A new way of determining the positive region of classification will be proposed, which is useful in evaluation of approximation quality in variable precision fuzzy or crisp rough sets applications. The notions of the fuzzy rough weighted mean u-lower and l-upper approximation will be discussed. Fuzzy rough approximations will be evaluated basing on selected R-implicators.

Alicja Mieszkowicz-Rolka, Leszek Rolka

The Elucidation of an Iterative Procedure to ß-Reduct Selection in the Variable Precision Rough Sets Model

One area of study in rough set theory is the ability to select a subset of the condition attributes which adequately describe an information system. For the variable precision rough sets model (VPRS), its associated ß-reduct selection process is compounded by a ß value defining the VPRS related majority inclusion relation to object classification. This paper investigates the role of an iterative procedure in the necessary ß-reduct selection process.

Malcolm J. Beynon

Spatial Reasoning

A Logic-Based Framework for Qualitative Spatial Reasoning in Mobile GIS Environment

The mobile computing technology has been increasingly grown in the past decade; however there still exist some important constraints that complicate work with a mobile information system. The limited resources on the mobile computing would restrict some features that are available on the traditional computing technology. In this article we suggest an idea based on space and time partitioning in order to provide a paradigm that treats moving objects in mobile GIS environment. A logic-based framework for representing and reasoning about qualitative spatial relations over moving agents in space and time is proposed. We motivate the use of influenceability relation as primary relation and show how a logical calculus can be built up from this basic concept. We derive the connection relation as a basis of topological relation and a kind of time order as a basis of time from our suggested primary influenceability relation. This framework finds applications in intelligent transportation system (ITS), and any mobile autonomous navigation systems.

Mohammad Reza Malek

Spatial Object Modeling in Intuitionistic Fuzzy Topological Spaces

In Geoinformation systems (GIS) there is need to model spatial region with indeterminate boundary and under uncertainties. Although fuzzy logic methods are of great interest in many GIS applications, however the traditional fuzzy logic has two important deficiencies: first, to apply the fuzzy logic, we need to assign, to every property and for every value, a crisp membership function and second, it does not distinguish between the situation in which there is no knowledge about a certain statement and a situation that the belief to the statement in favor and against is the same. In order to solve these problems, we motivate to use intuitionistic fuzzy logic. This paper gives fundamental concepts and properties of an intuitionistic fuzzy spatial region. We provide a theoretical framework for both dominant ontologies used in GIS; namely point-set topology and Region Connected calculus.

Mohammad Reza Malek

Rough Spatial Interpretation

Rough set is a new approach to uncertainties in spatial analysis. In this paper, we complete three works under the umbrella of rough space. First, a set of simplified rough symbols is extended on the basis of existing rough symbols. It is in terms of rough interpretation and specialized indication. Second, rough spatial entity is proposed to study the real world as it is, without forcing uncertainties to change into a crisp set. Third, rough spatial topological relationships are studied by using rough matrix and their figures. The relationships are divided into three types, crisp entity and crisp entity (CC), rough entity and crisp entity (RC) and rough entity and rough entity (RR). A universal intersected equation is further developed. Finally, rough membership function is further extended with the gray scale in our case study. And the maximum and minimum maps of river thematic classification are generated via the rough membership function and rough relationships.

Shuliang Wang, Hanning Yuan, Guoqing Chen, Deren Li, Wenzhong Shi

Reduction

A Scalable Rough Set Knowledge Reduction Algorithm

Knowledge reduction algorithms based on rough set play an important role in KDD because of its advantage in dealing with uncertain data. However, it is hard for classical rough set knowledge reduction algorithms to deal with huge data sets. A structure of Class Distribution List (CDL) is presented in this paper to express the distribution of all attribute values in the whole sample space. With database technology, a CDL can be generated through classifying the original data sets. Then, a group of rough-set-based knowledge reduction algorithms are revised using CDL. This method can process huge data sets directly. As a framework, CDL method can also be used in other rough set algorithms to improve their scalability without decreasing their accuracy. Efficiency of our algorithms is proved by simulation experiments.

Zhengren Qin, Guoyin Wang, Yu Wu, Xiaorong Xue

Tree-Like Parallelization of Reduct and Construct Computation

The paper addresses the problem of parallel computing in reduct/construct generation. The reducts are subsets of attributes that may be successfully applied in information/decision table analysis. Constructs, defined in a similar way, represent a notion that is a kind of generalization of the reduct. They ensure both discernibility between pairs of objects belonging to different classes (in which they follow the reducts) as well as similarity between pairs of objects belonging to the same class (which is not the case with reducts). Unfortunately, exhaustive sets of minimal constructs, similarly to sets of minimal reducts, are NP-hard to generate. To speed up the computations, decomposing the original task into multiple subtasks and executing these in parallel is employed. The paper presents a so-called constrained tree-like model of parallelization of this task and illustrates practical behaviour of this algorithm in a computational experiment.

Robert Susmaga

Heuristically Fast Finding of the Shortest Reducts

It is well known that finding the shortest reduct is NP hard. In this paper, we present an heuristic fast way of finding shortest relative reducts by exploring the specific nature of the input data. As a byproduct, we can show that the shortest relative reducts can be found in polynomial time, provided that we do know apriori that the lengths of the shortest reducts is bounded by a constant, that is, independent of the column size n. It should be noted that there are heuristic factors in the algorithm (”speeds” are not guaranteed) but the results, namely, the founded shortest reducts, are the precise answers.

Tsau Young Lin, Ping Yin

Study on Reduct and Core Computation in Incompatible Information Systems

Reduct and core computation is one of the key problems in rough set theory due to its applications in data mining extensively. Much attention presently has paid to it in compatible information system. However, in practice, many information systems are incompatible because of noise or incomplete data. In this paper, reduct and core computation for incompatible information systems is studied based on the algebraic view. A new method to construct discernibility matrix is proposed, which is a generalization of the methods presented by Hu(1995), Ye(2002) and Qing(2003). Moreover, the results are suitable for compatible information systems.

Tian-rui Li, Ke-yun Qing, Ning Yang, Yang Xu

The Part Reductions in Information Systems

In this paper the definition of part reduction is proposed in information systems to describe the minimal description of a definable set by attributes of the given information system. The part reduction can present more optimum description of single decision class than the existing reductions and relative reductions. It is proven that the core of reduction or relative reduction can be expressed as the union of the cores of part reductions. So a deep insight is presented to the classical reductions and relative reductions of information systems so that a unified framework of the reductions of information systems can be set up. The method of discernibility matrix for computing reductions is also generalized to compute the part reductions in information systems.

Chen Degang

Rule Induction

Rules from Belief Networks: A Rough Set Approach

A new version of the BeliefSEEKER software that incorporates some aspects of rough set theory is discussed in this paper. The new version is capable of generating certain belief networks (for consistent data) and possible belief networks (for inconsistent data). Then, both types of networks can be readily converted onto respective sets of production rules, which includes both certain and/or possible rules. The new version or broadly speaking-methodology, was tested in mining the melanoma database for the best descriptive attributes of skin illness. It was found, that both types of knowledge representation, can be readily used for classification of melanocytic skin lesions.

Teresa Mroczek, Jerzy W. Grzymała-Busse, Zdzisław S. Hippe

The Bagging and n 2-Classifiers Based on Rules Induced by MODLEM

An application of the rule induction algorithm MODLEM to construct multiple classifiers is studied. Two different such classifiers are considered: the bagging approach, where classifiers are generated from different samples of the learning set, and the n2-classifier, which is specialized for solving multiple class learning problems. This paper reports results of an experimental comparison of these multiple classifiers and the single, MODLEM based, classifier performed on several data sets.

Jerzy Stefanowski

A Parallel Approximate Rule Extracting Algorithm Based on the Improved Discernibility Matrix

A parallel rule-extracting algorithm based on the improved discernibility matrix [2] is proposed, by this way, a large amount of raw data can be divided into some small portions to be processed in parallel. The confidence factor is also introduced to the rule sets to obtain the uncertainty rules. The most important advantage of this algorithm is that it does not need to calculate the discernibility matrix corresponding to these overall data.

Liu Yong, Xu Congfu, Pan Yunhe

Decision Rules in Multivalued Decision Systems

The paper includes some notions from the area of decision systems analysis defined for systems with multifunctions as attributes. Apart from retuned notions of indiscernibiliy relation, reduct or decision rule which are natural generalization of respective classical notions there is described an algorithm of minimal decision rules generation in considered type of decision systems. Moreover we shortly compare the rules with the ones generated as for classical decision systems. An adapted confusion matrix is presented to show output of classification of new objects to respective decision classes. We also suggest as an example a kind of real life data that are suitable for being analyzed according to the presented algorithms.

Wojciech Rzasa, Artur Paluch, Zbigniew Suraj

Multicriteria Choice and Ranking Using Decision Rules Induced from Rough Approximation of Graded Preference Relations

The approach described in this paper can be applied to support multicriteria choice and ranking of actions when the input preferential information acquired from the decision maker is a graded pairwise comparison (or ranking) of reference actions. It is based on decision-rule preference model induced from a rough approximation of the graded comprehensive preference relation among the reference actions. The set of decision rules applied to a new set of actions provides a fuzzy preference graph, which can be exploited by an extended fuzzy net flow score, to build a final ranking.

Philippe Fortemps, Salvatore Greco, Roman Słowiński

Measuring the Expected Impact of Decision Rule Application

Decision rules induced from a data set allow to particularize the relationships between condition and decision factors. Several indices can be used to characterize the most significant decision rules based on ”historical” data, but they are not able to measure the impact that these rules (or strategies derived from these rules) will produce in the future. Thus, in this paper, a new methodology is introduced to quantify the impact that a strategy derived from decision rules may have on a real life situation in the future. The utility of this approach is illustrated by an example.

Salvatore Greco, Benedetto Matarazzo, Nello Pappalardo, Roman Słowiński

Detection of Differences between Syntactic and Semantic Similarities

One of the most important problems with rule induction methods is that it is very difficult for domain experts to check millions of rules generated from large datasets. The discovery from these rules requires deep interpretation from domain knowledge. Although several solutions have been proposed in the studies on data mining and knowledge discovery, these studies are not focused on similarities between rules obtained. When one rule r₁ has reasonable features and the other rule r₂ with high similarity to r₁ includes unexpected factors, the relations between these rules will become a trigger to the discovery of knowledge. In this paper, we propose a visualization approach to show the similarity relations between rules based on multidimensional scaling, which assign a two-dimensional cartesian coordinate to each data point from the information about similiaries between this data and others data. We evaluated this method on two medical data sets, whose experimental results show that knowledge useful for domain experts could be found.

Shoji Hirano, Shusaku Tsumoto

Rough Sets and Neural Network

Processing of Musical Data Employing Rough Sets and Artificial Neural Networks

This paper presents system assumptions for automatic recognition of music and musical sounds. An overview of the MPEG-7 standard, focused on audio information description, is given. The paper discusses some problems in audio information analysis related to efficient MPEG-7-based applications. The effectiveness of the implemented low-level descriptors for automatic recognition of musical instruments is presented on the basis of experiments. A discussion on the influence of the choice of descriptors on the recognition score is included. Experiments are carried out basing on a decision system employing Rough Sets and Artificial Neural Networks. Conclusions are also included.

Bozena Kostek, Piotr Szczuko, Pawel Zwan

Integration of Rough Set and Neural Network for Application of Generator Fault Diagnosis

In the paper, integration of rough set and neural network for fault is put forward and used in generator fault diagnosis. At first, rough set theory is utilized to reduce attributes of diagnosis system. Set in accordance with the practical needs, optimized decision attribute set acts as the input of artificial neural network used for fault diagnosis, which has been used for Fengman hydroelectric power station and testified the feasibility of integration of rough set and neural network. Given enough data, this method could be popularized to other generators.

Wei-ji Su, Yu Su, Hai Zhao, Xiao-dan Zhang

Harnessing Classifier Networks – Towards Hierarchical Concept Construction

The process of construction and tuning of classifier networks is discussed. The idea of relating the basic inputs with the target classification concepts via the internal layers of intermediate concepts is explored. Intuitions and relationships to other approaches, as well as the illustrative examples are provided.

Dominik Ślezak, Marcin S. Szczuka, Jakub Wróblewski

Associative Historical Knowledge Extraction from the Structured Memory

For the intelligent automatic information processing, the efficient memory structure and intelligent knowledge extraction mechanism using this structure should be studied. Accordingly, we propose structured memory with the mechanism for extracting the historical knowledge. In a retrieval stage, the empirical historic factor effects on the reaction of a certain class. This system is applied to the area for estimating the purchasing degree from the type of customer’s tastes, the pattern of commodities and the evaluation of a company.

JeongYon Shim

Clustering

Utilizing Rough Sets and Multi-objective Genetic Algorithms for Automated Clustering

This paper addresses the problem of estimating the number of clusters for web data by using a multi-objective evolutionary algorithm, NSGA-II (non-dominated sorting genetic algorithm) to eliminate the assignment of subjective weights. Clustering algorithms in general need the number of clusters as a priori and mostly it is hard for the domain experts to estimate the number of clusters. By applying a heuristic, among the existing solutions, feasible solutions are chosen for each clustering and minimum number of clusters satisfying our heuristic can be chosen as the actual number of clusters. As a last stage, a domain expert can analyze the results and take a decision. For the experiments, we used a course web site from the Department of Computer Science at University of Calgary.

Tansel Özyer, Reda Alhajj, Ken Barker

Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method

In this paper, we present a missing data imputation method based on one of the most popular techniques in Knowledge Discovery in Databases (KDD), i.e. clustering technique. We combine the clustering method with soft computing, which tends to be more tolerant of imprecision and uncertainty, and apply a fuzzy clustering algorithm to deal with incomplete data. Our experiments show that the fuzzy imputation algorithm presents better performance than the basic clustering algorithm.

Dan Li, Jitender Deogun, William Spaulding, Bill Shuart

K-means Indiscernibility Relation over Pixels

This article presents a new form of indiscernibility relation based on K-means clustering of pixel values. The end result is a partitioning of a set of pixel values into bins that represent equivalence classes. The proposed approach makes it possible to introduce a form of upper and lower approximation specialized relative to sets of pixel values. This approach is particularly relevant to a special class of digital images for power line ceramic insulators. Until now the problem of determining when a ceramic insulator needs to be replaced has relied on visual inspection. With the K-means indiscernibility relation, it is now possible to automate the detection of faulty ceramic insulators. The contribution of this article is the introduction of an approach to classifying power line insulators based on a rough set methods and K-means clustering in analyzing digital images.

James F. Peters, Maciej Borkowski

A New Cluster Validity Function Based on the Modified Partition Fuzzy Degree

The cluster validity is an important topic of cluster analysis, which is often converted into the determination of the optimal cluster number. Most of the available cluster validity functions are limited for the analysis of numeric data set and ineffective for the categorical data set. For this purpose, a new cluster validity function is presented in this paper, namely the modified partition fuzzy degree. By combining the partition entropy and the partition fuzzy degree, the new cluster validity can be applied to any data set with numeric attributes or categorical attributes. The experimental results illustrate the effectiveness of the proposed cluster validity function.

Jie Li, Xinbo Gao, Li-cheng Jiao

Data Mining

On the Evolution of Rough Set Exploration System

We present the next version (ver. 2.1) of the Rough Set Exploration System – a software tool featuring a library of methods and a graphical user interface supporting variety of rough-set-based and related computations. Methods, features and abilities of the implemented software are discussed and illustrated with examples in data analysis and decision support.

Jan G. Bazan, Marcin S. Szczuka, Arkadiusz Wojna, Marcin Wojnarski

Discovering Maximal Frequent Patterns in Sequence Groups

In this paper, we give a general treatment for some kind of sequences such as customer sequences, document sequences, and DNA sequences, etc. Large collections of transaction, document, and genomic information have been accumulated in recent years, and embedded latently in it there is potentially significant knowledge for exploitation in the retailing industry, in information retrieval, in medicine and in the pharmaceutical industry, respectively. The approach taken here to the distillation of such knowledge is to detect strings in sequences which appear frequently, either within a given sequence (eg for a particular customer, document, or patient) or across sequences (eg from different customers, documents, or patients sharing a particular transaction, information retrieval, or medical diagnosis; respectively).

J. W. Guan, David A. Bell, Dayou Liu

Fuzzy Taxonomic, Quantitative Database and Mining Generalized Association Rules

Mining association rules and the relative knowledge from databases has been a focused topic in recent data mining fields. This paper focuses on the issue of how to mine generalized association rules from quantitative databases with fuzzy taxonomic structure, and a new fuzzy taxonomic quantitative database model has been proposed to solve the problem. The new model is demonstrated effective on a real-world databases.

Hong-bin Shen, Shi-tong Wang, Jie Yang

Pattern Mining for Time Series Based on Cloud Theory Pan-concept-tree

One important series mining problems is finding important patterns in larger time series sets. Two limitations of previous works were the poor scalability and the robustness to noise. Here we introduce a algorithm using symbolic mapping based on concept tree. The slope of subsequence is chosen to describe series data. Then, the numerical data is transformed into low dimension symbol by cloud models. Due to characteristic of the cloud models, the loss of data in the course of linear preprocessing is treated. Moreover, it is more flexible for the local noise. Second, cloud Boolean calculation is realized to automatically produce the basic concepts as the leaf nodes in pan-concept-tree which leads to hierarchal discovering of the knowledge .Last, the probabilistic project algorithm was adapted so that comparison among symbols may be carried out with less CPU computing time. Experiments show strong robustness and less time and space complexity.

Yingjun Weng, Zhongying Zhu

Using Rough Set Theory for Detecting the Interaction Terms in a Generalized Logit Model

Although logit model has been a popular statistical tool for classification problems it is hard to determine interaction terms in the logit model because of the NP-hard problem in searching all sample space. In this paper, we provide another viewpoint to consider interaction effects based on information granulation. We reduce the sample space of interaction effects using decision rules in rough set theory, and then use the procedure of stepwise selection method is used to select the significant interaction effects. Based on our results, the interaction terms are significant and the logit model with interaction terms performs better than other two models.

Chorng-Shyong Ong, Jih-Jeng Huang, Gwo-Hshiung Tzeng

Optimization of the ABCD Formula for Melanoma Diagnosis Using C4.5, a Data Mining System

Our main objective was to improve the diagnosis of melanoma by optimizing the ABCD formula, used by dermatologists in melanoma identification. In our previous research, an attempt to optimize the ABCD formula using the LEM2 rule induction algorithm was successful. This time we decided to replace LEM2 by C4.5, a tree generating data mining system. The final conclusion is that, most likely, for C4.5 the original ABCD formula is already optimal and no further improvement is possible.

Ron Andrews, Stanislaw Bajcar, Jerzy W. Grzymała-Busse, Zdzisław S. Hippe, Chris Whiteley

A Contribution to Decision Tree Construction Based on Rough Set Theory

In this paper, the algorithm of building a decision tree is introduced by comparing the information gain or entropy. The produced process of univariate decision tree is given as an example. According to rough sets theory, the method of constructing multivariate decision tree is discussed. Using the algorithm, the complexity of decision tree is decreased, the construction of decision tree is optimized and the rule of data mining could be built.

Xumin Liu, Houkuan Huang, Weixiang Xu

Image and Signal Recognition

Domain Knowledge Approximation in Handwritten Digit Recognition

Pattern recognition system on large sets of complex objects often have to deal with atypical samples that defy most traditional classifiers. Such samples can be handled with additional domain knowledge from an human expert. We propose a framework for the transfer of knowledge from the expert and incorporating it into the learning process of our recognition system using the rough mereology methods. We also show how this knowledge acquisition can be conducted in an interactive manner, with a large dataset of handwritten digits as an example.

Tuan Trung Nguyen

An Automatic Analysis System for Firearm Identification Based on Ballistics Projectile

Characteristic markings on the cartridge and projectile of a bullet are produced when a gun is fired. Over thirty different features within these marks can be distinguished, which in combination produce a ”fingerprint” for identification of a firearm. Given a means of automatically analyzing features within such a firearm fingerprint, it will be possible to identify not only the type and model of a firearm, but also each individual weapon as effectively as human fingerprint identification can be achieved. In this paper, a new analytic system based on fast Fourier transform (FFT) for identifying the projectile specimens digitized using the line-scan imaging technique automatically is proposed. Experimental results show that the proposed system has the ability of efficient and precise analysis and identification for projectiles specimens.

Jun Kong, Dongguang Li, Chunnong Zhao

Granulation Based Image Texture Recognition

Granular computing as an enabling technology and as such it cuts across a broad spectrum of disciplines and becomes important to many areas of applications. In this paper, we present our model of information granulation that is more suitable to image recognition. Then, we construct an image granule framework and present a granulation based image texture recognition algorithm. We compare our algorithm with some other algorithms and the results show that our algorithm is effective and efficient.

Zheng Zheng, Hong Hu, Zhongzhi Shi

Radar Emitter Signal Recognition Based on Resemblance Coefficient Features

Resemblance coefficient (RC) feature extraction approach for radar emitter signals was proposed. Definition and properties of RC were given. Feature extraction algorithm based on RC was described in detail and the performances of RC features were also analyzed. Neural network classifiers were designed. Theoretical analysis results and simulation experiments of 9 typical radar emitter signal feature extraction and recognition show that RC features are not sensitive to noise and average accurate recognition rate rises to 99.33%, which indicates that the proposed approach is effective.

Gexiang Zhang, Haina Rong, Weidong Jin, Laizhao Hu

Vehicle Tracking Using Image Processing Techniques

A real time vehicle tracking in image sequences is presented. The moving vehicles are segmented by the method of differential image followed by the process of morphological dilation. The vehicles are recognized and tracked using statistical moments. The straight lines in the moving vehicles are found with the help of Radon transform. The direction of the moving object is calculated from the orientation of the straight lines in the direction of the principal axes of the moving objects. The direction of the moving object and the displacement of the object in the image sequence are used to calculate the velocity of the moving objects.

Seung Hak Rhee, Seungjo Han, Pan koo Kim, Muhammad Bilal Ahmad, Jong An Park

Classification of Swallowing Sound Signals: A Rough Set Approach

This paper introduces an approach to classifying swallowing sound signals to detect those patients at risk of aspiration, or choking using rough set methods. An important contribution of a recent study of segmenting the waveform of swallowing sound signals has been the use of the waveform dimension (WD) to describe signal complexity and major changes in signal variance. Prior swallowing sound classification studies have not considered discretization in the extraction of features from swallow sound data tables. In addition, derivation of decision rules for classifying swallowing sounds have not been considered. In the study reported in this paper, both discretization (quantization of real-valued attributes) and non-discretization have been used to achieve attribute reduction and decision rule derivation.

Lisa Lazareck, Sheela Ramanna

Emotional Temporal Difference Learning Based Multi-layer Perceptron Neural Network Application to a Prediction of Solar Activity

Nonlinear time series prediction has in recent years, been the subjects of many methodological and applied studies in the fields of system identification and nonlinear prediction. An important benchmark has been the prediction of solar activity with the markup increase in the practical importance of space weather forecasting; its motivation has risen far beyond more methodological concerns. In this paper, we have used a bounded rationality decision-making procedure, whose utility has been demonstrated in several identification and control tasks, for predicting sunspot numbers. An emotional temporal difference learning based multi layer perceptron neural network is introduced and applied to the prediction task.

Farzan Rashidi, Mehran Rashidi

Information Retrieval

Musical Metadata Retrieval with Flow Graphs

The CDDB database available in the Internet is widely used for the retrieval of metadata associated with almost any CD record. The database is queried automatically each time a CD is copied on a computer with appropriate software installed. However, this database could be used also for musical recording searching. An advanced query algorithm was prepared to that end employing the concept of inference rule derivation from flow graphs introduced recently by Pawlak. The searching engine utilizes knowledge acquired in advance and stored in flow graphs in order to enable searching CD records database. The experimental results showing the effectiveness of analyzing musical metadata with this method are presented in the paper.

Andrzej Czyzewski, Bozena Kostek

A Fuzzy-Rough Method for Concept-Based Document Expansion

In this paper, a novel approach of fuzzy-rough hybridization is developed for concept-based document expansion to enhance the quality of text information retrieval. Firstly, different from the traditional way of document representation, a given set of text documents is represented by an incomplete information system. To discover the relevant keywords to be complemented, the weights of those terms which do not occur in a document are considered missing instead of zero. Fuzzy sets are used to take care of the real-valued weights in the term vectors. Rough sets are then used to extract the potentially associated keywords which convey a concept for text retrieval in this incomplete information system. Finally, through incorporating Nearest Neighbor mechanism, the missing weights of the extracted keywords of a document can be filled by searching the corresponding weights of the most similar document. Thus, the documents in the original text dataset are expanded, whereas the number of total keywords is reduced. Some experiments are conducted using part of data from Ruters21578. Since the concept-based method is able to identify and supplement the potentially useful information to each document, the performance of information retrieval in terms of recall is greatly improved.

Yan Li, Simon Chi-Keung Shiu, Sankar Kumar Pal, James Nga-Kwok Liu

Use of Preference Relation for Text Categorization

The sudden expansion of the web and the use of the Internet has caused some research fields to regain (or even increase) its old popularity. Of them, text categorization aims at developing a classification system for assigning a number of predefined topic codes to the documents based on the knowledge accumulated in the training process. In this paper, we investigate a text categorization method based on steepest descent induction algorithm combined with multi-level preference relation over retrieval output that is especially suitable for inducing classifiers over non-exclusive data set. Our framework enables us to define a threshold value for relativeness such a way that it becomes specific for each category. Furthermore, a cache memory of a category, which is obtained when training the classifier, makes text categorization adaptive. We have found out that a cache memory based on 8-4-2 (positive-boundary-negative) examples yielded almost true classifiers over Reuters-2178 data set.

Hayri Sever, Zafer Bolat, Vijay V. Raghavan

Decision Support

An Expert System for the Utilisation of the Variable Precision Rough Sets Model

The variable precision rough sets model (VPRS) is a development of the original rough set theory (RST) and allows for the partial (probabilistic) classification of objects. This paper introduces a prototype VPRS expert system. A number of processes for the identification of the VPRS related ß-reducts and their respective ß intervals over the domain of the ß parameter are included. Three data sets are utilised in the exposition of the expert system.

Malcolm J. Beynon, Benjamin Griffiths

Application of Decision Units in Knowledge Engineering

In this paper we shall present the decision units idea that, allow us to divide a set of rules into subsets – decision units. This paper gives attention to decision units in chosen problems of knowledge engineering. We present our own rule base verification method based on decision unit conception. The decision units are simple and intuitive models describing relations in rule knowledge base, being direct superstructure of a rule-base model. The unit’s usage offers the knowledge engineer’s support on the technical design and base realization level.

Roman Siminski, Alicja Wakulicz-Deja

Fuzzy Decision Support System with Rough Set Based Rules Generation Method

This paper presents system which tries to combine the advantages of rough sets methods and fuzzy sets methods to get better classification. The fuzzy sets theory supports approximate reasoning and the rough sets theory is responsible for data analyzing and process of automatic fuzzy rules generation. The system was designed as a typical knowledge based system consisting of four main parts: rule extractor, knowledge base, inference engine, user interface and occurs to be useful tool in various decision problems and fuzzy control.

Grzegorz Drwal, Marek Sikora

Approximate Petri Nets for Rule-Based Decision Making

This paper describes a new Petri net model named approximate Petri net. Approximate Petri nets can be used for knowledge representation and approximate reasoning. The net model presented in the paper is defined on the base of the rough set theory, fuzzy Petri nets and coloured Petri nets.

Barbara Fryc, Krzysztof Pancerz, Zbigniew Suraj

Adaptive and Opminization Methods

Adaptive Linear Market Value Functions for Targeted Marketing

This paper presents adaptive linear market value functions to solve the problem of identification of customers having potential market value in targeted marketing. The performance of these methods is compared with some standard data mining methods such as simple Naive Bayes. Experiments on real world data show that the proposed methods are efficient and effective.

Jiajin Huang, Ning Zhong, Chunnian Liu, Yiyu Yao

Using Markov Models to Define Proactive Action Plans for Users at Multi-viewpoint Websites

Deciding about the best action plan to be tailored to and carried out for each user is the key for personalization. This is a challenging task as the maximum number of elements of an environment have to be taken into account when making decisions such as type of user, actual behaviour or goals to fulfil. The difficulty is even greater when dealing with web users and when decisions have to be taken on-line and salesmen are not involved. In this paper, we propose an approach that integrates user typologies and behaviour patterns in a multidepartamental organization to decide the best action plan to be carried out at each particular moment. The key idea to do this is based on detecting users behaviour changes by means of Behaviour Evolution Models (a combination of Discrete Markov Models). Besides, an agent based architecture has been proposed for the implementation of the whole method.

Ernestina Menasalvas, Socorro Millán, P. Gonzalez

A Guaranteed Global Convergence Particle Swarm Optimizer

The standard Particle Swarm Optimizer may prematurely converge on suboptimal solutions that are not even guaranteed to be local extrema. A new particle swarm optimizer, called stochastic PSO, which is guaranteed to convergence to the global optimization solution with probability one, is presented based on the analysis of the standard PSO. And the global convergence analysis is made using the F.Solis and R.Wets’ research results. Finally, several examples are simulated to show that SPSO is more efficient than the standard PSO.

Zhihua Cui, Jianchao Zeng

Adaptive Dynamic Clone Selection Algorithms

Based on the Antibody Clonal Selection Theory of immunology, a novel artificial immune system algorithm, adaptive dynamic clone select algorithm, is put forward. The new algorithm is intended to integrate the local searching with the global and the probability evolution searching with the stochastic searching. Compared with the improved genetic algorithm and other clonal selection algorithms, the new algorithm prevents prematurity more effectively and has high convergence speed. Numeric experiments of function optimization indicate that the new algorithm is effective and useful.

Haifeng Du, Licheng Jiao, Maoguo Gong, Ruochen Liu

Multiobjective Optimization Based on Coevolutionary Algorithm

With the intrinsic properties of multiobjective optimization problems in mind, multiobjective coevolutionary algorithm (MOCEA) is proposed. In MOCEA, a Pareto crossover operator, and 3 coevolutionary operators are designed for maintaining the population diversity and increasing the convergence rate. Moreover, a crowding distance is designed to reduce the size of the nondominated set. Experimental results demonstrate that MOCEA can find better solutions at a low computational cost. At the same time, the solutions found by MOCEA scatter uniformly over the entire Pareto front.

Jing Liu, Weicai Zhong, Li-cheng Jiao, Fang Liu

Bioinformatics

Extracting Protein-Protein Interaction Sentences by Applying Rough Set Data Analysis

In this paper, we introduce a way to apply rough set data analysis to the problem of extracting protein-protein interaction sentences in biomedical literature. Our approach builds on decision rules of protein names, interaction words, and their mutual positions in sentences. In order to broaden the set of potential interaction words, we develop a morphological model which generates spelling and inflection variants of the interaction words. We evaluate the performance of the proposed method on a hand-tagged dataset of 1894 sentences and show a precision-recall break-even performance of 79,8% by using leave-one-out crossvalidation.

Filip Ginter, Tapio Pahikkala, Sampo Pyysalo, Jorma Boberg, Jouni Järvinen, Tapio Salakoski

Feature Synthesis and Extraction for the Construction of Generalized Properties of Amino Acids

Amino acid similarity matrices are used for protein sequence comparison. It has been shown previously that they can be reconstructed from equivalence classes between amino acids. The goal of the current study is to propose an algorithm for identification of the properties that generate these equivalence classes. An approximate reasoning method for feature extraction and synthesis is developed to this end. It is shown that these equivalence classes are related with the amino acid properties that are important for the formation of the protein structure. The algorithm presented in this study works best for bit-patterns, which are frequently encountered in bit-vector representations.

Witold R. Rudnicki, Jan Komorowski

Improvement of the Needleman-Wunsch Algorithm

This paper provides a novel approach to improve the Needleman-Wunsch algorithm, one of the most basic algorithms in Protein or DNA sequence pairwise alignment. In the first part, the problem of this algorithm is outlined. Later, a general introduction of related work is described. Finally, a modified Needleman-Wunsch algorithm and experimental results are presented. The importance of this process lies on its effect of keeping common patterns in sequences to be aligned accurately.

Zhihua Du, Feng Lin

The Alignment of the Medical Subject Headings to the Gene Ontology and Its Application in Gene Annotation

The Gene Ontology (GO) is a controlled vocabulary used for annotation of genes. Assigning such terms to uncategorized genes is time-consuming work, and a recurring task in biomedicine. The biomedical citations of the literature database MEDLINE are indexed with terms from the Medical Subject Headings (MeSH). We studied whether MeSH terms from gene-related MEDLINE entries could be translated to GO, and be used automatically for annotation purposes. We explored three MeSH to GO alignments: pairing similar MeSH and GO term synonyms, indirect linking through MeSH’s associations with Enzyme Commission (EC) terms and the official EC2GO translation table, and using association analysis to indicate which MeSH and GO terms that co-occurred most frequently for existing annotations. We here show that an alignment can be found, and, despite inconsistency in the use of MeSH terms as MEDLINE indexes, we conclude that GO annotation prediction using this alignment is useful in manual annotation of genes.

Henrik Tveit, Torulf Mollestad, Astrid Lægreid

Medical Applications

Rough Set Methodology in Clinical Practice: Controlled Hospital Trial of the MET System

Acute abdominal pain in childhood is a common but diagnostically challenging problem facing Emergency Department personnel. Experienced physicians use a combination of key clinical attributes to assess and triage a child, but residents and inexperienced physicians may lack this ability. In order to assist them, we used knowledge discovery techniques based on rough set theory to develop a clinical decision model to support the triage. The model was implemented as a module for the Mobile Emergency Triage system – a clinical decision support system for the rapid emergency triage of children with acute pain. The abdominal pain module underwent a clinical trial in the Emergency Department of a teaching hospital. The trial allowed us to compare in a prospective manner the accuracy of the system to the triage performance of the physicians and the residents. The preliminary results are very encouraging and they demonstrate validity of developing computer-based support tools for a clinical triage.

Ken Farion, Wojtek Michalowski, Roman Słowiński, Szymon Wilk, Steven Rubin

An Automated Multi-spectral MRI Segmentation Algorithm Using Approximate Reducts

We introduce an automated multi-spectral MRI segmentation technique based on approximate reducts derived from the data mining paradigm of the theory of rough sets. We utilized the T1, T2 and PD MRI images from the Simulated Brain Database as a ”gold standard” to train and test our segmentation algorithm. The results suggest that approximate reducts, used alone or in combination with other classification methods, may provide a novel and efficient approach to the segmentation of volumetric MRI data sets.

Sebastian Widz, Kenneth Revett, Dominik Ślezak

Rough Set-Based Classification of EEG-Signals to Detect Intraoperative Awareness: Comparison of Fuzzy and Crisp Discretization of Real Value Attributes

Automated classification of calculated EEG parameters has been shown to be a promising method for detection of intraoperative awareness. In the present study, rough set-based methods were employed to generate classification rules. For these methods, discrete attributes are required. We compared a crisp and a fuzzy discretization of the real parameter values. Fuzzy discretization transforms one real attribute value to several discrete values. By combining the different (discrete) values of all attributes, several sub-objects were produced from a single original object. Rule generation from a training set of objects and classification of a test set provided good classification rates of approximately 90% for both crisp and fuzzy discretization. Fuzzy discretization resulted in a simpler and smaller rule set than crisp discretization. Therefore, the simplicity of the resulting classifier justifies the higher computational effort caused by fuzzy discretization.

Michael Ningler, Gudrun Stockmanns, Gerhard Schneider, Oliver Dressler, Eberhard F. Kochs

Fuzzy Logic-Based Modeling of the Biological Regulator of Blood Glucose

This paper proposes the utilisation of fuzzy logic so as to design a system which models the biological regulator of blood glucose. That system consists of several fuzzy relations, each one of them modeling a component of the biological glycemia regulator, that is, pancreatic insulin production, net hepatic glucose balance, insulin dependent and independent glucose uptake, and kidney function. A set of experiments has been carried out by means of a simulation of the proposed model, checking that fuzzy control provides good results for the studied cases. The system could be a basis for developing artificial glycemia control mechanisms to be applied as a therapy for different pathologies, as well as for the development of simulators and monitors to aid diagnosis.

José-Luis Sánchez Romero, Francisco-Javier Ferrández Pastor, Antonio Soriano Payá, Juan-Manuel García Chamizo

Bibliography Project of International Rough Set Society

The Rough Set Database System: An Overview

The paper describes the ”Rough Sets Database System” (called in short the RSDS system) for the creation of bibliography on rough sets and their applications. This database is the most comprehensive online rough sets bibliography and accessible under the following web-site address: http://rsds.wsiz.rzeszow.plThe service has been developed in order to facilitate the creation of rough sets bibliography, for various types of publications. At the moment the bibliography contains over 1400 entries from more than 450 authors. It is possible to create the bibliography in HTML or BibTeX format. In order to broaden the service contents it is possible to append new data using specially dedicated form. After appending data online the database is updated automatically. If one prefers sending a data file to the database administrator, please be aware that the database is updated once a month.In the current version of the RSDS system, there is the possibility for appending to each publication an abstract and keywords. As a natural consequence of this improvement there exists a possibility for searching a publication by keywords.

Zbigniew Suraj, Piotr Grochowalski

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter