Skip to main content

2006 | Buch

Rough Sets and Knowledge Technology

First International Conference, RSKT 2006, Chongquing, China, July 24-26, 2006. Proceedings

herausgegeben von: Guo-Ying Wang, James F. Peters, Andrzej Skowron, Yiyu Yao

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This volume contains the papers selected for presentation at the First Int- national Conference on Rough Sets and Knowledge Technology (RSKT 2006) organized in Chongqing, P. R. China, July 24-26, 2003. There were 503 s- missions for RSKT 2006 except for 1 commemorative paper, 4 keynote papers and 10 plenary papers. Except for the 15 commemorative and invited papers, 101 papers were accepted by RSKT 2006 and are included in this volume. The acceptance rate was only 20%. These papers were divided into 43 regular oral presentation papers (each allotted 8 pages), and 58 short oral presentation - pers (each allotted 6 pages) on the basis of reviewer evaluation. Each paper was reviewed by two to four referees. Since the introduction of rough sets in 1981 by Zdzis law Pawlak, many great advances in both the theory and applications have been introduced. Rough set theory is closely related to knowledge technology in a variety of forms such as knowledge discovery, approximate reasoning, intelligent and multiagent systems design, and knowledge intensive computations that signal the emergence of a knowledge technology age. The essence of growth in cutting-edge, state-of-t- art and promising knowledge technologies is closely related to learning, pattern recognition,machine intelligence and automation of acquisition, transformation, communication, exploration and exploitation of knowledge. A principal thrust of such technologies is the utilization of methodologies that facilitate knowledge processing.

Inhaltsverzeichnis

Frontmatter

Commemorative Paper

Some Contributions by Zdzisław Pawlak

This article celebrates the creative genius of Zdzisław Pawlak. He was with us only for a short time and, yet, when we look back at his accomplishments, we realize how greatly he has influenced us with his generous spirit and creative work in many areas such as approximate reasoning, intelligent systems research, computing models, mathematics (especially, rough set theory), molecular computing, pattern recognition, philosophy, art, and poetry. Pawlak’s contributions have far-reaching implications inasmuch as his works are fundamental in establishing new perspectives for scientific research in a wide spectrum of fields. His most widely recognized contribution is his brilliant approach to classifying objects with their attributes (features) and his introduction of approximation spaces, which establish the foundations of granular computing and provides an incisive approach to pattern recognition. This article attempts to give a vignette that highlights some of Pawlak’s remarkable accomplishments. This vignette is limited to a brief coverage of Pawlak’s work in rough set theory, molecular computing, philosophy, painting and poetry. Detailed coverage of these as well as other accomplishments by Pawlak is outside the scope of this commemorative article.

James F. Peters, Andrzej Skowron

Keynote Papers

Conflicts and Negotations

Conflicts analysis and resolution play an important role in business, governmental, political and lawsuits disputes, labor- management negotiations, military operations and others. In this paper we show how the conflict situation and development can be represented and studied by means of conflict graphs. An illustration of the introduced concepts by the Middle East conflict is presented.

Zdzisław Pawlak
Hierarchical Machine Learning – A Learning Methodology Inspired by Human Intelligence

One of the basic characteristics in human problem solving, including learning, is the ability to conceptualize the world at different granularities and translate from one abstraction level to the others easily, i.e., deal with them hierarchically[1]. But computers can only solve problems in one abstraction level generally. This is one of the reasons that human beings are superior to computers in problem solving and learning. In order to endow the computers with the human’s ability, several mathematical models have been presented such as fuzzy set, rough set theories [2, 3]. Based on the models, the problem solving and machine learning can be handled at different grain-size worlds. We proposed a quotient space based model [4, 5] that can also deal with the problems hierarchically. In the model, the world is represented by a semi-lattice composed by a set of quotient spaces: each of them represents the world at a certain grain-size and is denoted by a triplet, (

X

,

F

,

f

) where

X

is a domain,

F

- the structure of

X

,

f

-the attribute of

X

.

In this talk, we will discuss the hierarchical machine learning based on the proposed model. From the quotient space model point of view, a supervised learning (classification) can be regarded as finding a mapping from a low-level feature space to a high-level conceptual space, i.e., from a fine space to its quotient space (a coarse space) in the model. Since there is a big semantic gap between the low-level feature spaces and the conceptual spaces, finding the mapping is quite difficult and inefficiency. For example, it needs a large number of training samples and a huge amount of computational cost generally. In order to reduce the computational complexity in machine learning, the characteristics of human learning are adopted. In human learning, people always use a multi-level learning strategy, including multi-level classifiers and multi-level features, instead of one-level, i.e., learning at spaces with different grain-size. We call this kind of machine learning the hierarchical learning. So the hierarchical learning is a powerful strategy for improving machine learning.

Taking the image retrieval as an example, we’ll show how to use the hierarchical learning strategy to the field. Given a query (an image) by a user, the aim of image retrieval is to find a set of similar images from a collection of images. This is a typical classification problem and can be regarded as a supervised learning. The first problem is how to represent an image so that the similar images can be found from the collection of images precisely and entirely. So far in image retrieval, an image was represented by several forms with different grain-size. The finest representation of an image is by an n(n matrix, each of its elements represents a pixel. Using this representation to image retrieval, the precision will be high but the robustness (recall) will be low. Since it has the precise detail of an image, it is sensitive to noises. Therefore, the pixel-based representation was used in image retrieval rarely. The common used representation in image retrieval is the coarsest one, i.e., so called global visual features [6]. Here, an image is represented by a visual feature (a vector) such as color moments, color correlograms, wavelet transforms, Gabor transform, etc. In the coarsest representations, most of the details in an image lose so that the retrieval precision decreases but the robustness (recall) increases. The coarsest representations are suitable for seeking a class of similar images due to their robustness. Therefore, the global visual features were used for image retrieval widely. In order to overcome the low precision introduced by the coarsest representations, global features, the middle-size representation of an image was presented recently such as region-based representation [7]. In the representation, an image is partitioned into several consistent regions and each region is represented by a visual feature (a vector) extracted from the region. The whole image is represented by a set of features (vectors). Since the region-based representation has more details of an image than the global one, the retrieval precision increases but the robustness decreases. Therefore, the quality, including precision and recall, of image retrieval will be improved by using multi-level features. One of the strategies for hierarchical learning is to integrating the features with different grain-size, including the global, the region-based, and the pixel-based features.

One of the main goals in hierarchical learning is to reduce the computational complexity. Based on the proposed model we know that the learning cost can be reduced by using a set of multi-level classifiers. Certainly, the set of multi-level classifiers composes a hierarchical learning framework. A set of experimental results in hand-written Chinese character recognition and image retrieval are given to verify the advantage of the approach.

Hierarchical learning inspired by human’s learning is one of the methodologies for improving the performances of machine learning.

Ling Zhang, Bo Zhang
Rough-Fuzzy Granulation, Rough Entropy and Image Segmentation

This talk has two parts. The first part describes how the concept of rough-fuzzy granulation can be used for the problem of case generation, with varying reduced number of features, in a case based reasoning system, and the application to multi-spectral image segmentation. Here the synergistic integration of EM algorithm, minimal spanning tree and granular computing for efficient segmentation is described. The second part deals with defining a new definition of image entropy in a rough set theoretic framework, and its application to the object extraction problem from images by minimizing both object and background roughness. Granules carry local information and reflect the inherent spatial relation of the image by treating pixels of a window as indiscernible or homogeneous. Maximization of homogeneity in both object and background regions during their partitioning is achieved through maximization of rough entropy; thereby providing optimum results for object background classification. The effect of granule size is also discussed.

Sankar K. Pal
Towards Network Autonomy

The next generation Web technologies (in a broader sense than World Wide Web), as one of the ultimate goals in

Web Intelligence (WI)

research, will enable humans to go beyond the existing functionalities of online information search and knowledge queries and to gain from the Web

practical wisdoms

of living, working, and playing. This is a fundamental paradigm shift towards the so-called

Wisdom Web

, and presents new challenges as well as opportunities to computer scientists and practitioners.

Jiming Liu

Plenary Papers

A Roadmap from Rough Set Theory to Granular Computing

Granular Computing (GrC) operates with granules (generalized subsets) of data as pieces of basic knowledge. Rough Set Theory (RST) is a leading special case of GrC approach. In this paper, we outline a roadmap that stepwise refines RST into GrC. A prime illustration is that GrC of symmetric binary relations is a

complete topological

RST on granular spaces, where the adjective

complete

means that the representation theory can fully reflect the structure theory.

Tsau Young Lin
Partition Dependencies in Hierarchies of Probabilistic Decision Tables

The article investigates probabilistic dependencies in hierarchies of probabilistic decision tables learned from data. They are expressed by the probabilistic generalization of the Pawlak’s measure of the dependency between attributes and the certainty gain measure.

Wojciech Ziarko
Knowledge Theory and Artificial Intelligence

It is proved in the paper that it is knowledge that plays a crucial role for intelligence formation this is because of the fact that intelligence must normally be activated from knowledge and different categories of knowledge will thus lead to different categories of intelligence. On the other hand, knowledge itself should mainly come from information. Therefore, knowledge serves as a channel for linking information and intelligence. Without knowledge, information can hardly be transformed into intelligence. Even more interestingly, a unified theory of artificial intelligence can well be achieved if a comprehensive understanding of knowledge theory is reached.

Yixin Zhong
Applications of Knowledge Technologies to Sound and Vision Engineering

Sound and Vision Engineering as an interdisciplinary branch of science should quickly assimilate new methods and new technologies. Meanwhile, there exist some advanced and well developed methods for analyzing and processing of data or signals that are only occasionally applied to this domain of science. These methods emerged from the artificial intelligence approach to image and signal processing problems. In the paper the intelligent algorithms, such as neural networks, fuzzy logic, genetic algorithm and the rough set method will be presented with regards to their applications to sound and vision engineering. The paper will include a practical demonstration of results achieved with intelligent algorithms applications to: bi-modal recognition of speech employing NN-PCA algorithm, perceptually-oriented noisy data processing methods, advanced sound acquisition, GA algorithm-based digital signal processing for telecommunication applications and others.

Andrzej Czyzewski
A Rough Set Approach to Data with Missing Attribute Values

In this paper we discuss four kinds of missing attribute values: lost values (the values that were recorded but currently are unavailable), ”do not care” conditions (the original values were irrelevant), restricted ”do not care” conditions (similar to ordinary ”do not care” conditions but interpreted differently, these missing attribute values may occur when in the same data set there are lost values and ”do not care” conditions), and attribute-concept values (these missing attribute values may be replaced by any attribute value limited to the same concept). Through the entire paper the same calculus, based on computations of blocks of attribute-value pairs, is used. Incomplete data are characterized by characteristic relations, which in general are neither symmetric nor transitive. Lower and upper approximations are generalized for data with missing attribute values. Finally, some experiments on different interpretations of missing attribute values and different approximation definitions are cited.

Jerzy W. Grzymala-Busse
Cognitive Neuroscience and Web Intelligence

Cognitive neuroscience is an interdisciplinary research field to study human information processing mechanism from both macro and micro views. Web intelligence is a new direction for scientific research and development of emerging web-based artificial intelligence technology.

As two related important research fields, cognitive neuroscience and web intelligence mutually support each other strongly. The discovery of cognitive neuroscience can propose a new human intelligence model and to support web intelligence developments. Furthermore, web intelligence technology is useful to discover more advanced human cognitive models.

In order to develop the web intelligence systems which match human ability, it is necessary to investigate human cognitive mechanism systematically. The key issues are how to design the psychological, functional Magnetic Resonance Imaging (fMRI) and Electroencephalograph (EEG) experiments for obtaining various data from human cognitive mechanism, as well as how to analyze such data from multiple aspects for discovering new models of human cognition.

In our studies, we propose a new methodology with a multi-step process, in which various psychological experiments, physiological measurements and data mining techniques are cooperatively used to investigate human cognitive mechanism. This talk mainly introduces some cognitive neuroscience researches and the related intelligent mechanical systems in my laboratory. The researches include vision, auditory, memory, language and attention, etc. More specifically, I will talk about the relationship between cognitive neuroscience and web intelligence with using some examples.

Jinglong Wu
Cognitive Informatics and Contemporary Mathematics for Knowledge Manipulation

Although there are various ways to express entities, notions, relations, actions, and behaviors in natural languages, it is found in Cognitive Informatics (CI) that human and system behaviors may be classified into three basic categories known as to

be

, to

have

, and to

do

. All mathematical means and forms, in general, are an abstract and formal description of these three categories of system behaviors and their common rules. Taking this view, mathematical logic may be perceived as the abstract means for describing ‘to be,’ set theory for describing ‘to have,’ and algebras, particularly the process algebra, for describing ‘to do.’

This paper presents the latest development in a new transdisciplinary field known as CI. Three types of new mathematical structures, Concept Algebra (CA), System Algebra (SA), and Real-Time Process Algebra (RTPA), are created to enable rigorous treatment of knowledge representation and manipulation in terms of to

be

/ to

have

/ to

do

in a formal and coherent framework. A wide range of applications of the three knowledge algebras in the framework of CI has been identified in knowledge and software engineering.

Yingxu Wang
Rough Mereological Reasoning in Rough Set Theory: Recent Results and Problems

This article comes up a couple of months after the death of Professor Zdzisław Pawlak who created in 1982 the theory of rough sets as a vehicle to carry out Concept Approximation and a fortiori, Decision Making, Data Mining, Knowledge Discovery and other activities.

At the roots of rough set theory, was a deep knowledge of ideas going back to Frege, Russell, Łukasiewicz, Popper, and others.

Rough sets owe this attitude the intrinsic clarity of ideas, elegant simplicity (not to be confused with easy triviality), and a fortiori a wide spectrum of applications.

Over the years, rough set theory has been enriched with new ideas.

One of those additions has been rough mereology, an attempt at introducing a regular form of tolerance relations on objects in an information system, in order to provide a more flexible scheme of relating objects than indiscernibility. The theory of mereology, proposed long ago (1916) by S. Lesniewski, proved a valuable source of inspiration. As a result, a more general theory has emerged, still far from completion.

Rough mereology, operating with so called rough inclusions, allows for definitions of a class of logics, that in turn have applications to distributed systems, perception analysis, granular computing etc. etc. In this article, we give a survey of the present state of art in the area of rough mereological theory of reasoning, as we know it, along with comments on some problems.

Lech Polkowski
Theoretical Study of Granular Computing

We propose a higher order logic called as the granular logic. This logic is introduced as a tool for investigating properties of granular computing. In particular, constants of this logic are of the form

m

(

F

), where

F

is a formula (e.g., Boolean combination of descriptors) in a given information system. Truth values of the granular formula are discussed. The truth value of a given formula in a given model is defined by a degree to which the meaning of this formula in the given model is close to the universe of objects. Our approach generalizes the rough truth concept introduced by Zdzisław Pawlak in 1987. We present an axiomatization of granular logic. The resolution reasoning in the axiomatic systems is illustrated by examples, and the resolution soundness is also proved.

Qing Liu, Hui Sun
Knowledge Discovery by Relation Approximation: A Rough Set Approach

In recent years, rough set theory [1] has attracted attention of many researchers and practitioners all over the world, who have contributed essentially to its development and applications. With many practical and interesting applications rough set approach seems to be of fundamental importance to AI and cognitive sciences, especially in the areas of machine learning, knowledge acquisition, decision analysis, knowledge discovery from databases, expert systems, inductive reasoning and pattern recognition [2].

Hung Son Nguyen

Rough Computing

Reduction-Based Approaches Towards Constructing Galois (Concept) Lattices

Galois (concept) lattices and formal concept analysis have been proved useful in the resolution of many problems of theoretical and practical interest. Recent studies have put the emphasis on the need for both efficient and flexible algorithms to construct the lattice. In this paper, the concept of attribute reduction of formal concept was proposed with its properties being discussed. The

CL

–Axiom and some equivalent conditions for an attributes subset to be a reduction of a formal concept are presented.

Jingyu Jin, Keyun Qin, Zheng Pei
A New Discernibility Matrix and Function

In the paper, we define a new discernibility matrix and function between two decision tables. They are extension of Hu’s improved discernibility matrix and function such that the reducts and the cores of decision tables could be calculated by parts of them. The method of new discernibility matrix and function may be applied to the cases of large amount of data and incremental data.

Dayong Deng, Houkuan Huang
The Relationships Between Variable Precision Value and Knowledge Reduction Based on Variable Precision Rough Sets Model

The variable precision rough sets (VPRS) model is parametric and there are many types of knowledge reduction. Among the present various algorithms,

β

is introduced as prior knowledge. In some applications, it is not clear how to set the parameter. For that reason, it is necessary to seek an approach to realize the estimation of

β

from the decision table, avoiding the influence of

β

apriority upon the result. By studying relative discernibility in measurement of decision table, it puts forward algorithm of the threshold value of decision table’s relative discernibility: choosing

β

within the interval of threshold value as a substitute for prior knowledge can get knowledge reduction sets under certain level of error classification, thus finally realizing self-determining knowledge reduction from decision table based on VPRS.

Yusheng Cheng, Yousheng Zhang, Xuegang Hu
On Axiomatic Characterization of Approximation Operators Based on Atomic Boolean Algebras

In this paper, we focus on the extension of the theory of rough set in lattice-theoretic setting. First we introduce the definition for generalized lower and upper approximation operators determined by mappings between two complete atomic Boolean algebras. Then we find the conditions which permit a given lattice-theoretic operator to represent a upper (or lower) approximation derived from a special mapping. Different sets of axioms of lattice-theoretic operator guarantee the existence of different types of mappings which produce the same operator.

Tongjun Li
Rough Set Attribute Reduction in Decision Systems

An important issue of knowledge discovery and data mining is the reduction of pattern dimensionality. In this paper, we investigate the attribute reduction in decision systems based on a congruence on the power set of attributes and present a method of determining congruence classifications. We can obtain the reducts of attributes in decision systems by using the classification. Moreover, we prove that the reducts obtained by the congruence classification coincide with the distribution reducts in decision systems.

Hongru Li, Wenxiu Zhang, Ping Xu, Hong Wang
A New Extension Model of Rough Sets Under Incomplete Information

The classical rough set theory based on complete information systems stems from the observation that objects with the same characteristics are indiscernible according to available information. With respect to upper-approximation and lower-approximation defined on an indiscernibility relation it classifies objects into different equivalent classes. But in some cases such a rigid indiscernibility relation is far from applications in the real world. Therefore, several generalizations of the rough set theory have been proposed some of which extend the indiscernibility relation using more general similarity or tolerance relations. For example, Kryszkiewicz [4] studied a tolerance relation, and Stefanowski [7] explored a non-symmetric, similarity relation and valued tolerance relation. Unfortunately, All the extensions mentioned above have their inherent limitations. In this paper, after discussing several extension models based on rough sets for incomplete information, a concept of constrained dissymmetrical similarity relation is introduced as a new extension of the rough set theory, the upper-approximation and the lower-approximation defined on constrained similarity relation are proposed as well. Furthermore, we present the comparison between the performance of these extended relations. Analysis of results shows that this relation works effectively in incomplete information and generates rational object classification.

Xuri Yin, Xiuyi Jia, Lin Shang
Applying Rough Sets to Data Tables Containing Possibilistic Information

Rough sets are applied to data tables containing possibilistic information. A family of weighted equivalence classes is obtained, in which each equivalence class is accompanied by a possibilistic degree to which it is an actual one. By using the family of weighted equivalence classes we can derive a lower approximation and an upper approximation. The lower approximation and the upper approximation coincide with those obtained from methods of possible worlds. Therefore, the method of weighted equivalence classes is justified.

Michinori Nakata, Hiroshi Sakai
Redundant Data Processing Based on Rough-Fuzzy Approach

In this paper, we will try to use fuzzy approach to deal with either incomplete or imprecise even ill-defined database and to use the concepts of rough sets to define equivalence class encoding input data, and eliminate redundant or insignificant attributes in data sets, and incorporate the significant factor of the input feature corresponding to output pattern classification to constitute a class membership function which enhances a mapping characteristic for each of object in the input space belonging to consequent class in the output space.

Huanglin Zeng, Hengyou Lan, Xiaohui Zeng
Further Study of the Fuzzy Reasoning Based on Propositional Modal Logic

The notion of the fuzzy assertion based on propositional modal logic is introduced and the properties of the fuzzy reasoning based on fuzzy assertions are studied. As an extending of the traditional semantics of modal logics, the fuzzy Kripke semantics is considered and a formal fuzzy reasoning system based on fuzzy constraint is established. In order to decide whether a fuzzy assertion is a logical consequence of a set of fuzzy assertions, the notion of the educed set based on fuzzy constraint is introduced and the relation between the fuzzy reasoning and the satisfiability of the educed set is revealed.

Zaiyue Zhang, Yuefei Sui, Cungen Cao
The M-Relative Reduct Problem

Since there may exist many relative reducts for a decision table, some attributes that are very important from the viewpoint of human experts may fail to be included in relative reduct(s) computed by certain reduction algorithms. In this paper we present the concepts of

M-relative reduct and core

where

M

is a user specified attribute set to deal with this problem.

M-relative reducts and cores

can be obtained using

M

-discernibility matrices and functions. Their relationships with traditional definitions of relative reduct and core are closely investigated.

Fan Min, Qihe Liu, Hao Tan, Leiting Chen
Rough Contexts and Rough-Valued Contexts

Formal Concept Analysis (FCA) is a method mainly used for the analysis of data, which identifies conceptual structures among data sets. Central to FCA is the notion of a formal context. In this paper, we mainly introduce some extended formal contexts to FCA in virtue of some methods from rough set theory. The definitions for formal concepts in these extended contexts and the basic properties about these extended contexts are also given.

Feng Jiang, Yuefei Sui, Cungen Cao
Combination Entropy and Combination Granulation in Incomplete Information System

Based on the intuitionistic knowledge content characteristic of information gain, the concepts of combination entropy

CE

(

A

) and combination granulation

CG

(

A

) in incomplete information system are introduced, their some properties are given. Furthermore, the relationship between combination entropy and combination granulation is established. These concepts and properties are all special instances of those in in complete information system. These results will be very helpful for understanding the essence of knowledge content and uncertainty measurement in incomplete information system.

Yuhua Qian, Jiye Liang
An Extension of Pawlak’s Flow Graphs

In knowledge discovery, Pawlak’s flow graph is a new mathematical model and has some distinct advantages. However, the flow graph can not effectively deal with some situations, such as estimating consistence and removing redundant attributes. A primary reason is that it is a quantitative graph and requires the network to be steady. Therefore, we propose an extension of the flow graph which takes objects flowing in network as its basis to study the relations among the information in this paper. It not only has the capabilities of the flow graph, but also can implement some functions as well as decision table.

Jigui Sun, Huawen Liu, Huijie Zhang
Rough Sets and Brouwer-Zadeh Lattices

Many researchers study rough sets from the point of view of description of the rough set pairs (a rough set pair is also called a rough set), i.e. <lower approximation set, upper approximation set>. In this paper, it is showed that the collection of all the rough sets in an approximation space can be made into a distributive Brouwer-Zadeh lattice. The induced Brouwer-Zadeh lattice from an approximation space is called the rough Brouwer-Zadeh lattice. The rough top equation and rough bottom equation problem is studied in the framework of rough Brouwer-Zadeh lattices.

Jianhua Dai, Weidong Chen, Yunhe Pan
Covering-Based Generalized Rough Fuzzy Sets

This paper presents a general framework for the study of covering-based rough fuzzy sets in which a fuzzy set can be approximated by some elements in a covering of the universe of discourse. Some basic properties of the covering-based lower and upper approximation operators are examined. The concept of reduction of a covering is also introduced. By employing the discrimination matric of the covering, we provide an approach to find the reduct of a covering of the universe. It is proved that the reduct of a covering is the minimal covering that generates the same covering-based fuzzy lower (or upper) approximation operator, so this concept is also a technique to get rid of redundancy in data mining. Furthermore, it is shown that the covering-based fuzzy lower and upper approximations determine each other.

Tao Feng, Jusheng Mi, Weizhi Wu
Axiomatic Systems of Generalized Rough Sets

Rough set theory was proposed by Pawlak to deal with the vagueness and granularity in information systems that are characterized by insufficient, inconsistent, and incomplete data. Its successful applications draw attentions from researchers in areas such as artificial intelligence, computational intelligence, data mining and machine learning. The classical rough set model is based on an equivalence relation on a set, but it is extended to generalized model based on binary relations and coverings. This paper reviews and summarizes the axiomatic systems for classical rough sets, generalized rough sets based on binary relations, and generalized rough sets based on coverings.

William Zhu, Feiyue Wang
Rough-Sets-Based Combustion Status Diagnosis

In this paper, we proposed a new method to diagnose the combustion status in the boiler. It was based on the rough sets theory, using image characteristics of the combustion in the boiler. We introduced the lightness threshold segmentation of the green channel with an improved polar coordinate method to reduce the effects of the background radiation and to assure the integrity of the flame core. In the diagnosis, the weight coefficients of the condition attributes to the decision-making attributes in the decision-making table are determined by the approximation set conception in the rough sets theory. At last, an experiment has been done with a group spot fire images gained from different combustion status, and compare the experiment results with the spot status. It shows that the method is feasible.

Gang Xie, Xuebin Liu, Lifei Wang, Keming Xie
Research on System Uncertainty Measures Based on Rough Set Theory

Due to various inherent uncertain factors, system uncertainty is an important intrinsic feature of decision information systems. It is important for data mining tasks to reasonably measure system uncertainty. Rough set theory is one of the most successful tools for measuring and handling uncertain information. Various methods based on rough set theory for measuring system uncertainty have been investigated. Their algebraic characteristics and quantitative relations are analyzed and disclosed in this paper. The results are helpful for selecting proper uncertainty measures or even developing new uncertainty measures for specific applications.

Jun Zhao, Guoyin Wang
Conflict Analysis and Information Systems: A Rough Set Approach

Conflict analysis and conflict resolution play an important role in negotiation during contract-management situations in government and industry. The problem to be solved is how to model conflict situations where there is uncertainty about agreement, neutrality and disagreement among agents in a conflict situation. The solution to this problem includes modeling a conflict situation relative to basic binary relations on a universe of agents, introducing a measure of the degree of conflict, and encapsulating a conflict situation in an information system. The basic approach to modeling conflict situations is illustrated in the context of contract negotiation during the initial phases of requirement negotiation for a systems engineering project. An example of a high-level requirements negotiation for an automated lighting system is presented. The contribution of this paper is a rough set based requirements determination model using a conflict relation for representing requirements agreements (or disagreements).

Andrzej Skowron, Sheela Ramanna, James F. Peters
A Novel Discretizer for Knowledge Discovery Approaches Based on Rough Sets

Knowledge discovery approaches based on rough sets have successful application in machine learning and data mining. As these approaches are good at dealing with discrete values, a discretizer is required when the approaches are applied to continuous attributes. In this paper, a novel adaptive discretizer based on a statistical distribution index is proposed to preprocess continuous valued attributes in an instance information system, so that the knowledge discovery approaches based on rough sets can reach a high decision accuracy. The experimental results on benchmark data sets show that the proposed discretizer is able to improve the decision accuracy.

Qingxiang Wu, Jianyong Cai, Girijesh Prasad, T. M. McGinnity, David Bell, Jiwen Guan
Function S-Rough Sets and Recognition of Financial Risk Laws

Recognition of financial risk (investment risk and profit risk) has attracted more and more attention of investors, because each investor is threaten by the financial risk. Function S-rough set (function singular rough set) has law characteristic and the law has heredity characteristic. Using function S-rough set, this paper advances the recognition of financial risk law and gives its recognition model and an application example. Function S-rough set is defined by

R

-function equivalence class [

u

],

u

i

∈[

u

] is a function (or a law). Function S-rough set is the general form of S-rough set (singular rough set) and S-rough set is the special case of function S-rough set. The results of this paper have lots of important applications.

Kaiquan Shi, Bingxue Yao
Knowledge Reduction in Incomplete Information Systems Based on Dempster-Shafer Theory of Evidence

Knowledge reduction is one of the main problems in the study of rough set theory. This paper deals with knowledge reduction in incomplete information systems based on Dempster-Shafer theory of evidence. The concepts of plausibility and belief consistent sets as well as plausibility and belief reducts in incomplete information systems are introduced. It is proved that a plausibility consistent set in an incomplete information system must be a consistent set and an attribute set in an incomplete information system is a belief reduct if and only if it is a classical reduct.

Weizhi Wu, Jusheng Mi
Decision Rules Extraction Strategy Based on Bit Coded Discernibility Matrix

The rationality of a reduction approach for decision rules with discernibility matrix is analyzed and proved true theoretically. And a rules extraction strategy based on bit-coded discernibility matrix is presented. By bit-coding the description of discernibility matrix, the information is depicted by a series of binary code, which makes it easy to actualize the algorithm on a computer. And then, a hybrid algorithm of rules extraction is presented.That means the attribute and rules reduction work synchronously. The results that is applied for the rules extraction of the cement kiln operation has shown that its efficiency and availability.

Yuxia Qiu, Keming Xie, Gang Xie
Attribute Set Dependence in Apriori-Like Reduct Computation

In the paper we propose a novel approach to finding rough set reducts in information systems. Our method combines an apriori-like scheme of space traversing with an efficient pruning condition based on attribute set dependence. Moreover, we discuss theoretical and implementational aspects of our pruning procedure.

Pawel Terlecki, Krzysztof Walczak
Some Methodological Remarks About Categorical Equivalences in the Abstract Approach to Roughness – Part I

The categorical equivalence of three different approaches to roughness is discussed: the one based on the notion of abstract rough approximation spaces, the second one based on the abstract topological notions of interior and closure, and the third one based on a very weak form of BZ lattice.

Gianpiero Cattaneo, Davide Ciucci
Some Methodological Remarks About Categorical Equivalences in the Abstract Approach to Roughness – Part II

In this paper, it is remarked that BZ lattice structures can recover several theoretical approaches to rough sets, englobing their individual richness in a unique structure. Rough sets based on a similarity relation are also considered, showing that the BZ lattice approach turns out to be even more useful, since enables one to define another rough approximation, which is better than the corresponding similarity one.

Gianpiero Cattaneo, Davide Ciucci
Lower Bounds on Minimal Weight of Partial Reducts and Partial Decision Rules

In this paper greedy algorithms with weights for construction of partial tests (partial superreducts) and partial decision rules are considered. Lower bounds on minimal weight of partial reducts and partial decision rules based on information about greedy algorithm work are obtained.

Mikhail Ju. Moshkov, Marcin Piliszczuk, Beata Zielosko
On Reduct Construction Algorithms

This paper critically analyzes reduct construction methods at two levels. At a high level, one can abstract commonalities from the existing algorithms, and classify them into three basic groups based on the underlying control structures. At a low level, by adopting different heuristics or fitness functions for attribute selection, one is able to derive most of the existing algorithms. The analysis brings new insights into the problem of reduct construction, and provides guidelines for the design of new algorithms.

Yiyu Yao, Yan Zhao, Jue Wang
Association Reducts: Boolean Representation

We investigate association reducts, which extend previously studied information and decision reducts in capability of expressing compound, multi-attribute dependencies in data. We provide Boolean, discernibility-based representation for most informative association reducts.

Dominik Ślęzak
Notes on Rough Sets and Formal Concepts

We introduce a general framework to compare and combine Formal Concept Analysis and Rough Set Systems and some mathematical properties and limits of application of some approaches are discussed.

Piero Pagliani

Evolutionary Computing

High Dimension Complex Functions Optimization Using Adaptive Particle Swarm Optimizer

Due to the existence of large numbers of local and global optima of high dimension complex functions, general particle swarm optimization methods are slow speed on convergence and easy to be trapped in local optima. In this paper, an adaptive particle swarm optimizer with a better search performance is proposed, which employ a novel dynamic inertia weight curves and mutate global optimum to plan large-scale space global search and refined local search as a whole according to the fitness change of swarm in optimization process of the functions, and to quicken convergence speed, avoid premature problem, economize computational expenses, and obtain global optimum. We test the proposed algorithm and compare it with other published methods on several high dimension complex functions, the experimental results demonstrate that this revised algorithm can rapidly converge at high quality solutions.

Kaiyou Lei, Yuhui Qiu, Xuefei Wang, He Yi
Adaptive Velocity Threshold Particle Swarm Optimization

Particle swarm optimization (PSO) is a new robust swarm intelligence technique, which has exhibited good performance on well-known numerical test problems. Though many improvements published aims to increase the computational efficiency, there are still many works need to do. Inspired by evolution programming theory, this paper proposes a new adaptive particle swarm optimization in which the velocity threshold dynamically changes during the course of a simulation. Seven benchmark functions are used to testify the new algorithm, and the results showed clearly the new adaptive PSO leads to a significantly better performance, although the performance improvements were found to be dependent on problems.

Zhihua Cui, Jianchao Zeng, Guoji Sun

Fuzzy Sets

Relationship Between Inclusion Measure and Entropy of Fuzzy Sets

Inclusion measure and entropy of fuzzy sets are two basic concepts in fuzzy set theory. In this paper, we investigate the relationship between inclusion measure and entropy of fuzzy sets in detail, propose two theorems that inclusion measure and entropy of fuzzy sets can be transformed by each other based on their axiomatic definitions and give some formulas to calculate inclusion measure and entropy of fuzzy sets.

Wenyi Zeng, Qilei Feng, HongXing Li
A General Model for Transforming Vague Sets into Fuzzy Sets

The relationship of vague sets and fuzzy sets is analyzed and the problem of transforming vague sets into fuzzy sets is studied in this paper. It is found to be a many-to-one mapping relation to transform a vague set into a fuzzy set. A general model for transforming vague sets into fuzzy sets is proposed. The two transforming methods proposed by Fan Li in [1] are proved to be two special cases of this general transforming model.

Yong Liu, Guoyin Wang, Lin Feng
An Iterative Method for Quasi-Variational-Like Inclusions with Fuzzy Mappings

This paper presents an iterative method for solving a class of generalized quasi-variational-like inclusions with fuzzy mappings. The method employs step size controls that enable applications to problems where certain set-valued mappings do not always map to empty set. The algorithm also adopts the recently introduced (

H

,

η

)-monotone concept which unifies many known monotonicities. Thus generalized many existing results.

Yunzhi Zou, Nanjing Huang

Granular Computing

Application of Granular Computing in Knowledge Reduction

Skowron’s discernibility matrix is one of representative approaches in computing relative core and relative reducts, while redundant information is also involved. To decrease the complexity of computation, the idea of granular computing is applied to lower the rank of discernibility matrix. In addition, the absorptivity based on bit-vector computation is proposed to simplify computation of relative core and relative reducts.

Lai Wei, Duoqian Miao
Advances in the Quotient Space Theory and Its Applications

The quotient space theory uses a triplet, including the universe, its structure and attributes, to describe a problem space or simply a space. This paper improves the quotient space’s model so as to absorb the methods of rough set. It also generalizes the false-preserving principle and true-preserving principle to the case of probability. Some basic operations on quotient space are introduced. The significant properties of the fuzzy quotient space family are elaborated. The main applications of quotient space theory are discussed.

Li-Quan Zhao, Ling Zhang
The Measures Relationships Study of Three Soft Rules Based on Granular Computing

Granular computing is a new soft computing method. In this paper, the bit representation of granular computing and inclusion measures are used to analyze three soft rules of association rules, decision rules and extensional functional dependencies, and their measures relationships are studied as well. Concretely, some basic concepts were given. The support and the confidence of association rules, the degree of functional dependencies on the decision rules and the degree of extensional functional dependencies are discussed respectively. The measures relationships among the three soft rules are investigated by inclusion measures and granular computing. As a consequence, the united model of these measures is established.

Qiusheng An, WenXiu Zhang

Neural Computing

A Generalized Neural Network Architecture Based on Distributed Signal Processing

In this paper, an unstructured neural network based on the mathematics of holographic storage is presented. While the holographic process is analyzed by the distributed signal processing principles, the neural network architecture is adapted to the generalized support vector machine. This work is inspired by similarities between brain waves and the wave propagation and subsequent interference patterns seen in holograms. Then the mathematics to produce a general mathematical description of the holographic process is analyzed. From this analysis it is shown that how the holographic process can be used as an associative memory network. This aspect, makes this neural network formation process particularly useful for control.

Askin Demirkol
Worm Harm Prediction Based on Segment Procedure Neural Networks

This paper deals with the application of segment procedure neural networks to predict harm status of horsetail-pine worm. A novel procedure neural networks is proposed to solve those problems which are related to certain distinct segments of procedure. It is indicated that this model is a generalized form of the known procedure neural networks, and it owns all properties of the known model. This paper also presents learning algorithms for the segment procedure neural networks. Horsetail-pine worm forecast is a hard work for forest experts, but it is a typical segment procedure problem. In this paper a segment procedure neural networks is applied to deal with this issue, and some simulation experiment results are presented.

Jiuzhen Liang, Xiaohong Wu
Accidental Wow Defect Evaluation Using Sinusoidal Analysis Enhanced by Artificial Neural Networks

A method for evaluation of parasitic frequency modulation (wow) in archival audio is presented. The proposed approach utilizes sinusoidal components tracking as their variations correspond with the wow defect. The sinusoidal modeling procedures are used to extract the tonal components from severely distorted and significantly modulated audio signals. A prediction module based on neural networks is proposed to improve the tonal components tracking.

Andrzej Czyzewski, Bozena Kostek, Przemyslaw Maziewski, Lukasz Litwic
A Constructive Algorithm for Training Heterogeneous Neural Network Ensemble

This paper presents a new algorithm to construct a neural network ensemble (NNE) based on heterogeneous component neural networks with negative correlation learning. The constructive algorithm consists of two parts: a sub-algorithm to construct best heterogeneous component neural networks with negative correlation learning dynamically (CBHNN), and a sub-algorithm to construct heterogeneous NNE with trained heterogeneous neural networks incrementally (CHNNE). The experiment results showe that HNNE is better than the traditional homological NNE method.

Xianghua Fu, Zhiqiang Wang, Boqin Feng

Machine Learning and KDD

Gene Regulatory Network Construction Using Dynamic Bayesian Network (DBN) with Structure Expectation Maximization (SEM)

Discovering gene relationship from gene expression data is a hot topic in the post-genomic era. In recent years, Bayesian network has become a popular method to reconstruct the gene regulatory network due to the statistical nature. However, it is not suitable for analyzing the time-series data and cannot deal with cycles in the gene regulatory network. In this paper we apply the dynamic Bayesian network to model the gene relationship in order to overcome these difficulties. By incorporating the structural expectation maximization algorithm into the dynamic Bayesian network model, we develop a new method to learn the regulatory network from the

S.Cerevisiae

cell cycle gene expression data. The experimental results demonstrate that the accuracy of our method outperforms the previous work.

Yu Zhang, Zhidong Deng, Hongshan Jiang, Peifa Jia
Mining Biologically Significant Co-regulation Patterns from Microarray Data

In this paper, we propose a novel model, namely g-Cluster, to mine biologically significant co-regulated gene clusters. The proposed model can (1) discover extra co-expressed genes that cannot be found by current pattern/tendency-based methods, and (2) discover inverted relationship overlooked by pattern/tendency-based methods. We also design two tree-based algorithms to mine all qualified g-Clusters. The experimental results show: (1) our approaches are effective and efficient, and (2) our approaches can find an amount of co-regulated gene clusters missed by previous models, which are potentially of high biological significance.

Yuhai Zhao, Ying Yin, Guoren Wang
Fast Algorithm for Mining Global Frequent Itemsets Based on Distributed Database

There were some traditional algorithms for mining global frequent itemsets. Most of them adopted Apriori-like algorithm frameworks. This resulted a lot of candidate itemsets, frequent database scans and heavy communication traffic. To solve these problems, this paper proposes a fast algorithm for mining global frequent itemsets, namely the FMGFI algorithm. It can easily get the global frequency for any itemsets from the local FP-tree and require far less communication traffic by the searching strategies of top-down and bottom-up. It effectively reduces existing problems of most algorithms for mining global frequent itemsets. Theoretical analysis and experimental results suggest that the FMGFI algorithm is fast and effective.

Bo He, Yue Wang, Wu Yang, Yuan Chen
A VPRSM Based Approach for Inducing Decision Trees

This paper presents a new approach for inducing decision trees based on Variable Precision Rough Set Model(VPRSM). From the Rough Set theory point of view, in the process of inducing decision trees, some methods, such as information entropy based methods, emphasize the effect of class distribution. The more unbalanced the class distribution is, the more favorable it is. Whereas the Rough Set based approaches for inducing decision trees emphasize the effect of certainty. The more certain it is, the better it is. Two main concepts, i.e. variable precision explicit region, variable precision implicit region, and the process for inducing decision trees are introduced and discussed in the paper. The comparison between the presented approach and C4.5 on some data sets from the UCI Machine Learning Repository is also reported.

Shuqin Wang, Jinmao Wei, Junping You, Dayou Liu
Differential Evolution Fuzzy Clustering Algorithm Based on Kernel Methods

A new fuzzy clustering algorithm is proposed. By using kernel methods, this paper maps the data in the original space into a high-dimensional feature space in which a fuzzy dissimilarity matrix is constructed. It not only accurately reflects the difference of attributes among classes, but also maps the difference among samples in the high-dimensional feature space into the two-dimensional plane. Using the particularity of strong global search ability and quickly converging speed of Differential Evolution (DE) algorithms, it optimizes the coordinates of the samples distributed randomly on a plane. The clustering for random distributing shapes of samples is realized. It not only overcomes the dependence of clustering validity on the space distribution of samples, but also improves the flexibility of the clustering and the visualization of high-dimensional samples. Numerical experiments show the effectiveness of the proposed algorithm.

Libiao Zhang, Ming Ma, Xiaohua Liu, Caitang Sun, Miao Liu, Chunguang Zhou
Classification Rule Mining Based on Particle Swarm Optimization

The Particle Swarm Optimization(PSO) algorithm,is a robust stochastic evolutionary algorithm based on the movement and intelligence of swarms. In this paper, a PSO-based algorithm for classification rule mining is presented. Compared with the Ant-Miner and ESIA in public domain data sets,the proposed method achieved higher predictive accuracy and much smaller rule list than Ant-Miner and ESIA.

Ziqiang Wang, Xia Sun, Dexian Zhang
A Bottom-Up Distance-Based Index Tree for Metric Space

Similarity search is of importance in many new database applications. These operations can generally be referred as similarity search in metric space. In this paper, a new index construction algorithm is proposed for similarity search in metric space. The new data structure, called

bu

-tree (bottom-up tree), is based on constructing the index tree from bottom-up, rather than the traditional top-down approaches. The construction algorithm of

bu

-tree and the range search algorithm based on it are given in this paper. And the update to

bu

-tree is also discussed. The experiments show that

bu

-tree is better than

sa

-tree in search efficiency, especially when the objects are not uniform distributed or the query has low selectivity.

Bing Liu, Zhihui Wang, Xiaoming Yang, Wei Wang, Baile Shi
Subsequence Similarity Search Under Time Shifting

Time series data naturally arise in many application domains, and the similarity search for time series under dynamic time shifting is prevailing. But most recent research focused on the full length similarity match of two time series. In this paper a basic subsequence similarity search algorithm based on dynamic programming is proposed. For a given query time series, the algorithm can find out the most similar subsequence in a long time series. Furthermore two improved algorithms are also given in this paper. They can reduce the computation amount of the distance matrix for subsequence similarity search. Experiments on real and synthetic data sets show that the improved algorithms can significantly reduce the computation amount and running time comparing with the basic algorithm.

Bing Liu, Jianjun Xu, Zhihui Wang, Wei Wang, Baile Shi
Developing a Rule Evaluation Support Method Based on Objective Indices

In this paper, we present an evaluation of a rule evaluation support method for post-processing of mined results with rule evaluation models based on objective indices. To reduce the costs of rule evaluation task, which is one of the key procedures in data mining post-processing, we have developed the rule evaluation support method with rule evaluation models, which are obtained with objective indices of mined classification rules and evaluations of a human expert for each rule. Then we have evaluated performances of learning algorithms for constructing rule evaluation models on the meningitis data mining as an actual problem and five rulesets from the five kinds of UCI datasets. With these results, we show the availability of our rule evaluation support method.

Hidenao Abe, Shusaku Tsumoto, Miho Ohsaki, Takahira Yamaguchi
Data Dimension Reduction Using Rough Sets for Support Vector Classifier

This paper proposes an application of rough sets as a data preprocessing front end for support vector classifier (SVC). A novel multi-class support vector classification strategy based on binary tree is also presented. The binary tree extends the pairwise discrimination capability of the SVC to the multi-class case naturally. Experimental results on benchmark datasets show that proposed method can reduce computation complexity without decreasing classification accuracy compare to SVC without data preprocessing.

Genting Yan, Guangfu Ma, Liangkuan Zhu
A Comparison of Three Graph Partitioning Based Methods for Consensus Clustering

Consensus clustering refers to combining multiple clusterings over a common dataset into a consolidated better one. This paper compares three graph partitioning based methods. They differ in how to summarize the clustering ensemble in a graph. They are evaluated in a series of experiments, where component clusterings are generated by tuning parameters controlling their quality and resolution. Finally the combination accuracy is analyzed as a function of the learning dynamics vs. the number of clusterings involved.

Tianming Hu, Weiquan Zhao, Xiaoqiang Wang, Zhixiong Li
Feature Selection, Rule Extraction, and Score Model: Making ATC Competitive with SVM

Many studies have shown that association-based classification can achieve higher accuracy than traditional rule based schemes. However, when applied to text classification domain, the high dimensionality, the diversity of text data sets and the class skew make classification tasks more complicated. In this study, we present a new method for associative text categorization tasks. First,we integrate the feature selection into rule pruning process rather than a separate preprocess procedure. Second, we combine several techniques to efficiently extract rules. Third, a new score model is used to handle the problem caused by imbalanced class distribution. A series of experiments on various real text corpora indicate that by applying our approaches, associative text classification (ATC) can achieve as competitive classification performance as well-known support vector machines (SVM) do.

Tieyun Qian, Yuanzhen Wang, Langgang Xiang, WeiHua Gong
Relevant Attribute Discovery in High Dimensional Data: Application to Breast Cancer Gene Expressions

In many domains, the data objects are described in terms of a large number of features. The pipelined data mining approach introduced in [1] using two clustering algorithms in combination with rough sets and extended with genetic programming, is investigated with the purpose of discovering important subsets of attributes in high dimensional data. Their classification ability is described in terms of both collections of rules and analytic functions obtained by genetic programming (gene expression programming). The Leader and several k-means algorithms are used as procedures for attribute set simplification of the information systems later presented to rough sets algorithms. Visual data mining techniques including virtual reality were used for inspecting results. The data mining process is setup using high throughput distributed computing techniques. This approach was applied to Breast Cancer microarray data and it led to subsets of genes with high discrimination power with respect to the decision classes.

Julio J. Valdés, Alan J. Barton
Credit Risk Evaluation with Least Square Support Vector Machine

Credit risk evaluation has been the major focus of financial and banking industry due to recent financial crises and regulatory concern of Basel II. Recent studies have revealed that emerging artificial intelligent techniques are advantageous to statistical models for credit risk evaluation. In this study, we discuss the use of least square support vector machine (LSSVM) technique to design a credit risk evaluation system to discriminate good creditors from bad ones. Relative to the Vapnik’s support vector machine, the LSSVM can transform a quadratic programming problem into a linear programming problem thus reducing the computational complexity. For illustration, a published credit dataset for consumer credit is used to validate the effectiveness of the LSSVM.

Kin Keung Lai, Lean Yu, Ligang Zhou, Shouyang Wang
The Research of Sampling for Mining Frequent Itemsets

Efficiently mining frequent itemsets is the key step in extracting association rules from large scale databases. Considering the restriction of min_support in mining association rules, a weighted sampling algorithm for mining frequent itemsets is proposed in the paper. First of all, a weight is given to each transaction data. Then according to the statistical optimal sample size of database, a sample is extracted based on weight of data. In terms of the algorithm, the sample includes large amounts of transaction data consisting of the frequent itemsets with many items inside, so that the frequent itemsets mined from sample are similar to those gained from the original data. Furthermore, the algorithm can shrink the sample size and guarantee the sample quality at the same time. The experiment verifys the validity.

Xuegang Hu, Haitao Yu
ECPIA: An Email-Centric Personal Intelligent Assistant

In this paper, we describe ECPIA (Email-Centric Personal Intelligent Assistant), which provides Web-based environment to support the activities of a major time sink of our daily lives – the processing of emails. The design of the system is with an agent-based infrastructure. In addition to capabilities that an email client should provide, the novel features in ECPIA as a personal assistant include (1) user behavior analysis for publishing bulletins, making appointments, multi-filters and prioritizing emails; (2) ontology and multiple filtering agents based email management for blocking junk mails.

Wenbin Li, Ning Zhong, Chunnian Liu
A Novel Fuzzy C-Means Clustering Algorithm

This paper proposes a novel fuzzy c-means clustering algorithm which treats attributes differently. Moreover, by analyzing the Hessian Matrix of the new algorithm’s objective function, we get a rule of parameters’ selection. The experiments demonstrate the validity of the new algorithm and the guideline for the parameters’ selection.

Cuixia Li, Jian Yu
Document Clustering Based on Modified Artificial Immune Network

The aiNet is one of artificial immune system algorithms which exploits the features of nature immune system. In this paper, aiNet is modified by integrating K-means and Principal Component Analysis and used to more complex tasks of document clustering. The results of using different coded feature vectors–binary feature vectors and real feature vectors for documents are compared. PCA is used as a way of reducing the dimension of feature vectors. The results show that it can get better result by using aiNet with PCA and real feature vectors.

Lifang Xu, Hongwei Mo, Kejun Wang, Na Tang
A Novel Approach to Attribute Reduction in Concept Lattices

Concept lattice is an effective tool for data analysis and knowledge discovery. Since one of the key problems of knowledge discovery is knowledge reduction, it is very necessary to look for a simple and effective approach to knowledge reduction. In this paper, we develop a novel approach to attribute reduction by defining a partial relation and partial classes to generate concepts and introducing the notion of meet-irreducible element in concept lattice. Some properties of meet-irreducible element are presented. Furthermore, we analyze characteristics of attributes and obtain sufficient and necessary conditions of the characteristics of attributes. In addition, we illustrate that adopting partial classes to generate concepts and the approach to attribute reduction are simpler and more convenient compared with current approaches.

Xia Wang, Jianmin Ma
Granule Sets Based Bilevel Decision Model

Bilevel decision addresses the problem in which two levels of decision makers act and react in an uncooperative, sequential manner, and each tries to optimize their individual objectives under constraints. Such a bilevel optimization structure appears naturally in many aspects of planning, management and policy making. There are two kinds of bilevel decision models already presented, which are traditional bilevel decision models and rule sets based bilevel decision models. Based on the two kinds of models, granule sets based bilevel decision models are developed in this paper. The models can be viewed as extensions of the former two models, and they can describe more bilevel decision making problems and possess some new advantages. We also discuss the comparison of the three models and present some new topics in this research field.

Zheng Zheng, Qing He, Zhongzhi Shi
An Enhanced Support Vector Machine Model for Intrusion Detection

Design and implementation of intrusion detection systems remain an important research issue in order to maintain proper network security. Support Vector Machines (SVM) as a classical pattern recognition tool have been widely used for intrusion detection. However, conventional SVM methods do not concern different characteristics of features in building an intrusion detection system. We propose an enhanced SVM model with a weighted kernel function based on features of the training data for intrusion detection. Rough set theory is adopted to perform a feature ranking and selection task of the new model. We evaluate the new model with the KDD dataset and the UNM dataset. It is suggested that the proposed model outperformed the conventional SVM in precision, computation time, and false negative rate.

JingTao Yao, Songlun Zhao, Lisa Fan
A Modified K-Means Clustering with a Density-Sensitive Distance Metric

The K-Means clustering is by far the most widely used method for discovering clusters in data. It has a good performance on the data with compact super-sphere distributions, but tends to fail in the data organized in more complex and unknown shapes. In this paper, we analyze in detail the characteristic property of data clustering and propose a novel dissimilarity measure, named density-sensitive distance metric, which can describe the distribution characteristic of data clustering. By using this dissimilarity measure, a density-sensitive K-Means clustering algorithm is given, which has the ability to identify complex non-convex clusters compared with the original K-Means algorithm. The experimental results on both artificial data sets and real-world problems assess the validity of the algorithm.

Ling Wang, Liefeng Bo, Licheng Jiao
Swarm Intelligent Tuning of One-Class ν-SVM Parameters

The problem of kernel parameters selection for one-class classifier,

ν

-SVM, is studied. An improved constrained particle swarm optimization (PSO) is proposed to optimize the RBF kernel parameters of the

ν

-SVM and two kinds of flexible RBF kernels are introduced. As a general purpose swarm intelligent and global optimization tool, PSO do not need the classifier performance criterion to be differentiable and convex. In order to handle the parameter constraints involved by the

ν

-SVM, the improved constrained PSO utilizes the punishment term to provide the constraints violation information. Application studies on an artificial banana dataset the efficiency of the proposed method.

Lei Xie
A Generalized Competitive Learning Algorithm on Gaussian Mixture with Automatic Model Selection

Derived from regularization theory, an adaptive entropy regularized likelihood (ERL) learning algorithm is presented for Gaussian mixture modeling, which is then proved to be actually a generalized competitive learning. The simulation experiments demonstrate that our adaptive ERL learning algorithm can make the parameter estimation with automatic model selection for Gaussian mixture even when two or more Gaussians are overlapped in a high degree.

Zhiwu Lu, Xiaoqing Lu
The Generalization Performance of Learning Machine with NA Dependent Sequence

The generalization performance is the main purpose of machine learning theoretical research. This note mainly focuses on a theoretical analysis of learning machine with negatively associated dependent input sequence. The explicit bound on the rate of uniform convergence of the empirical errors to their expected error based on negatively associated dependent input sequence is obtained by the inequality of Joag-dev and Proschan. The uniform convergence approach is used to estimate the convergence rate of the sample error of learning machine that minimize empirical risk with negatively associated dependent input sequence. In the end, we compare these bounds with previous results.

Bin Zou, Luoqing Li, Jie Xu
Using RS and SVM to Detect New Malicious Executable Codes

A hybrid algorithm based on attribute reduction of Rough Sets(RS) and classification principles of Support Vector Machine (SVM) to detect new malicious executable codes is present. Firstly, the attribute reduction of RS has been applied as preprocessor so that we can delete redundant attributes and conflicting objects from decision making table but remain efficient information lossless. Then, we realize classification modeling and forecasting test based on SVM. By this method, we can reduce the dimension of data, decrease the complexity in the process. Finally, comparison of detection ability between the above detection method and others is given. Experiment result shows that the present method could effectively use to discriminate normal and abnormal executable codes.

Boyun Zhang, Jianping Yin, Jinbo Hao
Applying PSO in Finding Useful Features

In data mining and knowledge discovery, the curse of dimensionality is a damning factor for numerous potentially powerful machine learning techniques, while rough set theory can be employed to reduce the dimensionality of datasets as a preprocessing step. For rough set based methods, finding reducts is an essential step, yet it is of high complexity. In this paper, based on particle swarm optimization(PSO) which is an optimization algorithm inspired by social behavior of flocks of birds when they are searching for food, a novel method is proposed for finding useful features instead of reducts in rough set theory. Subsequent experiments on UCI show that this method performs well on whole convergence, and can retrieve useful subsets effectively while retaining attributes of high importance as possible.

Yongsheng Zhao, Xiaofeng Zhang, Shixiang Jia, Fuzeng Zhang

Logics and Reasoning

Generalized T-norm and Fractional “AND” Operation Model

In the process of uncertainties reasoning with universal logic, T-norm is the mathematical model of “AND” operation. T-norm and T-generator were defined on interval [0,1] in previous work. In the recent relational work, authors put forward fractional logic based on continuous radix [a, b]. This paper studied the T-norm and T-generator on any interval [a, b], discussed the two kinds of generalized T-generators: “Automorphic increase T-generator” and “Infinite decrease T-generator”. Authors found and proved the useful and important theorem: “generating theorem of generalized T-norm”. Using the integrated clusters of generalized T-norm and T-generator, authors gave the mathematical generating method for “AND” operation model of fractional logic based on any interval [a, b]. The operation model is already used to uncertainties reasoning and flexible control now.

Zhicheng Chen, Mingyi Mao, Huacan He, Weikang Yang
Improved Propositional Extension Rule

Method based on extension rule is a new method for theorem proving, whether or not it will behave well in theorem proving depends on the efficiency. Moreover, the efficiency of propositional extension rule will affect that of first order extension rule directly. Thus the efficiency of the propositional extension rule is very important. ER and IER are two extension rule methods Lin gave. We have improved the ER method before. In order to increase the efficiency of IER, this paper improves IER by some reduction rules. And then the soundness and completeness of it is proved. We also report some preliminary computational results.

Xia Wu, Jigui Sun, Shuai Lu, Ying Li, Wei Meng, Minghao Yin
Web Services-Based Digital Library as a CSCL Space Using Case-Based Reasoning

This study proposes a Web Services-based Digital Libraries (DLs) using Case-based Reasoning as a space for collaborative learning. In the Digital Library environment, cases were designed on the basis of a list of loaned books and personal information. Using those data, a degree of preference was computed and Case-based Reasoning was used to compare among the cases in the case base. The proposed system recommends suitable communities to an individual user based on his or her personal preferences. As a result, DLs can play a role as a computer-supported collaborative learning space in order to provide more plentiful and useful information to users. In addition, this study demonstrates that DLs can be effectively expanded by using Web-Services techniques and Case-based Reasoning.

Soo-Jin Jun, Sun-Gwan Han, Hae-Young Kim
Using Description Logic to Determine Seniority Among RB-RBAC Authorization Rules

Rule-Based RBAC (RB-RBAC) provides the mechanism to dynamically assign users to roles based on authorization rules defined by security policy. In RB-RBAC, seniority levels of rules are also introduced to express domination relationship among rules. Hence, relations among attribute expressions may be quite complex and security officers may perform incorrect or unintended assignments if they are not aware of such relations behind authorization rules. We proposed a formalization of RB-RBAC by description logic. A seniority relation determination method is developed based on description logic reasoning services. This method can find out seniority relations efficiently even for rules without identical syntax structures.

Qi Xie, Dayou Liu, Haibo Yu
The Rough Logic and Roughness of Logical Theories

Tuples in an information system are taken as terms in a logical system, attributes as function symbols, a tuple taking a value at an attribute as an atomic formula. In such a way, an information system is represented by a logical theory in a logical language. The roughness of an information system is represented by the roughness of the logical theory, and the roughness of logical theories is a generalization of that of information systems. A logical theory induces an indiscernibility relation on the Herbrand universe of the logical language, the set of all the ground terms. It is imaginable that there is some connection between the logical implication of logical theories and the refinement of indiscernibility relations induced by the logical theories. It shall be proved that there is no such a connection of simple form.

Cungen Cao, Yuefei Sui, Zaiyue Zhang

Multiagent Systems and Web Intelligence

Research on Multi-Agent Service Bundle Middleware for Smart Space

Ubiquitous computing as the integration of sensors, smart devices, and intelligent technologies to form a “smart space” environment relies on the development of both middleware and networking technologies. To realize the environments, it is important to reduce the cost to develop various pervasive computing applications by encapsulating complex issues in middleware infrastructures. We propose a multi-agent-based middleware infrastructure suitable for the smart space: MASBM (Multi-Agent Service Bundle Middleware) which is capable of making it easy to develop pervasive computing applications. We conclude with the initial implementation results and lessons learned from MASAM.

Minwoo Son, Dongkyoo Shin, Dongil Shin
A Customized Architecture for Integrating Agent Oriented Methodologies

While multi-agent systems seem to provide a good basis to build complex system, the variety of agent-oriented(AO) methodologies may become a problem for developer when it comes to select the best-suited methodology for a given application domain. To solve the problem, a development architecture is proposed to blend various AO methodologies, which can empower developer to assemble a methodology tailored to the given project by putting appropriate models together. To verify its validity, we derive a new approach from the architecture in the research project for the construction of C4I system on naval warship.

Xiao Xue, Dan Dai, Yiren Zou
A New Method for Focused Crawler Cross Tunnel

Focused crawlers are programs designed to selectively retrieve Web pages relevant to a specific domain for the use of domain-specific search engines. Tunneling is a heuristic-based method that solves global optimization problem. In this paper we use content block algorithm to enhance focused crawler’s ability of traversing tunnel. The novel Algorithm not only avoid granularity too coarse when evaluation on the whole page but also avoid granularity too fine based on link-context. A comprehensive experiment has been conducted, the result shows obviously that this approach outperforms BestFirst and Anchor text algorithm both in harvest ratio and efficiency.

Na Luo, Wanli Zuo, Fuyu Yuan, Changli Zhang
Migration of the Semantic Web Technologies into E-Learning Knowledge Management

The Semantic Web builds a scenario of a new web based architecture that contains content with formal semantics, which can enhance the navigation and discovery of content. As a result, the Semantic Web represents a promising technology for realizing the e-Learning requirement. In this paper, we present our approach for migrating the Semantic Web technologies into the knowledge management in the e-Learning environment. Based on the semantic layer, our e-Learning framework provides dynamic knowledge management and representation, including tightly integration with the related e-Learning standards.

Baolin Liu, Bo Hu
Opponent Learning for Multi-agent System Simulation

Multi-agent reinforcement learning is a challenging issue in artificial intelligence researches. In this paper, the reinforcement learning model and algorithm in multi-agent system simulation context are brought forward. We suggest and validate an opponent modeling learning to the problem of finding good policies for agents accommodated in an adversarial artificial world. The feature of the algorithm exhibits in that when in a multi-player adversarial environment the immediate reward depends on not only agent’s action choose but also its opponent’s trends. Experiment results show that the learning agent finds optimal policies in accordance with the reward functions provided.

Ji Wu, Chaoqun Ye, Shiyao Jin

Pattern Recognition

A Video Shot Boundary Detection Algorithm Based on Feature Tracking

Partitioning a video sequence into shots is the first and key step toward video-content analysis and content-based video browsing and retrieval. A novel video shot boundary detection algorithm is presented based on the feature tracking. First, the proposed algorithm extracts a set of corner-points as features from the first frame of a shot. Then, based on the Kalman filtering, these features are tracked with windows matching method from the subsequent frames. According to the characteristic pattern of pixels intensity changing between corresponding windows, the measure of shot boundary detection can be obtained to confirm the types of transitions and the time interval of gradual transitions. The experimental results illustrate that the proposed algorithm is effective and robust with low computational complexity.

Xinbo Gao, Jie Li, Yang Shi
Curvelet Transform for Image Authentication

In this paper, we propose a new image authentication algorithm using curvelet transform. In our algorithm, we apply ridgelet transform to each block which is subbanded from the image after wavelet transform. Experimental results demonstrate this algorithm has good property to localize tampering, and robust to JPEG compression.

Jianping Shi, Zhengjun Zhai
An Image Segmentation Algorithm for Densely Packed Rock Fragments of Uneven Illumination

Uneven illumination creates difficulty for image processing and segmentation in general. This paper shows that an algorithm technique involving image classification and valley-edge based fragment delineation is a highly efficient way of delineating densely packed rock fragments for the images of uneven illumination. The result shows that it is not affected much by fragment surface noise and image uneven illumination. It is robust for densely packed rock fragments.

Weixing Wang
A New Chaos-Based Encryption Method for Color Image

The methods of conventional encryption cannot be applicable to images for the resistance to statistic attack, differential attack and grey code attack. In this paper, the confusion is improved in terms of chaotic permutation with ergodic matrix, and the diffusion is implemented through a new chaotic dynamic system incorporated with a S-box algebraic operation and a ’XOR plus mod’ operation, which greatly enhances the practical security of the system with a little computational expense, and a key scheme is also proposed. Experimental and theoretical results also show that our scheme is efficient and very secure.

Xiping He, Qingsheng Zhu, Ping Gu
Support Vector Machines Based Image Interpolation Correction Scheme

A novel error correction scheme for image interpolation algorithms based on support vector machines (SVMs) is proposed. SVMs are trained with the interpolation error distribution of down-sampled interpolated image to estimate interpolation error of the source image. Interpolation correction is employed to the interpolated result of source image with SVMs regression to obtain more accuracy result image. Error correction results of linear, cubic and warped distance adaptive interpolation algorithms demonstrate the effectiveness of the scheme.

Liyong Ma, Jiachen Ma, Yi Shen
Pavement Distress Image Automatic Classification Based on DENSITY-Based Neural Network

This study proposes an integrated neural network-based crack imaging system to classify crack types of digital pavement images, which was named DENSITY-based neural network(DNN).The neural network was developed to classify various crack types based on the subimages (crack tiles) rather than crack pixels in digital pavement images. The spatial neural network was trained using artificially generated data following the Federal Highway Administration (FHWA) guidelines. The optimal architecture of each neural network was determined based on the testing results from different sets of the number of hidden units, and the number of training epochs. To validate the system, computer- generated data as well as the actual pavement pictures taken from pavements were used. The final result indicates that the DNN produced the best results with the accuracy of 99.50% for 1591 computer-generated data and 97.59% for 83 actual pavement pictures. The experimental results have demonstrated that DNN is quite effective in classifying crack type, which will be useful for pavement management.

Wangxin Xiao, Xinping Yan, Xue Zhang
Towards Fuzzy Ontology Handling Vagueness of Natural Languages

At the moment ontology-based applications do not provide a solution to handle vague information. Recently, some tentatives have been made to integrate fuzzy set theory in ontology domain. This paper presents an approach to handle the nuances of natural languages (i.e. adjectives, adverbs) in the fuzzy ontologies context. On the one hand, we handle query-processing to evaluate vague information. On the other hand, we manage the knowledge domain extending ontology properties with

quality

concepts.

Stefania Bandini, Silvia Calegari, Paolo Radaelli
Evoked Potentials Estimation in Brain-Computer Interface Using Support Vector Machine

The single-trial Visual Evoked Potentials estimation of brain-computer interface was investigated. Communication carriers between brain and computer were induced by ”imitating-human-natural-reading” paradigm. With carefully signal preprocess and feature selection procedure, we explored the single-trial estimation of EEG using

ν

-support vector machines in six subjects, and by comparison the results using P300 features from channel Fz and Pz, gained a satisfied classification accuracy of 91.3%, 88.9%, 91.5%, 92.1%, 90.2% and 90.1% respectively. The result suggests that the experimental paradigm is feasible and the speed of our mental speller can be boosted.

Jin-an Guan
Intra-pulse Modulation Recognition of Advanced Radar Emitter Signals Using Intelligent Recognition Method

A new method is proposed to solve the difficult problem of advanced radar emitter signal (RES) recognition. Different from traditional five-parameter method, the method is composed of feature extraction, feature selection using rough set theory and combinatorial classifier. Support vector clustering, support vector classification and Mahalanobis distance are integrated to design an efficient combinatorial classifier. 155 radar emitter signals with 8 intra-pulse modulations are used to make simulation experiments. It is proved to be a valid and practical method.

Gexiang Zhang
Multi-objective Blind Image Fusion

Based on multi-objective optimization, a novel approach to blind image fusion (without the reference image) is presented in this paper, which can achieve the optimal fusion indices through optimizing the fusion parameters. First the proper evaluation indices of blind image fusion are given; then the fusion model in DWT domain is established; and finally the adaptive multi-objective particle swarm optimization (AMOPSO-II) is proposed and used to search the fusion parameters. AMOPSO-II not only uses an adaptive mutation and an adaptive inertia weight to raise the search capacity, but also uses a new crowding operator to improve the distribution of nondominated solutions along the Pareto front. Results show that AMOPSO-II has better exploratory capabilities than AMOPSO-I and MOPSO, and that the approach to blind image fusion based on AMOPSO-II realizes the optimal image fusion.

Yifeng Niu, Lincheng Shen, Yanlong Bu

System Engineering and Description

The Design of Biopathway’s Modelling and Simulation System Based on Petri Net

The paper proposes a new software design method of biopathway’s modeling and simulation application. In order for the software tool can be used by biologists easily and intuitively, we use Petri net and Stochastic Petri net to model biopathway, combining corresponding algorithms then can do deterministic and stochastic simulation, add the function that Petri net handle string, thus users can model biopathway such as transcription and translation more effectively. We introduce how to model and simulate biopathway with it in detail, it will be accepted by biologist quickly and used widely.

Chunguang Ji, Xiancui Lv, Shiyong Li
Timed Hierarchical Object-Oriented Petri Net-Part I: Basic Concepts and Reachability Analysis

To extend object Petri nets (OPN) for modeling and analyzing complex time critical systems, this paper proposes a high-level Petri net called timed hierarchical object-oriented Petri net (TOPN). In TOPN, a duration is attached to each object accounting for the minimal and maximal amount of time between which that the behavior of the object can be completed once fired. On the other hand, the problem of the state analysis of TOPN models is also addressed, which makes it possible to judge the model consistency at a given moment of time. In particular, a new way is investigated to represent and deal with the objects with temporal knowledge. Finally, the proposed TOPN is used to model and analyze a real decision making module in one cooperative multiple robot system to demonstrate its effectiveness.

Hua Xu, Peifa Jia
Approximate Semantic Query Based on Multi-agent Systems

Within multi-agent systems, it is almost impossible for multiple Web agents to completely share a same vocabulary. This makes multi-agent communication difficult. In this paper, we proposed an approach for better multi-agent communication using approximation technology of semantic terminology across multiple ontolgies. This method uses description logic language for describing ontological information and perform approximate query across multiple ontologies.

Yinglong Ma, Kehe Wu, Beihong Jin, Shaohua Liu

Real-Life Applications Based on Knowledge Technology

Swarm Intelligent Analysis of Independent Component and Its Application in Fault Detection and Diagnosis

An industrial process often has a large number of measured variables, which are usually driven by fewer essential variables. An improved independent component analysis based on particle swarm optimization (PSO-ICA) is involved to extract these essential variables. Process faults can be detected more efficiently by monitoring the independent components. On the basis of this, the diagnosis of faults is reduced to a string matching problem according to the situation of alarm limit violations of independent components. The length of the longest common subsequence (LLCS) between two strings is used to evaluate the difficulty in distinguishing two faults. The proposed method is illustrated by the application to the Tennessee Eastman challenging process.

Lei Xie, Jianming Zhang
Using VPRS to Mine the Significance of Risk Factors in IT Project Management

In the study, combining the concept of quality of classification (QoC) in Variable Precision Rough Set (VPRS) Theory and judgment matrix in Analytical Hierarchy Process (AHP), we design a method to process the data in decision tables, and obtain the significance of risk factors. Then, we explore the stable interval of variable precision factor

β

on the significance.

Gang Xie, Jinlong Zhang, K. K. Lai
Mining of MicroRNA Expression Data—A Rough Set Approach

In our research we used a microRNA expression level data set describing eleven types of human cancers. Our methodology was based on data mining (rule induction) using rough set theory. We used a novel methodology based on rule generations and cumulative rule sets. The original testing data set described only four types of cancer. We further restricted our attention to two types of cancer: breast and ovary. Using our combined rule set, all but one cases of breast cancer and all cases of ovary cancer were correctly classified.

Jianwen Fang, Jerzy W. Grzymala-Busse
Classifying Email Using Variable Precision Rough Set Approach

Emails have brought us great convenience in our daily work and life. However, Unsolicited messages or spam, flood our email boxes, viruses, worms, and denial-of service attacks that cripple computer networks may secret in spam. which result in bandwidth, time and money wasting. To this end, this paper presents a novel schema to do classification for emails by using Variable Precision Rough Set Approach. By comparing with popular classification methods like Naive Bayes classification, our anti-Spam filter model is effectiveness.

Wenqing Zhao, Yongli Zhu
Facial Expression Recognition Based on Rough Set Theory and SVM

Facial expression recognition is becoming more and more important in computer application, such as health care, children education, etc. Based on geometric feature and appearance feature, there are a few works have been done on facial expression recognition using such methods as ANN, SVM, etc. In this paper, considering geometric feature only, a novel approach based on rough set theory and SVM is proposed. The experiment results show this approach can get high recognition ratio and reduce the cost of calculation.

Peijun Chen, Guoyin Wang, Yong Yang, Jian Zhou
Gene Selection Using Rough Set Theory

The generic approach to cancer classification based on gene expression data is important for accurate cancer diagnosis, instead of using all genes in the dataset, we select a small gene subset out of thousands of genes for classification. Rough set theory is a tool for reducing redundancy in information systems, thus Application of Rough Set to gene selection is interesting. In this paper, a novel gene selection method called RMIMR is proposed for gene selection, which searches for the subset through maximum relevance and maximum positive interaction of genes. Compared with the classical methods based on statistics,information theory and regression, Our method leads to significantly improved classification in experiments on 4 gene expression datasets.

Dingfang Li, Wen Zhang
Attribute Reduction Based Expected Outputs Generation for Statistical Software Testing

A lot of test cases need to be executed in statistical software testing. A test case consists of a set of inputs and a list of expected outputs. To automatically generate the expected outputs for a lot of test cases is rather difficult. An attribute reduction based approach is proposed in this paper to automatically generate the expected outputs. In this approach the input and output variables of a software are expressed as conditional attributes and decision attributes respectively. The relationship between input and output variables are then obtained by attribute reduction. Thus, the expected outputs for a lot of test sets are automatically generated via the relationship. Finally, a case study and the comparison results are presented, which show that the method is effective.

Mao Ye, Boqin Feng, Li Zhu, Yao Lin
FADS: A Fuzzy Anomaly Detection System

In this paper, we propose a novel anomaly detection framework which integrates soft computing techniques to eliminate sharp boundary between normal and anomalous behavior. The proposed method also improves data pre-processing step by identifying important features for intrusion detection. Furthermore, we develop a learning algorithm to find classifiers for imbalanced training data to avoid some assumptions made in most learning algorithms that are not necessarily sound. Preliminary experimental results indicate that our approach is very effective in anomaly detection.

Dan Li, Kefei Wang, Jitender S. Deogun
Gene Selection Using Gaussian Kernel Support Vector Machine Based Recursive Feature Elimination with Adaptive Kernel Width Strategy

Recursive feature elimination based on non-linear kernel support vector machine (SVM-RFE) with parameter selection by genetic algorithm is an effective algorithm to perform gene selection and cancer classification in some degree, but its calculating complexity is too high for implementation. In this paper, we propose a new strategy to use adaptive kernel parameters in the recursive feature elimination algorithm implemented with Gaussian kernel SVMs as a better alternatives to the aforementioned algorithm for pragmatic reasons. The proposed method performs well in selecting genes and achieves high classification accuracies with these genes on two cancer datasets.

Yong Mao, Xiaobo Zhou, Zheng Yin, Daoying Pi, Youxian Sun, Stephen T. C. Wong
Backmatter
Metadaten
Titel
Rough Sets and Knowledge Technology
herausgegeben von
Guo-Ying Wang
James F. Peters
Andrzej Skowron
Yiyu Yao
Copyright-Jahr
2006
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-36299-9
Print ISBN
978-3-540-36297-5
DOI
https://doi.org/10.1007/11795131

Premium Partner