Skip to main content

2009 | Buch

Rough Set Theory: A True Landmark in Data Analysis

herausgegeben von: Ajith Abraham, Rafael Falcón, Rafael Bello

Verlag: Springer Berlin Heidelberg

Buchreihe : Studies in Computational Intelligence

insite
SUCHEN

Über dieses Buch

Along the years, rough set theory has earned a well-deserved reputation as a sound methodology for dealing with imperfect knowledge in a simple though mathematically sound way. This edited volume aims at continue stressing the benefits of applying rough sets in many real-life situations while still keeping an eye on topological aspects of the theory as well as strengthening its linkage with other soft computing paradigms. The volume comprises 11 chapters and is organized into three parts. Part 1 deals with theoretical contributions while Parts 2 and 3 focus on several real world data mining applications. Chapters authored by pioneers were selected on the basis of fundamental ideas/concepts rather than the thoroughness of techniques deployed. Academics, scientists as well as engineers working in the rough set, computational intelligence, soft computing and data mining research area will find the comprehensive coverage of this book invaluable.

Inhaltsverzeichnis

Frontmatter

Theoretical Contributions to Rough Set Theory

Frontmatter
Rough Sets on Fuzzy Approximation Spaces and Intuitionistic Fuzzy Approximation Spaces
Summary
In the past few years the original concept of rough sets, as introduced by Pawlak [26] has been extended in many different directions. Some of these extensions are obtained by relaxing the requirement of the basic relations to be equivalence relations [14,15,31,32,34,35,37,38].That is by dropping the requirement of transitivity or symmetry. One such approach is to replace the equivalence relations by fuzzy proximity relations. The notions of rough sets thus generated are called rough sets defined upon fuzzy approximation spaces [14,15]. A generalization of this is obtained by taking intuitionistic fuzzy proximity relations instead of equivalence relations, called rough sets on intuitionistic fuzzy approximation spaces [37,38]. In this chapter we shall be concentrating on the study of these two notions of rough sets. It is our objective to define these types of rough sets along with related concepts and establish their properties which are parallel to those of basic rough sets. Several real life applications shall be considered in the sequel to illustrate the power and necessity of these generalized models of rough sets in the representation and study of imperfect knowledge.
B. K. Tripathy
Categorical Innovations for Rough Sets
Summary
Categories arise in mathematics and appear frequently in computer science where algebraic and logical notions have powerful representations using categorical constructions. In this chapter we lean towards the functorial view involving natural transformations and monads. Functors extendable to monads, further incorporating order structure related to the underlying functor, turn out to be very useful when presenting rough sets beyond relational structures in the usual sense. Relations can be generalized with rough set operators largely maintaining power and properties. In this chapter we set forward our required categorical tools and we show how rough sets and indeed a theory of rough monads can be developed. These rough monads reveal some canonic structures, and are further shown to be useful in real applications as well. Information within pharmacological treatment can be structured by rough set approaches. In particular, situations involving management of drug interactions and medical diagnosis can be described and formalized using rough monads.
P. Eklund, M. A. Galán, J. Karlsson
Granular Structures and Approximations in Rough Sets and Knowledge Spaces
Summary
Multilevel granular structures play a fundamental role in granular computing. In this chapter, we present a general framework of granular spaces. Within the framework, we examine the granular structures and approximations in rough set analysis and knowledge spaces. Although the two theories use different types of granules, they can be unified in the proposed framework.
Yiyu Yao, Duoqian Miao, Feifei Xu
On Approximation of Classifications, Rough Equalities and Rough Equivalences
Summary
In this chapter we mainly focus on the study of some topological aspects of rough sets and approximations of classifications. The topological classification of rough sets deals with their types. We find out types of intersection and union of rough sets, New concepts of rough equivalence (top, bottom and total) are defined, which capture approximate equality of sets at a higher level than rough equality (top, bottom and total) of sets introduced and studied by Novotny and Pawlak [23,24,25] and is also more realistic. Properties are established when top and bottom rough equalities are interchanged. Also, parallel properties for rough equivalences are established. We study approximation of classifications (introduced and studied by Busse [12]) and find the different types of classifications of an universe completely. We find out properties of rules generated from information systems and observations on the structure of such rules. The algebraic properties which hold for crisp sets and deal with equalities loose their meaning when crisp sets are replaced with rough sets. We analyze the validity of such properties with respect to rough equivalences.
B. K. Tripathy

Rough Set Data Mining Activities

Frontmatter
Rough Clustering with Partial Supervision
Summary
This study focuses on bringing two rough-set-based clustering algorithms into the framework of partially supervised clustering. A mechanism of partial supervision relying on either fuzzy membership grades or rough memberships and non-memberships of patterns to clusters is envisioned. Allowing such knowledge-based hints to play an active role in the discovery of the overall structure of the dataset has proved to be highly beneficial, this being corroborated by the empirical results. Other existing rough clustering techniques can successfully incorporate this type of auxiliary information with little computational effort.
Rafael Falcón, Gwanggil Jeon, Rafael Bello, Jechang Jeong
A Generic Scheme for Generating Prediction Rules Using Rough Sets
Abstract
This chapter presents a generic scheme for generating prediction rules based on rough set approach for stock market prediction. To increase the efficiency of the prediction process, rough sets with Boolean reasoning discretization algorithm is used to discretize the data. Rough set reduction technique is applied to find all the reducts of the data, which contains the minimal subset of attributes that are associated with a class label for prediction. Finally, rough sets dependency rules are generated directly from all generated reducts. Rough confusion matrix is used to evaluate the performance of the predicted reducts and classes. For comparison, the results obtained using rough set approach were compared to that of artificial neural networks and decision trees. Empirical results illustrate that rough set approach achieves a higher overall prediction accuracy reaching over 97% and generates more compact and fewer rules than neural networks and decision tree algorithm.
Hameed Al-Qaheri, Aboul Ella Hassanien, Ajith Abraham
Rough Web Caching
Summary
The demand for Internet content rose dramatically in recent years. Servers became more and more powerful and the bandwidth of end user connections and backbones grew constantly during the last decade. Nevertheless users often experience poor performance when they access web sites or download files. Reasons for such problems are often performance problems, which occur directly on the servers (e.g. poor performance of server-side applications or during flash crowds) and problems concerning the network infrastructure (e.g. long geographical distances, network overloads, etc.). Web caching and prefetching have been recognized as the effective schemes to alleviate the service bottleneck and to minimize the user access latency and reduce the network traffic. In this chapter, we model the uncertainty in Web caching using the granularity of rough set (RS) and inductive learning. The proposed framework is illustrated using the trace-based experiments from Boston University Web trace data set.
Sarina Sulaiman, Siti Mariyam Shamsuddin, Ajith Abraham
Software Defect Classification: A Comparative Study of Rough-Neuro-fuzzy Hybrid Approaches with Linear and Non-linear SVMs
Summary
This chapter is an extension of our earlier work in combining and comparing rough hybrid approaches with neuro-fuzzy and partial decision trees in classifying software defect data. The extension includes a comparison of our earlier results with linear and non-linear support vector machines (SVMs) in classifying defects. We compare SVM classification results with partial decision trees, neuro-fuzzy decision trees(NFDT), LEM2 algorithm based on rough sets, rough-neuro-fuzzy decision trees(R-NFDT), and fuzzy-rough classification trees(FRCT). The analyses of the results include statistical tests for classification accuracy. The experiments were aimed at not only comparing classification accuracy, but also collecting other useful software quality indicators such as number of rules, number of attributes (metrics) and the type of metrics (design vs. code level). The contribution of this chapter is a comprehensive comparative study of several computational intelligence methods in classifying software defect data. The different methods also point to the type of metrics data that ought to be collected and whether the rules generated by these methods can be easily interpreted.
Rajen Bhatt, Sheela Ramanna, James F. Peters

Rough Hybrid Models to Classification and Attribute Reduction

Frontmatter
Rough Sets and Evolutionary Computation to Solve the Feature Selection Problem
Summary
The feature selection problem has been usually addressed through heuristic approaches given its significant computational complexity. In this context, evolutionary techniques have drawn the researchers’ attention owing to their appealing optimization capabilities. In this chapter, promising results achieved by the authors in solving the feature selection problem through a joint effort between rough set theory and evolutionary computation techniques are reviewed. In particular, two new heuristic search algorithms are introduced, i.e. Dynamic Mesh Optimization and another approach which splits the search process carried out by swarm intelligence methods.
Rafael Bello, Yudel Gómez, Yailé Caballero, Ann Nowe, Rafael Falcón
Nature Inspired Population-Based Heuristics for Rough Set Reduction
Summary
Finding reducts is one of the key problems in the increasing applications of rough set theory, which is also one of the bottlenecks of the rough set methodology. The population-based reduction approaches are attractive to find multiple reducts in the decision systems. In this chapter, we introduce two nature inspired population-based computational optimization techniques, Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) for rough set reduction. Particle Swarm Optimization (PSO) is particularly attractive for the challenging problem as a new heuristic algorithm. The approach discover the best feature combinations in an efficient way to observe the change of positive region as the particles proceed throughout the search space. We evaluated the performance of the two algorithms using some benchmark datasets and the corresponding computational experiments are discussed. Empirical results indicate that both methods are ideal for all the considered problems and particle swarm optimization technique outperformed the genetic algorithm approach by obtaining more number of reducts for the datasets. We also illustrate a real world application in fMRI data analysis, which is helpful for cognition research.
Hongbo Liu, Ajith Abraham, Yanheng Li
Developing a Knowledge-Based System Using Rough Set Theory and Genetic Algorithms for Substation Fault Diagnosis
Summary
Supervisory Control and Data Acquisition (SCADA) systems are fundamental tools for quick fault diagnosis and efficient restoration of power systems. When multiple faults, or malfunctions of protection devices occur in the system, the SCADA system issues many alarm signals rapidly and relays these to the control center. The original cause and location of the fault can be difficult to determine for operators under stress without assistance from a computer aided decision support system. In cases of power system disturbances, network operators in the control center must use their judgement and experience to determine the possible faulty elements as the first step in the restoration procedures. If a breaker or its associated relays fail to operate, the fault is removed by backup protection. In such cases, the outage area can be large and it is then difficult for the network operators to estimate the fault location. Multiple faults, events and actions may eventually take place with many breakers being tripped within a short time. In these circumstances, many alarms need to be analysed by the operators to ensure that the most appropriate actions are taken [1]. Therefore, it is essential to develop software tools to assist in these situations.
This chapter proposes a novel and hybrid approach using Rough Set Theory and a Genetic Algorithm (RS-GA) indexrough hybrid to extract knowledge from a set of events captured by (microprocessor based) protection, control and monitoring devices (referred to as Intelligent Electronic Devices (IED)). The approach involves formulating a set of rules that identify the most probable faulty section in a network. The idea of this work is to enhance the capability of substation informatics and to assist real time decision support so that the network operators can diagnose the type and cause of the events in a time frame ranging from a few minutes to an hour. Building knowledge for a fault diagnostic system can be a lengthy and costly process. The quality of knowledge base is sometimes hampered by extra and superfluous rules that lead to large knowledge based systems and serious inconveniences to rule maintenance. The proposed technique not only can induce the decision rules efficiently but also reduce the size of the knowledge base without causing loss of useful information. Numerous case studies have been performed on a simulated distribution network [2] that includes relay models [3]. The network, modelled using a commercial power system simulator; PSCAD (Power Systems Computer Aided Design)/EMTDC (ElectroMagnetic Transients including DC), was used to investigate the effect of faults and switching actions on the protection and control equipment. The results have revealed the usefulness of the proposed technique for fault diagnosis and have also demonstrated that the extracted rules are capable of identifying and isolating the faulty section and hence improves the outage response time. These rules can be used by an expert system in supervisory automation and to support operators during emergency situations, for example, diagnosis of the type and cause of a fault event leads to network restoration and post-emergency repair.
Ching Lai Hor, Peter Crossley, Simon Watson, Dean Millar
Backmatter
Metadaten
Titel
Rough Set Theory: A True Landmark in Data Analysis
herausgegeben von
Ajith Abraham
Rafael Falcón
Rafael Bello
Copyright-Jahr
2009
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-89921-1
Print ISBN
978-3-540-89920-4
DOI
https://doi.org/10.1007/978-3-540-89921-1

Premium Partner