Skip to main content

2013 | Buch

Rough Sets and Knowledge Technology

8th International Conference, RSKT 2013, Halifax, NS, Canada, October 11-14, 2013, Proceedings

herausgegeben von: Pawan Lingras, Marcin Wolski, Chris Cornelis, Sushmita Mitra, Piotr Wasilewski

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This book constitutes the thoroughly refereed conference proceedings of the 8th International Conference on Rough Sets and Knowledge Technology, RSKT 2013, held in Halifax, Canada in October 2013 as one of the co-located conferences of the 2013 Joint Rough Set Symposium, JRS 2013. The 69 papers (including 44 regular and 25 short papers) included in the JRS proceedings (LNCS 8170 and LNCS 8171) were carefully reviewed and selected from 106 submissions. The papers in this volume cover topics such as history and future of rough sets; foundations and probabilistic rough sets; rules, reducts, ensembles; new trends in computing; three-way decision rough sets; and learning, predicting, modeling.

Inhaltsverzeichnis

Frontmatter

Tutorial

Using Domain Knowledge in Initial Stages of Knowledge Discovery in Databases
Tutorial Description

In this tutorial the topic of data preparation for Knowledge Discovery in Databases (KDD) is discussed on rather general level, with just few detailed descriptions of particular data processing steps. The general ideas are illustrated with application examples. Most of examples are taken from real-life KDD projects.

Marcin Szczuka

History and Future of Rough Sets

Non-deterministic Information in Rough Sets: A Survey and Perspective

We have been coping with issues connected with

non-deterministic information

in rough sets. Non-deterministic information is a kind of incomplete information, and it defines a set in which the actual value exists, but we do not know which is the actual value. If the defined set is equal to the domain of attribute values, we may see this is corresponding to a

missing value

. We need to pick up the merits in each information, and need to apply them to analyzing data sets. In this paper, we describe our opinion on non-deterministic information as well as incomplete information, some algorithms, software tools, and its perspective in rough sets.

Hiroshi Sakai, Mao Wu, Naoto Yamaguchi, Michinori Nakata
Granular Computing and Sequential Three-Way Decisions

Real-world decision making typically involves the three options of acceptance, rejection and non-commitment. Three-way decisions can be motivated, interpreted and implemented based on the notion of information granularity. With coarse-grained granules, it may only be possible to make a definite decision of acceptance or rejection for some objects. A lack of detailed information may make a definite decision impossible for some other objects, and hence the third non-commitment option is used. Objects with a non-commitment decision may be further investigated by using fine-grained granules. In this way, multiple levels of granularity lead naturally to sequential three-way decisions.

Yiyu Yao
A Scientometrics Study of Rough Sets in Three Decades

Rough set theory has been attracting researchers and practitioners over three decades. The theory and its applications experienced unprecedented prosperity especially in the recent ten years. It is essential to explore and review the progress made in the field of rough sets. Mainly based on Web of Science database, we analyze the prolific authors, impact authors, impact groups, and the most impact papers in the past three decades. In addition, we also examine rough set development in the recent five years. One of the goals of this article is to use scientometrics approaches to study three decade research in rough sets. We review the historic growth of rough sets and elaborate on recent development status in this field.

JingTao Yao, Yan Zhang
Generalizations of Approximations

In this paper we consider a generalization of the indiscernibility relation, i.e., a relation

R

that is not necessarily reflexive, symmetric, or transitive. There exist 36 basic definitions of lower and upper approximations based on such relation

R

. Additionally, there are six probabilistic approximations, generalizations of 12 corresponding lower and upper approximations. How to convert remaining 24 lower and upper approximations to 12 respective probabilistic approximations is an open problem.

Patrick G. Clark, Jerzy W. Grzymała-Busse, Wojciech Rząsa
Expression and Processing of Uncertain Information

Uncertainty is one basic feature in the information processing, and the expressing and processing of uncertain information have attracted more attentions. There are many theories introduced to process the uncertain information, such as probability theory, random set, evidence theory, fuzzy set theory, rough set theory, cloud model theory and so on. They depict the uncertain information from different aspects. This paper mainly discusses their differences and relations in expressing and processing for uncertain information. The future development trend is also discussed.

Guoyin Wang, Changlin Xu, Hong Yu
Early Development of Rough Sets - From a Personal Perspective

First I would like to thank Dr. Lingras and Dr. Yao for giving me the opportunity to talk to you this morning and get acquainted again with many friends whom I have not seen for quite a while.

I would like to share with you my own personal involvement in the early development of Rough Sets proposed by Professor Pawlak [1,2]. My talk this morning is definitely not meant to be a review of all the important work done in Rough Sets since then. Another thing I want to emphasize is that I am not an expert in this field at all, but it will become clear to you as the story unfolds that somehow my connection with Rough Sets is not broken during these years.

S. K. M. Wong

Foundations and Probabilistic Rough Sets

Contraction to Matroidal Structure of Rough Sets

As an important technique for granular computing, rough sets deal with vagueness and granularity in information systems. Rough sets are usually used in attribute reduction, however, the corresponding algorithms are often greedy ones. Matroids generalize the linear independence in vector spaces and provide well-established platforms for greedy algorithms. In this paper, we apply contraction to a matroidal structure of rough sets. Firstly, for an equivalence relation on a universe, a matroid is established through the lower approximation operator. Secondly, three characteristics of the dual of the matroid, which are useful for applying a new operation to the dual matroid, are investigated. Finally, the operation named contraction is applied to the dual matroid. We study some relationships between the contractions of the dual matroid to two subsets, which are the complement of a single point set and the complement of the equivalence class of this point. Moreover, these relationships are extended to general cases. In a word, these results show an interesting view to investigate the combination between rough sets and matroids.

Jingqian Wang, William Zhu
Optimal Approximations with Rough Sets

When arbitrary sets are approximated by more structured sets, it may not be

possible

to obtain an exact approximation that is equivalent to a given set. A proposal is presented for a ‘metric’ approach to Rough Sets. This includes a definition of the ‘optimal’ or best approximation with respect to a measure of similarity, and an algorithm to find it using the Jaccard Index. A definition of consistency also allows the algorithm to work for a larger class of similarity measures. Several consequences of these definitions are also presented.

Ryszard Janicki, Adam Lenarčič
Partial Approximation of Multisets and Its Applications in Membrane Computing

Partial nature of real–life problems requires working out partial approximation schemes. Partial approximation of sets is based on classical set theory. Its generalization for multisets gives a plausible opportunity to introduce an abstract concept of “to be close enough to a membrane” in membrane computing. The paper presents important features of general (maybe partial) multiset approximation spaces, their lattice theory properties, and shows how partial multiset approximation spaces can be applied to membrane computing.

Tamás Mihálydeák, Zoltán Ernő Csajbók
A Formal Concept Analysis Based Approach to Minimal Value Reduction

Reduction is a core issue in Rough Set Theory. Current reductions falls into 3 categories: tuple reduction, attribute reduction and value reduction. From the reduced tables, decision rules can be derived. For the purpose of storage and better understanding, minimization of the rule set is desired, and it is NP-hard. To tackle this problem, a heuristic approach to approximate minimal value reduct set is proposed based on Formal Concept Analysis in this paper. Experiments show that our approach is valid with a higher accuracy.

Mei-Zheng Li, Guoyin Wang, Jin Wang
Comparison of Two Models of Probabilistic Rough Sets

To generalize the classical rough set model, several proposals have been made by considering probabilistic information. Each of the proposed probabilistic models uses three regions for approximating a concept. Although the three regions are similar in form, they have different semantics and therefore are appropriate for different applications. In this paper, we present a comparative study of a decision-theoretic rough set model and a confirmation-theoretic rough set model. We argue that the former deals with drawing conclusions based on available evidence and the latter concerns evaluating difference pieces of evidence. By considering both models, we can obtain a more comprehensive understanding of probabilistic rough sets.

Bing Zhou, Yiyu Yao
Empirical Risk Minimization for Variable Precision Dominance-Based Rough Set Approach

In this paper, we characterize Variable Precision Dominance-based Rough Set Approach (VP-DRSA) from the viewpoint of empirical risk minimization. VP-DRSA is an extension of the Dominance-based Rough Set Approach (DRSA) that admits some degree of misclassification error. From a definable set, we derive a classification function, which indicates assignment of an object to a decision class. Then, we define an empirical risk associated with the classification function. It is defined as mean hinge loss function. We prove that the classification function minimizing the empirical risk function corresponds to the lower approximation in VP-DRSA.

Yoshifumi Kusunoki, Jerzy Błaszczyński, Masahiro Inuiguchi, Roman Słowiński
Formulating Game Strategies in Game-Theoretic Rough Sets

The determination of thresholds (

α

,

β

) has been considered as a fundamental issue in probabilistic rough sets. The game-theoretic rough set (GTRS) model determines the required thresholds based on a formulated game between different properties related to rough sets approximations and classification. The game strategies in the GTRS model are generally based on an initial threshold configuration that corresponds to the Pawlak model. We study different approaches for formulating strategies by considering different initial conditions. An example game is shown for each case. The selection of a particular approach for a given problem may be based on the quality of data and computing resources at hand. The realization of these approaches in GTRS based methods may bring new insights into effective determination of probabilistic thresholds.

Nouman Azam, JingTao Yao

Rules, Reducts, Ensembles

Sequential Optimization of Approximate Inhibitory Rules Relative to the Length, Coverage and Number of Misclassifications

This paper is devoted to the study of algorithms for sequential optimization of approximate inhibitory rules relative to the length, coverage and number of misclassifications. Theses algorithms are based on extensions of dynamic programming approach. The results of experiments for decision tables from UCI Machine Learning Repository are discussed.

Fawaz Alsolami, Igor Chikalov, Mikhail Moshkov
Robustness Measure of Decision Rules

Rough set approaches provide useful tools to find minimal decision rules. The obtained minimal decision rules are used to classify unseen objects. On the other hand, the condition parts of the minimal decision rules are sometimes used to design new objects which will be classified into the target decision class. While we are interested in the goodness of the set of obtained minimal decision rules in the former case, we are interested in the goodness of an individual minimal decision rule in the latter case. In this paper, we propose robustness measure as a new type of evaluation index for decision rules. The measure evaluates to what extent the decision rule maintains the goodness of classification against the partially-matched data. Numerical experiments are conducted to examine the effectiveness of robustness measure.

Motoyuki Ohki, Masahiro Inuiguchi
Exploring Margin for Dynamic Ensemble Selection

How to effectively combine the outputs of base classifiers is one of the key issues in ensemble learning. A new dynamic ensemble selection algorithm is proposed in this paper. In order to predict a sample, the base classifiers whose classification confidences on this sample are greater than or equal to specified threshold value are selected. Since margin is an important factor to the generalization performance of voting classifiers, thus the threshold value is estimated via the minimization of margin loss. We analyze the proposed algorithm in detail and compare it with some other multiple classifiers fusion algorithms. The experimental results validate the effectiveness of our algorithm.

Leijun Li, Qinghua Hu, Xiangqian Wu, Daren Yu
Evaluation of Incremental Change of Set-Based Indices

This paper proposes a new framework for evaluation of set-based indices based on incremental sampling. Since these indices are defined by the relations between conditional attributes (

R

) and decision attribute(

D

), incremental sampling gives four possible cases according to the increment of sets for

R

or

D

. Using this classification, the behavior of indices can be evaluated for four cases. We applied this technique to several set-based indices. The results show that the evaluation framework gives a powerful tool for evaluation of set-based indices. Especially, it is found that the behavior of indices can be determined by a firstly given dataset.

Shusaku Tsumoto, Shoji Hirano
Recent Advances in Decision Bireducts: Complexity, Heuristics and Streams

We continue our research on decision bireducts. For a decision system

$\mathbb{A}$

= (

U,A

∪ {

d

}), a decision bireduct is a pair (

B

,

X

), where

B

 ⊆ 

A

is a subset of attributes discerning all pairs of objects in

X

 ⊆ 

U

with different values on the decision attribute

d

, and where

B

and

X

cannot be, respectively, reduced and extended. We report some new results related to NP-hardness of extraction of optimal decision bireducts, heuristics aimed at searching for sub-optimal decision bireducts, and applications of decision bireducts to stream data mining.

Sebastian Stawicki, Dominik Ślęzak
Studies on the Necessary Data Size for Rule Induction by STRIM

STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by a simulation experiment specifying rules in advance, and by comparison with the conventional methods. However, there remains scope for future studies. One aspect which needs examination is determination of the size of the dataset needed for inducting true rules by simulation experiments, since finding statistically significant rules is the core of the method. This paper examines the theoretical necessary size of the dataset that STRIM needs to induct true rules with probability

w

[%] in connection with the rule length, and confirms the validity of this study by a simulation experiment at the rule length 2. The results provide useful guidelines for analyzing real-world datasets.

Yuichi Kato, Tetsuro Saeki, Shoutarou Mizuno
Applying Threshold SMOTE Algoritwith Attribute Bagging to Imbalanced Datasets

Synthetic minority over-sampling technique (SMOTE) is an effective over-sampling technique and specifically designed for learning from imbalanced data sets. However, in the process of synthetic sample generation, SMOTE is of some blindness. This paper proposes a novel approach for imbalanced problem, based on a combination of the Threshold SMOTE (TSMOTE) and the Attribute Bagging (AB) algorithms. TSMOTE takes full advantage of majority samples to adjust the neighbor selective strategy of SMOTE in order to control the quality of the new sample. Attribute Bagging, a famous ensemble learning algorithm, is also used to improve the predictive power of the classifier. A comprehensive suite of experiments tested on 7 imbalanced data sets collected from UCI machine learning repository is conducted. Experimental results show that TSMOTE-AB outperforms the SMOTE and other previously known algorithms.

Jin Wang, Bo Yun, Pingli Huang, Yu-Ao Liu

New Trends in Computing

Parallel Reducts: A Hashing Approach

A hashing approach in parallel reducts is clearly presented in this paper. With the help of this new approach, time-consuming comparison operations reduce significantly, therefore, matrix of attribute significance can be calculated more efficiently. Experiments show that our method has advantage over PRMAS, our classical parallel reducts method.

Minghua Pei, Dayong Deng, Houkuan Huang
A Parallel Implementation of Computing Composite Rough Set Approximations on GPUs

In information systems, there may exist multiple different types of attributes like categorical attributes, numerical attributes, set-valued attributes, interval-valued attributes, missing attributes,

etc

. Such information systems are called as composite information systems. To process such attributes with rough set theory, composite rough set model and corresponding matrix methods were introduced in our previous research. Rough set approximations of a concept are the basis for rule acquisition and attribute reduction in rough set based methods. To accelerate the computation process of rough set approximations, this paper first presents the boolean matrix representation of the lower and upper approximations in the composite information system, then designs a parallel method based on matrix, and implements it on GPUs. The experiments on data sets from UCI and user-defined data sets show that the proposed method can accelerate the computation process efficiently.

Junbo Zhang, Yun Zhu, Yi Pan, Tianrui Li
GPU Implementation of MCE Approach to Finding Near Neighbourhoods

This paper presents a parallel version of the Maximal Clique Enumeration (MCE) approach for discovering tolerance classes. Finding such classes is a computationally complex problem, especially in the case of large data sets or in content-based retrieval applications(CBIR). The GPU implementation is an extension of earlier work by the authors on finding efficient methods for computing tolerance classes in images. The experimental results demonstrate that the GPU-based MCE algorithm is faster than the serial MCE implementation and can perform computations with higher values of tolerance

ε

.

Tariq Alusaifeer, Sheela Ramanna, Christopher J. Henry, James Peters
FPGA in Rough Set Based Core and Reduct Computation

In this paper we propose a combination of capabilities of the FPGA based device and PC computer for data processing using rough set methods. Presented architecture has been tested on a random data. Obtained results confirm the significant acceleration of the computation time using hardware supporting rough sets operations in comparison to software implementation.

Tomasz Grześ, Maciej Kopczyński, Jarosław Stepaniuk
Fast Approximate Attribute Reduction with MapReduce

Massive data processing is a challenging problem in the age of big data. Traditional attribute reduction algorithms are generally time-consuming when facing massive data. For fast processing, we introduce a parallel fast approximate attribute reduction algorithm with

MapReduce

. We divide the original data into many small blocks, and use reduction algorithm for each block. The reduction algorithm is based on attribute significance. We compute the dependency of each reduction on testing data in order to select the best reduction. Data with different sizes are experimented. The experimental results show that our proposed algorithm can efficiently process large-scale data on

Hadoop

platform. In particular, on high dimensional data, the algorithm runs significantly faster than other latest parallel reduction methods.

Ping Li, Jianyang Wu, Lin Shang

Three-Way Decision Rough Sets

Three-Way Decision Based Overlapping Community Detection

The three-way decision based overlapping community detection algorithm (OCDBTWD) divides the vesting relationship between communities into three types: completely belong relation, completely not belong relation and incompletely belong relation, and it uses the positive domain, negative domain and boundary domain to describe those vesting relationships respectively. OCDBTWD defines the similarity between communities to quantify the conditional probability when two communities have the vesting relationship, and uses the increment values of extended modularity to reflect the inclusion ratio thresholds. OCDBTWD uses the three-way decision to decide the vesting relationship between communities to guide the merger of them. When the vesting relationship between communities is incompletely belong relation, then the overlapping vertex detection algorithm (OVDA) is proposed to detect overlapping vertices. OCDBTWD has been tested on both synthetic and real world networks and also compared with other algorithms. The experiments demonstrate its feasibility and efficiency.

Youli Liu, Lei Pan, Xiuyi Jia, Chongjun Wang, Junyuan Xie
Three-Way Decisions in Dynamic Decision-Theoretic Rough Sets

In the previous decision-theoretic rough sets (DTRS), its loss function values are constant. This paper extends the constant values of loss functions to a more realistic dynamic environment. Considering the dynamic change of loss functions in DTRS with the time, an extension of DTRS, dynamic decision-theoretic rough sets (DDTRS) is proposed in this paper. An empirical study of climate policy making validates the reasonability and effectiveness of the proposed model.

Dun Liu, Tianrui Li, Decui Liang
A Cluster Ensemble Framework Based on Three-Way Decisions

Cluster ensembles can combine the outcomes of several clusterings to a single clustering that agrees as much as possible with the input clusterings. However, little attention has been paid to the development of approaches to deal with consolidating the outcomes of both soft and hard clustering systems into a single final partition. For this reason, this paper proposes a cluster ensemble framework based on three-way decisions, and the interval sets used here to represent the cluster which is described by three regions according to the lower and upper bound of the cluster. In addition, this paper also devises a plurality voting-based consensus function which can consolidate the outcomes of multiple clustering systems whatever the systems are soft clustering systems or hard clustering systems. The proposed consensus function has been evaluated both in the quality of consensus partitions and in the running time.

Hong Yu, Qingfeng Zhou
Multistage Email Spam Filtering Based on Three-Way Decisions

A ternary, three-way decision strategy to email spam filtering divides incoming emails into three folders, namely, a mail folder consisting of emails that we

accept

as being legitimate, a spam folder consisting of emails that we

reject

as being legitimate, and a third folder consisting of emails that we cannot accept nor reject based on available information. The introduction of the third folder enables us to reduce both acceptance and rejection errors. Many existing ternary approaches are essentially a single-stage process. In this paper, we propose a model of multistage three-way email spam filtering based on principles of granular computing and rough sets.

Jianlin Li, Xiaofei Deng, Yiyu Yao
Cost-Sensitive Three-Way Decision: A Sequential Strategy

Three-way decision model is an extension of two-way decision model, in which boundary region decision is regarded as a new feasible decision choice when precise decision can not be immediately made due to lack of available information. In this paper, a cost-sensitive sequential three-way decision model is presented, which simulate a gradual decision process from rough granule to precise granule. At the beginning of the sequential decision process, the decision results have a high decision cost and many instances are decided as boundary region due to lack of information. With the increasing of the decision steps, the decision cost decrease and more instances are precisely decided. Eventually the decision cost achieve at a satisfying value and the boundary region disappears. The paper presents both theoretic analysis and experimental validation on this proposed model.

Huaxiong Li, Xianzhong Zhou, Bing Huang, Dun Liu
Two-Phase Classification Based on Three-Way Decisions

A two-phase classification method is proposed based on three-way decisions. In the first phase, all objects are classified into three different regions by three-way decisions. A positive rule makes a decision of acceptance, a negative rule makes a decision of rejection, and a boundary rule makes a decision of abstaining. The positive region contains those objects that have been assigned a class label with a high level of confidence. The boundary and negative regions contain those objects that have not been assigned class labels. In the second phase, a simple ensemble learning approach to determine the class labels of objects in the boundary or negative regions. Experiments are performed to compare the proposed two-phase classification approach and a classical classification approach. The results show that our method can produce a better classification accuracy than the classical model.

Weiwei Li, Zhiqiu Huang, Xiuyi Jia
A Three-Way Decisions Model Based on Constructive Covering Algorithm

The three-way decisions model divides the universe into three regions, i.e., positive region (POS), boundary region (BND) and negative region (NEG) according to two thresholds. A challenge of the three-way decisions model is how to compute the thresholds that generally rely on the experience of experts. In this paper, we propose a novel three-way decisions model based on Constructive Covering Algorithm(CCA). The new model produces three regions automatically according to the samples and does not need any given parameters. We give a method for constructing coverings from which the three regions are formed. We can classify samples based on the three regions. The experimental results show that the proposed model has great advantage on the classification efficiency and provides a new method to form three regions automatically for the theory of three-way decisions.

Yanping Zhang, Hang Xing, Huijin Zou, Shu Zhao, Xiangyang Wang

Learning, Predicting, Modeling

A Hierarchical Statistical Framework for the Extraction of Semantically Related Words in Textual Documents

Nowadays there exist a lot of documents in electronic format on the Internet, such as daily news and blog articles. Most of them are related, organized and archived into categories according to their themes. In this paper, we propose a statistical technique to analyze collections of documents, characterized by a hierarchical structure, to extract information hidden into them. Our approach is based on an extension of the log-bilinear model. Experimental results on real data illustrate the merits of the proposed statistical hierarchical model and its efficiency.

Weijia Su, Djemel Ziou, Nizar Bouguila
Anomaly Intrusion Detection Using Incremental Learning of an Infinite Mixture Model with Feature Selection

We propose an incremental nonparametric Bayesian approach for clustering. Our approach is based on a Dirichlet process mixture of generalized Dirichlet (GD) distributions. Unlike classic clustering approaches, our model does not require the number of clusters to be pre-defined. Moreover, an unsupervised feature selection scheme is integrated into the proposed nonparametric framework to improve clustering performance. By learning the proposed model using an incremental variational framework, the number of clusters as well as the features weights can be automatically and simultaneously computed. The effectiveness and merits of the proposed approach are investigated on a challenging application namely anomaly intrusion detection.

Wentao Fan, Nizar Bouguila, Hassen Sallay
Hybridizing Meta-heuristics Approaches for Solving University Course Timetabling Problems

In this paper we have presented a combination of two meta-heuristics, namely great deluge and tabu search, for solving the university course timetabling problem. This problem occurs during the assignment of a set of courses to specific timeslots and rooms within a working week and subject to a variety of hard and soft constraints. Essentially a set of hard constraints must be satisfied in order to obtain a feasible solution and satisfying as many as of the soft constraints as possible. The algorithm is tested over two databases: eleven enrolment-based benchmark datasets (representing one large, five medium and five small problems) and curriculum-based datasets used and developed from the International Timetabling Competition, ITC2007 (UD2 problems). A new strategy has been introduced to control the application of a set of neighbourhood structures using the tabu search and great deluge. The results demonstrate that our approach is able to produce solutions that have lower penalties on all the small and medium problems in eleven enrolment-based datasets and can produce solutions with comparable results on the curriculum-based datasets (with lower penalties on several data instances) when compared against other techniques from the literature.

Khalid Shaker, Salwani Abdullah, Arwa Alqudsi, Hamid Jalab
Weight Learning for Document Tolerance Rough Set Model

Creating a document model for efficient keyword search is a long studied problem in Information Retrieval. In this paper we explore the application of Tolerance Rough Set Model for Documents (TRSM) for this problem. We further provide an extension of TRSM with a weight learning procedure (TRSM-WL) and compare performance of these two algorithms in keyword search. We further provide a generalization of TRSM-WL that imposes additional constraints on the underlying model structure and compare it to a supervised variant of Explicit Semantic Analysis.

Wojciech Świeboda, Michał Meina, Hung Son Nguyen
A Divide-and-Conquer Method Based Ensemble Regression Model for Water Quality Prediction

This paper proposes a novel ensemble regression model to predict time series data of water quality. The proposed model consists of multiple regressors and a classifier. The model transforms the original time series data into subsequences by sliding window and divides it into several parts according to the fitness of regressor so that each regressor has advantages in a specific part. The classifier decides which part the new data should belong to so that the model could divide the whole prediction problem into small parts and conquer it after computing on only one part. The ensemble regression model, with a combination of Support Vector Machine, RBF Neural Network and Grey Model, is tested using 450-week observations of COD

Mn

data provided by Ministry of Environmental Protection of the People’s Republic of China during 2004 and 2012. The results show that the model could approximately convert the problem of prediction into a problem of classification and provide better accuracy over each single model it has combined.

Xuan Zou, Guoyin Wang, Guanglei Gou, Hong Li
A Self-learning Audio Player That Uses a Rough Set and Neural Net Hybrid Approach

A self-learning Audio Player was built to learn a users habits by analyzing operations the user does when listening to music. The self-learning component is intended to provide a better music experience for the user by generating a special playlist based on the prediction of a users favorite songs. The rough set core characteristics are used throughout the learning process to capture the dynamics of changing user interactions with the audio player. The engine is evaluated by simulation data. The simulation process ensures the data contain specific predetermined patterns. Evaluation results show the predictive power and stability of the hybrid engine for learning a users habits and the increased intelligence achieved by combining rough sets and NN when compared with using NN by itself.

Hongming Zuo, Julia Johnson
Backmatter
Metadaten
Titel
Rough Sets and Knowledge Technology
herausgegeben von
Pawan Lingras
Marcin Wolski
Chris Cornelis
Sushmita Mitra
Piotr Wasilewski
Copyright-Jahr
2013
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-41299-8
Print ISBN
978-3-642-41298-1
DOI
https://doi.org/10.1007/978-3-642-41299-8