Elsevier

Knowledge-Based Systems

Volume 91, January 2016, Pages 71-83
Knowledge-Based Systems

Decision-theoretic rough set: A multicost strategy

https://doi.org/10.1016/j.knosys.2015.09.011Get rights and content

Highlights

  • We propose a DTRS by a set of cost matrices.

  • Optimistic and pessimistic cases are two special models of our approach.

  • Two criterions based reducts are calculated by two different algorithms.

Abstract

By introducing the misclassification and delayed decision costs into the probabilistic approximations of the target, the model of decision-theoretic rough set is then sensitive to cost. However, traditional decision-theoretic rough set is proposed based on one and only one cost matrix, such model does not take the characteristics of multiplicity and variability of cost into consideration. To fill this gap, a multicost strategy is developed for decision-theoretic rough set. Firstly, from the viewpoint of the voting fusion mechanism, a parameterized decision-theoretic rough set is proposed. Secondly, based on the new model, the smallest possible cost and the largest possible cost are calculated in decision systems. Finally, both the decision-monotocity and cost criteria are introduced into the attribute reductions. The heuristic algorithm is used to compute decision-monotonicity reduct while the genetic algorithm is used to compute the smallest and the largest possible cost reducts. Experimental results on eight UCI data sets tell us: 1. compared with the raw data, decision-monotocity reduct can generate greater lower approximations and more decision rules; 2. the smallest possible cost reduct is much better than decision-monotocity reduct for obtaining smaller costs and more decision rules. This study suggests new research trends concerning decision-theoretic rough set theory.

Introduction

Cost-sensitive learning is a valuable problem. It has been addressed by many researchers from a variety of fields including decision making [22], machine learning [8], [9], [55], [41], pattern recognition [56] and so on. Note that in recent years, cost-sensitive learning has also attracted much attention of researchers in rough set theory [32]. In broad terms, rough set is one of the most important tools of Granular Computing (GrC) [33], [44], [46] and then introduction of cost sensitivity into rough set is bound to bring GrC a worthwhile topic.

Roughly speaking, driven by many practical applications, two main types of costs have been discussed in rough set: test cost and misclassification cost. On the one hand, each test (i.e., attribute, measurement, feature) may have an associated cost which is regarded as test cost. For example, in medical diagnosis, a blood test has a cost which may be the time or money spent on testing blood. On the other hand, misclassification cost is basically the loss when classifying data into a specific outcome. For example, it may be troublesome if a healthy subject is misclassified as a patient, but it could result in a more serious consequence if a patient is misclassified as healthy subject.

In rough set theory, two crucial aspects should be covered in terms of costs. Firstly, since the basic model, i.e., lower and upper approximations are not sensitive to cost, then how to introduce cost into modeling is an interesting issue. Such aspect will provide us a new perspective to characterize the uncertainty, mainly because if lower and upper approximations are sensitive to cost, then inevitably, uncertainty [1], [7] in rough set is sensitive to cost. Secondly, it is well known that attribute reduction is one of the key topics in rough set, from which we can see that finding reducts with some particular requirements of cost (e.g., to find reduct with the smallest cost) is also a challenge. In recent years, unremitting efforts have led to great progress around the two aspects we mentioned above.

  • As far as the modeling is concerned, both test cost and misclassification cost have been explored.

    For example, by considering test cost of the attributes, Yang et al. [43] introduced test cost into the construction of the multigranulation rough set [35], [37], [13]; by considering misclassification cost, Yao [47] proposed the concept of the Decision-Theoretic Rough Set (DTRS). Decision-theoretic rough set is actually a probabilistic rough set. Since the determination for a pair of the thresholds used in probabilistic rough set is a substantial challenge, then the pair of thresholds presented in decision-theoretic rough set is calculated by loss function. It must be noticed that for Yao’s loss function matrix, not only misclassification cost is used, but also the delayed decision cost is considered. Therefore, decision-theoretic rough set is corresponding to a three-way decision procedure [45]. For more generalizations and applications of decision-theoretic rough set, please refer to Refs. [2], [4], [14], [15], [17], [18], [19], [20], [21], [23], [24], [26], [27], [38], [39], [42], [53].

  • As far as the attribute reduction is concerned, both test cost and misclassification cost have also been studied by some researchers.

    For example, Min et al. [28], [31] studied the approaches to test cost based attribute reduction. Their goals are to find reducts with a smaller (obtained by a competition strategy) or the smallest (obtained by a backtracking approach) test costs. Following decision-theoretic rough set, Jia et al. [11], [12] formulated an optimization problem. They aimed to minimize the cost of decision. With respect to different requirements, Yao and Zhao [48] studied different definitions of attribute reductions in decision-theoretic rough set.

From discussions above, we can see that decision-theoretic rough set takes the cost into consideration (model is sensitive to cost) and then attribute reduction of decision-theoretic rough set is bound to close to cost. For Yao’s classical decision-theoretic rough set, the loss function is a 3 × 2 cost matrix which includes both misclassification and delayed decision costs. However, one and only one cost matrix may not be good enough for problem solving. We have some practical examples to illustrate the limitations of one cost matrix.

  • 1.

    Take for instance one medical accident, it is reasonable to assume that different regions may execute different criteria for making compensation, i.e., the costs of the same medical accident may be different in different regions. The developed regions may pay more than that paid by a developing region for economic factors.

  • 2.

    Here is a China old saying: “Everyone has a steelyard in his heart” and “It is hard to please all”, that is, for a given decision, different individuals may have different views which will inevitably generate different costs.

  • 3.

    Suppose that Mr. X wants to buy a house, if the prices show an up trend, then Mr. X faces different costs during different periods. In other words, one and only one cost is not enough to characterize the variability of cost.

Based on the above examples, it is noticeable that in Yao’s decision-theoretic rough set, one and only one cost matrix does not take the characteristics of multiplicity and variability of cost into consideration. Therefore, multiple cost matrices are required. From this point of view, a voting fusion mechanism will be used and then three models are constructed when facing multiple cost matrices. They are referred to as θ (parameterized), optimistic and pessimistic decision-theoretic rough sets in this paper. Note that optimistic and pessimistic models are two limits of parameterized approach.

To facilitate our discussions, we present the basic knowledge about rough set and decision-theoretic rough set in Section 2. In Section 3, not only three multicost based decision-theoretic rough sets are proposed, but also the computations of smallest and largest possible costs are discussed. In Section 4, decision-monotocity criterion based attribute reduction and cost criterion based attribute reduction are presented. The heuristic algorithm is used to compute decision-monotocity reduct while the genetic algorithm is used to compute cost reduct. In Section 5, the theoretical results shown in this paper are tested on eight UCI data sets. The paper ends with conclusions and outlooks for further research in Section 6.

Section snippets

Rough set

An information system can be considered as a pair I=<U,AT>, in which U is a non-empty finite set of the objects called the universe; AT is a non-empty finite set of the attributes. ∀aAT,Va is the domain of attribute a. ∀xU,a(x) denotes the value that x holds on a (∀aAT). Given an information system I,∀AAT, an indiscernibility relation IND(A) may be defined as IND(A)={(x,y)U2:aA,a(x)=a(y)}.

Obviously, IND(A) is an equivalence relation. ∀XU, one can construct the lower and upper

Multicost based decision-theoretic rough sets

For classical decision-theoretic rough set, one and only one 3×2 cost matrix is used. Yao assumed that such cost matrix comes from the expert’s evaluation. However, as what have been argued in Section 1, it is clear that a single cost matrix is insufficient for making a better or a more suitable decision. Therefore, multiple cost matrices and the corresponding decision-theoretic rough set approach will be explored in this section.

Decision-monotocity criterion based attribute reduction

The aim of attribute reduction is to delete redundant attributes in information or decision systems. For example, if we delete redundant attributes for obtaining approximations’ preservation reduct, the certain decision rules derived from lower approximation can be shortened. That is to say, we can generate simple decision rules. This is why attribute reduction plays an important role in rough set theory. In classical rough set theory, since indiscernibility relation, positive region,

Experimental analyses

The objectives of the following experiments are: 1. comparing costs derived by different approaches when facing multiple cost matrices; 2. comparing the effectiveness of different reducts shown in Section 4. All the experiments in this section have been carried out on a personal computer with Windows 7, Intel Core 2 DuoT5800 CPU (2.00 GHz) and 2.00 GB memory. The programming language is Matlab 2010a. The data sets used are outlined in Table 3, which were all downloaded from UCI Repository of

Conclusions

In this paper, we have developed a general framework for the study of decision-theoretic rough sets when facing multiple cost matrices. Based on voting fusion scheme, a θ decision-theoretic rough set was proposed. Such rough set includes two special cases: optimistic and pessimistic decision-theoretic rough sets. Moreover, based on the model of θ decision-theoretic rough set, we also presented two strategies to compute costs in decision systems. One is to generate cost as smaller as possible,

Acknowledgments

This work is supported by the Philosophy and Social Science Foundation of Jiangsu Higher Education Institutions (No. 2015SJD769), Natural Science Foundation of China (Nos. 61572242, 61272419, 61305058), Key Program of Natural Science Foundation of China (No. 61233011), Key Laboratory of Oceanographic Big Data Mining & Application of Zhejiang Province (No. OBDMA201501), Postdoctoral Science Foundation of China (No. 2014M550293), Qing Lan Project of Jiangsu Province of China.

References (56)

  • F. Min et al.

    Test-cost-sensitive attribute reduction

    Inf. Sci.

    (2011)
  • X.A. Ma et al.

    Decision region distribution preservation reduction in decision-theoretic rough set model

    Inf. Sci.

    (2014)
  • F. Min et al.

    Attribute reduction of data with error ranges and test costs

    Inf. Sci.

    (2012)
  • W. Pedrycz

    Allocation of information granularity in optimization and decision-making models: towards building the foundations of Granular Computing

    Eur. J. Oper. Res.

    (2014)
  • Y.H. Qian et al.

    Multigranulation decision-theoretic rough sets

    Int. J. Approx. Reason.

    (2014)
  • B.Z. Sun et al.

    Decision-theoretic rough fuzzy set model and application

    Inf. Sci.

    (2014)
  • H. Yu et al.

    An automatic method to determine the number of clusters using decision-theoretic rough set

    Int. J. Approx. Reason.

    (2014)
  • X.B. Yang et al.

    Test cost sensitive multigranulation rough set: model and minimal cost selection

    Inf. Sci.

    (2013)
  • Y.Y. Yao

    Three-way decisions with probabilistic rough sets

    Inf. Sci.

    (2010)
  • Y.Y. Yao et al.

    Attribute reduction in decision-theoretic rough set models

    Inf. Sci.

    (2008)
  • B. Zhou

    Multi-class decision-theoretic rough sets

    Int. J. Approx. Reason.

    (2014)
  • X.Y. Zhang et al.

    Reduction target structure-based hierarchical attribute reduction for two-category decision-theoretic rough sets

    Inf. Sci.

    (2014)
  • X.Y. Zhang et al.

    Region-based quantitative and hierarchical attribute reduction in the two-category decision theoretic rough set model

    Knowl.-Based Syst.

    (2014)
  • S.Y. Zhao et al.

    Nested structure in parameterized rough reduction

    Inf. Sci.

    (2013)
  • C.L. Blake, C.J. Merz, UCI repository of machine learning databases, 1998....
  • H.M. Chen et al.

    A decision-theoretic rough set approach for dynamic data mining

    IEEE Trans. Fuzzy Syst.

    (2015)
  • D.G. Chen et al.

    A novel algorithm for finding reducts with fuzzy rough sets

    IEEE Trans. Fuzzy Syst.

    (2012)
  • D.G. Chen et al.

    Sample pair selection for attribute reduction with rough set

    IEEE Trans. Knowl. Data Eng.

    (2012)
  • Cited by (52)

    • Thresholds learning of three-way decisions in pairwise crime linkage

      2022, Applied Soft Computing
      Citation Excerpt :

      Second, thresholds are artificially given. Some studies randomly generate some loss values under constraint conditions to calculate thresholds [50–54]. To avoid the setting of loss functions, the strategy of directly specifying several sets of thresholds was tried [9,21,55].

    View all citing articles on Scopus
    View full text