On modeling similarity and three-way decision under incomplete information in rough set theory

doi:10.1016/j.knosys.2019.105251

Knowledge-Based Systems

Volume 191, 5 March 2020, 105251

https://doi.org/10.1016/j.knosys.2019.105251 Get rights and content

Abstract

Although incomplete information is a well studied topic in rough set theory, there still does not exist a general agreement on the semantics of various types of incomplete information. This has led to some confusions and many definitions of similarity or tolerance relations on a set of objects, without a sound of semantical justification. The main objective of this paper is to address semantics issues related to incomplete information. We present a four-step model of Pawlak rough set analysis, in order to gain insights on how an indiscernibility relation (i.e., an equivalence relation) is defined and used under complete information. The results enable us to propose a conceptual framework for studying the similarity of objects under incomplete information. The framework is based on a classification of four types of incomplete information (i.e., “do-not-care value”, “partially-known value”, “class-specific value”, and “non-applicable value”) and two groups of methods (i.e., relation-based and granule-based methods) for modeling similarity. We examine existing studies on similarity and their relationships. In spite of their semantics differences, all four types of incomplete information can be uniformly represented in a set-valued table. We are therefore able to have a common conceptual possible-world semantics. Finally, to demonstrate the value of the proposed framework, we examine three-way decisions under incomplete information.

Introduction

Rough set theory, proposed by Pawlak [1], is an effective tool for deriving decision rules from data. A fundamental notion of rough set based data analysis is an indiscernibility relation on a set of objects [2]. If two objects have the same values over a set of attributes, they are indiscernible with respect to these attributes [2]. Objects with the same description form an equivalence class and the family of all equivalence classes is a partition of the universe. By using equivalence classes as basic building blocks, one can construct approximations of a subset of objects in terms of three pair-wise disjoint positive, negative, and boundary regions [1]. Yao [3], [4], [5], [6] introduced a theory of three-way decision as thinking and processing in threes. Interpreting rules in rough set theory in terms of three regions is an example of three-way decision. Specifically, from the three regions, one can derive acceptance, rejection, and non-commitment rules for making three-way decisions.

An assumption of Pawlak rough set analysis is that an object takes exactly one value on each attribute and, furthermore, we know this value. However, in many situations available information about some objects may be incomplete and we may not know their actual values on some attributes. In addition, it may be necessary to consider two categories of values: “applicable value” and “non-applicable value”. For the category of applicable values, the actual values must exist, but we may not know the values or only know a range of possibilities. For the category of non-applicable values, some attributes are not applicable to certain objects and hence their values cannot be stated. It may be viewed as special type of missing value. Under these circumstances of incomplete information, we may not know the exact descriptions of some objects and the notion of equivalence relations is no longer appropriate. Many authors propose and investigate different types of non-equivalence relation to model similarity, including tolerance relations [7], similarity relations [8], conditional tolerance relations [9], characteristic relations [10], [11], [12], [13], [14], [15], [16]. Indiscernibility is a special type of similarity. An indiscernibility relation is essential for deriving rules with complete information; a similarity relation plays the same essential role for deriving rules under incomplete information. Different types of similarity relations models are based on different semantics of incomplete information. However, there does not exist a conceptual framework for studying incomplete information from semantics point of view.

Kryszkiewicz [7] considers incomplete information as “do-not-care value” that may be replaced by any known values of an attribute. Stefanowski and Tsoukiàs [8] consider two types of incomplete information: “missing value” and “absent value”. The “missing value” semantics allows comparison operations on a missing value. The “absent value” semantics does not allow any comparison. Grzymala-Busse [11], [13] considers two types of incomplete information: “do-not-care value” and “lost value”. He further divides “do-not-care value” into three categories according to their comparison ranges: “do-not-care value”, “restricted do-not-care value”, and “attribute-concept value”. For a “do-not-care value”, it may be replaced by any known values of the attribute. For a “restricted do-not-care value”, it may only be replaced by any known values of the attribute excluding “lost values”. For an “attribute-concept value”, it may be replaced by any known values that are limited to the same concept. For a “lost value”, its original value existed but for a variety of reasons now it is not accessible.

Based on the existing studies of different semantics of incomplete information, we summarize four types of semantics of incomplete information: “do-not-care value”, “partially-known value”, “class-specific value”, and “non-applicable value”. Lipski [17], [18] presents a possible-world semantics to discuss incomplete information in databases. We adopt the possible-world semantics to study different types of incomplete information tables based on the four types of semantics. An incomplete information table is characterized by a family of complete information tables. Each complete information table in the family corresponds to a candidate of the actual table in one possible world and only one of the complete information tables is the actual table if information is complete. We unify different definitions of similarity relations by transforming an incomplete information table into a set-valued table, which allows a common possible-world semantics.

To demonstrate the value of the proposed framework, we discuss three-way decisions in an incomplete information table based on rough set approximations defined by a similarity relation. It is essentially a generalization of three-way decision with Pawlak rough sets [3]. Three-way decisions are inspired by a common practice of human decision-making with three options, namely, acceptance, rejection, and non-commitment. There are many studies on the theory and practice applications of three-way decisions [3], [4], [5], [19], [20], [21], [22], [23]. Three-way decision under incomplete information extends potential applications of standard rough sets and is worthy of further investigation.

This paper focuses on a conceptual modeling of similarity. Two important computational and practical issues are not covered, which shows two limitations of this paper. The first issue is about efficient algorithms for constructing a similarity relation. The second issue is the selection of a most suitable type of similarity relations in a particular application. Different applications may require different types of similarity relations. How to choose a particular type of similarity relations and how to efficiently compute a similarity relation in an application are interesting problems for further study.

The rest of this paper is organized as follows. Section 2 gives a four-step model of Pawlak rough set analysis under complete information. Section 3 presents four types of semantics of incomplete information for defining different classes of incomplete information tables. Section 4 studies the different definitions of similarity relations and similarity classes and discusses their relationships. Section 5 discusses three-way decisions in an incomplete information table.

Section snippets

Equivalence of objects under complete information

Pawlak rough set analysis (RSA) offers a unique approach to classification problems based on the notions of discernibility and indiscernibility of objects according to their descriptions. Fig. 1 presents a four-step framework for a conceptual understanding of RSA. It basically follows from the seminal book by Pawlak [2] with some slight modifications.

As shown in Fig. 1, the input of RSA is an information table that describes a set of objects by using a set of attributes.

Definition 2.1

An information table is

A framework for modeling similarity of objects under incomplete information

By reviewing the RSA in Section 2, we can find that the notion of indiscernibility is one of the central concepts. In many practical applications, the available information may be incomplete. We divide the incomplete information into two categories of “applicable values” and “non-applicable values”. Applicable values must exist, but we may not know the value or only know a range of possibilities. In some cases, an attribute may not be applicable to some objects. For example, an attribute

Two methods to similarity

Similarity plays an essential role in rule acquisition in an incomplete information table. A similarity relation is in one-to-one correspondence with a family of similarity classes. In this section, we discuss two methods in Fig. 3 to define similarity relations and similarity classes and study the relationships among different similarities.

Three-way decision based on similarity

In rough set theory under complete information, three-way approximations of a subset of objects, i.e., the positive, negative, and boundary regions, serve as a basis for three-way decisions. We formulate acceptance rules, rejection rules, and non-commitment rules, respectively, from the three regions [3], [4], [5], [19], [20], [21], [22], [23]. In this section, we extend the ideas of the three-way decision to situations with incomplete information.

In an incomplete information table, the

Conclusions

This paper focuses on the different interpretations and definitions of similarity based on the different semantics of incomplete information. We summarize four types of semantics of incomplete information and present a general definition of an incomplete information table. We identify two methods to study similarity relations and similarity classes in an incomplete information table. By transferring an incomplete information table into a set-valued table, we uniformly study the relationships

Acknowledgments

The authors thank reviewers for their constructive comments. This work was supported in part by the National Natural Science Foundation of China (Grant No. 61473239), the China Scholarship Council (Grant No. 201707000052), and a Discovery Grant from NSERC, Canada .

References (47)

YaoY.Y.
Three-way decisions with probabilistic rough sets
Inform. Sci.
(2010)
YaoY.Y.
Three-way decision and granular computing
Internat. J. Approx. Reason.
(2018)
YaoY.Y.
Three-way conflict analysis: reformulations and extensions of the Pawlak model
Knowl.-Based Syst.
(2019)
KryszkiewiczM.
Rough set approach to incomplete information systems
Inform. Sci.
(1998)
LeungY. et al.
Dependence-space-based attribute reductions in inconsistent decision information systems
Internat. J. Approx. Reason.
(2008)
LeungY. et al.
Maximal consistent block technique for rule acquisition in incomplete information systems
Inform. Sci.
(2003)
Grzymała-BusseJ.W. et al.
Generalized probabilistic approximations of incomplete data
Internat. J. Approx. Reason.
(2014)
LuoC. et al.
Updating three-way decisions in incomplete multi-scale information systems
Inform. Sci.
(2019)
LangG.M. et al.
Three-way decision approaches to conflict analysis using decision-theoretic rough set theory
Inform. Sci.
(2017)
YangX. et al.
A sequential three-way approach to multi-class decision
Internat. J. Approx. Reason.
(2019)

YangX. et al.

A temporal-spatial composite sequential approach of three-way granular computing

Inform. Sci.

(2019)

YaoY.Y.

The two sides of the theory of rough sets

Knowl.-Based Syst.

(2015)

HuM.J. et al.

Structured approximations as a basis for three-way decisions in rough set theory

Knowl.-Based Syst.

(2019)

MaJ.M. et al.

Structured probabilistic rough set approximations

Internat. J. Approx. Reason.

(2017)

MaX.A. et al.

Three-way decision perspectives on class-specific attribute reducts

Inform. Sci.

(2018)

ShaoM.W. et al.

Rule acquisition and complexity reduction in formal decision contexts

Internat. J. Approx. Reason.

(2014)

LiM.Z. et al.

Approximate concept construction with three-way decisions and attribute reduction in incomplete contexts

Knowl.-Based Syst.

(2016)

GuanL.H. et al.

Generalized approximations defined by non-equivalence relations

Inform. Sci.

(2012)

SłowińskiR. et al.

Rough classification in incomplete information systems

Math. Comput. Modelling

(1989)

GuanY.Y. et al.

Set-valued information systems

Inform. Sci.

(2006)

QianY.H. et al.

Set-valued ordered information systems

Inform. Sci.

(2009)

DaiJ.H. et al.

Fuzzy rough set model for set-valued data

Fuzzy Sets and Systems

(2013)

OrłowskaE. et al.

Representation of nondeterministic information

Theoret. Comput. Sci.

(1984)

Cited by (68)

Three-way group decision based on regret theory under dual hesitant fuzzy environment: An application in water supply alternatives selection
2024, Expert Systems with Applications
For actual multi-criteria group decision-making (MCGDM) issues with complexity and uncertainty, decision-makers may not be completely rational, and the psychological characteristics generally exert influences on the decision process. As an effective cognitive model, three-way decision (TWD) dedicates to handling uncertainty and is widely used in distinct fields. In this regard, this study incorporates regret emotion into TWD in the dual hesitant fuzzy (DHF) context, and accordingly designs a decision model for uncertain MCGDM issues considering bounded rationality. Specifically, by means of distance measures and score functions of dual hesitant fuzzy elements (DHFEs), the calculation method of loss functions and conditional probability are proposed firstly. Meanwhile, drawing from regret theory, the TWD rules considering regret emotion are generated under the DHF environment. Further, based on this, a detailed three-way group decision model based on regret theory, aimed at solving MCGDM problems under DHF uncertainty, water supply alternatives selection, is constructed. Lastly, a numerical example is employed to elaborate the application of the model.
Partially-defined equivalence relations: Relationship with orthopartitions and connection to rough sets
2024, Information Sciences
We introduce partially-defined equivalence relations as a type of equivalence relation that incorporates uncertainty. In these relations, certain pairs of objects are not definitively determined to be related or unrelated. The relationship with orthopartitions is put forward, providing the conditions under which an orthopartition can be transformed into an equivalent partially-defined equivalence relation and vice versa. Additionally, we explore their connection with reducts in rough set theory, offering insights into the characterization of similarity reducts in terms of orthopartitions.
A sequential three-way classification model based on risk preference and decision correction[Formula presented]
2023, Applied Soft Computing
The sequential three-way decision (S3WD) model, which merges three-way decisions and granular computing, is increasingly crucial in classification. The risk attitude to the decision process and result costs affects the decisive actions in the S3WD model. Furthermore, decision conflict arises when there is a discrepancy between coarse-grained and fine-grained definite decision-making for the same object, which can potentially impact decision accuracy. However, current studies show incomplete risk preference research and a lack of decision correction strategies to address decision conflict. To address the limitation, four sequential three-way classifiers (S3WCs) are proposed. First, three prominent distance functions are employed to compute similarity classes for condition probability estimation. Second, optimistic, pessimistic, and weighted compromise sequential three-way classifiers are established to reflect the risk preference for the two types of costs. Third, four precision differences in the S3WCs are defined from local and global perspectives. An S3WC with decision correction is presented to improve precision by judging precision differences in adjacent granularity levels and the entire granular structure. Finally, a series of experiments are conducted to thoroughly analyze the characteristics and applications of these S3WCs. The superior classification performance of the proposed models on diverse datasets is demonstrated.
A regret theory-based multi-granularity three-way decision model with incomplete T-spherical fuzzy information and its application in forest fire management
2023, Applied Soft Computing
Forest fires are an abrupt and highly destructive meteorological disaster that can occur in all regions of the world, resulting in significant ecological, economic and social losses. Moreover, the causes of forest fire disasters are usually complex, involving several uncertain factors such as temperature, relative humidity, wind speed and rainfall. All of those pose the greatest challenge to the study of forest fire management (FRM). In order to efficiently explore FRM via valid intelligent decision-making techniques, a novel model of regret theory (RT)-based multi-granularity (MG) three-way decisions (TWD) in incomplete T-spherical fuzzy (T-SF) environments has been constructed, where incomplete T-spherical fuzzy sets (T-SFSs) have been employed to describe diverse types of uncertain information in FRM, and RT-based MG TWD is conducive to analyzing multi-source T-SF information via reducing decision risks and modeling bounded rationality owned by decision-makers (DMs). Specifically, the concept of MG T-SF incomplete information systems (IISs) has been first constructed for information depictions of FRM. Then, MG T-SF IISs have been processed via the presented T-SF similarity principles for developing adjustable MG T-SF probabilistic rough sets (PRSs). Afterwards, an RT-based MG TWD approach has been built with the support of adjustable MG T-SF PRSs. Finally, a real-world FRM case analysis has been performed by using the built RT-based MG TWD approach, and extensive comparative and experimental analyses have been performed to validate the practicability of the presented methodology. To sum up, the presented methodology has simultaneously incorporated MG T-SF IISs, MG TWD and RT to model various uncertainties, valid information fusion processes and bounded rationality for FRM, which serves as a valid intelligent decision-making technique in processing incomplete and imprecise multi-source information with plentiful decision risks and regret emotions.
Matrix-based approaches for updating three-way regions in incomplete information systems with the variation of attributes
2023, Information Sciences
As a commonly used framework for uncertainty reasoning, tolerance rough set has achieved remarkable success in handling incomplete information systems with missing values. Three-way regions generated from tolerance rough set model play an increasingly crucial role in decision making and intelligent data analysis. Nevertheless, the dynamic change of attributes often exists in incomplete information systems. With this dynamic characteristic, three-way regions must be effectively updated for potential decision-making processes. Therefore, we develop incremental algorithms for maintenance of three-way regions in incomplete information systems when adding or deleting attributes, accelerating the calculation by making use of prior information. First, we put forward an effective matrix-based approach to calculate three-way regions in incomplete data. With the dynamic change of attributes, we further investigate the updating strategies of related matrices for constructing three-way regions. Accordingly, matrix-based algorithms for incrementally updating three-way regions are developed and discussed while the attributes vary over time. In addition, the complexity comparisons of non-incremental and incremental algorithms are illustrated. Finally, empirical experiments are performed to reveal the efficiency of the incremental algorithms compared with matrix-based non-incremental and related incremental algorithms.
Formal concept analysis perspectives on three-way conflict analysis
2023, International Journal of Approximate Reasoning
Pawlak conflict analysis focuses on three-valued ratings of a set of agents on a set of issues, in which the three values +, 0, and − indicate, respectively, that an agent is positive, neutral, and negative about an issue. According to their shared ratings, we can form different types of agent coalitions and, similarly, different types of issue bundles (i.e., families of issues to be considered together). The main objective of this paper is to introduce and investigate connections of agent coalitions and issue bundles from Wille formal concept analysis perspectives. By interpreting a three-valued rating table as four different formal contexts, we introduce four types of agent coalitions, namely, support, non-opposition, opposition, and non-support coalitions, the corresponding four types of issue bundles, and four lattices of coalition-bundle couplings. The lattices reveal structural information of agents and issues in a conflict situation. To demonstrate the usefulness of the proposed model, we analyze the problem of development planning of the Gansu Province of China.

View all citing articles on Scopus

^☆: No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.2019.105251.

View full text

On modeling similarity and three-way decision under incomplete information in rough set theory☆

Abstract

Introduction

Section snippets

Equivalence of objects under complete information

A framework for modeling similarity of objects under incomplete information

Two methods to similarity

Three-way decision based on similarity

Conclusions

Acknowledgments

Inform. Sci.

Internat. J. Approx. Reason.

Knowl.-Based Syst.

Inform. Sci.

Internat. J. Approx. Reason.

Inform. Sci.

Internat. J. Approx. Reason.

Inform. Sci.

Inform. Sci.

Internat. J. Approx. Reason.

Inform. Sci.

Knowl.-Based Syst.

Knowl.-Based Syst.

Internat. J. Approx. Reason.

Inform. Sci.

Internat. J. Approx. Reason.

Knowl.-Based Syst.

Inform. Sci.

Math. Comput. Modelling

Inform. Sci.

Inform. Sci.

Fuzzy Sets and Systems

Theoret. Comput. Sci.