Introduction
Related work
Literature | Handing complete/incomplete datasets? | From algebra/ information view? |
---|---|---|
Qian et al. [35] | Complete | Algebra view |
Sun et al. [37] | Complete | Both |
Hu et al. [39] | Complete | Algebra view |
Qian et al. [43] | Incomplete | Algebra view |
Sun et al. [45] | Incomplete | Both |
Zeng et al. [52] | Complete | Information view |
Feng et al. [53] | Complete | Information view |
Xu et al. [58] | Complete | Both |
Our work
-
Based on the related definitions of NMRS, the shortcomings of the related neighborhood functions are analyzed.
-
Three kinds of uncertain indices are proposed, including decision index, sharp decision index, and blunt decision index, using upper and lower approximations of NMRS. Then, we redefine three types of precision and roughness based on three indices. Next, combining with the concept of self-information, four kinds of neighborhood multi-granulation self-information measures are proposed and their related properties are studied. According to theoretical analysis, the fourth measure, named lenient neighborhood multi-granulation self-information (NMSI), is suitable for select the optimal feature subsets.
-
To better study the uncertainty measure for the incomplete neighborhood decision systems from the algebra and information views, the self-information measures and information entropy are combined to propose a neighborhood multi-granulation self-information-based pessimistic neighborhood multi-granulation tolerance joint entropy (PTSIJE). PTSIJE not only considers the upper and lower approximations of the incomplete decision systems at the same time, but also can measure the uncertainty of the incomplete neighborhood decision systems from the algebra and information views simultaneously.
Previous knowledge
Self-information
Neighborhood multi-granulation rough sets
The deficiency of relative function and PTSIJE-based uncertainty measures
The deficiency of relative function
PTSIJE-based uncertainty measures
PTSIJE-based feature selection method in incomplete neighborhood decision systems
Feature selection based on PTSIJE
Feature selection algorithm
Experimental results and analysis
Experiment preparation
Effect of different neighborhood parameters
Datasets | Original | FSNTDJE | DMRA | FPRA | FRFS | IFGAS | PDJE-AR | PTSIJE-FS |
---|---|---|---|---|---|---|---|---|
Credit | 15 | 9.5 | 10.2 | 9.6 | 10.0 | 8.4 | 9.1 | 7.2 |
Heart | 13 | 10.9 | 11.8 | 11.4 | 10.7 | 8.4 | 9.6 | 9.4 |
Sonar | 60 | 19.7 | 35.3 | 17.1 | 19.5 | 13.0 | 18.4 | 17.7 |
Wdbc | 31 | 16.9 | 27.9 | 20.8 | 24.4 | 17.1 | 15.7 | 18.4 |
Wine | 13 | 8.8 | 12.1 | 10.9 | 11.4 | 6.2 | 5.7 | 5.5 |
Wpbc | 34 | 7.6 | 25.5 | 19.0 | 22.1 | 10.4 | 12.5 | 5.0 |
Mean | 27.67 | 12.23 | 20.47 | 14.8 | 16.35 | 10.58 | 11.83 | 10.53 |
Datasets | KNN | CART |
---|---|---|
Credit | {3 5 8 9 11 13 14 15} | {4 5 9 11 15} |
Heart | {1 3 4 5 8 10 11 12} | {1 2 3 4 6 7 9 11 12 13} |
Sonar | {3 10 11 16 27 37 46 47 60} | {3 11 12 13 14 16 18 24 27 32 35 38 45 46 48 49 52 55} |
Wdbc | {1 3 5 8 9 12 14 15 16 17 18 19 20 21 25 27 28 30} | {1 3 5 8 9 12 14 15 16 17 18 19 20 21 25 27 28 30} |
Wine | {1 7 10 11 13} | {1 2 7 9 11} |
Wpbc | {15 16 17 20 23 25 26} | {17 18} |
Classification results of the UCI datasets
Datasets | Original | FSNTDJE | DMRA | FPRA | FRFS | IFGAS | PDJE-AR | PTSIJE-FS |
---|---|---|---|---|---|---|---|---|
Credit | 0.7046 | 0.7046 | 0.7927 | 0.8023 | 0.7827 | 0.8130 | 0.8551 | 0.8652 |
Heart | 0.7652 | 0.7741 | 0.7697 | 0.7444 | 0.7545 | 0.7889 | 0.8037 | 0.8222 |
Sonar | 0.7262 | 0.6975 | 0.7214 | 0.7407 | 0.7357 | 0.7055 | 0.7240 | 0.8365 |
Wdbc | 0.9124 | 0.9466 | 0.9404 | 0.9337 | 0.9432 | 0.9311 | 0.9561 | 0.9561 |
Wine | 0.9195 | 0.9057 | 0.9108 | 0.9474 | 0.9330 | 0.9429 | 0.9494 | 0.9775 |
Wpbc | 0.7475 | 0.7374 | 0.7021 | 0.7471 | 0.7368 | 0.6840 | 0.7576 | 0.7626 |
Mean | 0.7959 | 0.7943 | 0.8062 | 0.8193 | 0.8143 | 0.8109 | 0.8410 | 0.8700 |
Datasets | Original | FSNTDJE | DMRA | FPRA | FRFS | IFGAS | PDJE-AR | PTSIJE-FS |
---|---|---|---|---|---|---|---|---|
Credit | 0.7766 | 0.7766 | 0.8724 | 0.8173 | 0.8245 | 0.8317 | 0.8420 | 0.8629 |
Heart | 0.7852 | 0.7653 | 0.7444 | 0.7407 | 0.7370 | 0.7481 | 0.8000 | 0.8222 |
Sonar | 0.6302 | 0.6707 | 0.7362 | 0.7224 | 0.7260 | 0.6871 | 0.6904 | 0.7596 |
Wdbc | 0.8595 | 0.9124 | 0.9208 | 0.9175 | 0.8990 | 0.9296 | 0.9402 | 0.9349 |
Wine | 0.8351 | 0.8027 | 0.8492 | 0.8592 | 0.8544 | 0.8727 | 0.9101 | 0.9157 |
Wpbc | 0.6970 | 0.6869 | 0.6241 | 0.6439 | 0.6818 | 0.6658 | 0.7071 | 0.7626 |
Mean | 0.7639 | 0.7691 | 0.7912 | 0.7835 | 0.7871 | 0.7892 | 0.815 | 0.8430 |
Classification results of the gene expression datasets
Datasets | Original | FSNTDJE | MIBARK | DNEAR | EGGS | EGGS-FS | PDJE-AR | PTSIJE-FS |
---|---|---|---|---|---|---|---|---|
DLBCL | 5469 | 6.4 | 2.8 | 10 | 19.5 | 3.7 | 6 | 7.4 |
Leukemia | 7129 | 7.0 | 4.4 | 7.8 | 8.1 | 5.0 | 7.6 | 7.6 |
Lung | 12533 | 9.0 | 5.9 | 6.0 | 11.4 | 6.4 | 8 | 11.0 |
MLL | 12582 | 7.0 | – | – | – | – | 7.8 | 14.8 |
Prostate | 12600 | 8.0 | 4.4 | 5.1 | 7.7 | 14.0 | 4.0 | 5.5 |
Mean | 10062.6 | 7.48 | 4.38 | 7.23 | 11.68 | 7.28 | 6.68 | 9.26 |
DLBCL | {1071 3942 689 856 584 874 952 930 5184} | {3942 689 3724 4552} | {4767 52 584 823 1640 952} |
Leukemia | {1745 2020 2111 6005 461 4328 4377 6281 2354 1260 } | {2288 4196 2288} | {461 4328 4377 6281 2354 1260} |
Lung | {2882 3202 7249 11957 12298 4815 1673 7625 6405 8564 2421} | {7200 111958 8564 2421} | {2882 3202 7249 11957 12298 4815 1673 7625 6405 8564 2421} |
MLL | {12418 8518 8428 3768 11229 4341 1316 9929 9845 3675 3882 9084 10856} | {12418 8518 3634 11297 3768 9586 9882 6278 11718 9005 8212 3316 9741 10457 8165 11325 11603} | {12418 8518 3634 11297 3768 9586 9882 6278 11718 9005 8212 3316 9741 10457 8165 11325 11603} |
Prostate | {6185 8058} | {6185 5486 8058} | {6185 8958 5979 3333 1322 5486 9230 8236 9357} |
Datasets | Original | FSNTDJE | MIBARK | DNEAR | EGGS | EGGS-FS | PDJE-AR | PTSIJE-FS |
---|---|---|---|---|---|---|---|---|
DLBCL | 0.896 | 0.805 | 0.765 | 0.698 | 0.854 | 0.87 | 0.935 | 0.9870 |
Leukemia | 0.734 | 0.833 | 0.828 | 0.533 | 0.629 | 0.801 | 0.875 | 0.9583 |
Lung | 0.931 | 0.892 | 0.958 | 0.82 | 0.859 | 0.979 | 0.941 | 0.9890 |
MLL | 0.6528 | 0.9167 | – | – | – | – | 0.8722 | 0.9722 |
Prostate | 0.782 | 0.787 | 0.512 | 0.611 | 0.639 | 0.849 | 0.868 | 0.8824 |
Mean | 0.7992 | 0.8467 | 0.7658 | 0.6655 | 0.7453 | 0.8748 | 0.8982 | 0.9578 |
Datasets | Original | FSNTDJE | MIBARK | DNEAR | EGGS | EGGS-FS | PDJE-AR | PTSIJE-FS |
---|---|---|---|---|---|---|---|---|
DLBCL | 0.809 | 0.831 | 0.778 | 0.718 | 0.826 | 0.801 | 0.877 | 0.9481 |
Leukemia | 0.814 | 0.917 | 0.834 | 0.671 | 0.733 | 0.813 | 0.806 | 0.9306 |
Lung | 0.926 | 0.916 | 0.964 | 0.819 | 0.966 | 0.955 | 0.951 | 0.9890 |
MLL | 0.7500 | 0.8889 | – | – | – | – | 0.9028 | 0.9444 |
Prostate | 0.640 | 0.772 | 0.566 | 0.570 | 0.591 | 0.863 | 0.912 | 0.9044 |
Mean | 0.7878 | 0.8650 | 0.7855 | 0.6945 | 0.7790 | 0.8580 | 0.8898 | 0.9433 |
Statistical analysis
Datasets | FSNTDJE | DMRA | FPRA | FRFS | IFGAS | PDJE-AR | PTSIJE-FS |
---|---|---|---|---|---|---|---|
Credit | 7 | 5 | 4 | 6 | 3 | 2 | 1 |
Heart | 4 | 5 | 7 | 6 | 3 | 2 | 1 |
Sonar | 7 | 5 | 2 | 3 | 6 | 4 | 1 |
Wdbc | 3 | 5 | 6 | 4 | 7 | 1.5 | 1.5 |
Wine | 7 | 6 | 3 | 5 | 4 | 2 | 1 |
Wpbc | 4 | 6 | 3 | 5 | 7 | 2 | 1 |
Mean | 5.33 | 5.33 | 4.17 | 4.83 | 5 | 2.25 | 1.08 |
Datasets | FSNTDJE | DMRA | FPRA | FRFS | IFGAS | PDJE-AR | PTSIJE-FS |
---|---|---|---|---|---|---|---|
Credit | 7 | 1 | 6 | 5 | 4 | 3 | 2 |
Heart | 3 | 5 | 6 | 7 | 4 | 2 | 1 |
Sonar | 7 | 2 | 4 | 3 | 6 | 5 | 1 |
Wdbc | 6 | 4 | 5 | 7 | 3 | 1 | 2 |
Wine | 7 | 6 | 4 | 5 | 3 | 2 | 1 |
Wpbc | 3 | 7 | 6 | 4 | 5 | 2 | 1 |
Mean | 5.5 | 4.17 | 5.17 | 5.17 | 4.17 | 2.5 | 1.33 |
Datasets | FSNTDJE | MIBARK | DNEAR | EGGS | EGGS-FS | PDJE-AR | PTSIJE-FS |
---|---|---|---|---|---|---|---|
DLBCL | 5 | 6 | 7 | 4 | 3 | 2 | 1 |
Leukemia | 3 | 4 | 7 | 6 | 5 | 2 | 1 |
Lung | 5 | 3 | 7 | 6 | 2 | 4 | 1 |
MLL | 2 | – | – | – | – | 3 | 1 |
Prostate | 4 | 7 | 6 | 5 | 3 | 2 | 1 |
Mean | 3.8 | 5 | 6.75 | 5.25 | 3.25 | 2.6 | 1 |
Datasets | FSNTDJE | MIBARK | DNEAR | EGGS | EGGS-FS | PDJE-AR | PTSIJE-FS |
---|---|---|---|---|---|---|---|
DLBCL | 4 | 6 | 7 | 3 | 5 | 2 | 1 |
Leukemia | 2 | 3 | 7 | 6 | 4 | 5 | 1 |
Lung | 6 | 3 | 7 | 2 | 4 | 5 | 1 |
MLL | 3 | – | – | – | – | 2 | 1 |
Prostate | 4 | 7 | 6 | 5 | 3 | 1 | 2 |
Mean | 3.8 | 4.75 | 6.75 | 4 | 4 | 3 | 1.2 |