Skip to main content
Top

Hint

Swipe to navigate through the articles of this issue

Published in: International Journal of Machine Learning and Cybernetics 2/2023

05-10-2022 | Original Article

TSFNFS: two-stage-fuzzy-neighborhood feature selection with binary whale optimization algorithm

Authors: Lin Sun, Xinya Wang, Weiping Ding, Jiucheng Xu, Huili Meng

Published in: International Journal of Machine Learning and Cybernetics | Issue 2/2023

Login to get access
share
SHARE

Abstract

The optimal global feature subset cannot be found easily due to the high cost, and most swarm intelligence optimization-based feature selection methods are inefficient in handling high-dimensional data. In this study, a two-stage feature selection model based on fuzzy neighborhood rough sets (FNRS) and binary whale optimization algorithm (BWOA) is developed. First, to denote the fuzziness of samples for mixed data with symbolic and numerical features, fuzzy neighborhood similarity is presented to study the similarity matrix and fuzzy membership degree, and the lower and upper approximations can be developed to present new FNRS model. Fuzzy neighborhood-based uncertainty measures such as dependence degree, knowledge granularity, and entropy measures are studied. From the viewpoints of algebra and information, fuzzy knowledge granularity conditional entropy is presented to form a preselected feature reduction set in the first stage. Second, the cosine curve change is added to develop a new control factor, which slows down the convergence rate of BWOA in the early iteration to fully explore the global, and accelerates the convergence rate in the late iteration. Integrating dependence degree with fuzzy knowledge granularity conditional entropy, a new fitness function is designed for selecting an optimal feature subset in this second stage. Two strategies are fused to avoid BWOA falling into the local optimum: the population partition strategy with the adaptive neighborhood search radius to divide the whale population and the local interference strategy of the elite subgroup to adjust the whale position update. Finally, a two-stage feature selection algorithm is designed, where the Fisher score algorithm is employed to preliminarily delete those redundancy features of high-dimensional datasets. Experiments on six UCI datasets and five gene expression datasets show that our algorithm is valid compared to other related algorithms.

To get access to this content you need the following product:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 69.000 Bücher
  • über 500 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt 90 Tage mit der neuen Mini-Lizenz testen!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 50.000 Bücher
  • über 380 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe



 


Jetzt 90 Tage mit der neuen Mini-Lizenz testen!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 58.000 Bücher
  • über 300 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko





Jetzt 90 Tage mit der neuen Mini-Lizenz testen!

Show more products
Literature
1.
go back to reference Zhang X, Yao Y (2022) Tri-level attribute reduction in rough set theory. Expert Syst Appl 190:116187 CrossRef Zhang X, Yao Y (2022) Tri-level attribute reduction in rough set theory. Expert Syst Appl 190:116187 CrossRef
2.
go back to reference Ding W, Pedrycz W, Triguero I, Cao Z, Lin C (2021) Multigranulation supertrust model for attribute reduction. IEEE Trans Fuzzy Syst 29(6):1395–1408 CrossRef Ding W, Pedrycz W, Triguero I, Cao Z, Lin C (2021) Multigranulation supertrust model for attribute reduction. IEEE Trans Fuzzy Syst 29(6):1395–1408 CrossRef
3.
go back to reference Qian W, Dong P, Wang Y, Dai S, Huang J (2022) Local rough set-based feature selection for label distribution learning with incomplete labels. Int J Mach Learn Cybern 13:2345–2364 CrossRef Qian W, Dong P, Wang Y, Dai S, Huang J (2022) Local rough set-based feature selection for label distribution learning with incomplete labels. Int J Mach Learn Cybern 13:2345–2364 CrossRef
4.
go back to reference Sun L, Zhang X, Qian Y, Xu J, Zhang S (2019) Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification. Inf Sci 502:18–41 MATHCrossRef Sun L, Zhang X, Qian Y, Xu J, Zhang S (2019) Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification. Inf Sci 502:18–41 MATHCrossRef
5.
go back to reference Sun L, Li M, Ding W, Zhang E, Mu X, Xu J (2022) AFNFS: Adaptive fuzzy neighborhood-based feature selection with adaptive synthetic over-sampling for imbalanced data. Inf Sci 612:724–744 CrossRef Sun L, Li M, Ding W, Zhang E, Mu X, Xu J (2022) AFNFS: Adaptive fuzzy neighborhood-based feature selection with adaptive synthetic over-sampling for imbalanced data. Inf Sci 612:724–744 CrossRef
6.
go back to reference Xu W, Yuan K, Li W (2022) Dynamic updating approximations of local generalized multigranulation neighborhood rough set. Appl Intell 52(8):9148–9173 CrossRef Xu W, Yuan K, Li W (2022) Dynamic updating approximations of local generalized multigranulation neighborhood rough set. Appl Intell 52(8):9148–9173 CrossRef
7.
go back to reference Sun L, Huang M, Xu J (2022) Weak label feature selection method based on neighborhood rough sets and Relief. Chin Comput Sci 49(4):152–160 Sun L, Huang M, Xu J (2022) Weak label feature selection method based on neighborhood rough sets and Relief. Chin Comput Sci 49(4):152–160
8.
go back to reference Zhang C, Ding J, Zhan J, Li D (2022) Incomplete three-way multi-attribute group decision making based on adjustable multigranulation Pythagorean fuzzy probabilistic rough sets. Int J Approx Reason 147:40–59 MATHCrossRef Zhang C, Ding J, Zhan J, Li D (2022) Incomplete three-way multi-attribute group decision making based on adjustable multigranulation Pythagorean fuzzy probabilistic rough sets. Int J Approx Reason 147:40–59 MATHCrossRef
9.
go back to reference Zhang C, Li D, Liang J (2020) Multi-granularity three-way decisions with adjustable hesitant fuzzy linguistic multigranulation decision-theoretic rough sets over two universes. Inf Sci 507:665–683 MATHCrossRef Zhang C, Li D, Liang J (2020) Multi-granularity three-way decisions with adjustable hesitant fuzzy linguistic multigranulation decision-theoretic rough sets over two universes. Inf Sci 507:665–683 MATHCrossRef
10.
go back to reference Sun L, Wang X, Ding W, Xu J (2022) TSFNFR: Two-stage fuzzy neighborhood-based feature reduction with binary whale optimization algorithm for imbalanced data classification. Knowl Based Syst 256:109849 CrossRef Sun L, Wang X, Ding W, Xu J (2022) TSFNFR: Two-stage fuzzy neighborhood-based feature reduction with binary whale optimization algorithm for imbalanced data classification. Knowl Based Syst 256:109849 CrossRef
11.
go back to reference Sun L, Zhang J, Ding W, Xu J (2022) Feature reduction for imbalanced data classification using similarity-based feature clustering with adaptive weighted k-nearest neighbors. Inf Sci 593:591–613 CrossRef Sun L, Zhang J, Ding W, Xu J (2022) Feature reduction for imbalanced data classification using similarity-based feature clustering with adaptive weighted k-nearest neighbors. Inf Sci 593:591–613 CrossRef
12.
go back to reference Sun L, Wang L, Ding W, Qian Y, Xu J (2021) Feature selection using fuzzy neighborhood entropy-based uncertainty measures for fuzzy neighborhood multigranulation rough sets. IEEE Trans Fuzzy Syst 29(1):19–33 CrossRef Sun L, Wang L, Ding W, Qian Y, Xu J (2021) Feature selection using fuzzy neighborhood entropy-based uncertainty measures for fuzzy neighborhood multigranulation rough sets. IEEE Trans Fuzzy Syst 29(1):19–33 CrossRef
13.
go back to reference Sun L, Wang W, Xu J, Zhang S (2019) Improved LLE and neighborhood rough sets-based gene selection using Lebesgue measure for cancer classification on gene expression data. J Intell Fuzzy Syst 37(4):5731–5742 CrossRef Sun L, Wang W, Xu J, Zhang S (2019) Improved LLE and neighborhood rough sets-based gene selection using Lebesgue measure for cancer classification on gene expression data. J Intell Fuzzy Syst 37(4):5731–5742 CrossRef
14.
go back to reference Zhang X, Jiang J (2022) Measurement, modeling, reduction of decision-theoretic multigranulation fuzzy rough sets based on three-way decisions. Inf Sci 607:1550–1582 CrossRef Zhang X, Jiang J (2022) Measurement, modeling, reduction of decision-theoretic multigranulation fuzzy rough sets based on three-way decisions. Inf Sci 607:1550–1582 CrossRef
15.
go back to reference Sun L, Xu J, Tian Y (2012) Feature selection using rough entropy-based uncertainty measures in incomplete decision systems. Knowl Based Syst 36:206–216 CrossRef Sun L, Xu J, Tian Y (2012) Feature selection using rough entropy-based uncertainty measures in incomplete decision systems. Knowl Based Syst 36:206–216 CrossRef
17.
go back to reference Xu W, Li W (2016) Granular computing approach to two-way learning based on formal concept analysis in fuzzy datasets. IEEE Trans Cybern 46(2):366–379 CrossRef Xu W, Li W (2016) Granular computing approach to two-way learning based on formal concept analysis in fuzzy datasets. IEEE Trans Cybern 46(2):366–379 CrossRef
19.
go back to reference Sun L, Yin T, Ding W, Qian Y, Xu J (2022) Feature selection with missing labels using multilabel fuzzy neighborhood rough sets and maximum relevance minimum redundancy. IEEE Trans Fuzzy Syst 30(5):1197–1211 CrossRef Sun L, Yin T, Ding W, Qian Y, Xu J (2022) Feature selection with missing labels using multilabel fuzzy neighborhood rough sets and maximum relevance minimum redundancy. IEEE Trans Fuzzy Syst 30(5):1197–1211 CrossRef
20.
go back to reference Sun L, Wang L, Ding W, Qian Y, Xu J (2020) Neighborhood multi-granulation rough sets-based attribute reduction using Lebesgue and entropy measures in incomplete neighborhood decision systems. Knowl Based Syst 192:105373 CrossRef Sun L, Wang L, Ding W, Qian Y, Xu J (2020) Neighborhood multi-granulation rough sets-based attribute reduction using Lebesgue and entropy measures in incomplete neighborhood decision systems. Knowl Based Syst 192:105373 CrossRef
21.
go back to reference Sun L, Wang L, Qian Y, Xu J, Zhang S (2019) Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems. Knowl Based Syst 186:104942 CrossRef Sun L, Wang L, Qian Y, Xu J, Zhang S (2019) Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems. Knowl Based Syst 186:104942 CrossRef
22.
go back to reference Shu W, Qian W, Xie Y (2020) Incremental feature selection for dynamic hybrid data using neighborhood rough set. Knowl Based Syst 194:105516 CrossRef Shu W, Qian W, Xie Y (2020) Incremental feature selection for dynamic hybrid data using neighborhood rough set. Knowl Based Syst 194:105516 CrossRef
23.
go back to reference Chen Y, Chen Y (2021) Feature subset selection based on variable precision neighborhood rough sets. Int J Comput Intell Syst 14(1):572–581 CrossRef Chen Y, Chen Y (2021) Feature subset selection based on variable precision neighborhood rough sets. Int J Comput Intell Syst 14(1):572–581 CrossRef
24.
go back to reference Tan A, Wu W, Qian Y, Liang J, Chen J, Li J (2019) Intuitionistic fuzzy rough set-based granular structures and attribute subset selection. IEEE Trans Fuzzy Syst 27(3):527–539 CrossRef Tan A, Wu W, Qian Y, Liang J, Chen J, Li J (2019) Intuitionistic fuzzy rough set-based granular structures and attribute subset selection. IEEE Trans Fuzzy Syst 27(3):527–539 CrossRef
25.
go back to reference Zeng K, She K, Niu X (2013) Multi-granulation entropy and its applications. Entropy 15(6):2288–2302 MATHCrossRef Zeng K, She K, Niu X (2013) Multi-granulation entropy and its applications. Entropy 15(6):2288–2302 MATHCrossRef
26.
go back to reference Chen D, Yang Y (2014) Attribute reduction for heterogeneous data based on the combination of classical and fuzzy rough set models. IEEE Trans Fuzzy Syst 22(5):1325–1334 CrossRef Chen D, Yang Y (2014) Attribute reduction for heterogeneous data based on the combination of classical and fuzzy rough set models. IEEE Trans Fuzzy Syst 22(5):1325–1334 CrossRef
27.
go back to reference Wang C, Shao M, He Q, Qian Y, Qi Y (2016) Feature subset selection based on fuzzy neighborhood rough sets. Knowl Based Syst 111:173–179 CrossRef Wang C, Shao M, He Q, Qian Y, Qi Y (2016) Feature subset selection based on fuzzy neighborhood rough sets. Knowl Based Syst 111:173–179 CrossRef
28.
go back to reference Zhang X, Fan Y, Yang J (2021) Feature selection based on fuzzy-neighborhood relative decision entropy. Pattern Recogn Lett 146:100–107 CrossRef Zhang X, Fan Y, Yang J (2021) Feature selection based on fuzzy-neighborhood relative decision entropy. Pattern Recogn Lett 146:100–107 CrossRef
29.
go back to reference Xu J, Wang Y, Mu H, Huang F (2019) Feature genes selection based on fuzzy neighborhood conditional entropy. J Intell Fuzzy Syst 36(1):117–126 CrossRef Xu J, Wang Y, Mu H, Huang F (2019) Feature genes selection based on fuzzy neighborhood conditional entropy. J Intell Fuzzy Syst 36(1):117–126 CrossRef
31.
go back to reference Fan X, Chen H (2020) Stepwise optimized feature selection algorithm based on discernibility matrix and mRMR. Chin Comput Sci 47(1):87–95 Fan X, Chen H (2020) Stepwise optimized feature selection algorithm based on discernibility matrix and mRMR. Chin Comput Sci 47(1):87–95
32.
go back to reference Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67 CrossRef Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67 CrossRef
33.
go back to reference Tian M, Liang X, Fu X, Sun Y, Li Z (2021) Multi-subgroup particle swarm optimization with game probability selection. Chin Comput Sci 48(10):67–76 Tian M, Liang X, Fu X, Sun Y, Li Z (2021) Multi-subgroup particle swarm optimization with game probability selection. Chin Comput Sci 48(10):67–76
34.
go back to reference Sun L, Kong X, Xu J, Xue Z, Zhai R, Zhang S (2019) A hybrid gene selection method based on ReliefF and Ant Colony Optimization algorithm for tumor classification. Sci Rep 9:8978 CrossRef Sun L, Kong X, Xu J, Xue Z, Zhai R, Zhang S (2019) A hybrid gene selection method based on ReliefF and Ant Colony Optimization algorithm for tumor classification. Sci Rep 9:8978 CrossRef
35.
go back to reference Sanjoy C, Apu K, Ratul C, Moumita S (2021) An enhanced whale optimization algorithm for large scale optimization problems. Knowl Based Syst 233:107543 CrossRef Sanjoy C, Apu K, Ratul C, Moumita S (2021) An enhanced whale optimization algorithm for large scale optimization problems. Knowl Based Syst 233:107543 CrossRef
36.
go back to reference Zheng Y, Li Y, Wang G, Chen Y, Xu Q, Fan J, Cui X (2019) A novel hybrid algorithm for feature selection based on whale optimization algorithm. IEEE Access 7:14908–14923 CrossRef Zheng Y, Li Y, Wang G, Chen Y, Xu Q, Fan J, Cui X (2019) A novel hybrid algorithm for feature selection based on whale optimization algorithm. IEEE Access 7:14908–14923 CrossRef
37.
go back to reference Moorthy U, Gandhi U (2021) A novel optimal feature selection technique for medical data classification using ANOVA based whale optimization. J Ambient Intell Humaniz Comput 12:3527–3538 CrossRef Moorthy U, Gandhi U (2021) A novel optimal feature selection technique for medical data classification using ANOVA based whale optimization. J Ambient Intell Humaniz Comput 12:3527–3538 CrossRef
38.
go back to reference Tawhid M, Ibrahim A (2020) Feature selection based on rough set approach, wrapper approach, and binary whale optimization algorithm. Int J Mach Learn Cybern 11(3):573–602 CrossRef Tawhid M, Ibrahim A (2020) Feature selection based on rough set approach, wrapper approach, and binary whale optimization algorithm. Int J Mach Learn Cybern 11(3):573–602 CrossRef
39.
go back to reference Wang S, Chen H (2020) Feature selection method based on rough sets and improved whale optimization algorithm. Chin Comput Sci 47(2):44–50 Wang S, Chen H (2020) Feature selection method based on rough sets and improved whale optimization algorithm. Chin Comput Sci 47(2):44–50
40.
go back to reference Sun L, Wang T, Ding W, Xu J, Tan A (2022) Two-stage-neighborhood-based multilabel classification for incomplete data with missing labels. Int J Intell Syst 37:6773–6810 CrossRef Sun L, Wang T, Ding W, Xu J, Tan A (2022) Two-stage-neighborhood-based multilabel classification for incomplete data with missing labels. Int J Intell Syst 37:6773–6810 CrossRef
42.
go back to reference Fang B, Chen H, Wang S (2019) Feature selection algorithm based on rough sets and fruit fly optimization. Chin Comput Sci 46(7):157–164 Fang B, Chen H, Wang S (2019) Feature selection algorithm based on rough sets and fruit fly optimization. Chin Comput Sci 46(7):157–164
43.
go back to reference Sun L, Qin X, Ding W, Xu J (2022) Nearest neighbors-based adaptive density peaks clustering with optimized allocation strategy. Neurocomputing 473:159–181 CrossRef Sun L, Qin X, Ding W, Xu J (2022) Nearest neighbors-based adaptive density peaks clustering with optimized allocation strategy. Neurocomputing 473:159–181 CrossRef
44.
go back to reference Sun L, Qin X, Ding W, Xu J, Zhang S (2021) Density peaks clustering based on k-nearest neighbors and self-recommendation. Int J Mach Learn Cybern 12(7):1913–1938 CrossRef Sun L, Qin X, Ding W, Xu J, Zhang S (2021) Density peaks clustering based on k-nearest neighbors and self-recommendation. Int J Mach Learn Cybern 12(7):1913–1938 CrossRef
45.
go back to reference Sun L, Zhang X, Qian Y, Xu J, Zhang S, Tian Y (2019) Joint neighborhood entropy-based gene selection method with fisher score for tumor classification. Appl Intell 49(4):1245–1259 CrossRef Sun L, Zhang X, Qian Y, Xu J, Zhang S, Tian Y (2019) Joint neighborhood entropy-based gene selection method with fisher score for tumor classification. Appl Intell 49(4):1245–1259 CrossRef
46.
go back to reference Chen Y, Zhang Z, Zheng J, Ma Y, Xue Y (2017) Gene selection for tumor classification using neighborhood rough sets and entropy measures. J Biomed Inform 67:59–68 CrossRef Chen Y, Zhang Z, Zheng J, Ma Y, Xue Y (2017) Gene selection for tumor classification using neighborhood rough sets and entropy measures. J Biomed Inform 67:59–68 CrossRef
47.
go back to reference Xu F, Miao D, Wei L (2009) Fuzzy-rough attribute reduction via mutual information with an application to cancer classification. Comput Math Appl 57(6):1010–1017 MATHCrossRef Xu F, Miao D, Wei L (2009) Fuzzy-rough attribute reduction via mutual information with an application to cancer classification. Comput Math Appl 57(6):1010–1017 MATHCrossRef
48.
go back to reference Faramarzi A, Heidarinejad M, Stephens B, Mirjalili S (2020) Equilibrium optimizer: a novel optimization algorithm. Knowl Based Syst 191:105190 CrossRef Faramarzi A, Heidarinejad M, Stephens B, Mirjalili S (2020) Equilibrium optimizer: a novel optimization algorithm. Knowl Based Syst 191:105190 CrossRef
49.
go back to reference Faramaizi A, Heidarinejad M, Mirjalili S, Gandomi A (2020) Marine predators algorithm: a nature-inspired metaheuristic. Expert Syst Appl 152:113377 CrossRef Faramaizi A, Heidarinejad M, Mirjalili S, Gandomi A (2020) Marine predators algorithm: a nature-inspired metaheuristic. Expert Syst Appl 152:113377 CrossRef
50.
go back to reference Bozorgi S, Yazdani S (2019) IWOA: an improved whale optimization algorithm for optimization problems. J Comput Des Eng 6(3):243–259 Bozorgi S, Yazdani S (2019) IWOA: an improved whale optimization algorithm for optimization problems. J Comput Des Eng 6(3):243–259
53.
go back to reference Hu Q, Yu D, Xie Z (2006) Information-preserving hybrid data reduction based on fuzzy-rough techniques. Pattern Recogn Lett 27(5):414–423 CrossRef Hu Q, Yu D, Xie Z (2006) Information-preserving hybrid data reduction based on fuzzy-rough techniques. Pattern Recogn Lett 27(5):414–423 CrossRef
54.
go back to reference Sun L, Wang T, Ding W, Xu J, Lin Y (2021) Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification. Inf Sci 578:887–912 CrossRef Sun L, Wang T, Ding W, Xu J, Lin Y (2021) Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification. Inf Sci 578:887–912 CrossRef
55.
go back to reference Xu J, Shen K, Sun L (2022) Multi-label feature selection based on fuzzy neighborhood rough sets. Complex Intell Syst 8(3):2105–2129 CrossRef Xu J, Shen K, Sun L (2022) Multi-label feature selection based on fuzzy neighborhood rough sets. Complex Intell Syst 8(3):2105–2129 CrossRef
56.
go back to reference Sun L, Yin T, Ding W, Qian Y, Xu J (2020) Multilabel feature selection using ML-ReliefF and neighborhood mutual information for multilabel neighborhood decision systems. Inf Sci 537:401–424 MATHCrossRef Sun L, Yin T, Ding W, Qian Y, Xu J (2020) Multilabel feature selection using ML-ReliefF and neighborhood mutual information for multilabel neighborhood decision systems. Inf Sci 537:401–424 MATHCrossRef
Metadata
Title
TSFNFS: two-stage-fuzzy-neighborhood feature selection with binary whale optimization algorithm
Authors
Lin Sun
Xinya Wang
Weiping Ding
Jiucheng Xu
Huili Meng
Publication date
05-10-2022
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 2/2023
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-022-01653-0

Other articles of this Issue 2/2023

International Journal of Machine Learning and Cybernetics 2/2023 Go to the issue