Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 5/2020

13.08.2019 | Original Article

Discernible neighborhood counting based incremental feature selection for heterogeneous data

verfasst von: Yanyan Yang, Shiji Song, Degang Chen, Xiao Zhang

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 5/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Incremental feature selection refreshes a subset of information-rich features from added-in samples without forgetting the previously learned knowledge. However, most existing algorithms for incremental feature selection have no explicit mechanisms to handle heterogeneous data with symbolic and real-valued features. Therefore, this paper presents an incremental feature selection method for heterogeneous data with the sequential arrival of samples in group. Discernible neighborhood counting that measures different types of features, is first introduced to establish a framework for feature selection from heterogeneous data. With the arrival of new samples, the discernible neighborhood counting of a feature subset is then updated to reveal the incremental feature selection scheme. This scheme determines the criterion for efficiently adding informative features and deleting redundant features. Based on the incremental scheme, our incremental feature selection algorithm is further formulated to select valuable features from heterogeneous data. Extensive experiments are finally conducted to demonstrate the effectiveness and the efficiency of the proposed incremental feature selection algorithm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Qian YH, Liang JY, Pedrycz W, Dang CY (2010) Positive approximation: an accelerator for attribute reduction in rough set theory. Artif Intell 174(9):597–618MathSciNetMATH Qian YH, Liang JY, Pedrycz W, Dang CY (2010) Positive approximation: an accelerator for attribute reduction in rough set theory. Artif Intell 174(9):597–618MathSciNetMATH
2.
Zurück zum Zitat Zhang X, Mei CL, Chen DG, Yang YY (2018) A fuzzy rough set-based feature selection method using representative instances. Knowl Based Syst 151:216–229 Zhang X, Mei CL, Chen DG, Yang YY (2018) A fuzzy rough set-based feature selection method using representative instances. Knowl Based Syst 151:216–229
3.
Zurück zum Zitat Bell DA, Wang H (2000) A formalism for relevance and its application in feature subset selection. Mach Learn 41(2):175–195MATH Bell DA, Wang H (2000) A formalism for relevance and its application in feature subset selection. Mach Learn 41(2):175–195MATH
4.
Zurück zum Zitat Zeng AP, Li TR, Liu D, Zhang JB, Chen HM (2015) A fuzzy rough set approach for incremental feature selection on hybrid information systems. Fuzzy Sets Syst 258:39–60MathSciNetMATH Zeng AP, Li TR, Liu D, Zhang JB, Chen HM (2015) A fuzzy rough set approach for incremental feature selection on hybrid information systems. Fuzzy Sets Syst 258:39–60MathSciNetMATH
5.
Zurück zum Zitat Zhang JB, Zhu Y, Pan Y, Li TR (2016) Efficient parallel boolean matrix based algorithms for computing composite rough set approximations. Inf Sci 329:287–302MATH Zhang JB, Zhu Y, Pan Y, Li TR (2016) Efficient parallel boolean matrix based algorithms for computing composite rough set approximations. Inf Sci 329:287–302MATH
6.
Zurück zum Zitat Zhang JB, Li TR, Chen HM (2014) Composite rough sets for dynamic data mining. Inf Sci 257:81–100MathSciNetMATH Zhang JB, Li TR, Chen HM (2014) Composite rough sets for dynamic data mining. Inf Sci 257:81–100MathSciNetMATH
7.
Zurück zum Zitat Tang WY, Mao KZ (2007) Feature selection algorithm for mixed data with both nominal and continuous features. Pattern Recogn Lett 28(5):563–571 Tang WY, Mao KZ (2007) Feature selection algorithm for mixed data with both nominal and continuous features. Pattern Recogn Lett 28(5):563–571
8.
Zurück zum Zitat Ching JY, Wong AKC, Chan KCC (1995) Class-dependent discretization for inductive learning from continuous and mixed-mode data. IEEE Trans Pattern Anal Mach Intell 17(7):641–651 Ching JY, Wong AKC, Chan KCC (1995) Class-dependent discretization for inductive learning from continuous and mixed-mode data. IEEE Trans Pattern Anal Mach Intell 17(7):641–651
9.
Zurück zum Zitat Chmielewski MR, Grzymala-Busse JW (1996) Global discretization of continuous attributes as preprocessing for machine learning. Int J Approx Reason 15(4):319–331MATH Chmielewski MR, Grzymala-Busse JW (1996) Global discretization of continuous attributes as preprocessing for machine learning. Int J Approx Reason 15(4):319–331MATH
10.
Zurück zum Zitat Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106 Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
11.
Zurück zum Zitat Hu QH, Yu DR, Liu JF, Wu CX (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594MathSciNetMATH Hu QH, Yu DR, Liu JF, Wu CX (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594MathSciNetMATH
12.
Zurück zum Zitat Chen DG, Yang YY (2014) Attribute reduction for heterogeneous data based on the combination of classical and fuzzy rough set models. IEEE Trans Fuzzy Syst 22(5):1325–1334 Chen DG, Yang YY (2014) Attribute reduction for heterogeneous data based on the combination of classical and fuzzy rough set models. IEEE Trans Fuzzy Syst 22(5):1325–1334
13.
Zurück zum Zitat Zhang X, Mei CL, Chen DG, Li JH (2016) Feature selection in mixed data: a method using a novel fuzzy rough set-based information entropy. Pattern Recogn 56:1–15MATH Zhang X, Mei CL, Chen DG, Li JH (2016) Feature selection in mixed data: a method using a novel fuzzy rough set-based information entropy. Pattern Recogn 56:1–15MATH
14.
Zurück zum Zitat Wang CZ, Hu QH, Wang XZ, Chen DG, Qian YH, Dong Z (2018) Feature selection based on neighborhood discrimination index. IEEE Trans Neural Netw Learn Syst 29(7):2986–2999MathSciNet Wang CZ, Hu QH, Wang XZ, Chen DG, Qian YH, Dong Z (2018) Feature selection based on neighborhood discrimination index. IEEE Trans Neural Netw Learn Syst 29(7):2986–2999MathSciNet
15.
Zurück zum Zitat Wang CZ, He Q, Shao MW, Hu QH (2018) Feature selection based on maximal neighborhood discernibility. Int J Mach Learn Cybern 9(11):1929–1940 Wang CZ, He Q, Shao MW, Hu QH (2018) Feature selection based on maximal neighborhood discernibility. Int J Mach Learn Cybern 9(11):1929–1940
16.
Zurück zum Zitat Wu Y, Hoi SCH, Mei T, Yu NH (2017) Large-scale online feature selection for ultra-high dimensional sparse data. ACM Trans Knowl Discov Data 11(4):1–13 Wu Y, Hoi SCH, Mei T, Yu NH (2017) Large-scale online feature selection for ultra-high dimensional sparse data. ACM Trans Knowl Discov Data 11(4):1–13
17.
Zurück zum Zitat Luo C, Li TR, Chen HM, Fujita H, Zhang Y (2018) Incremental rough set approach for hierarchical multicriteria classification. Inf Sci 429:72–87MathSciNet Luo C, Li TR, Chen HM, Fujita H, Zhang Y (2018) Incremental rough set approach for hierarchical multicriteria classification. Inf Sci 429:72–87MathSciNet
18.
Zurück zum Zitat Das AK, Sengupta S, Bhattacharyya S (2018) A group incremental feature selection for classification using rough set theory based genetic algorithm. Appl Soft Comput 65:400–411 Das AK, Sengupta S, Bhattacharyya S (2018) A group incremental feature selection for classification using rough set theory based genetic algorithm. Appl Soft Comput 65:400–411
19.
Zurück zum Zitat Xie XJ, Qin XL (2018) A novel incremental attribute reduction approach for dynamic incomplete decision systems. Int J Approx Reason 93:443–462MathSciNetMATH Xie XJ, Qin XL (2018) A novel incremental attribute reduction approach for dynamic incomplete decision systems. Int J Approx Reason 93:443–462MathSciNetMATH
20.
Zurück zum Zitat Huang YY, Li TR, Luo C, Fujita H, Horng SJ (2017) Matrix-based dynamic updating rough fuzzy approximations for data mining. Knowl Based Syst 119(C):273–283 Huang YY, Li TR, Luo C, Fujita H, Horng SJ (2017) Matrix-based dynamic updating rough fuzzy approximations for data mining. Knowl Based Syst 119(C):273–283
21.
Zurück zum Zitat Hu CX, Liu SX, Liu GX (2017) Matrix-based approaches for dynamic updating approximations in multigranulation rough sets. Knowl Based Syst 122:51–63 Hu CX, Liu SX, Liu GX (2017) Matrix-based approaches for dynamic updating approximations in multigranulation rough sets. Knowl Based Syst 122:51–63
22.
Zurück zum Zitat Zhang YY, Li TR, Luo C, Zhang JB, Chen HM (2016) Incremental updating of rough approximations in interval-valued information systems under attribute generalization. Inf Sci 373:461–475MATH Zhang YY, Li TR, Luo C, Zhang JB, Chen HM (2016) Incremental updating of rough approximations in interval-valued information systems under attribute generalization. Inf Sci 373:461–475MATH
23.
Zurück zum Zitat Hu J, Li TR, Luo C, Fujita H, Li SY (2016) Incremental fuzzy probabilistic rough sets over two universes. Int J Approx Reason 81:28–48MathSciNetMATH Hu J, Li TR, Luo C, Fujita H, Li SY (2016) Incremental fuzzy probabilistic rough sets over two universes. Int J Approx Reason 81:28–48MathSciNetMATH
24.
Zurück zum Zitat Luo C, Li TR, Chen HM, Fujita H, Zhang Y (2016) Efficient updating of probabilistic approximations with incremental objects. Knowl Based Syst 109:71–83 Luo C, Li TR, Chen HM, Fujita H, Zhang Y (2016) Efficient updating of probabilistic approximations with incremental objects. Knowl Based Syst 109:71–83
25.
Zurück zum Zitat Chen HM, Li TR, Luo C, Horng SJ, Wang GY (2014) A rough set-based method for updating decision rules on attribute values’ coarsening and refining. IEEE Trans Knowl Data Eng 6(12):2886–2899 Chen HM, Li TR, Luo C, Horng SJ, Wang GY (2014) A rough set-based method for updating decision rules on attribute values’ coarsening and refining. IEEE Trans Knowl Data Eng 6(12):2886–2899
26.
Zurück zum Zitat Chen HM, Li TR, Luo C, Horng SJ, Wang GY (2015) A decision-theoretic rough set approach for dynamic data mining. IEEE Trans Fuzzy Syst 23(6):1958–1970 Chen HM, Li TR, Luo C, Horng SJ, Wang GY (2015) A decision-theoretic rough set approach for dynamic data mining. IEEE Trans Fuzzy Syst 23(6):1958–1970
27.
Zurück zum Zitat Orlowska ME, Orlowski MW (1992) Maintenance of knowledge in dynamic information systems. Springer, Dordrecht, pp 315–329 Orlowska ME, Orlowski MW (1992) Maintenance of knowledge in dynamic information systems. Springer, Dordrecht, pp 315–329
28.
Zurück zum Zitat Hu F, Wang GY, Huang H, Wu Y (2005) Incremental attribute reduction based on elementary sets. In: Slezak D, Wang G, Szczuka M, Duntsch I, Yao Y (eds) International conference on rough sets, fuzzy sets, data mining, and granular computing. Springer, Berlin, Heidelberg, pp 185–193 Hu F, Wang GY, Huang H, Wu Y (2005) Incremental attribute reduction based on elementary sets. In: Slezak D, Wang G, Szczuka M, Duntsch I, Yao Y (eds) International conference on rough sets, fuzzy sets, data mining, and granular computing. Springer, Berlin, Heidelberg, pp 185–193
29.
Zurück zum Zitat Hu F, Dai J, Wang GY (2007) Incremental algorithms for attribute reduction in decision table. Control Decis 22(3):268–272MATH Hu F, Dai J, Wang GY (2007) Incremental algorithms for attribute reduction in decision table. Control Decis 22(3):268–272MATH
30.
Zurück zum Zitat Yang M (2007) An incremental updating algorithm for attribute reduction based on improved discernibility matrix. Chin J Comput 30(5):815–822MathSciNet Yang M (2007) An incremental updating algorithm for attribute reduction based on improved discernibility matrix. Chin J Comput 30(5):815–822MathSciNet
31.
Zurück zum Zitat Feng SR, Zhang DZ (2012) Increment algorithm for attribute reduction based on improvement of discernibility matrix. J Shenzhen Univ Sci Eng 29:5MathSciNetMATH Feng SR, Zhang DZ (2012) Increment algorithm for attribute reduction based on improvement of discernibility matrix. J Shenzhen Univ Sci Eng 29:5MathSciNetMATH
32.
Zurück zum Zitat Shu WH, Shen H (2013) A rough-set based incremental approach for updating attribute reduction under dynamic incomplete decision systems. In: IEEE international conference on fuzzy systems. IEEE, Hyderabad, pp 1–7 Shu WH, Shen H (2013) A rough-set based incremental approach for updating attribute reduction under dynamic incomplete decision systems. In: IEEE international conference on fuzzy systems. IEEE, Hyderabad, pp 1–7
33.
Zurück zum Zitat Liang JY, Wang F, Dang CY, Qian YH (2013) A group incremental approach to feature selection applying rough set technique. IEEE Trans Knowl Data Eng 26(2):294–308 Liang JY, Wang F, Dang CY, Qian YH (2013) A group incremental approach to feature selection applying rough set technique. IEEE Trans Knowl Data Eng 26(2):294–308
34.
Zurück zum Zitat Chen DG, Yang YY, Dong Z (2016) An incremental algorithm for attribute reduction with variable precision rough sets. Appl Soft Comput 45:129–149 Chen DG, Yang YY, Dong Z (2016) An incremental algorithm for attribute reduction with variable precision rough sets. Appl Soft Comput 45:129–149
35.
Zurück zum Zitat Yang YY, Chen DG, Wang H, Tsang ECC, Zhang DL (2016) Fuzzy rough set based incremental attribute reduction from dynamic data with sample arriving. Fuzzy Sets Syst 312:66–86MathSciNetMATH Yang YY, Chen DG, Wang H, Tsang ECC, Zhang DL (2016) Fuzzy rough set based incremental attribute reduction from dynamic data with sample arriving. Fuzzy Sets Syst 312:66–86MathSciNetMATH
36.
Zurück zum Zitat Yang YY, Chen DG, Wang H, Wang XZ (2018) Incremental perspective for feature selection based on fuzzy rough sets. IEEE Trans Fuzzy Syst 26(3):1257–1273 Yang YY, Chen DG, Wang H, Wang XZ (2018) Incremental perspective for feature selection based on fuzzy rough sets. IEEE Trans Fuzzy Syst 26(3):1257–1273
37.
Zurück zum Zitat Yang YY, Chen DG, Wang H (2017) Active sample selection based incremental algorithm for attribute reduction with rough sets. IEEE Trans Fuzzy Syst 25(4):825–838 Yang YY, Chen DG, Wang H (2017) Active sample selection based incremental algorithm for attribute reduction with rough sets. IEEE Trans Fuzzy Syst 25(4):825–838
38.
Zurück zum Zitat Lang GM, Li QG, Cai MJ, Yang T (2015) Characteristic matrixes-based knowledge reduction in dynamic covering decision information systems. Knowl Based Syst 85(C):1–26 Lang GM, Li QG, Cai MJ, Yang T (2015) Characteristic matrixes-based knowledge reduction in dynamic covering decision information systems. Knowl Based Syst 85(C):1–26
39.
Zurück zum Zitat Jing YG, Li TR, Luo C, Horng SJ, Wang GY, Yu Z (2016) An incremental approach for attribute reduction based on knowledge granularity. Knowl Based Syst 104(C):24–38 Jing YG, Li TR, Luo C, Horng SJ, Wang GY, Yu Z (2016) An incremental approach for attribute reduction based on knowledge granularity. Knowl Based Syst 104(C):24–38
40.
Zurück zum Zitat Jing YG, Li TR, Fujita H, Yu Z, Wang B (2017) An incremental attribute reduction approach based on knowledge granularity with a multi-granulation view. Inf Sci 411:23–38MathSciNet Jing YG, Li TR, Fujita H, Yu Z, Wang B (2017) An incremental attribute reduction approach based on knowledge granularity with a multi-granulation view. Inf Sci 411:23–38MathSciNet
41.
Zurück zum Zitat Wang H (2006) Nearest neighbors by neighborhood counting. IEEE Trans Pattern Anal Mach Intell 28(6):942–953 Wang H (2006) Nearest neighbors by neighborhood counting. IEEE Trans Pattern Anal Mach Intell 28(6):942–953
42.
Zurück zum Zitat Wu WZ, Zhang WX (2002) Neighborhood operator systems and approximations. Inf Sci 144(1):201–217MathSciNetMATH Wu WZ, Zhang WX (2002) Neighborhood operator systems and approximations. Inf Sci 144(1):201–217MathSciNetMATH
43.
Zurück zum Zitat Zhu PF, Hu QH (2013) Adaptive neighborhood granularity selection and combination based on margin distribution optimization. Inf Sci 249:1–12MathSciNetMATH Zhu PF, Hu QH (2013) Adaptive neighborhood granularity selection and combination based on margin distribution optimization. Inf Sci 249:1–12MathSciNetMATH
44.
Zurück zum Zitat Wang CZ, Shi YP, Fan XD, Shao MW (2019) Attribute reduction based on k-nearest neighborhood rough sets. Int J Approx Reason 106:18–31MathSciNetMATH Wang CZ, Shi YP, Fan XD, Shao MW (2019) Attribute reduction based on k-nearest neighborhood rough sets. Int J Approx Reason 106:18–31MathSciNetMATH
45.
Zurück zum Zitat Wang CZ, Shao MW, He Q, Qian YH, Qi YL (2016) Feature subset selection based on fuzzy neighborhood rough sets. Knowl Based Syst 111:173–179 Wang CZ, Shao MW, He Q, Qian YH, Qi YL (2016) Feature subset selection based on fuzzy neighborhood rough sets. Knowl Based Syst 111:173–179
Metadaten
Titel
Discernible neighborhood counting based incremental feature selection for heterogeneous data
verfasst von
Yanyan Yang
Shiji Song
Degang Chen
Xiao Zhang
Publikationsdatum
13.08.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 5/2020
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-019-00997-4

Weitere Artikel der Ausgabe 5/2020

International Journal of Machine Learning and Cybernetics 5/2020 Zur Ausgabe

Neuer Inhalt