Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 12/2020

07.05.2020 | Original Article

Fast feature selection for interval-valued data through kernel density estimation entropy

verfasst von: Jianhua Dai, Ye Liu, Jiaolong Chen, Xiaofeng Liu

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 12/2020

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Kernel density estimation, which is a non-parametric method about estimating probability density distribution of random variables, has been used in feature selection. However, existing feature selection methods based on kernel density estimation seldom consider interval-valued data. Actually, interval-valued data exist widely. In this paper, a feature selection method based on kernel density estimation for interval-valued data is proposed. Firstly, the kernel function in kernel density estimation is defined for interval-valued data. Secondly, the interval-valued kernel density estimation probability structure is constructed by the defined kernel function, including kernel density estimation conditional probability, kernel density estimation joint probability and kernel density estimation posterior probability. Thirdly, kernel density estimation entropies for interval-valued data are proposed by the constructed probability structure, including information entropy, conditional entropy and joint entropy of kernel density estimation. Fourthly, we propose a feature selection approach based on kernel density estimation entropy. Moreover, we improve the proposed feature selection algorithm and propose a fast feature selection algorithm based on kernel density estimation entropy. Finally, comparative experiments are conducted from three perspectives of computing time, intuitive identifiability and classification performance to show the feasibility and the effectiveness of the proposed method.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Javidi MM, Eskandari S (2018) Streamwise feature selection: a rough set method. Int J Mach Learn Cybernet 9(4):667–676 Javidi MM, Eskandari S (2018) Streamwise feature selection: a rough set method. Int J Mach Learn Cybernet 9(4):667–676
2.
Zurück zum Zitat Li JZ, Yang XB, Song XN, Wang PX, Yu DJ (2019) Neighborhood attribute reduction: a multi-criterion approach. Int J Mach Learn Cybernet 10(4):731–742 Li JZ, Yang XB, Song XN, Wang PX, Yu DJ (2019) Neighborhood attribute reduction: a multi-criterion approach. Int J Mach Learn Cybernet 10(4):731–742
3.
Zurück zum Zitat Dai JH, Hu QH, Hu H, Huang DB (2018) Neighbor inconsistent pair selection for attribute reduction by rough set approach. IEEE Trans Fuzzy Syst 26(2):937–950 Dai JH, Hu QH, Hu H, Huang DB (2018) Neighbor inconsistent pair selection for attribute reduction by rough set approach. IEEE Trans Fuzzy Syst 26(2):937–950
4.
Zurück zum Zitat Shang RH, Chang JW, Jiao LC, Xue Y (2019) Unsupervised feature selection based on self-representation sparse regression and local similarity preserving. Int J Mach Learn Cybernet 10(4):757–770 Shang RH, Chang JW, Jiao LC, Xue Y (2019) Unsupervised feature selection based on self-representation sparse regression and local similarity preserving. Int J Mach Learn Cybernet 10(4):757–770
5.
Zurück zum Zitat Dai JH, Hu QH, Zhang JH, Hu H, Zheng NG (2017) Attribute selection for partially labeled categorical data by rough set approach. IEEE Trans Cybernet 47(9):2460–2471 Dai JH, Hu QH, Zhang JH, Hu H, Zheng NG (2017) Attribute selection for partially labeled categorical data by rough set approach. IEEE Trans Cybernet 47(9):2460–2471
7.
Zurück zum Zitat Wang CZ, Qi YL, Shao MW, Hu QH, Chen DG, Qian YH, Lin YJ (2017) A fitting model for feature selection with fuzzy rough sets. IEEE Trans Fuzzy Syst 25(4):741–753 Wang CZ, Qi YL, Shao MW, Hu QH, Chen DG, Qian YH, Lin YJ (2017) A fitting model for feature selection with fuzzy rough sets. IEEE Trans Fuzzy Syst 25(4):741–753
8.
Zurück zum Zitat Dai JH, Hu H, Wu WZ, Qian YH, Huang DB (2018) Maximal-discernibility-pair-based approach to attribute reduction in fuzzy rough sets. IEEE Trans Fuzzy Syst 26(4):2174–2187 Dai JH, Hu H, Wu WZ, Qian YH, Huang DB (2018) Maximal-discernibility-pair-based approach to attribute reduction in fuzzy rough sets. IEEE Trans Fuzzy Syst 26(4):2174–2187
9.
Zurück zum Zitat Zhang X, Mei CL, Chen DG, Li JH (2016) Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy. Pattern Recogn 56:1–15MATH Zhang X, Mei CL, Chen DG, Li JH (2016) Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy. Pattern Recogn 56:1–15MATH
10.
Zurück zum Zitat Dai JH, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl Soft Comput 13(1):211–221 Dai JH, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl Soft Comput 13(1):211–221
11.
Zurück zum Zitat Dai JH, Han HF, Hu QH, Liu MF (2016) Discrete particle swarm optimization approach for cost sensitive attribute reduction. Knowl-Based Syst 102:116–126 Dai JH, Han HF, Hu QH, Liu MF (2016) Discrete particle swarm optimization approach for cost sensitive attribute reduction. Knowl-Based Syst 102:116–126
12.
Zurück zum Zitat Ashour AS, Guo Y, Kucukkulahli E, Erdogmus P, Polat K (2018) A hybrid dermoscopy images segmentation approach based on neutrosophic clustering and histogram estimation. Appl Soft Comput 69:426–434 Ashour AS, Guo Y, Kucukkulahli E, Erdogmus P, Polat K (2018) A hybrid dermoscopy images segmentation approach based on neutrosophic clustering and histogram estimation. Appl Soft Comput 69:426–434
13.
Zurück zum Zitat Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 3(33):1065–1076MathSciNetMATH Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 3(33):1065–1076MathSciNetMATH
14.
Zurück zum Zitat Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat, pp 832–837 Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat, pp 832–837
15.
Zurück zum Zitat Banerjee A, Burlina P (2010) Efficient particle filtering via sparse kernel density estimation. IEEE Trans Image Process 19(9):2480–2490MathSciNetMATH Banerjee A, Burlina P (2010) Efficient particle filtering via sparse kernel density estimation. IEEE Trans Image Process 19(9):2480–2490MathSciNetMATH
16.
Zurück zum Zitat Cai XJ, Wu ZF, Cheng J (2012) Using kernel density estimation to assess the spatial pattern of road density and its impact on landscape fragmentation. Int J Geogr Inf Sci 27:1–9 Cai XJ, Wu ZF, Cheng J (2012) Using kernel density estimation to assess the spatial pattern of road density and its impact on landscape fragmentation. Int J Geogr Inf Sci 27:1–9
17.
Zurück zum Zitat Qian PJ, Wang ST, Deng ZH (2011) Fast adaptive similarity-based clustering using sparse parzen window density estimation. Acta Autom Sin 37(2):179–187MathSciNetMATH Qian PJ, Wang ST, Deng ZH (2011) Fast adaptive similarity-based clustering using sparse parzen window density estimation. Acta Autom Sin 37(2):179–187MathSciNetMATH
18.
Zurück zum Zitat Rouhani M, Mohammadi M, Kargarian A (2016) Parzen window density estimator-based probabilistic power flow with correlated uncertainties. IEEE Trans Sustain Energy 7(3):1170–1181 Rouhani M, Mohammadi M, Kargarian A (2016) Parzen window density estimator-based probabilistic power flow with correlated uncertainties. IEEE Trans Sustain Energy 7(3):1170–1181
19.
Zurück zum Zitat Schller H, Hartmann U (1992) Mapping neural network derived from the parzen window estimator. Neural Netw 5(6):903–909 Schller H, Hartmann U (1992) Mapping neural network derived from the parzen window estimator. Neural Netw 5(6):903–909
20.
Zurück zum Zitat Wang S, Chung F, Xiong F (2008) A novel image thresholding method based on parzen window estimate. Pattern Recogn 41(1):117–129MATH Wang S, Chung F, Xiong F (2008) A novel image thresholding method based on parzen window estimate. Pattern Recogn 41(1):117–129MATH
21.
Zurück zum Zitat Wang SC, Gao R, Wang LM (2016) Bayesian network classifiers based on gaussian kernel density. Expert Syst Appl 51:207–217 Wang SC, Gao R, Wang LM (2016) Bayesian network classifiers based on gaussian kernel density. Expert Syst Appl 51:207–217
22.
Zurück zum Zitat Yang SS, Zheng F, Luo X, Cai SX, Wu YF, Liu KZ, Wu MH, Chen J, Krishnan S (2014) Effective dysphonia detection using feature dimension reduction and kernel density estimation for patients with parkinsons disease. PLoS ONE 9(2):e88825 Yang SS, Zheng F, Luo X, Cai SX, Wu YF, Liu KZ, Wu MH, Chen J, Krishnan S (2014) Effective dysphonia detection using feature dimension reduction and kernel density estimation for patients with parkinsons disease. PLoS ONE 9(2):e88825
23.
Zurück zum Zitat Yu WH, Ai TH, Shao SW (2015) The analysis and delimitation of central business district using network kernel density estimation. J Transp Geogr 45:32–47 Yu WH, Ai TH, Shao SW (2015) The analysis and delimitation of central business district using network kernel density estimation. J Transp Geogr 45:32–47
24.
Zurück zum Zitat Kwak N, Choi CH (2002) Input feature selection by mutual information based on parzen window. IEEE Trans Pattern Anal Mach Intell 24(12):1667–1671 Kwak N, Choi CH (2002) Input feature selection by mutual information based on parzen window. IEEE Trans Pattern Anal Mach Intell 24(12):1667–1671
25.
Zurück zum Zitat Xu SQ, Dai JH, Shi H (2018) Semi-supervised feature selection by mutual information based on kernel density estimation. In: 24th international conference on pattern recognition (ICPR), pp 818–823 Xu SQ, Dai JH, Shi H (2018) Semi-supervised feature selection by mutual information based on kernel density estimation. In: 24th international conference on pattern recognition (ICPR), pp 818–823
26.
Zurück zum Zitat Zhang JH (2017) Kernel density estimation entropy for mixed data and fast greedy feature selection algorithms. Master’s thesis, Zhejiang university Zhang JH (2017) Kernel density estimation entropy for mixed data and fast greedy feature selection algorithms. Master’s thesis, Zhejiang university
27.
Zurück zum Zitat Dai JH, Wang WT, Xu Q, Tian HW (2012) Uncertainty measurement for interval-valued decision systems based on extended conditional entropy. Knowl-Based Syst 27:443–450 Dai JH, Wang WT, Xu Q, Tian HW (2012) Uncertainty measurement for interval-valued decision systems based on extended conditional entropy. Knowl-Based Syst 27:443–450
28.
Zurück zum Zitat Dai JH, Wang WT, Mi JS (2013) Uncertainty measurement for interval-valued information systems. Inf Sci 251:63–78MathSciNetMATH Dai JH, Wang WT, Mi JS (2013) Uncertainty measurement for interval-valued information systems. Inf Sci 251:63–78MathSciNetMATH
29.
Zurück zum Zitat Du WS, Hu BQ (2014) Approximate distribution reducts in inconsistent interval-valued ordered decision tables. Inf Sci 271:93–114MathSciNetMATH Du WS, Hu BQ (2014) Approximate distribution reducts in inconsistent interval-valued ordered decision tables. Inf Sci 271:93–114MathSciNetMATH
30.
Zurück zum Zitat Yang XB, Qi Yong YDJ, Yu HL, Yang JY (2015) \(\alpha\)-Dominance relation and rough sets in interval-valued information systems. Inf Sci 294:334–347MathSciNetMATH Yang XB, Qi Yong YDJ, Yu HL, Yang JY (2015) \(\alpha\)-Dominance relation and rough sets in interval-valued information systems. Inf Sci 294:334–347MathSciNetMATH
31.
Zurück zum Zitat Dai JH, Zheng GJ, Han HF, Hu QH, Zheng NG, Liu J, Zhang QL (2017) Probability approach for interval-valued ordered decision systems in dominance-based fuzzy rough set theory. J Intell Fuzzy Syst 32(1):701–703MATH Dai JH, Zheng GJ, Han HF, Hu QH, Zheng NG, Liu J, Zhang QL (2017) Probability approach for interval-valued ordered decision systems in dominance-based fuzzy rough set theory. J Intell Fuzzy Syst 32(1):701–703MATH
32.
Zurück zum Zitat Guru DS, Kumar NV, Suhil M (2017) Feature selection of interval valued data through interval K-means clustering. Int J Comput Vis Image Process 7:64–80 Guru DS, Kumar NV, Suhil M (2017) Feature selection of interval valued data through interval K-means clustering. Int J Comput Vis Image Process 7:64–80
33.
Zurück zum Zitat Li LF (2017) Multi-level interval-valued fuzzy concept lattices and their attribute reduction. Int J Mach Learn Cybernet 8(1):45–56 Li LF (2017) Multi-level interval-valued fuzzy concept lattices and their attribute reduction. Int J Mach Learn Cybernet 8(1):45–56
34.
Zurück zum Zitat Dai JH, Hu H, Zheng GJ, Hu QH, Han HF, Shi H (2016) Attribute reduction in interval-valued information systems based on information entropies. Front Inf Technol Electron Eng 17(9):919–928 Dai JH, Hu H, Zheng GJ, Hu QH, Han HF, Shi H (2016) Attribute reduction in interval-valued information systems based on information entropies. Front Inf Technol Electron Eng 17(9):919–928
35.
Zurück zum Zitat Dai JH, Yan YJ, Li ZW, Liao BS (2018) Dominance-based fuzzy rough set approach for incomplete interval-valued data. J Intell Fuzzy Syst 34:423–436 Dai JH, Yan YJ, Li ZW, Liao BS (2018) Dominance-based fuzzy rough set approach for incomplete interval-valued data. J Intell Fuzzy Syst 34:423–436
36.
Zurück zum Zitat Guru DS, Kumar NV (2020) Interval chi-square score (ICSS): feature selection of interval valued data. Adv Intell Syst Comput 941:686–698 Guru DS, Kumar NV (2020) Interval chi-square score (ICSS): feature selection of interval valued data. Adv Intell Syst Comput 941:686–698
37.
Zurück zum Zitat Gatenby RA, Frieden BR (2008) Inf Theory and Entropy. Springer, New York Gatenby RA, Frieden BR (2008) Inf Theory and Entropy. Springer, New York
38.
Zurück zum Zitat Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654 Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654
39.
Zurück zum Zitat Wang R, Wang XZ, Kwong S, Xu C (2017) Incorporating diversity and informativeness in multiple-instance active learning. IEEE Trans Fuzzy Syst 25(6):1460–1475 Wang R, Wang XZ, Kwong S, Xu C (2017) Incorporating diversity and informativeness in multiple-instance active learning. IEEE Trans Fuzzy Syst 25(6):1460–1475
40.
Zurück zum Zitat Wang XZ, Wang R, Xu C (2018) Discovering the relationship between generalization and uncertainty by incorporating complexity of classification. IEEE Trans Cybernet 48(2):703–715MathSciNet Wang XZ, Wang R, Xu C (2018) Discovering the relationship between generalization and uncertainty by incorporating complexity of classification. IEEE Trans Cybernet 48(2):703–715MathSciNet
41.
Zurück zum Zitat Zhang GL, Shen H, Shi F, Huo YQ (2015) Block iterative inversion algorithms for large real symmetric matrix. Wirel Interconnect Technol 6:127–129 Zhang GL, Shen H, Shi F, Huo YQ (2015) Block iterative inversion algorithms for large real symmetric matrix. Wirel Interconnect Technol 6:127–129
42.
43.
Zurück zum Zitat Stanimirović PS, Petković MD (2013) Gauss-Jordan elimination method for computing outer inverses. Appl Math Comput 219(9):4667–4679MathSciNetMATH Stanimirović PS, Petković MD (2013) Gauss-Jordan elimination method for computing outer inverses. Appl Math Comput 219(9):4667–4679MathSciNetMATH
44.
Zurück zum Zitat Hedjazi L, Aguilar MJ, Lann MVL (2011) Similarity-margin based feature selection for symbolic interval data. Pattern Recogn Lett 32(4):578–585 Hedjazi L, Aguilar MJ, Lann MVL (2011) Similarity-margin based feature selection for symbolic interval data. Pattern Recogn Lett 32(4):578–585
45.
Zurück zum Zitat Quevedo J, Puig V, Cembrano G, Blanch J, Aguilar J, Saporta D, Benito G, Hedo M, Molina A (2010) Validation and reconstruction of flow meter data in the barcelona water distribution network. Control Eng Pract 18(6):640–651 Quevedo J, Puig V, Cembrano G, Blanch J, Aguilar J, Saporta D, Benito G, Hedo M, Molina A (2010) Validation and reconstruction of flow meter data in the barcelona water distribution network. Control Eng Pract 18(6):640–651
46.
Zurück zum Zitat Khan J, Wei JS, Ringnér M, Lao HS, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C, (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7(6):673–679 Khan J, Wei JS, Ringnér M, Lao HS, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C, (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7(6):673–679
47.
Zurück zum Zitat Li JD, Cheng KW, Wang SH, Morstatter F, Trevino RP, Tang JL, Liu H (2018) Feature selection: a data perspective. ACM Comput Surv 9(4):1–45 Li JD, Cheng KW, Wang SH, Morstatter F, Trevino RP, Tang JL, Liu H (2018) Feature selection: a data perspective. ACM Comput Surv 9(4):1–45
49.
Zurück zum Zitat Zhang YY, Li TR, Luo C, Zhang JB, Chen HM (2016) Incremental updating of rough approximations in interval-valued information systems under attribute generalization. Inf Sci 373:461–475MATH Zhang YY, Li TR, Luo C, Zhang JB, Chen HM (2016) Incremental updating of rough approximations in interval-valued information systems under attribute generalization. Inf Sci 373:461–475MATH
50.
Zurück zum Zitat Dai JH, Wei BJ, Zhang XH, Zhang QL (2017) Uncertainty measurement for incomplete interval-valued information systems based on \(\alpha\)-weak similarity. Knowl-Based Syst 136:159–171 Dai JH, Wei BJ, Zhang XH, Zhang QL (2017) Uncertainty measurement for incomplete interval-valued information systems based on \(\alpha\)-weak similarity. Knowl-Based Syst 136:159–171
51.
Zurück zum Zitat He DC, Zhang HJ, Hao WN, Zhang R (2015) A robust parzen window mutual information estimator for feature selection with label noise. Intell Data Anal 19:1199–1212 He DC, Zhang HJ, Hao WN, Zhang R (2015) A robust parzen window mutual information estimator for feature selection with label noise. Intell Data Anal 19:1199–1212
Metadaten
Titel
Fast feature selection for interval-valued data through kernel density estimation entropy
verfasst von
Jianhua Dai
Ye Liu
Jiaolong Chen
Xiaofeng Liu
Publikationsdatum
07.05.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 12/2020
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-020-01131-5

Weitere Artikel der Ausgabe 12/2020

International Journal of Machine Learning and Cybernetics 12/2020 Zur Ausgabe

Neuer Inhalt