Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 1/2016

01.02.2016 | Original Article

Feature and instance reduction for PNN classifiers based on fuzzy rough sets

verfasst von: Eric C. C. Tsang, Qinghua Hu, Degang Chen

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Instance reduction for K-nearest-neighbor classification rules (KNN) has attracted much attention these years, and most of the existing approaches lose the semantics of probability of original data. In this work, we propose a new reduced KNN rule, called FAIR-KNN, to perform feature and instance reduction based on fuzzy rough set theory. First, we use fuzzy rough sets to evaluate candidate features and select the most informative ones. The algorithm of feature selection returns the selected features and the membership values of samples to the lower approximations of their classes. These values reflect the distances of the samples to classification boundary and are used to compute probabilities of samples to be subsampled. Then we introduce a weighted Parzen window technique to estimate the probability from the weighted subsampled data. Thus we can not only reduce features and samples in original data, but also do not lose the semantics of probability. Finally, the memberships of samples to lower and upper approximations of decisions are interpreted as certainty and possibility degrees of samples belonging to the corresponding decisions, respectively. So the weighted averages with probability of the memberships of samples to lower and upper approximations are outputted as the certainty and possibility degrees of unseen samples belonging to some decisions, which enrich the semantics of KNN. Numerical experiments on artificial and real-world data validate the effectiveness of the proposed technique.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefMATH Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefMATH
2.
Zurück zum Zitat Hart PE (1968) Condensed nearest neighbor rule. IEEE Trans Inf Theory 14(3):515CrossRef Hart PE (1968) Condensed nearest neighbor rule. IEEE Trans Inf Theory 14(3):515CrossRef
3.
Zurück zum Zitat Lai JZC, Liaw YC, Liu J (2007) Fast k-nearest-neighbor search based on projection and triangular inequality. Pattern Recogn 40(2):351–359CrossRefMATH Lai JZC, Liaw YC, Liu J (2007) Fast k-nearest-neighbor search based on projection and triangular inequality. Pattern Recogn 40(2):351–359CrossRefMATH
4.
Zurück zum Zitat Angiulli F (2007) Fast nearest neighbor condensation for large data sets classification. IEEE Trans Knowl Data Eng 19(11):1450–1464CrossRef Angiulli F (2007) Fast nearest neighbor condensation for large data sets classification. IEEE Trans Knowl Data Eng 19(11):1450–1464CrossRef
5.
Zurück zum Zitat Tao YF, Yiu ML, Mamoulis N (2006) Reverse nearest neighbor search in metric spaces. IEEE Trans Knowl Data Eng 18(9):1239–1252CrossRef Tao YF, Yiu ML, Mamoulis N (2006) Reverse nearest neighbor search in metric spaces. IEEE Trans Knowl Data Eng 18(9):1239–1252CrossRef
6.
Zurück zum Zitat Keller JM, Gray MR, Givens JA (1985) A fuzzy k-nearest neighbor algorithm. IEEE Trans Syst Man Cybern 15(4):580–585CrossRef Keller JM, Gray MR, Givens JA (1985) A fuzzy k-nearest neighbor algorithm. IEEE Trans Syst Man Cybern 15(4):580–585CrossRef
7.
Zurück zum Zitat Yager RR (2002) Using fuzzy methods to model nearest neighbor rules. IEEE Trans Syst Man Cybern Part B Cybern 32(4):512–525MathSciNetCrossRef Yager RR (2002) Using fuzzy methods to model nearest neighbor rules. IEEE Trans Syst Man Cybern Part B Cybern 32(4):512–525MathSciNetCrossRef
8.
Zurück zum Zitat Wettschereck D, Aha DW, Mohri T (1997) A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artif Intell Rev 11(1–5):273–314CrossRef Wettschereck D, Aha DW, Mohri T (1997) A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artif Intell Rev 11(1–5):273–314CrossRef
9.
Zurück zum Zitat Domeniconi C, Gunopulos D, Peng J (2005) Large margin nearest neighbor classifiers. IEEE Trans Neural Netw 16(4):899–909CrossRef Domeniconi C, Gunopulos D, Peng J (2005) Large margin nearest neighbor classifiers. IEEE Trans Neural Netw 16(4):899–909CrossRef
10.
Zurück zum Zitat Paredes R, Vidal E (2006) Learning weighted metrics to minimize nearest-neighbor classification error. IEEE Trans Pattern Anal Mach Intell 28(7):1100–1110CrossRef Paredes R, Vidal E (2006) Learning weighted metrics to minimize nearest-neighbor classification error. IEEE Trans Pattern Anal Mach Intell 28(7):1100–1110CrossRef
11.
Zurück zum Zitat Wang JG, Neskovic P, Cooper LN (2007) Improving nearest neighbor rule with a simple adaptive distance measure. Pattern Recogn Lett 28(2):207–213CrossRef Wang JG, Neskovic P, Cooper LN (2007) Improving nearest neighbor rule with a simple adaptive distance measure. Pattern Recogn Lett 28(2):207–213CrossRef
12.
Zurück zum Zitat Ghosh AK (2007) On nearest neighbor classification using adaptive choice of k. J Comput Gr Stat 16(2):482–502CrossRefMathSciNet Ghosh AK (2007) On nearest neighbor classification using adaptive choice of k. J Comput Gr Stat 16(2):482–502CrossRefMathSciNet
13.
Zurück zum Zitat Ferri FJ, Albert JV, Vidal E (1999) Considerations about sample-size sensitivity of a family of edited nearest-neighbor rules. IEEE Trans Syst Man Cybern Part B Cybern 29(5):667–672CrossRef Ferri FJ, Albert JV, Vidal E (1999) Considerations about sample-size sensitivity of a family of edited nearest-neighbor rules. IEEE Trans Syst Man Cybern Part B Cybern 29(5):667–672CrossRef
14.
Zurück zum Zitat Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38(3):257–286CrossRefMATH Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38(3):257–286CrossRefMATH
15.
Zurück zum Zitat Lindenbaum M, Markovitch S, Rusakov D (2004) Selective sampling for nearest neighbor classifiers. Mach Learn 54(2):125–152CrossRefMATH Lindenbaum M, Markovitch S, Rusakov D (2004) Selective sampling for nearest neighbor classifiers. Mach Learn 54(2):125–152CrossRefMATH
16.
Zurück zum Zitat Zouhal LM, Denoeux T (1998) An evidence-theoretic k-NN rule with parameter optimization. IEEE Trans Syst Man Cybern Part C Appl Rev 28(2):263–271CrossRef Zouhal LM, Denoeux T (1998) An evidence-theoretic k-NN rule with parameter optimization. IEEE Trans Syst Man Cybern Part C Appl Rev 28(2):263–271CrossRef
17.
Zurück zum Zitat Ghosh AK, Chaudhuri P, Murthy CA (2005) On visualization and aggregation of nearest neighbor classifiers. IEEE Trans Pattern Anal Mach Intell 27(10):1592–1602CrossRef Ghosh AK, Chaudhuri P, Murthy CA (2005) On visualization and aggregation of nearest neighbor classifiers. IEEE Trans Pattern Anal Mach Intell 27(10):1592–1602CrossRef
18.
Zurück zum Zitat Wettschereck D, Dietterich TG (1995) An experimental comparison of the nearest-neighbor and nearest-hyperrectangle algorithms. Mach Learn 19(1):5–27 Wettschereck D, Dietterich TG (1995) An experimental comparison of the nearest-neighbor and nearest-hyperrectangle algorithms. Mach Learn 19(1):5–27
19.
Zurück zum Zitat Wang H (2006) Nearest neighbors by neighborhood counting. IEEE Trans Pattern Anal Mach Intell 28(6):942–953CrossRef Wang H (2006) Nearest neighbors by neighborhood counting. IEEE Trans Pattern Anal Mach Intell 28(6):942–953CrossRef
20.
Zurück zum Zitat Hu QH, Yu DR, Xie ZX (2008) Neighborhood classifiers. Expert Syst Appl 34:866–876CrossRef Hu QH, Yu DR, Xie ZX (2008) Neighborhood classifiers. Expert Syst Appl 34:866–876CrossRef
21.
Zurück zum Zitat Homes CC, Adams NM (2002) A probabilistic nearest neighbour method for statistical pattern recognition. J Royal Stat Soc B 64:295–306CrossRefMathSciNetMATH Homes CC, Adams NM (2002) A probabilistic nearest neighbour method for statistical pattern recognition. J Royal Stat Soc B 64:295–306CrossRefMathSciNetMATH
22.
Zurück zum Zitat Jiang QY, Zhang WS (1993) An improved method for finding nearest neighbors. Pattern Recogn Lett 14(7):531–535CrossRef Jiang QY, Zhang WS (1993) An improved method for finding nearest neighbors. Pattern Recogn Lett 14(7):531–535CrossRef
23.
Zurück zum Zitat McNames J (2001) A fast nearest-neighbor algorithm based on a principal axis search tree. IEEE Trans Pattern Anal Mach Intell 23(9):964–976CrossRef McNames J (2001) A fast nearest-neighbor algorithm based on a principal axis search tree. IEEE Trans Pattern Anal Mach Intell 23(9):964–976CrossRef
24.
Zurück zum Zitat Cha GH, Zhu XM, Petkovic D et al (2002) An efficient indexing method for nearest neighbor searches in high-dimensional image databases. IEEE Trans Multimed 4(1):76–87CrossRef Cha GH, Zhu XM, Petkovic D et al (2002) An efficient indexing method for nearest neighbor searches in high-dimensional image databases. IEEE Trans Multimed 4(1):76–87CrossRef
26.
Zurück zum Zitat Narayan BL, Murthy CA, Pal SK (2006) Maxdiff kd-trees for data condensation. Pattern Recogn Lett 27(3):187–200CrossRef Narayan BL, Murthy CA, Pal SK (2006) Maxdiff kd-trees for data condensation. Pattern Recogn Lett 27(3):187–200CrossRef
27.
Zurück zum Zitat Fu AW, Chan PM, Cheung YL et al (2000) Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances. VLDB J 9(2):154–173CrossRef Fu AW, Chan PM, Cheung YL et al (2000) Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances. VLDB J 9(2):154–173CrossRef
28.
Zurück zum Zitat Ritter GL, Woodruff HB, Lowry SR et al (1975) Algorithm for a selective nearest neighbor decision rule. IEEE Trans Inf Theory 21(6):665–669CrossRefMATH Ritter GL, Woodruff HB, Lowry SR et al (1975) Algorithm for a selective nearest neighbor decision rule. IEEE Trans Inf Theory 21(6):665–669CrossRefMATH
29.
Zurück zum Zitat Gates GW (1972) Reduced nearest neighbor rule. IEEE Trans Inf Theory 18(3):431CrossRef Gates GW (1972) Reduced nearest neighbor rule. IEEE Trans Inf Theory 18(3):431CrossRef
30.
Zurück zum Zitat Manocha S, Girolami MA (2007) An empirical analysis of the probabilistic K-nearest neighbour classifier. Pattern Recogn Lett 28:1818–1824CrossRef Manocha S, Girolami MA (2007) An empirical analysis of the probabilistic K-nearest neighbour classifier. Pattern Recogn Lett 28:1818–1824CrossRef
31.
Zurück zum Zitat Dubois D, Prade H (1990) Rough fuzzy-sets and fuzzy rough sets. Int J Gen Syst 17(2–3):191–209CrossRefMATH Dubois D, Prade H (1990) Rough fuzzy-sets and fuzzy rough sets. Int J Gen Syst 17(2–3):191–209CrossRefMATH
33.
Zurück zum Zitat Mitra S, Banka H, Pedrycz W (2006) Rough-fuzzy collaborative clustering. IEEE Trans Syst Man Cybern Part B Cybern 36(4):795–805CrossRef Mitra S, Banka H, Pedrycz W (2006) Rough-fuzzy collaborative clustering. IEEE Trans Syst Man Cybern Part B Cybern 36(4):795–805CrossRef
34.
Zurück zum Zitat Jensen R, Shen Q (2004) Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches. IEEE Trans Knowl Data Eng 16(12):1457–1471CrossRef Jensen R, Shen Q (2004) Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches. IEEE Trans Knowl Data Eng 16(12):1457–1471CrossRef
35.
Zurück zum Zitat Bhatt RB, Gopal M (2005) On fuzzy-rough sets approach to feature selection. Pattern Recogn Lett 26(7):965–975CrossRef Bhatt RB, Gopal M (2005) On fuzzy-rough sets approach to feature selection. Pattern Recogn Lett 26(7):965–975CrossRef
36.
Zurück zum Zitat Li Y, Shiu SCK, Pal SK (2006) Combining feature reduction and case selection in building CBR classifiers. IEEE Trans Knowl Data Eng 18(3):415–429CrossRef Li Y, Shiu SCK, Pal SK (2006) Combining feature reduction and case selection in building CBR classifiers. IEEE Trans Knowl Data Eng 18(3):415–429CrossRef
37.
Zurück zum Zitat Hu QH, Xie ZX, Yu DR (2007) Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation. Pattern Recogn 40(12):3509–3521CrossRefMATH Hu QH, Xie ZX, Yu DR (2007) Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation. Pattern Recogn 40(12):3509–3521CrossRefMATH
38.
Zurück zum Zitat Jensen R, Shen Q (2007) Fuzzy-rough sets assisted attribute selection. IEEE Trans Fuzzy Syst 15(1):73–89CrossRef Jensen R, Shen Q (2007) Fuzzy-rough sets assisted attribute selection. IEEE Trans Fuzzy Syst 15(1):73–89CrossRef
39.
Zurück zum Zitat Shen Q, Chouchoulas A (1999) Combining rough sets and data-driven fuzzy learning for generation of classification rules. Pattern Recogn 32(12):2073–2076CrossRef Shen Q, Chouchoulas A (1999) Combining rough sets and data-driven fuzzy learning for generation of classification rules. Pattern Recogn 32(12):2073–2076CrossRef
40.
Zurück zum Zitat Hong TP, Wang TT, Wang SL et al (2000) Learning a coverage set of maximally general fuzzy rules by rough sets. Expert Syst Appl 19(2):97–103CrossRef Hong TP, Wang TT, Wang SL et al (2000) Learning a coverage set of maximally general fuzzy rules by rough sets. Expert Syst Appl 19(2):97–103CrossRef
41.
Zurück zum Zitat Shen Q, Chouchoulas A (2002) A rough-fuzzy approach for generating classification rules. Pattern Recogn 35(11):2425–2438CrossRefMATH Shen Q, Chouchoulas A (2002) A rough-fuzzy approach for generating classification rules. Pattern Recogn 35(11):2425–2438CrossRefMATH
42.
Zurück zum Zitat Pal SK, Mitra P (2004) Case generation using rough sets with fuzzy representation. IEEE Trans Knowl Data Eng 16(3):292–300CrossRef Pal SK, Mitra P (2004) Case generation using rough sets with fuzzy representation. IEEE Trans Knowl Data Eng 16(3):292–300CrossRef
44.
Zurück zum Zitat Zadeh LA (1996) Fuzzy logic = computing with words. IEEE Trans Fuzzy Syst 4:103–111CrossRef Zadeh LA (1996) Fuzzy logic = computing with words. IEEE Trans Fuzzy Syst 4:103–111CrossRef
46.
47.
Zurück zum Zitat Yeung DS, Chen D-G, Tsang ECC, Lee JWT, Wang X-Z (2005) On the generalization of fuzzy rough sets. IEEE Trans Fuzzy Syst 13(3):343–361CrossRef Yeung DS, Chen D-G, Tsang ECC, Lee JWT, Wang X-Z (2005) On the generalization of fuzzy rough sets. IEEE Trans Fuzzy Syst 13(3):343–361CrossRef
48.
Zurück zum Zitat Moser B (2006) On representing and generating kernels by fuzzy equivalence relations. J Mach Learn Res 7:2603–2620MathSciNetMATH Moser B (2006) On representing and generating kernels by fuzzy equivalence relations. J Mach Learn Res 7:2603–2620MathSciNetMATH
49.
Zurück zum Zitat Babich GA, Camps OI (1996) Weighted Parzen windows for pattern classification. IEEE Trans Pattern Anal Mach Intell 18:567–570CrossRef Babich GA, Camps OI (1996) Weighted Parzen windows for pattern classification. IEEE Trans Pattern Anal Mach Intell 18:567–570CrossRef
50.
Zurück zum Zitat Hu QH, Zhang L, Chen DG, Pedrycz W, Yu DR (2010) Gaussian kernel based fuzzy rough sets: model, uncertainty and applications. Int J Approx Reason 51(4):453–471CrossRefMATH Hu QH, Zhang L, Chen DG, Pedrycz W, Yu DR (2010) Gaussian kernel based fuzzy rough sets: model, uncertainty and applications. Int J Approx Reason 51(4):453–471CrossRefMATH
51.
Zurück zum Zitat Derrac J, Cornelis C, García S, Herrera F (2012) Enhancing evolutionary instance selection algorithms by means of fuzzy rough set based feature selection. Inf Sci 186(1):73–92CrossRef Derrac J, Cornelis C, García S, Herrera F (2012) Enhancing evolutionary instance selection algorithms by means of fuzzy rough set based feature selection. Inf Sci 186(1):73–92CrossRef
52.
Zurück zum Zitat Jensen R, Cornelis C (2010) Fuzzy-rough instance selection. Fuzzy systems (FUZZ), 2010 IEEE International Conference, pp 1–7 Jensen R, Cornelis C (2010) Fuzzy-rough instance selection. Fuzzy systems (FUZZ), 2010 IEEE International Conference, pp 1–7
53.
Zurück zum Zitat Kang X-M, Liu X-P, Zhai J-H, Zhai M-Y (2011) Instances selection for NN with fuzzy rough technique. In: 2011 International Conference on machine learning and cybernetics, vol 3, pp 1097, 1100 Kang X-M, Liu X-P, Zhai J-H, Zhai M-Y (2011) Instances selection for NN with fuzzy rough technique. In: 2011 International Conference on machine learning and cybernetics, vol 3, pp 1097, 1100
54.
Zurück zum Zitat Verbiest N, Cornelis C, Herrera F (2013) fRPS: a fuzzy rough prototype selection method. Pattern Recogn 46(10):2770–2782CrossRefMATH Verbiest N, Cornelis C, Herrera F (2013) fRPS: a fuzzy rough prototype selection method. Pattern Recogn 46(10):2770–2782CrossRefMATH
55.
Zurück zum Zitat Hu QH, Yu DR, Pedrycz W, Chen DG (2011) Kernelized fuzzy rough sets and their applications. IEEE Trans Knowl Data Eng 23(11):1649–1667CrossRef Hu QH, Yu DR, Pedrycz W, Chen DG (2011) Kernelized fuzzy rough sets and their applications. IEEE Trans Knowl Data Eng 23(11):1649–1667CrossRef
56.
Zurück zum Zitat Tomašev N, Radovanović M, Mladenić D, Ivanović M (2012) Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification. Int J Mach Learn Cybernet. doi:10.1007/s13042-012-0137-1 Tomašev N, Radovanović M, Mladenić D, Ivanović M (2012) Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification. Int J Mach Learn Cybernet. doi:10.​1007/​s13042-012-0137-1
58.
59.
Metadaten
Titel
Feature and instance reduction for PNN classifiers based on fuzzy rough sets
verfasst von
Eric C. C. Tsang
Qinghua Hu
Degang Chen
Publikationsdatum
01.02.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 1/2016
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-014-0232-6

Weitere Artikel der Ausgabe 1/2016

International Journal of Machine Learning and Cybernetics 1/2016 Zur Ausgabe

Neuer Inhalt