Skip to main content
Erschienen in: Soft Computing 23/2017

09.07.2016 | Methodologies and Application

A PSO algorithm for multi-objective cost-sensitive attribute reduction on numeric data with error ranges

verfasst von: Yu Fang, Zhong-Hui Liu, Fan Min

Erschienen in: Soft Computing | Ausgabe 23/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Multi-objective cost-sensitive attribute reduction is an attractive problem in supervised machine learning. Most research has focused on single-objective minimal test cost reduction or dealt with symbolic data. In this paper, we propose a particle swarm optimization algorithm for the attribute reduction problem on numeric data with multiple costs and error ranges and use three metrics with which to evaluate the performance of the algorithm. The proposed algorithm benefits from a fitness function based on the positive region, the selected n types of the test cost, a set of constant weight values \(w_{i}^k\), and a designated non-positive exponent \(\lambda \). We design a learning strategy by setting dominance principles, which ensures the preservation of Pareto-optimal solutions and the rejection of redundant solutions. With different parameter settings, our PSO algorithm searches for a sub-optimal reduct set. Finally, we test our algorithm on seven UCI (University of California, Irvine) datasets. Comparisons with alternative approaches including the \(\lambda \)-weighted method and exhaustive calculation method of reduction are analyzed. Experimental results indicate that our heuristic algorithm outperforms existing algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Berry MJ, Linoff G (1997) Data mining techniques: for marketing, sales and customer support. Wiley, Hoboken Berry MJ, Linoff G (1997) Data mining techniques: for marketing, sales and customer support. Wiley, Hoboken
Zurück zum Zitat Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning, vol 1. Springer, New YorkMATH Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning, vol 1. Springer, New YorkMATH
Zurück zum Zitat Bradford J, Kunz C, Kohavi R (1998) Pruning decision trees with misclassification costs. In: Proceedings of the 10th European conference on machine learning, Berlin, pp 131–136 Bradford J, Kunz C, Kohavi R (1998) Pruning decision trees with misclassification costs. In: Proceedings of the 10th European conference on machine learning, Berlin, pp 131–136
Zurück zum Zitat Chen YM, Miao DQ, Wang RZ (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31:226–233CrossRef Chen YM, Miao DQ, Wang RZ (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31:226–233CrossRef
Zurück zum Zitat Drummond C, Holte R (2000) Exploiting the cost (in)sensitivity of decision tree splitting criteria. In: The 17th international conference on machine learning. Morgan Kaufmann, pp 239–246 Drummond C, Holte R (2000) Exploiting the cost (in)sensitivity of decision tree splitting criteria. In: The 17th international conference on machine learning. Morgan Kaufmann, pp 239–246
Zurück zum Zitat Du Y, Hu Q, Zhu P, Ma P (2011) Rule learning for classification based on neighborhood covering reduction. Inf Sci 181(24):5457–5467 Du Y, Hu Q, Zhu P, Ma P (2011) Rule learning for classification based on neighborhood covering reduction. Inf Sci 181(24):5457–5467
Zurück zum Zitat Fan W, Stolfo S, Zhang J, Chan P (1999) Adacost: misclassification cost-sensitive boosting, In: The 16th international conference on machine learning, Bled Slovenia, pp 97–105 Fan W, Stolfo S, Zhang J, Chan P (1999) Adacost: misclassification cost-sensitive boosting, In: The 16th international conference on machine learning, Bled Slovenia, pp 97–105
Zurück zum Zitat Fumera G, Roli F (2005) A theoretical and experimental analysis of linear combiners for multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 27:942–956CrossRef Fumera G, Roli F (2005) A theoretical and experimental analysis of linear combiners for multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 27:942–956CrossRef
Zurück zum Zitat Greiner R, Grove A, Roth D (2002) Learning cost-sensitive active classifiers. Artif Intell J 139(2):137–174CrossRefMathSciNet Greiner R, Grove A, Roth D (2002) Learning cost-sensitive active classifiers. Artif Intell J 139(2):137–174CrossRefMathSciNet
Zurück zum Zitat Han J, Kamber M (2001) Data mining: concepts and techniques. China Machine Press, BeijingMATH Han J, Kamber M (2001) Data mining: concepts and techniques. China Machine Press, BeijingMATH
Zurück zum Zitat Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594CrossRefMATHMathSciNet Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594CrossRefMATHMathSciNet
Zurück zum Zitat John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the 11th conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo, pp 338–345 John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the 11th conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo, pp 338–345
Zurück zum Zitat Knoll U, Nakhaeizadeh G, Tausend B (1994) Cost-sensitive pruning of decision trees. In: Proceedings of the 8th European conference on machine learning. Catania, pp 383–386 Knoll U, Nakhaeizadeh G, Tausend B (1994) Cost-sensitive pruning of decision trees. In: Proceedings of the 8th European conference on machine learning. Catania, pp 383–386
Zurück zum Zitat Kukar M, Kononenko I (1998) Cost-sensitive learning with neural networks. In: Proceedings of the 13th European conference on artificial intelligence, pp 445–449 Kukar M, Kononenko I (1998) Cost-sensitive learning with neural networks. In: Proceedings of the 13th European conference on artificial intelligence, pp 445–449
Zurück zum Zitat Ledley RS, Lusted LB (1959) Reasoning foundations of medical diagnosis. Science 130(3366):9–21CrossRefMATH Ledley RS, Lusted LB (1959) Reasoning foundations of medical diagnosis. Science 130(3366):9–21CrossRefMATH
Zurück zum Zitat Li H-X, Zhou X-Z (2011) Risk decision making based on decision-theoretic rough set: a three-way view decision model. Int J Comput Intell Syst 4(1):1–11CrossRefMathSciNet Li H-X, Zhou X-Z (2011) Risk decision making based on decision-theoretic rough set: a three-way view decision model. Int J Comput Intell Syst 4(1):1–11CrossRefMathSciNet
Zurück zum Zitat Li L, Chen H, Zhu W (2014a) Attribute reduction in time-cost-sensitive decision systems through backtracking. J Info Comput Sci 11(2):597–606. doi:10.12733/jics20102790 Li L, Chen H, Zhu W (2014a) Attribute reduction in time-cost-sensitive decision systems through backtracking. J Info Comput Sci 11(2):597–606. doi:10.​12733/​jics20102790
Zurück zum Zitat Li J, Zhao H, Zhu W (2014b) Fast randomized algorithm with restart strategy for minimal test cost feature selection. Int J Mach Learn Cybernet 5(3):234–556 Li J, Zhao H, Zhu W (2014b) Fast randomized algorithm with restart strategy for minimal test cost feature selection. Int J Mach Learn Cybernet 5(3):234–556
Zurück zum Zitat Liu J, Liao S, Min F, Zhu W (2012a) An improved genetic algorithm to minimal test cost reduction. In: Lin TY, Hu X, Wu Z, Chen ALP, Broder AZ, Ho H, Wang S (eds) IEEE international conference on granular computing (GrC), Washington, DC, pp 304–309 Liu J, Liao S, Min F, Zhu W (2012a) An improved genetic algorithm to minimal test cost reduction. In: Lin TY, Hu X, Wu Z, Chen ALP, Broder AZ, Ho H, Wang S (eds) IEEE international conference on granular computing (GrC), Washington, DC, pp 304–309
Zurück zum Zitat Liu J, Min F, Liao S, Zhu W (2012b) Minimal test cost feature selection with positive region constraint. In: Rough sets and current trends in computing, pp 259–266 Liu J, Min F, Liao S, Zhu W (2012b) Minimal test cost feature selection with positive region constraint. In: Rough sets and current trends in computing, pp 259–266
Zurück zum Zitat Liu J, Liao S, Min F, Zhu W (2013) Test cost constraint attribute reduction through a genetic approach. J Inf Comput Sci 10(3):839–849 Liu J, Liao S, Min F, Zhu W (2013) Test cost constraint attribute reduction through a genetic approach. J Inf Comput Sci 10(3):839–849
Zurück zum Zitat Liu D, Li T-R, Liang D-C (2014) Incorporating logistic regression to decision-theoretic rough sets for classifications. Int J Approximate Reasoning 55(1):197–210CrossRefMATHMathSciNet Liu D, Li T-R, Liang D-C (2014) Incorporating logistic regression to decision-theoretic rough sets for classifications. Int J Approximate Reasoning 55(1):197–210CrossRefMATHMathSciNet
Zurück zum Zitat Min F, Zhu W (2011) Minimal cost attribute reduction through backtracking. In: Tai-hoon K, Hojjat A, Alfredo C, Tughrul A, Yanchun Z, Jianhua M, Kyo-il C, Siti M, Xiaofeng S (eds) FGIT-database theory and application/bio-science and bio-technology, Springer, Berlin, Heidelberg, pp 100–107 Min F, Zhu W (2011) Minimal cost attribute reduction through backtracking. In: Tai-hoon K, Hojjat A, Alfredo C, Tughrul A, Yanchun Z, Jianhua M, Kyo-il C, Siti M, Xiaofeng S (eds) FGIT-database theory and application/bio-science and bio-technology, Springer, Berlin, Heidelberg, pp 100–107
Zurück zum Zitat Min F, He H, Qiao Y, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181:4928–4942CrossRef Min F, He H, Qiao Y, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181:4928–4942CrossRef
Zurück zum Zitat Nunez M (1991) The use of background knowledge in decision tree induction. Mach Learn 6(3):231–250 Nunez M (1991) The use of background knowledge in decision tree induction. Mach Learn 6(3):231–250
Zurück zum Zitat Pan G, Min F, Zhu W (2011) A genetic algorithm to the minimal test cost reduct problem. In: IEEE international conference on granular computing (GrC), pp 539–544 Pan G, Min F, Zhu W (2011) A genetic algorithm to the minimal test cost reduct problem. In: IEEE international conference on granular computing (GrC), pp 539–544
Zurück zum Zitat Pawlak Z (2002) Rough set theory and its applications. J Telecommun Inf Technol 3:7–10 Pawlak Z (2002) Rough set theory and its applications. J Telecommun Inf Technol 3:7–10
Zurück zum Zitat Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann, Burlington Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann, Burlington
Zurück zum Zitat Salvatore G, Benedetto M, Roman S (2001) Rough sets theory for multicriteria decision analysis. Eur J Oper Res 129(1):1–47CrossRefMATH Salvatore G, Benedetto M, Roman S (2001) Rough sets theory for multicriteria decision analysis. Eur J Oper Res 129(1):1–47CrossRefMATH
Zurück zum Zitat Sun L, Xu J, Tian Y (2012) Feature selection using rough entropy-based uncertainty measures in incomplete decision systems. Knowl Based Syst 36:206–216CrossRef Sun L, Xu J, Tian Y (2012) Feature selection using rough entropy-based uncertainty measures in incomplete decision systems. Knowl Based Syst 36:206–216CrossRef
Zurück zum Zitat Turney P (1995) Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J Artif Intell Res 2:369–409 Turney P (1995) Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J Artif Intell Res 2:369–409
Zurück zum Zitat Turney P (2000) Types of cost in inductive concept learning. In: Workshop on cost-sensitive learning at the 17th international conference on machine learning, Stanford University, California Turney P (2000) Types of cost in inductive concept learning. In: Workshop on cost-sensitive learning at the 17th international conference on machine learning, Stanford University, California
Zurück zum Zitat Wang G, Du H, Yang D (2002) Reduction of decision table based on condition information entropy. Chin J Comput 25(7):759–766 Wang G, Du H, Yang D (2002) Reduction of decision table based on condition information entropy. Chin J Comput 25(7):759–766
Zurück zum Zitat Xu B, Chen H, Zhu W (2013) Multi-objective cost-sensitive attribute reduction. In: Proceedings of the 2013 joint IFSA world congress and NAFIPS annual meeting, Canada, pp 1377–1381 Xu B, Chen H, Zhu W (2013) Multi-objective cost-sensitive attribute reduction. In: Proceedings of the 2013 joint IFSA world congress and NAFIPS annual meeting, Canada, pp 1377–1381
Zurück zum Zitat Xu Z, Zhao H, Min F, Zhu W (2013) Ant colony optimization with three stages for independent test cost attribute reduction. Math Probl Eng. doi:10.1155/2013/510167 Xu Z, Zhao H, Min F, Zhu W (2013) Ant colony optimization with three stages for independent test cost attribute reduction. Math Probl Eng. doi:10.​1155/​2013/​510167
Zurück zum Zitat Xu B, Min F, Zhu W, Chen H (2014) A genetic algorithm to multi-objective cost-sensitive attribute reduction. J Comput Inf Syst 10(7):3011–3022 Xu B, Min F, Zhu W, Chen H (2014) A genetic algorithm to multi-objective cost-sensitive attribute reduction. J Comput Inf Syst 10(7):3011–3022
Zurück zum Zitat Xu J, Chen L, Min F (2014) Minimal test cost reduction through a randomized heuristic algorithm. J Inf Comput Sci 11(13):4555–4565CrossRef Xu J, Chen L, Min F (2014) Minimal test cost reduction through a randomized heuristic algorithm. J Inf Comput Sci 11(13):4555–4565CrossRef
Zurück zum Zitat Yang Q, Wu X (2006) 10 Challenging problems in data mining research. Int J Inf Technol Decis Mak 5(04):597–604CrossRef Yang Q, Wu X (2006) 10 Challenging problems in data mining research. Int J Inf Technol Decis Mak 5(04):597–604CrossRef
Zurück zum Zitat Yang X-B, Qi Y-S, Song X-N, Yang J-Y (2013) Test cost sensitive multigranulation rough set: model and minimal cost selection. Inf Sci 250:184–199CrossRefMATHMathSciNet Yang X-B, Qi Y-S, Song X-N, Yang J-Y (2013) Test cost sensitive multigranulation rough set: model and minimal cost selection. Inf Sci 250:184–199CrossRefMATHMathSciNet
Zurück zum Zitat Yao YY (2004) A partition model of granular computing. Lect Notes Comput Sci 3100:232–253CrossRefMATH Yao YY (2004) A partition model of granular computing. Lect Notes Comput Sci 3100:232–253CrossRefMATH
Zurück zum Zitat Yao YY, Wong S (1992) A decision theoretic framework for approximating concepts. Int J Man Mach Stud 37(6):793–809CrossRef Yao YY, Wong S (1992) A decision theoretic framework for approximating concepts. Int J Man Mach Stud 37(6):793–809CrossRef
Zurück zum Zitat Yao YY, Wong S, Lingras P (1990) A decision-theoretic rough set model. In: The 5th international symposium on methodologies for intelligent systems, pp 17–24 Yao YY, Wong S, Lingras P (1990) A decision-theoretic rough set model. In: The 5th international symposium on methodologies for intelligent systems, pp 17–24
Zurück zum Zitat Yao YY, Zhao Y, Wang J (2006) On reduct construction algorithms. In: Gavrilova ML, Tan CJK, Wang Y, Yao Y, Wang G (eds) Proceedings of rough set and knowledge technology, vol 4062, Springer, Berlin, Heidelberg, pp 297–304 Yao YY, Zhao Y, Wang J (2006) On reduct construction algorithms. In: Gavrilova ML, Tan CJK, Wang Y, Yao Y, Wang G (eds) Proceedings of rough set and knowledge technology, vol 4062, Springer, Berlin, Heidelberg, pp 297–304
Zurück zum Zitat Zhao H, Min F, Zhu W (2013) A backtracking approach to minimal cost feature selection of numerical data. J Inf Comput Sci 10(13):4105–4115CrossRef Zhao H, Min F, Zhu W (2013) A backtracking approach to minimal cost feature selection of numerical data. J Inf Comput Sci 10(13):4105–4115CrossRef
Zurück zum Zitat Zhao H, Min F, Zhu W (2013) Cost-sensitive feature selection of numeric data with measurement errors. J Appl Math 2013:1–13MATH Zhao H, Min F, Zhu W (2013) Cost-sensitive feature selection of numeric data with measurement errors. J Appl Math 2013:1–13MATH
Zurück zum Zitat Zubek V, Dietterich T (2002) Pruning improves heuristic search for cost-sensitive learning. In: Proceedings of the 19th international conference on machine learning. Sydney, pp 27–34 Zubek V, Dietterich T (2002) Pruning improves heuristic search for cost-sensitive learning. In: Proceedings of the 19th international conference on machine learning. Sydney, pp 27–34
Metadaten
Titel
A PSO algorithm for multi-objective cost-sensitive attribute reduction on numeric data with error ranges
verfasst von
Yu Fang
Zhong-Hui Liu
Fan Min
Publikationsdatum
09.07.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 23/2017
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-016-2260-5

Weitere Artikel der Ausgabe 23/2017

Soft Computing 23/2017 Zur Ausgabe