Skip to main content
Erschienen in: Progress in Artificial Intelligence 2-3/2014

01.06.2014 | Regular Paper

Accuracy–diversity based pruning of classifier ensembles

verfasst von: Vasudha Bhatnagar, Manju Bhardwaj, Shivam Sharma, Sufyan Haroon

Erschienen in: Progress in Artificial Intelligence | Ausgabe 2-3/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Classification ensemble methods have recently drawn serious attention due to their ability to appreciably pull up prediction performance. Since smaller ensembles are preferred because of storage and efficiency reasons, ensemble pruning is an important step for construction of classifier ensembles. In this paper, we propose a heuristic method to obtain an optimal ensemble from a given pool of classifiers. The proposed accuracy–diversity based pruning algorithm takes into account the accuracy of individual classifiers as well as the pairwise diversity amongst these classifiers. The algorithm performs a systematic bottom-up search and conditionally grows sub-ensembles by adding diverse pairs of classifiers to the candidates with relatively higher accuracies. The ultimate aim is to deliver the smallest ensemble with highest achievable accuracy in the pool. The performance study on UCI datasets demonstrates that the proposed algorithm rarely misses the optimal ensemble, thus establishing confidence in the quality of heuristics employed by the algorithm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Please note that this does not reduce the complexity of the problem.
 
2
Recall that the choice of base classifier is incidental, since ADP is independent of the method of ensemble generation.
 
Literatur
1.
Zurück zum Zitat Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: Ensemble diversity measures and their applications to thinning. Inf. Fusion 6(1), 49–62 (2005)CrossRef Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: Ensemble diversity measures and their applications to thinning. Inf. Fusion 6(1), 49–62 (2005)CrossRef
4.
Zurück zum Zitat Brown, G., Kuncheva, L.I.: “Good” and “bad” diversity in majority vote ensembles. In: MCS 2010, pp. 124–133 (2010) Brown, G., Kuncheva, L.I.: “Good” and “bad” diversity in majority vote ensembles. In: MCS 2010, pp. 124–133 (2010)
5.
Zurück zum Zitat Brown, G., Wyatt, J.L., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Inf. Fusion 6(1), 5–20 (2005)CrossRef Brown, G., Wyatt, J.L., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Inf. Fusion 6(1), 5–20 (2005)CrossRef
6.
Zurück zum Zitat Caruana, R., Munson, A., Niculescu-Mizil, A.: Getting the most out of ensemble selection. In: Proceedings of the Sixth International Conference on Data Mining, ICDM ’06, pp. 828–833. IEEE Computer Society (2006) Caruana, R., Munson, A., Niculescu-Mizil, A.: Getting the most out of ensemble selection. In: Proceedings of the Sixth International Conference on Data Mining, ICDM ’06, pp. 828–833. IEEE Computer Society (2006)
7.
Zurück zum Zitat Caruana, R., Niculescu-Mizil, A., Crew, G., Ksikes, A.: Ensemble selection from libraries of models. In: Proceedings of the twenty-first international conference on Machine learning, ICML ’04. ACM (2004) Caruana, R., Niculescu-Mizil, A., Crew, G., Ksikes, A.: Ensemble selection from libraries of models. In: Proceedings of the twenty-first international conference on Machine learning, ICML ’04. ACM (2004)
8.
Zurück zum Zitat Chen, H., Tiho, P., Yao, X.: Predictive ensemble pruning by expectation propagation. IEEE Trans. Knowl. Data Eng. 21(7), 999–1013 (2009)CrossRef Chen, H., Tiho, P., Yao, X.: Predictive ensemble pruning by expectation propagation. IEEE Trans. Knowl. Data Eng. 21(7), 999–1013 (2009)CrossRef
9.
Zurück zum Zitat Dai, Q.: A competitive ensemble pruning approach based on cross-validation technique. Knowl. Based Syst. 37, 394–414 (2013)CrossRef Dai, Q.: A competitive ensemble pruning approach based on cross-validation technique. Knowl. Based Syst. 37, 394–414 (2013)CrossRef
10.
Zurück zum Zitat Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MATHMathSciNet Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MATHMathSciNet
11.
Zurück zum Zitat Didaci, L., Fumera, G., Roli, F.: Diversity in classifier ensembles: fertile concept or dead end? In: Zhou, Z.H., Roli, F., Kittler, J. (eds.) Multiple classifier systems. Lecture Notes in Computer Science, vol. 7872, pp. 37–48. Springer, Berlin (2013) Didaci, L., Fumera, G., Roli, F.: Diversity in classifier ensembles: fertile concept or dead end? In: Zhou, Z.H., Roli, F., Kittler, J. (eds.) Multiple classifier systems. Lecture Notes in Computer Science, vol. 7872, pp. 37–48. Springer, Berlin (2013)
12.
Zurück zum Zitat Dietterich, T.: Ensemble methods in machine learning. In: International workshop on Multiple Classifier Systems, pp. 1–10 (2000) Dietterich, T.: Ensemble methods in machine learning. In: International workshop on Multiple Classifier Systems, pp. 1–10 (2000)
13.
Zurück zum Zitat Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40, 139–157 (2000)CrossRef Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40, 139–157 (2000)CrossRef
14.
Zurück zum Zitat Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification, 2nd edn. Wiley (2001) Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification, 2nd edn. Wiley (2001)
16.
Zurück zum Zitat Giacinto, G., Roli, F., Fumera, G.: Design of effective multiple classifier systems by clustering of classifiers. In: Proceedings of ICPR2000, 15th International Conference on Pattern Recognition, pp. 3–8 (2000) Giacinto, G., Roli, F., Fumera, G.: Design of effective multiple classifier systems by clustering of classifiers. In: Proceedings of ICPR2000, 15th International Conference on Pattern Recognition, pp. 3–8 (2000)
17.
Zurück zum Zitat Giancinto, G., Roli, F.: Design of effective neural network ensembles for image classification purposes. Image Vis. Comput. J. 19, 699–707 (2001)CrossRef Giancinto, G., Roli, F.: Design of effective neural network ensembles for image classification purposes. Image Vis. Comput. J. 19, 699–707 (2001)CrossRef
18.
Zurück zum Zitat Guo, L., Boukir, S.: Margin-based ordered aggregation for ensemble pruning. Pattern Recognit. Lett. 34(6), 603–609 (2013)CrossRef Guo, L., Boukir, S.: Margin-based ordered aggregation for ensemble pruning. Pattern Recognit. Lett. 34(6), 603–609 (2013)CrossRef
19.
Zurück zum Zitat Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRef Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRef
20.
Zurück zum Zitat Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844 (1998)CrossRef Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844 (1998)CrossRef
21.
Zurück zum Zitat Partalas, G.T., Vlahavas, I.: Focussed ensemble selection: a diversity based method for greedy ensemble selection. Front. Artif. Intell. Appl. pp. 117–121 (2008) Partalas, G.T., Vlahavas, I.: Focussed ensemble selection: a diversity based method for greedy ensemble selection. Front. Artif. Intell. Appl. pp. 117–121 (2008)
22.
Zurück zum Zitat Kittler, J., Hatef, M., Duin, R., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998)CrossRef Kittler, J., Hatef, M., Duin, R., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998)CrossRef
23.
Zurück zum Zitat Ko, A.H.R., Sabourin, R., de Souza Britto Jr., A.: Compound diversity functions for ensemble selection. IJPRAI 23(4), 659–686 (2009) Ko, A.H.R., Sabourin, R., de Souza Britto Jr., A.: Compound diversity functions for ensemble selection. IJPRAI 23(4), 659–686 (2009)
24.
Zurück zum Zitat Hoboken, L.I.: Combining pattern classifiers: methods and algorithms. Wiley-Interscience, Hoboken, New Jersey (2004) Hoboken, L.I.: Combining pattern classifiers: methods and algorithms. Wiley-Interscience, Hoboken, New Jersey (2004)
25.
Zurück zum Zitat Kuncheva, L.I.: Using diversity measures for generating error-correcting output codes in classifier ensembles. Pattern Recognit. Lett. 26(1), 83–90 (2005)CrossRef Kuncheva, L.I.: Using diversity measures for generating error-correcting output codes in classifier ensembles. Pattern Recognit. Lett. 26(1), 83–90 (2005)CrossRef
26.
Zurück zum Zitat Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with ensemble accuracy. Mach. Learn. 51, 181–207 (2003)CrossRefMATH Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with ensemble accuracy. Mach. Learn. 51, 181–207 (2003)CrossRefMATH
27.
Zurück zum Zitat Lam, L.: Classifier combinations: Implementations and theoretical issues. In: Multiple Classifier Systems, pp. 77–86 (2000) Lam, L.: Classifier combinations: Implementations and theoretical issues. In: Multiple Classifier Systems, pp. 77–86 (2000)
28.
Zurück zum Zitat Li, N., Yu, Y., Zhou, Z.H.: Diversity regularized ensemble pruning. In: Flach, P., Bie, T., Cristianini, N. (eds.) Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, vol. 7523, pp. 330–345. Springer, Berlin (2012) Li, N., Yu, Y., Zhou, Z.H.: Diversity regularized ensemble pruning. In: Flach, P., Bie, T., Cristianini, N. (eds.) Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, vol. 7523, pp. 330–345. Springer, Berlin (2012)
29.
Zurück zum Zitat Lu, Z., Wu, X., Zhu, X., Bongard, J.: Ensemble pruning via individual contribution ordering. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’10, pp. 871–880. ACM (2010). Lu, Z., Wu, X., Zhu, X., Bongard, J.: Ensemble pruning via individual contribution ordering. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’10, pp. 871–880. ACM (2010).
30.
Zurück zum Zitat Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: Proceedings of the Fourteenth International Conference on Machine Learning, ICML ’97, pp. 211–218. Morgan Kaufmann Publishers Inc. (1997) Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: Proceedings of the Fourteenth International Conference on Machine Learning, ICML ’97, pp. 211–218. Morgan Kaufmann Publishers Inc. (1997)
31.
Zurück zum Zitat Marill, T., Green, D.: On the effectiveness of receptors in recognition systems. IEEE Trans. Inf. Theor. 9(1), 11–17 (2006)CrossRef Marill, T., Green, D.: On the effectiveness of receptors in recognition systems. IEEE Trans. Inf. Theor. 9(1), 11–17 (2006)CrossRef
32.
Zurück zum Zitat Martínez-Muñoz, G., Suárez, A.: Pruning in ordered bagging ensembles. In: Proceedings of the 23rd international conference on Machine learning, ICML ’06, pp. 609–616. ACM (2006) Martínez-Muñoz, G., Suárez, A.: Pruning in ordered bagging ensembles. In: Proceedings of the 23rd international conference on Machine learning, ICML ’06, pp. 609–616. ACM (2006)
33.
Zurück zum Zitat Martinez-Munoz, G., Suarez, A.: Aggregation ordering in bagging. In: International Conference on Artificial intelligence and Applications (IASTED) pp. 258–263 (2004) Martinez-Munoz, G., Suarez, A.: Aggregation ordering in bagging. In: International Conference on Artificial intelligence and Applications (IASTED) pp. 258–263 (2004)
34.
Zurück zum Zitat Opitz, D.W.: Feature selection for ensembles. In: Proceedings of the 16th national conference on Artificial intelligence, AAAI ’99/IAAI ’99, pp. 379–384. American Association for Artificial Intelligence (1999) Opitz, D.W.: Feature selection for ensembles. In: Proceedings of the 16th national conference on Artificial intelligence, AAAI ’99/IAAI ’99, pp. 379–384. American Association for Artificial Intelligence (1999)
35.
Zurück zum Zitat Partalas, I., Tsoumakas, G., Vlahavas, I.: Focused ensemble selection: A diversity-based method for greedy ensemble selection. In: Proceeding of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence, pp. 117–121. IOS Press (2008) Partalas, I., Tsoumakas, G., Vlahavas, I.: Focused ensemble selection: A diversity-based method for greedy ensemble selection. In: Proceeding of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence, pp. 117–121. IOS Press (2008)
36.
Zurück zum Zitat Partalas, I., Tsoumakas, G., Vlahavas, I.: Pruning an ensemble of classifiers via reinforcement learning. Neurocomputing 72, 1900–1909 (2008) Partalas, I., Tsoumakas, G., Vlahavas, I.: Pruning an ensemble of classifiers via reinforcement learning. Neurocomputing 72, 1900–1909 (2008)
38.
Zurück zum Zitat Rokach, L., Maimon, O.: Data mining with decision trees: theroy and applications. World Scientific Publishing Co., Inc. (2008) Rokach, L., Maimon, O.: Data mining with decision trees: theroy and applications. World Scientific Publishing Co., Inc. (2008)
39.
Zurück zum Zitat Tamon, C., Xiang, J.: On the boosting pruning problem. In: Proceedings of the 11th European Conference on Machine Learning, ECML ’00, pp. 404–412. Springer (2000) Tamon, C., Xiang, J.: On the boosting pruning problem. In: Proceedings of the 11th European Conference on Machine Learning, ECML ’00, pp. 404–412. Springer (2000)
40.
Zurück zum Zitat Tang, E.K., Suganthan, P.N., Yao, X.: An analysis of diversity measures. Mach. Learn. 65, 247–271 (2006)CrossRef Tang, E.K., Suganthan, P.N., Yao, X.: An analysis of diversity measures. Mach. Learn. 65, 247–271 (2006)CrossRef
41.
Zurück zum Zitat Tsoumakas, G., Partalas, I., Vlahavas, I.: Ensemble pruning primer. In: O. Okun, G. Valentini (eds.) Applications of supervised and unsupervised ensemble methods, vol. 245, pp. 1–13 Springer, Heidelberg (2009) Tsoumakas, G., Partalas, I., Vlahavas, I.: Ensemble pruning primer. In: O. Okun, G. Valentini (eds.) Applications of supervised and unsupervised ensemble methods, vol. 245, pp. 1–13 Springer, Heidelberg (2009)
42.
Zurück zum Zitat Webb, G.I., Zheng, Z.: Multistrategy ensemble learning: reducing error by combining ensemble learning techniques. IEEE Trans. Knowl. Data Eng. 16, 980–991 (2004) Webb, G.I., Zheng, Z.: Multistrategy ensemble learning: reducing error by combining ensemble learning techniques. IEEE Trans. Knowl. Data Eng. 16, 980–991 (2004)
43.
Zurück zum Zitat Zhang, P., Zhu, X., Shi, Y., Wu, X.: An aggregate ensemble for mining concept drifting data streams with noise. Advances in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science, vol. 5476, pp. 1021–1029. Springer, Berlin (2009) Zhang, P., Zhu, X., Shi, Y., Wu, X.: An aggregate ensemble for mining concept drifting data streams with noise. Advances in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science, vol. 5476, pp. 1021–1029. Springer, Berlin (2009)
44.
Zurück zum Zitat Zhang, Y., Burer, S., Street, W.N.: Ensemble pruning via semi-definite programming. J. Mach. Learn. Res. 7, 1315–1338 (2006)MATHMathSciNet Zhang, Y., Burer, S., Street, W.N.: Ensemble pruning via semi-definite programming. J. Mach. Learn. Res. 7, 1315–1338 (2006)MATHMathSciNet
45.
Zurück zum Zitat Zhou, Z.H., Tang, W.: Selective ensemble of decision trees. In: Proceedings of the 9th international conference on Rough sets, fuzzy sets, data mining, and granular computing, pp. 476–483. Springer, Berlin (2003) Zhou, Z.H., Tang, W.: Selective ensemble of decision trees. In: Proceedings of the 9th international conference on Rough sets, fuzzy sets, data mining, and granular computing, pp. 476–483. Springer, Berlin (2003)
Metadaten
Titel
Accuracy–diversity based pruning of classifier ensembles
verfasst von
Vasudha Bhatnagar
Manju Bhardwaj
Shivam Sharma
Sufyan Haroon
Publikationsdatum
01.06.2014
Verlag
Springer Berlin Heidelberg
Erschienen in
Progress in Artificial Intelligence / Ausgabe 2-3/2014
Print ISSN: 2192-6352
Elektronische ISSN: 2192-6360
DOI
https://doi.org/10.1007/s13748-014-0042-9

Weitere Artikel der Ausgabe 2-3/2014

Progress in Artificial Intelligence 2-3/2014 Zur Ausgabe

Premium Partner