Skip to main content

2015 | OriginalPaper | Buchkapitel

Consensus Decision Making in Random Forests

verfasst von : Raja Khurram Shahzad, Mehwish Fatima, Niklas Lavesson, Martin Boldt

Erschienen in: Machine Learning, Optimization, and Big Data

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The applications of Random Forests, an ensemble learner, are investigated in different domains including malware classification. Random Forests uses the majority rule for the outcome, however, a decision from the majority rule faces different challenges such as the decision may not be representative or supported by all trees in Random Forests. To address such problems and increase accuracy in decisions, a consensus decision making (CDM) is suggested. The decision mechanism of Random Forests is replaced with the CDM. The updated Random Forests algorithm is evaluated mainly on malware data sets, and results are compared with unmodified Random Forests. The empirical results suggest that the proposed Random Forests, i.e., with CDM performs better than the original Random Forests.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Biau, G., Devroye, L., Lugosi, G.: Consistency of random forests and other averaging classifiers. J. Mach. Learn. Res. 9, 2015–2033 (2008)MATHMathSciNet Biau, G., Devroye, L., Lugosi, G.: Consistency of random forests and other averaging classifiers. J. Mach. Learn. Res. 9, 2015–2033 (2008)MATHMathSciNet
3.
Zurück zum Zitat Li, H.-B., Wang, W., Ding, H.-W., Dong, J.: Trees weighting random forest method for classifying high-dimensional noisy data. In: IEEE 7th International Conference on e-Business Engineering (ICEBE), pp. 160–163 (2010) Li, H.-B., Wang, W., Ding, H.-W., Dong, J.: Trees weighting random forest method for classifying high-dimensional noisy data. In: IEEE 7th International Conference on e-Business Engineering (ICEBE), pp. 160–163 (2010)
4.
Zurück zum Zitat Shahzad, R.K., Lavesson, N.: Comparative analysis of voting schemes for ensemble-based malware detection. J. Wirel. Mob. Netw. Ubiquitous Comput. Dependable Appl. (JoWUA) 4(1), 98–117 (2013) Shahzad, R.K., Lavesson, N.: Comparative analysis of voting schemes for ensemble-based malware detection. J. Wirel. Mob. Netw. Ubiquitous Comput. Dependable Appl. (JoWUA) 4(1), 98–117 (2013)
5.
Zurück zum Zitat Tsiporkova, E., Boeva, V.: Multi-step ranking of alternatives in a multi-criteria and multi-expert decision making environment. Inf. Sci. 176(18), 2673–2697 (2006)MATHMathSciNetCrossRef Tsiporkova, E., Boeva, V.: Multi-step ranking of alternatives in a multi-criteria and multi-expert decision making environment. Inf. Sci. 176(18), 2673–2697 (2006)MATHMathSciNetCrossRef
6.
Zurück zum Zitat Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM Spec. Interest Group Knowl. Discov. Data Min. (SIGKDD) Explor. Newslett. 11, 10–18 (2009) Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM Spec. Interest Group Knowl. Discov. Data Min. (SIGKDD) Explor. Newslett. 11, 10–18 (2009)
7.
Zurück zum Zitat Verikas, A., Gelzinis, A., Bacauskiene, M.: Mining data with random forests: a survey and results of new tests. Pattern Recogn. 44(2), 330–349 (2011)CrossRef Verikas, A., Gelzinis, A., Bacauskiene, M.: Mining data with random forests: a survey and results of new tests. Pattern Recogn. 44(2), 330–349 (2011)CrossRef
8.
Zurück zum Zitat Vanicek, J., Vrana, I., Aly, S.: Fuzzy aggregation and averaging for group decision making: a generalization and survey. Knowl.-Based Syst. 22(1), 79–84 (2009)CrossRef Vanicek, J., Vrana, I., Aly, S.: Fuzzy aggregation and averaging for group decision making: a generalization and survey. Knowl.-Based Syst. 22(1), 79–84 (2009)CrossRef
9.
Zurück zum Zitat Ho, T.K.: Random decision forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282. IEEE (1995) Ho, T.K.: Random decision forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282. IEEE (1995)
11.
Zurück zum Zitat Shahzad, R.K., Lavesson, N.: Detecting scareware by mining variable length instruction sequences. In: 10th International Information Security South Africa Conference, pp. 1–8 (2011) Shahzad, R.K., Lavesson, N.: Detecting scareware by mining variable length instruction sequences. In: 10th International Information Security South Africa Conference, pp. 1–8 (2011)
12.
Zurück zum Zitat Robnik-Šikonja, M.: Improving random forests. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 359–370. Springer, Heidelberg (2004) CrossRef Robnik-Šikonja, M.: Improving random forests. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 359–370. Springer, Heidelberg (2004) CrossRef
13.
Zurück zum Zitat Boström, H.: Calibrating random forests. In: Seventh International Conference on Machine Learning and Applications, pp. 121–126 (2008) Boström, H.: Calibrating random forests. In: Seventh International Conference on Machine Learning and Applications, pp. 121–126 (2008)
14.
Zurück zum Zitat Genuer, R., Poggi, J.-M., Tuleau-Malot, C.: Variable selection using random forests. Pattern Recogn. Lett. 31(14), 2225–2236 (2010)CrossRef Genuer, R., Poggi, J.-M., Tuleau-Malot, C.: Variable selection using random forests. Pattern Recogn. Lett. 31(14), 2225–2236 (2010)CrossRef
15.
Zurück zum Zitat Khoshgoftaar, T.M., Golawala, M., Van Hulse, J.: An empirical study of learning from imbalanced data using random forest. In: 19th IEEE International Conference on Tools with Artificial Intelligence, vol. 2, pp. 310–317 (2007) Khoshgoftaar, T.M., Golawala, M., Van Hulse, J.: An empirical study of learning from imbalanced data using random forest. In: 19th IEEE International Conference on Tools with Artificial Intelligence, vol. 2, pp. 310–317 (2007)
16.
Zurück zum Zitat Risse, M.: Arguing for majority rule. J. Polit. Philos. 12(1), 41–64 (2004)CrossRef Risse, M.: Arguing for majority rule. J. Polit. Philos. 12(1), 41–64 (2004)CrossRef
17.
Zurück zum Zitat Farrell, D.M.: Electoral Systems: A Comparative Introduction. Palgrave Macmillan, Basingstoke (2001) Farrell, D.M.: Electoral Systems: A Comparative Introduction. Palgrave Macmillan, Basingstoke (2001)
18.
Zurück zum Zitat Elster, J., Hylland, A.: Foundations of Social Choice Theory. Cambridge University Press, Cambridge (1989) Elster, J., Hylland, A.: Foundations of Social Choice Theory. Cambridge University Press, Cambridge (1989)
19.
Zurück zum Zitat Davis, J.H.: Group decision and social interaction: a theory of social decision schemes. Psychol. Rev. 80(2), 97–125 (1973)CrossRef Davis, J.H.: Group decision and social interaction: a theory of social decision schemes. Psychol. Rev. 80(2), 97–125 (1973)CrossRef
20.
Zurück zum Zitat Davis, J.H., Stasson, M.F., Parks, C.D., Hulbert, L., Kameda, T., Zimmerman, S.K., Ono, K.: Quantitative decisions by groups and individuals: voting procedures and monetary awards by mock civil juries. J. Exp. Soc. Psychol. 29(4), 326–346 (1993)CrossRef Davis, J.H., Stasson, M.F., Parks, C.D., Hulbert, L., Kameda, T., Zimmerman, S.K., Ono, K.: Quantitative decisions by groups and individuals: voting procedures and monetary awards by mock civil juries. J. Exp. Soc. Psychol. 29(4), 326–346 (1993)CrossRef
21.
Zurück zum Zitat Shabtai, A., Moskovitch, R., Feher, C., Dolev, S., Elovici, Y.: Detecting unknown malicious code by applying classification techniques on opcode patterns. Secur. Inf. 1(1), 1–22 (2012)CrossRef Shabtai, A., Moskovitch, R., Feher, C., Dolev, S., Elovici, Y.: Detecting unknown malicious code by applying classification techniques on opcode patterns. Secur. Inf. 1(1), 1–22 (2012)CrossRef
22.
Zurück zum Zitat Shahzad, R.K., Haider, S.I., Lavesson, N.: Detection of spyware by mining executable files. In: International Conference on Availability, Reliability, and Security, pp. 295–302 (2010) Shahzad, R.K., Haider, S.I., Lavesson, N.: Detection of spyware by mining executable files. In: International Conference on Availability, Reliability, and Security, pp. 295–302 (2010)
23.
Zurück zum Zitat Bache, K., Lichman, M.: UCI machine learning repository (2013) Bache, K., Lichman, M.: UCI machine learning repository (2013)
24.
Zurück zum Zitat Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques: Practical Machine Learning Tools and Techniques. The Morgan Kaufmann Series in Data Management Systems. Elsevier Science, USA (2011) Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques: Practical Machine Learning Tools and Techniques. The Morgan Kaufmann Series in Data Management Systems. Elsevier Science, USA (2011)
25.
Zurück zum Zitat Lazarevic, A., Obradovic, Z.: Data reduction using multiple models integration. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 301–313. Springer, Heidelberg (2001) CrossRef Lazarevic, A., Obradovic, Z.: Data reduction using multiple models integration. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 301–313. Springer, Heidelberg (2001) CrossRef
26.
Zurück zum Zitat Gharehchopogh, F.S., Khaze, S.R.: Data mining application for cyber space users tendency inblog writing: a case study. Int. J. Comput. Appl. 47(18), 40–46 (2012) Gharehchopogh, F.S., Khaze, S.R.: Data mining application for cyber space users tendency inblog writing: a case study. Int. J. Comput. Appl. 47(18), 40–46 (2012)
27.
Zurück zum Zitat Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Inf. Secur. Tech. Rep. 14(1), 16–29 (2009)CrossRef Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Inf. Secur. Tech. Rep. 14(1), 16–29 (2009)CrossRef
28.
Zurück zum Zitat Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18, 613–620 (1975)MATHCrossRef Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18, 613–620 (1975)MATHCrossRef
29.
Zurück zum Zitat Provost, F.J., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: Proceedings of the Fifteenth International Conference on Machine Learning. pp. 445–453. Morgan Kaufmann Publishers Inc. (1998) Provost, F.J., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: Proceedings of the Fifteenth International Conference on Machine Learning. pp. 445–453. Morgan Kaufmann Publishers Inc. (1998)
Metadaten
Titel
Consensus Decision Making in Random Forests
verfasst von
Raja Khurram Shahzad
Mehwish Fatima
Niklas Lavesson
Martin Boldt
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-27926-8_31