Skip to main content
Top

2019 | OriginalPaper | Chapter

Investigating the Effective Use of Machine Learning Algorithms in Network Intruder Detection Systems

Authors : Intisar S. Al-Mandhari, L. Guan, E. A. Edirisinghe

Published in: Advances in Information and Communication Networks

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Research into the use of machine learning techniques for network intrusion detection, especially carried out with respect to the popular public dataset, KDD cup 99, have become commonplace during the past decade. The recent popularity of cloud-based computing and the realization of the associated risks are the main reasons for this research thrust. The proposed research demonstrates that machine learning algorithms can be effectively used to enhance the performance of existing intrusion detection systems despite the high misclassification rates reported in the literature. This paper reports on an empirical investigation to determine the underlying causes of the poor performance of some of the well-known machine learning classifiers. Especially when learning from minor classes/attacks. The main factor is that the KDD cup 99 dataset, which is popularly used in most of the existing research, is an imbalanced dataset due to the nature of the specific intrusion detection domain, i.e. some attacks being rare and some being very frequent. Therefore, there is a significant imbalance amongst the classes in the dataset. Based on the number of the classes in the dataset, the imbalance dataset issue can be considered a binary problem or a multi-class problem. Most of the researchers focus on conducting a binary class classification as conducting a multi-class classification is complex. In the research proposed in this paper, we consider the problem as a multi-class classification task. The paper investigates the use of different machine learning algorithms in order to overcome the common misclassification problems that have been faced by researchers who used the imbalance KDD cup 99 dataset for their investigations. Recommendations are made as for which classifier is best for the classification of imbalanced data.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Modi, C., Patel, D., Borisaniya, B., Patel, A., et al.: A survey on security issues and solutions at different layers of Cloud computing. J. Supercomput. 63(2), 561–592 (2013)CrossRef Modi, C., Patel, D., Borisaniya, B., Patel, A., et al.: A survey on security issues and solutions at different layers of Cloud computing. J. Supercomput. 63(2), 561–592 (2013)CrossRef
2.
go back to reference Chen, Y., Sion, R.: On securing untrusted clouds with cryptography. Science 109–114 (2010) Chen, Y., Sion, R.: On securing untrusted clouds with cryptography. Science 109–114 (2010)
4.
go back to reference Naiping, S.N.S., Genyuan, Z.G.Z.: A study on intrusion detection based on data mining. In: International Conference of Information Science and Management Engineering, ISME, vol. 1, pp. 8–15 (2010) Naiping, S.N.S., Genyuan, Z.G.Z.: A study on intrusion detection based on data mining. In: International Conference of Information Science and Management Engineering, ISME, vol. 1, pp. 8–15 (2010)
5.
go back to reference Almutairi, A.: Intrusion detection using data mining techniques Almutairi, A.: Intrusion detection using data mining techniques
6.
go back to reference McHugh, J.: Testing intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Trans. Inf. Syst. Secur. 3(4), 262–294 (2000)CrossRef McHugh, J.: Testing intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Trans. Inf. Syst. Secur. 3(4), 262–294 (2000)CrossRef
7.
go back to reference Tavallaee, M., et al.: A detailed analysis of the KDD CUP 99 data set. In: IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA 2009, (Cisda), pp. 1–6 (2009) Tavallaee, M., et al.: A detailed analysis of the KDD CUP 99 data set. In: IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA 2009, (Cisda), pp. 1–6 (2009)
8.
go back to reference Tavallaee, M.: An Adaptive Intrusion Detection System. Sdstate.Edu. (2011) Tavallaee, M.: An Adaptive Intrusion Detection System. Sdstate.Edu. (2011)
9.
go back to reference Thomas, C., Balakrishnan, N.: Performance enhancement of intrusion detection systems using advances in sensor fusion. In: 11th International Conference on Information Fusion, pp. 1–7 (2008) Thomas, C., Balakrishnan, N.: Performance enhancement of intrusion detection systems using advances in sensor fusion. In: 11th International Conference on Information Fusion, pp. 1–7 (2008)
12.
go back to reference Troesch, M., Walsh, I.: Machine learning for network intrusion detection, pp. 1–5 (2014) Troesch, M., Walsh, I.: Machine learning for network intrusion detection, pp. 1–5 (2014)
15.
go back to reference Kubat, M.:. Neural networks: a comprehensive foundation by Simon Haykin, Macmillan, 1994. The Knowledge Engineering Review 13(4), pp. 409–412 (1999). ISBN 0-02-352781-7 Kubat, M.:. Neural networks: a comprehensive foundation by Simon Haykin, Macmillan, 1994. The Knowledge Engineering Review 13(4), pp. 409–412 (1999). ISBN 0-02-352781-7
16.
go back to reference LeCun, Y.A., et al.: Efficient backprop. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7700 (2012) LeCun, Y.A., et al.: Efficient backprop. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7700 (2012)
17.
go back to reference Engen, V.: Machine learning for network based intrusion detection. Int. J. (2010) Engen, V.: Machine learning for network based intrusion detection. Int. J. (2010)
19.
go back to reference Weiss, G.M.: Mining with rarity: a unifying framework. SIGKDD Explor. 6(1), 7–19 (2004)CrossRef Weiss, G.M.: Mining with rarity: a unifying framework. SIGKDD Explor. 6(1), 7–19 (2004)CrossRef
22.
go back to reference Weiss, G.M., Provost, F.: Learning when training data are costly: the effect of class distribution on tree induction. J. Artif. Intell. Res. 19, 315–354 (2003)CrossRef Weiss, G.M., Provost, F.: Learning when training data are costly: the effect of class distribution on tree induction. J. Artif. Intell. Res. 19, 315–354 (2003)CrossRef
23.
go back to reference Barandela, R., et al.: Strategies for learning in class imbalance problems.pdf. Pattern Recog. 36, 849–851 (2003)CrossRef Barandela, R., et al.: Strategies for learning in class imbalance problems.pdf. Pattern Recog. 36, 849–851 (2003)CrossRef
24.
go back to reference Barandela, R., Sánchez, J.S., Valdovinos, R.M.: New applications of ensembles of classifiers. Pattern Anal. Appl. 6(3), 245–256 (2003)MathSciNetCrossRef Barandela, R., Sánchez, J.S., Valdovinos, R.M.: New applications of ensembles of classifiers. Pattern Anal. Appl. 6(3), 245–256 (2003)MathSciNetCrossRef
25.
go back to reference Ducange, P., Lazzerini, B., Marcelloni, F.: Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets. Soft. Comput. 14(7), 713–728 (2010)CrossRef Ducange, P., Lazzerini, B., Marcelloni, F.: Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets. Soft. Comput. 14(7), 713–728 (2010)CrossRef
26.
go back to reference Lin, W.J., Chen, J.J.: Class-imbalanced classifiers for high-dimensional data. Brief. Bioinform. 14(1), 13–26 (2013)CrossRef Lin, W.J., Chen, J.J.: Class-imbalanced classifiers for high-dimensional data. Brief. Bioinform. 14(1), 13–26 (2013)CrossRef
27.
go back to reference Wang, J.: Advanced attack tree based intrusion detection (2012) Wang, J.: Advanced attack tree based intrusion detection (2012)
29.
go back to reference Batuwita, R., Palade, V.: Class imbalance learning methods for support vector. imbalanced learning: foundations, algorithms, applications, pp. 83–100 (2013) Batuwita, R., Palade, V.: Class imbalance learning methods for support vector. imbalanced learning: foundations, algorithms, applications, pp. 83–100 (2013)
30.
go back to reference García-Pedrajas, N., et al.: Class imbalance methods for translation initiation site recognition in DNA sequences. Knowl. Based Syst. 25(1), 22–34 (2012)CrossRef García-Pedrajas, N., et al.: Class imbalance methods for translation initiation site recognition in DNA sequences. Knowl. Based Syst. 25(1), 22–34 (2012)CrossRef
32.
go back to reference Zhou, Z., Member, S., Liu, X.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2006)CrossRef Zhou, Z., Member, S., Liu, X.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2006)CrossRef
33.
go back to reference Błaszczyński, J., et al.: Integrating selective pre-processing of imbalanced data with Ivotes ensemble. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNAI, vol. 6086, pp. 148–157 (2010) Błaszczyński, J., et al.: Integrating selective pre-processing of imbalanced data with Ivotes ensemble. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNAI, vol. 6086, pp. 148–157 (2010)
34.
go back to reference Chawla, N.V., et al.: SMOTEBoost: improving prediction. In: Lecture Notes in Computer Science, vol. 2838, pp.107–119 (2003) Chawla, N.V., et al.: SMOTEBoost: improving prediction. In: Lecture Notes in Computer Science, vol. 2838, pp.107–119 (2003)
36.
go back to reference Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J., et al.: RUSBoost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 40(1), 185–197 (2010)CrossRef Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J., et al.: RUSBoost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 40(1), 185–197 (2010)CrossRef
37.
go back to reference Batuwita, R., Palade, V.: Efficient resampling methods for training support vector machines with imbalanced datasets. In: Proceedings of the International Joint Conference on Neural Networks (2010) Batuwita, R., Palade, V.: Efficient resampling methods for training support vector machines with imbalanced datasets. In: Proceedings of the International Joint Conference on Neural Networks (2010)
38.
go back to reference Fernandez, A., et al.: A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets. Fuzzy Sets Syst. 159(18), 2378–2398 (2008)MathSciNetCrossRef Fernandez, A., et al.: A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets. Fuzzy Sets Syst. 159(18), 2378–2398 (2008)MathSciNetCrossRef
41.
go back to reference Japkowicz, N.: The class imbalance problem: significance and strategies. In: Proceedings of the 2000 International Conference on Artificial Intelligence, pp. 111–117 (2000) Japkowicz, N.: The class imbalance problem: significance and strategies. In: Proceedings of the 2000 International Conference on Artificial Intelligence, pp. 111–117 (2000)
42.
go back to reference Van Hulse, J.: An empirical comparison of repetitive undersampling techniques, pp. 29–34 (2009) Van Hulse, J.: An empirical comparison of repetitive undersampling techniques, pp. 29–34 (2009)
43.
go back to reference Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRef Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRef
45.
go back to reference Adamu Teshome, D., Rao, V.S.: A cost sensitive machine learning approach for intrusion detection. Glob. J. Comput. Sci. Technol. 14(6) (2014) Adamu Teshome, D., Rao, V.S.: A cost sensitive machine learning approach for intrusion detection. Glob. J. Comput. Sci. Technol. 14(6) (2014)
47.
go back to reference He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRef He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRef
49.
go back to reference Depren, O., Topallar, M., Anarim E., Ciliz, M.K.: An intelligent intrusion detection system for anomaly and misuse detection in computer networks. Expert Syst. Appl., 29, 713–722 (2005)CrossRef Depren, O., Topallar, M., Anarim E., Ciliz, M.K.: An intelligent intrusion detection system for anomaly and misuse detection in computer networks. Expert Syst. Appl., 29, 713–722 (2005)CrossRef
50.
go back to reference Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on Artificial Intelligence (IJCAI), (1995) Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on Artificial Intelligence (IJCAI), (1995)
Metadata
Title
Investigating the Effective Use of Machine Learning Algorithms in Network Intruder Detection Systems
Authors
Intisar S. Al-Mandhari
L. Guan
E. A. Edirisinghe
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-03405-4_10