nach oben

International Journal of Machine Learning and Cybernetics

Erschienen in:

05.01.2023 | Original Article

Traffic data extraction and labeling for machine learning based attack detection in IoT networks

verfasst von: Hayelom Gebrye, Yong Wang, Fagen Li

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 7/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The fast expansion of the Internet of Things (IoT) networks raises the possibility of further network threats. In today’s world, network traffic analysis has become an increasingly critical and useful tool for monitoring network traffic in general and analyzing attack patterns in particular. A few years ago, distributed denial-of-service attacks on IoT networks were considered the most pressing problem that needed to be addressed. The absence of high-quality datasets is one of the main obstacles to applying DDOS detection systems based on machine learning. Researchers have developed numerous methods to extract and analyze information from recorded files. From a literature review, it is clear that most of these tools share similar drawbacks. In this study, we proposed an intelligent raw network data extractor and labeler tool by incorporating the limitations of the tools that are available to transform PCAP to CSV. To generate and process a high-quality DDOS attack dataset suitable for machine learning models, we employed several data preprocessing operations on the selected network intrusion dataset. To confirm the validity and acceptability of the dataset, we tested different models. Among the models tested, the random forest was the most accurate in detecting the DDOS attack.

Vorheriger Artikel ConvHiA: convolutional network with hierarchical attention for knowledge graph multi-hop reasoning

Nächster Artikel Improving boosting methods with a stable loss function handling outliers

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

Jetzt informieren

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

Jetzt informieren

Roopak M, Tian GY, Chambers J (2019) Deep learning models for cyber security in iot networks. In: 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0452–0457. IEEE

Iglesias F, Zseby T (2015) Analysis of network traffic features for anomaly detection. Mach Learn 101(1):59–84MathSciNetCrossRef

Frank Jr CV (2019) Mirai bot scanner summation prototype

Orebaugh A, Ramirez G, Beale J (2006) Wireshark & Ethereal Network Protocol Analyzer Toolkit, Elsevier

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830MathSciNetMATH

Wei T, Li X, Stojanovic V (2021) Input-to-state stability of impulsive reaction-diffusion neural networks with infinite distributed delays. Nonlinear Dyn 103(2):1733–1755CrossRef

Tao H, Li J, Chen Y, Stojanovic V, Yang H (2020) Robust point-to-point iterative learning control with trial-varying initial conditions. IET Control Theory Appl 14(19):3344–3350MathSciNetCrossRef

Xu Z, Li X, Stojanovic V (2021) Exponential stability of nonlinear state-dependent delayed impulsive systems with applications. Nonlinear Anal Hybrid Syst 42:101088MathSciNetCrossRefMATH

Gregg B (2004) Chaosreader. http://www.brendangregg.com/chaosreader.html. Accessed 23 Oct 2021

10.

Soderberg W (2010) Extracting Files from a Capture aka Intercepting Files. https://wh1sk3yj4ck.wordpress.com/2010/08/12/extracting-files-from-a-capturefile-aka-intercepting-files/. Accessed 23 Oct 2021

11.

B P, N H (2005) Tcpxtract Home Page. http://tcpxtract.sourceforge.net/. Accessed 23 Oct 2021

12.

Davidoff S, Ham J (2012) Network forensics: tracking hackers through cyberspace vol. 2014. Prentice hall Upper Saddle River

13.

Deck S, Khiabani H (2015) Extracting files from network packet captures. SANS Institute-InfoSec Reading Room

14.

Joshi M, Hadi TH (2015) A review of network traffic analysis and prediction techniques. arXiv preprint arXiv:1507.05722

15.

Alothman B (2019) Raw network traffic data preprocessing and preparation for automatic analysis. In: 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security), pp. 1–5. IEEE

16.

Draper-Gil G, Lashkari AH, Mamun MSI, Ghorbani AA (2016) Characterization of encrypted and vpn traffic using time-related. In: Proceedings of the 2nd International Conference on Information Systems Security and Privacy (ICISSP), pp. 407–414

17.

Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the kdd cup 99 data set. In: 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–6. Ieee

18.

Kayacık HG, Zincir-Heywood AN, Heywood MI Selecting features for intrusion detection: a feature relevance analysis on kdd 99 benchmark

19.

Flood R A data-driven toolset using containers to generate datasets for network intrusion detection

20.

Mukkavilli SK, Shetty S, Hong L et al (2016) Generation of labelled datasets to quantify the impact of security threats to cloud data centers. J Inf Secur 7(03):172

21.

Alzahrani S, Hong L (2018) Generation of ddos attack dataset for effective ids development and evaluation. J Inf Secur 9(4):225–241

22.

Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24(6):417CrossRefMATH

23.

Izenman A (2013) Linear discriminant analysis in modern multivariate statistical techniques: 237–280. Springer, New York

24.

Legendre P, De Cáceres M (2013) Beta diversity as the variance of community data: dissimilarity coefficients and partitioning. Ecol Lett 16(8):951–963CrossRef

25.

Wenskovitch J, Crandell I, Ramakrishnan N, House L, North C (2017) Towards a systematic combination of dimension reduction and clustering in visual analytics. IEEE Trans Visual Comput Graphics 24(1):131–141CrossRef

26.

Fujiwara T, Kwon O-H, Ma K-L (2019) Supporting analysis of dimensionality reduction results with contrastive learning. IEEE Trans Visual Comput Graphics 26(1):45–55CrossRef

27.

Kang H, Ahn DH, Lee GM, Yoo JD, Park KH, Kim HK (2019) IoT Network Intrusion Dataset. https://doi.org/10.21227/q70p-q449

28.

G., H.: Libpcap File Format (2015). https://wiki.wireshark.org/Development/LibpcapFileFormat. Accessed 25 Oct 2021

29.

Holland T (2004) Understanding ips and ids: Using ips and ids together for defense in depth. SANS Institute

30.

Heidemann J, Mirkovic J, Hardaker W, Kallitsis M (2021) Collecting, labeling, and using networking data: the intersection of ai and networking

31.

Fukuda K, Heidemann J, Qadeer A (2017) Detecting malicious activity with dns backscatter over time. IEEE/ACM Trans Netw 25(5):3203–3218CrossRef

32.

Sorzano COS, Vargas J, Montano AP (2014) A survey of dimensionality reduction techniques. arXiv preprint arXiv:1403.2877

33.

Csubák D, Szücs K, Vörös P, Kiss A (2016) Big data testbed for network attack detection. Acta Polytechnica Hungarica 13(2):47–57

34.

Cunningham RK, Lippmann RP, Fried DJ, Garfinkel SL, Graf I, Kendall KR, Webster SE, Wyschogrod D, Zissman MA (1999) Evaluating intrusion detection systems without attacking your friends: the 1998 Darpa intrusion detection evaluation. Technical report, Massachusetts Inst Of Tech Lexington Lincoln Lab

35.

Haines JW, Rossey LM, Lippmann RP, Cunningham RK (2001) Extending the darpa off-line intrusion detection evaluations. In: Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX’01, vol. 1, pp. 35–45. IEEE

36.

Lee W, Stolfo SJ (2000) A framework for constructing features and models for intrusion detection systems. ACM Transact Inform Syst Secur (TiSSEC) 3(4):227–261CrossRef

37.

Sperotto A, Sadre R, Vliet Fv, Pras A (2009) A labeled data set for flow-based intrusion detection. In: International Workshop on IP Operations and Management, pp. 39–50. Springer

38.

Sangster B, O’Connor T, Cook T, Fanelli R, Dean E, Morrell C, Conti GJ (2009) Toward instrumenting network warfare competitions to generate labeled datasets. In: CSET

39.

Shiravi A, Shiravi H, Tavallaee M, Ghorbani AA (2012) Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput Secur 31(3):357–374CrossRef

40.

Moustafa N, Slay J (2015) Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS), pp. 1–6. IEEE

41.

Kolias C, Kambourakis G, Stavrou A, Gritzalis S (2015) Intrusion detection in 802.11 networks: empirical evaluation of threats and a public dataset. IEEE Communications Surveys & Tutorials 18(1), 184–208

42.

Sharafaldin I, Lashkari AH, Ghorbani AA (2018) Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 1:108–116

43.

Lashkari AH, Draper-Gil G, Mamun MSI, Ghorbani AA, et al. (2017) Characterization of tor traffic using time based features. In: ICISSp, pp. 253–262

44.

Sharafaldin I, Lashkari AH, Hakak S, Ghorbani AA (2019) Developing realistic distributed denial of service (ddos) attack dataset and taxonomy. In: 2019 International Carnahan Conference on Security Technology (ICCST), pp. 1–8. IEEE

45.

Koroniotis N, Moustafa N, Sitnikova E, Turnbull B (2019) Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset. Futur Gener Comput Syst 100:779–796CrossRef

46.

Meidan Y, Bohadana M, Mathov Y, Mirsky Y, Shabtai A, Breitenbacher D, Elovici Y (2018) N-baiot–network-based detection of iot botnet attacks using deep autoencoders. IEEE Pervasive Comput 17(3):12–22CrossRef

47.

Ullah I, Mahmoud QH (2020) A scheme for generating a dataset for anomalous activity detection in iot networks. In: Canadian Conference on Artificial Intelligence, pp. 508–520. Springer

48.

Hayelom G (2022) Mirai Based DDOS Dataset. https://doi.org/10.17632/h38nhgcpgk.1

49.

Guo C, Berkhahn F (2016) Entity embeddings of categorical variables. arXiv preprint arXiv:1604.06737

50.

Haykin S, Network N (2004) A comprehensive foundation. Neural Netw 2(2004):41

51.

Xue F, Qu A (2017) Variable selection for highly correlated predictors. arXiv preprint arXiv:1709.04840

52.

Kotsiantis SB, Zaharakis I, Pintelas P et al (2007) Supervised machine learning: a review of classification techniques. Emerging artificial intelligence applications in computer engineering 160(1):3–24

53.

Friedman J, Hastie T, Tibshirani R, et al. (2001) The elements of statistical learning vol. 1. Springer series in statistics New York

54.

Hussain J, Lalmuanawma S (2016) Feature analysis, evaluation and comparisons of classification algorithms based on noisy intrusion dataset. Proc Comput Sci 92:188–198CrossRef

55.

Fenanir S, Semchedine F, Baadache A (2019) A machine learning-based lightweight intrusion detection system for the internet of things. Rev. d’Intelligence Artif. 33(3):203–211

56.

Abrar I, Ayub Z, Masoodi F, Bamhdi AM (2020) A machine learning approach for intrusion detection system on nsl-kdd dataset. In: 2020 International Conference on Smart Electronics and Communication (ICOSEC), pp. 919–924. IEEE

57.

Ashraf S, Ahmed T (2020) Sagacious intrusion detection strategy in sensor network. In: 2020 International Conference on UK-China Emerging Technologies (UCET), pp. 1–4. IEEE

58.

Belavagi MC, Muniyal B (2016) Performance evaluation of supervised machine learning algorithms for intrusion detection. Proc Comput Sci 89:117–123CrossRef

59.

Ullah S, Ahmad J, Khan MA, Alkhammash EH, Hadjouni M, Ghadi YY, Saeed F, Pitropakis N (2022) A new intrusion detection system for the internet of things via deep convolutional neural network and feature engineering. Sensors 22(10):3607CrossRef

Titel: Traffic data extraction and labeling for machine learning based attack detection in IoT networks
verfasst von: Hayelom Gebrye
Yong Wang
Fagen Li
Publikationsdatum: 05.01.2023
Verlag: Springer Berlin Heidelberg
Erschienen in: International Journal of Machine Learning and Cybernetics / Ausgabe 7/2023
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-022-01765-7

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Internationaler Motorenkongress/© [M] ATZlive | Chisnikov / Fotolia.com, Search Icon, Banner Hanser, Customer Experience/© © oatawa / Getty Images / iStock, Erdgasmotor 1.5 TGI evo von Volkswagen/© Volkswagen AG, Thorsten Mücke/© Alexandra Bachran, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Weitere Artikel der Ausgabe 7/2023

Multi-receptive field spatiotemporal network for action recognition

DaCFN: divide-and-conquer fusion network for RGB-T object detection

Online rule fusion model based on formal concept analysis

Collaborative optimization of spatial-spectrum parallel convolutional network (CO-PCN) for hyperspectral image classification

Iterative convolutional enhancing self-attention Hawkes process with time relative position encoding

Image dehazing using multi-scale recursive networks

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.