Skip to main content
Top

An FCM-based hybrid method for DDoS attack detection in resource-constrained devices

  • Open Access
  • 27-10-2025
Published in:

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This article introduces a novel hybrid method for detecting Distributed Denial of Service (DDoS) attacks in resource-constrained devices, such as IoT devices. The method combines fuzzy cognitive maps (FCMs) with machine learning feature selection algorithms to create a lightweight, transparent, and reliable intrusion detection system. The article begins by discussing the challenges of securing IoT devices and the limitations of traditional machine learning methods. It then introduces the concept of fuzzy cognitive maps and their application in intrusion detection systems. The authors present a hybrid approach that uses feature selection algorithms to identify high-impact features and their weights, which are then used to train and test an FCM-based model. The article also discusses the automatic computation of the threshold value for packet classification based on the Area Under the Curve (AUC) score. The experimental results demonstrate the efficacy of the hybrid approach in detecting a wide variety of DDoS attacks, with a focus on selected ML-FCM hybrid pairings. The article concludes by discussing the potential for further improvement in accuracy and speed of classification, as well as the robustness of the model against adversarial attacks and new types of network traffic.
Sarvesh Kulkarni and Prathibha Keshavamurthy contributed equally to this work.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Advancements in machine learning (ML) and deep learning (DL) methods have encouraged security researchers to explore these methods as viable solutions for intrusion detection. However, it is a formidable challenge for system and network administrators to secure their hardware and software against unauthorized access. Cyber attacks and data breaches targeting vulnerable devices are increasing in frequency and have significant operational and economic impacts.
The security of devices that comprise the Internet of Things (IoT) is even more challenging due to their resource limitations. IoT devices, such as smart home appliances, wearable medical devices, vehicular management systems, radio frequency identification (RF-ID) enabled devices, programmable logic controllers (PLC), sensors, probes, and cameras, are equipped with just the necessary hardware and software to be able to collect and transfer data and can be controlled and monitored remotely. Industrial Internet of Things (IIoT) devices are part of the Critical Network Infrastructure (CNI) that plays a pivotal role in the functioning of systems in manufacturing, power, transportation, and health sectors. They contain minimalist hardware, software, and applications, and cannot be overburdened by the processing requirements of complex security tools. It is essential that these complex interconnected systems are secure and devoid of threats. A light-weight method to detect network layer attacks on such low-resource devices is therefore very desirable.
The cyber kill chain® framework developed by Lockheed MartinTM [1] (adopted from the “phases of the intrusion kill chain” of the US military), describes an end-to-end systematic process to find and target adversaries. The seven steps of the kill chain that constitute a comprehensive cyber attack are as follows.
1.
Reconnaissance: attackers gather information such as open ports and vulnerabilities of the target device.
 
2.
Weaponization: attackers develop a cyber weapon to exploit the victim, for example, a malware, based on the information they have collected.
 
3.
Delivery: attackers attempt to deliver the malware through methods like phishing emails and social engineering.
 
4.
Exploitation: attackers attempt to gain access to the victim device through malware and also attempt a lateral move to other systems in the network.
 
5.
Installation: attackers install the malware and gain access to the victim device through some backdoor.
 
6.
Command and control: attackers establish a control channel and remotely monitor the activities of their weapon.
 
7.
Action: attackers fulfill their objective, either to exfiltrate data or to use the victim as a bot for further attacks.
 
Security engineers must be ready with all the necessary tools to set up a strong defense to stop an adversary at every step of the kill chain process. Well-resourced and trained malicious actors have made modern-day defense tools appear inadequate. IoT devices are targets of Denial of Service (DoS), man in the middle, phishing, malware, and ransomware attacks, among many others. As a precursor to any attack, adversaries use scanning tools to gather information about the victim device. Therefore, defense against reconnaissance is of utmost importance as a preventive measure against subsequent attacks. A malicious actor can weaponize a simple tool, such as a port scan, into an attack like a SYN flood or a UDP flood. The victim is so overwhelmed by SYN or UDP requests that it cannot serve legitimate users, causing a DoS attack. The malicious actor can also carry out a distributed denial-of-service (DDoS) attack by initiating a deluge of port scan requests from multiple sources. So, a “prevention is better than cure” approach to defending critical systems at the network layer is necessary to prevent reconnaissance and DoS attacks.
Fifth generation (5 G) cellular systems are enablers of low latency, large-scale, high-density edge IoT deployments [2] that bring workload processing closer to data sources. Edge deployments also improve scalability by reducing end-to-end latency and also the load in the core network, both essential in application areas like industrial automation and connected vehicles [3]. Further scalability improvements are possible through “network slicing” by cleanly separating different application classes into QoS-optimized virtual networks [4]. Thus, large-scale, dense IoT deployments are possible through hardware and containerization support in 5 G networks, and other distributed deployment strategies [5]. However, such large-scale deployments of resource-constrained IoT devices increase the attack surface greatly and require new security models and mechanisms.
Resource limitations, device heterogeneity, scalability, access limitations, and maintaining data privacy in IoT devices present special challenges. Furthermore, most legacy security schemes are designed to work with IP traffic. In 5 G networks, when Non-IP Data Delivery (NIDD) is employed, the absence of standard IP and related protocol headers renders such schemes ineffective. Thus, new approaches to traffic inspection and analysis are needed.
Although ML and DL algorithms are capable of detecting a broad range of attacks, it is important to explore alternative solutions that can work with these aforementioned limitations. We therefore propose a new hybrid model for intrusion detection that: (1) is computationally light, transparent, and reliable; (2) works well in devices with limited resources; (3) combines various statistical or ML-based feature selection algorithms with fuzzy cognitive maps; (4) is an alternative to traditional ML models, with comparable detection accuracy while overcoming some of the limitations of ML solutions such as computational cost, black-box approach, and randomness in predictions; and (5) makes no assumptions about the type of traffic. We demonstrate that our hybrid algorithm works well to detect twelve different types of DDoS attacks.
This paper is organized as follows: Section 2 introduces the background and related work. Section 3 explains the concept of fuzzy cognitive maps and their application in intrusion detection. Section 4 describes our work on the hybrid solution for detecting cyber attacks. Section 5 explains the results and analyses of our experiments. Finally, Section 6 concludes the document and suggests directions for future work.

2.1 Background

Firewalls, antivirus software, and intrusion detection systems have improved over the years in becoming increasingly accurate in detecting a wide range of cyber attacks. The traditional Information Technology (IT) infrastructure is constantly monitored and updated by system and network administrators. However, the Operational Technology (OT) infrastructure, which the IoT devices are part of, is more likely to lag behind in standard security practices of upgrading and patching their software. Therefore, the OT infrastructure is particularly vulnerable to sophisticated, modern-day attacks.
Many recent instances of high-profile cyber attacks demonstrate the security challenges for devices in highly sensitive environments or with modest processing capabilities. In particular, the Stuxnet attack [6], also dubbed as the world’s first cyberweapon, sabotaged Iran’s nuclear program by targeting the computer systems that controlled the centrifuges and gas values. An attack on Target® Corporation’s HVAC systems in 2014 [7] illustrated the risks of remotely managed Heating, Ventilation and Air Conditioning (HVAC) systems. Hackers stole the login credentials of the company that managed the HVAC systems to penetrate Target’s payroll systems. At that time, QualysTM, the vulnerability management software company, noted that there were about 55,000 HVAC systems without adequate security. They were able to hack and reprogram consumer-grade WiFi cameras to create a permanent backdoor. The backdoor was then used by malicious actors to get command and control over other devices in the network and steal data. Carpentier et al. [8] demonstrate an attack against a highly secure network using a BluetoothTM smart-bulb. They execute a simple JavaScript within a web-based Bluetooth Application Programming Interface (API) using a web browser as an attack vector. Their attack runs in stealth mode and outsmarts all the security controls within the network to gain access to the smart-bulb.

2.2 Reconnaissance attacks

Reconnaissance is an important first step for attackers in the discovery and selection of target systems. The discovered vulnerabilities of the target can be exploited in many ways. A denial-of-service attack is one such exploit wherein the attacker sends a large number of scan requests to overwhelm the bandwidth and processing resources of the target. Protection of systems against such attacks is challenging, as the service ports must be kept open to respond to client requests. For instance, since a web server must listen on TCP ports 80 and 443, requests to these ports cannot be blocked. Apart from the next-generation security apparatus, enterprises can utilize DDoS protection service providers that filter and clean traffic. However, a large number of IoT devices are not protected by traditional security infrastructure. Worse yet, many devices are left at their default factory settings and thus highly susceptible. Therefore, equipping such devices with a low-overhead host-based intrusion detection system is advisable.
The MITRE ATT&CK® framework [9], which is similar to the cyber kill chain described in Section 1, is a knowledge base of adversarial tactics, techniques, and procedures. This framework provides a broad and comprehensive list of techniques used by adversaries and is based on real-world examples. Based on this framework, cyber-security professionals develop a structural approach to threat modeling and defense methodologies. This framework identifies reconnaissance as an important tactic and lists the ten techniques that adversaries utilize for reconnaissance. The first of the reconnaissance techniques is “active scanning.” A wide range of proprietary and open-source scanning tools is available to either gather information passively about a target host or to weaponize scanning as an active tool for a denial-of-service (DoS) attack. Some examples of easily accessible open-source tools that may be used for mounting attacks are: Kali Linux®, Metasploit, WiresharkTM, TCPdump, nmap, BurpsuiteTM, and Nessus®. Modern computer systems are capable of scanning a wide swath of the Internet in minutes and gather significant information about the systems running across the globe. In addition, there are online search engines such as Shodan® and Censys® that scan the Internet and provide additional Internet intelligence for a fee. As a consequence, Internet-wide port scanning activity is ever-increasing.
A. Cui and S. J. Stolfo [10] present a quantitative analysis of insecure embedded network devices by conducting an Internet-wide scan (excluding some networks) in 2010. Their study reveals that about 540,000 embedded devices were publicly accessible and vulnerable to attack. Insight into Internet-wide scanning and vulnerability identification is provided by O’Hare et al. [11]. They propose a scanning tool called Scout that combines data from Censys and the National Vulnerability Database (NVD) of the National Institute of Standards and Technology (NIST) to passively identify potential vulnerabilities. Port scanning is performed by both institution-sanctioned penetration testers as well as nefarious adversaries, but their motives differ. The former aim to discover and patch the vulnerabilities, while the latter seek to discover and exploit these vulnerabilities.
Fig. 1
Taxonomy of DDoS attacks [12]
Full size image

2.3 Distributed denial-of-service attacks

Information on open ports can be utilized to launch a denial-of-service (DoS) attack. In a DoS attack, the target machine is inundated with a large number of spurious requests so as to render it unable to respond to legitimate user requests. Servers accessible over the Internet are naturally vulnerable to external attacks since the service ports have to be publicly visible. An IoT device, such as a web camera running a web server for access, can be easily overwhelmed by such large requests, as it has limited processing, memory, and storage capacity. A far more common type of DoS attack is when an adversary mounts a massive scan operation from multiple bots, resulting in a distributed denial-of-service (DDoS) attack. A DDoS is difficult to detect and block as the attack has seemingly multi-point origins. Common service ports such as HTTP (80), DNS (53), and LDAP (389) are easy targets of DDoS attacks.
Sharafaldin et al. [12] provide a taxonomy of DDoS attacks on the basis of the type of attack (reflection or exploitation), the protocols targeted, and the types of packets that are utilized. The taxonomy is succinctly captured in Fig. 1 reproduced here from their paper. It is apparent from the figure that DDoS attacks are possible at the network, transport, as well as application layers of the TCP/IP reference model.
Exploitation-based DDoS attacks, such as SYN flood or UDP flood, are usually volumetric in nature. In a volumetric attack, the adversary sends SYN or UDP packets to the target through numerous intermediate bots so as to overwhelm the target with requests. A SYN flood [13], sometimes known as a “half-open” attack, is a network-tier attack that bombards a server with TCP connection requests without responding to the corresponding server-sent acknowledgments. The large numbers of these half-open TCP connections consume the server’s resources, thereby crowding out legitimate traffic. After multiple attempts, the victimized server sends an RST packet and terminates the session. However, by then, it may be too late to avoid service disruption. A UDP flood attack [14] is similar to a SYN flood attack, but performed with UDP packets. A large number of UDP packets are sent to random UDP ports of a target server in order to consume its resources disproportionately. The server, if not listening on that port, keeps responding with a “destination unreachable” Internet Control Message Protocol (ICMP) message.
In reflection-based attacks, the attacker hides behind multiple reflector servers to attack the target machine. This type of attack is effective using protocols such as DNS, LDAP, and NTP, as the servers constantly advertise their presence and are expected to be available and listening on the associated ports. Thus, an attacker can send a large number of DNS requests to public DNS servers with the source IP address set to the victim machine. The DNS servers respond to all these DNS queries, but the responses go to the victim machine and not the attacking host, thus overwhelming the victim machine.
Table 1
Comparison of related work with their limitations
Reference
Dataset
Algorithm
F1-Score
Limitations
[12]
CICDDoS2019
ID3
69.0
Limited algorithms tested in the new dataset created
[15]
CICIDS2017
Random Forest
97.0
Limited attack types in the new dataset created
[16]
ISOT
MultiLayer Perceptron
98.89
High False Positive Rate & and False Negative Rate
[17]
UNSW-NB15
PCA-XGBoost
99.99
Only one feature reduction method tested
[18]
WUSTL-IIoT-2021
Transformer
94.31
Interpretability & High computation cost
[19]
CICDDoS2019
Random Forest
99.98
Cannot detect new attacks
[20]
Bot-IoT
CNN
99.92
High computation cost
[21]
SYN flooding
Custom Feature
99.79
Complex, multiple
 
Low Rate
Selection with
99.7
subprocesses. High
 
Mirai
Logistic Regression
89.49
computaton cost
[22]
CICDDoS2019
Agglomerative clustering (AC)
78.5
Other algorithms perform better than the
 
DARPA1998
AC
88.3
proposed algorithm
 
ISCXIDS2012
K-means
46.4
IQRPACF
[23]
BOT-IoT
Random Forest (with Looking-back)
99.81
High computation cost
[24]
Edge-IIoTset
SSA-MLP
88.24
High computation cost
 
WUSTIL-IIoT-2021
(Salp Swarm Algorithm-
93.61
Requires parameter tuning
 
IoTID20
Multi-Layer Perceptron)
97.69
 
[26]
Nine different
Emperor Penguin Colony
98.0
KNN Classifier tested
 
IoT datasets
K-nearest neighbor
(Accuracy)
High computation cost
[25]
UNSW-NB15
Effective Seeker Optimization (ESO
83.08
High computation cost
  
Denoising Auto-Encoder (DAE)
 
Computation time not specified
[27]
UNSW-NB15
Pearson Correlation
97.8 (Accuracy)
High false alarm rate
  
-Decision Tree (DT)
 
Only DT method tested
[28]
CICDDoS2019
Time Enchanced Transformer for Security(TETS)
96.0 (Average)
Only few DDoS attacks tested
  
(integration of ContiFormer and Pigeon hole optimization)
 
Computation time not specified

2.4 Intrusion detection with machine learning algorithms

Much research effort is directed towards using machine learning (ML) methodologies in Intrusion Detection Systems (IDS). Some publicly available datasets provide a wide variety of attack vectors, the most current being the CICIDS2017 [15] and the CICDDoS2019 [12] datasets, published by the Canadian Institute for Cybersecurity. These two datasets differ in some of the attacks that they capture. In addition to providing the datasets, the authors also demonstrate successful detection by first identifying the best features and then analyzing the predictions of selected ML algorithms. The Random Forest algorithm performs the best in their experiments. Wang et al. [16] propose a Multi-Layer Perceptron (MLP) based attack detection method that incorporates sequential feature selection and a feedback mechanism to dynamically reconstruct the detector based on detection errors. They experiment with the five best features and train and test three different datasets. Saheed et al. [17] conduct a performance analysis of an IDS solution based on supervised learning algorithms using the UNSW-NB15 dataset. They reduce features using the dimensionality-reducing Principal Component Analysis (PCA) algorithm and then use other classification algorithms to train and test; XGBoost performs the best. Casajús-Setién et al. [18] propose an anomaly-based Network IDS (NIDS) using the transformer model, a type of neural network which is a sequence transduction model that allows reasoning from a set of observed training sequences to a new set of test sequences using the WUSTL-IIoT-2021 dataset. Becerra et al. [19] present performance analysis of six ML algorithms using the CICDDoS2019 dataset. The Pearson correlation coefficient is used to reduce the number of features to 22 and to demonstrate that the Random Forest algorithm has the best accuracy.
Jemal et al. [20] present an effective Convolutional Neural Network model for detecting DoS and DDoS attacks, tested on the BoT-IoT dataset. Zhou et al. [21] propose a DDoS attack flow classification system based on a feature selection method that computes the best threshold value for each feature. Gniewkowski et al. [22] propose a novel detection method based on the features related to the temporal dependencies in the client’s behavior. Their so-called “IQRPACF” method not only identifies large values that stand out from the rest (Inter Quartile Range - IQR), but also utilizes a Partial AutoCorrelation Function (PACF) to find frequent and repetitive changes in the data. Mihoub et al. [23] propose a new method for the detection and mitigation of DDoS attack, based on the looking-back approach for machine learning. This method is able to detect the attack type precisely; the inputs for this model are the top ten input features of the selected ML or DL algorithms, along with the attack types seen until that time instant. The Random Forest algorithm, paired with looking-back steps, is the best performer with 99.81% accuracy. Alzubi et al. [24] present an Artificial Neural Network (ANN) based IDS combined with Salp Swarm Algorithm (SSA) for optimization. In fog and edge computing, the Effective Seeker Optimization combined with Machine Learning-Enabled Intrusion Detection System (ESOML-IDS) feature selection (FS) method in [25] chooses an optimal subset of features to identify intrusions. Alweshah et al. [26] propose a wrapper feature selection model using the emperor penguin colony optimization algorithm and K-nearest neighbor classifier for intrusion detection in IoT devices.
Prasad et al. [27] propose a custom feature selection method based on Pearson Correlation Coefficient (PCC) combined with Decision Tree (DT)classification for intrusion detection using the UNSW-NB’15 dataset. A Transformer architecture-based model is proposed by Fan et al. [28] to detect DDoS attacks using the CICDDoS2019 dataset. They use the pigeon hole optimization algorithm for feature selection and the Time Enhanced Transformer for Security (TETS) model for training and predictions. They test DNS, LDAP, NTP, and Portmap DDoS attacks against Convolutional Neural Network, Recurrent Neural Network, Long Short Term Memory+Transformer, Convolutional Neural Network+Transformer, along with their own model. A detailed survey of intrusion detection solutions for IoT devices is presented by Kharaisat et al. in [29]. They classify IDS for IoT based on placement, detection, and validation strategies. The detection methods utilize machine learning algorithms such as supervised learning, unsupervised learning, reinforcement learning ensemble, and deep learning models. They also discuss statistical methods like the hidden Markov model and detection techniques based on fuzzy logic.
Table 1 captures the salient points and the limitations of ML and DL algorithms used for intrusion detection. They generally suffer from one or more of the following limitations. (1) High computational complexity for training and/or testing with large datasets for effective predictions, (2) non-determinism in algorithms such as Random Forest and Extra Tree Classifier, due to the randomness built into their logic, (3) focus on just a few specific attack types, (4) limited baseline models for comparison, and (5) lack of causal understanding within these models leading to mistaken inferences (false positives and false negatives) [32]. More recent studies, such as [2426] and [28], also exhibit one or more of these limitations. In terms of the interpretability of these trained models, eXplainable Artificial Intelligence (XAI) methods like SHapley Additive exPlanations (SHAP) [30] and Local Interpretable Model-Agnostic Explanations (LIME) [31] do help interpret ML predictions. However, their processes are complex and add to the already high cost of computation. Furthermore, they are executed on the model after its training phase, and thus add an extra step. In addition to these limitations, the heterogeneity of resource-constrained devices presents an additional level of challenge in the implementation of trained models. These considerations are strong motivators for our work. Our proposed hybrid algorithm has low computational complexity, making it suitable for resource-constrained devices, is scalable, deterministic, and is effective for detecting a large number of DDoS attacks. Furthermore, our model is transparent about the calculations and impact of the various input features on the final classification. Our work, presented in this paper, thus deviates from the normal approach of using ML algorithms as black-box detectors.
In our previous work [33], we ran experiments to determine the causal relationship between traffic characteristics of incoming network packets that would lead to their classification into benign or attack packets. We used statistical analysis with a focus on key features such as packet count, packet size, and CPU utilization to detect reconnaissance attacks. It relies on experimental hand-selection of a few aforementioned features, as well as the manual selection of detection thresholds to determine where the traffic is malicious or benign. In [34], we introduced a greatly improved method that considers a vastly large set of influencing features and employs a novel hybrid technique for the detection of SYN and UDP flood attacks. It is both effective and computationally efficient compared to pure ML algorithms. This manuscript is an extended version of our paper presented at CSNet 2024 [34]. Here, we extend our hybrid approach to detect all types of DDoS attacks, not just the SYN flood and UDP flood attacks. Furthermore, we automate the computation of the threshold value that serves as the decision boundary between attack and benign packets. We use ML algorithms to simply compute the weights of the input features against the output feature. We then use FCM to train and test the models and demonstrate that this method is relatively fast and reliable for a few specific feature selection algorithms for most types of DDoS attacks. Our model is meant to be executed directly on edge devices.

3 Fuzzy cognitive maps (FCMs)

3.1 FCM concepts, edges, and weights

Fuzzy cognitive maps (FCMs) were originally proposed by Kosko [35] as an evolution of R. Axelrod’s work on Cognitive Maps [36]. An FCM is a fuzzy digraph that represents hazy causal relationships (uncertainty) between causal objects in the graph called “concepts” or “features.” The features are represented by the vertices in the digraph, and the directed edges connecting features are called “weights.” An FCM is constructed using these features and weights. Figure 2 shows a complex map of concepts interconnected by edges with weights. Concept \(C_4\) is the output feature, concepts \(C_1\) and \(C_3\) have positive impact on \(C_4\) with weights \(W_{14}\) and \(W_{34}\) respectively. Concept \(C_2\) and \(C_5\) have negative influence on \(C_4\) with weights \(W_{42}\) and \(W_{45}\). The impacts are reflected in the weights between any two concepts. Some input concepts also have an influence on other input features, thus creating a complex web of inter-dependencies. In a traditional FCM, initial values are assigned to both features (vertices) and weights (edges) of the digraph by experts in the corresponding field. Features are the influencing factors that impact the system’s outcome. The extent of the impact is determined by the values of features and weights and by the iterative algorithm that updates these values on each pass. Thus, an FCM may be used as an analysis tool for inferring causal relationships between influencing factors (features) and outcomes of interest.
Fig. 2
A simple fuzzy cognitive map
Full size image
FCMs may be analyzed using one of two approaches: causal or dynamic [37]. The causal approach represents our certainty about one concept’s influence over another. So, the value on each edge is either 0 or 1. However, the dynamic approach represents the magnitude of influence of one concept over another. Thus, the value on an edge can be any real value, usually normalized between \(-1\) and \(+1\). FCM concepts can be identified as input and output features. Single-layered FCMs measure only the influence of multiple input features on a single output feature. Multi-layered FCMs measure the influence of each feature on all other features.
For instance, in Fig. 2, if the edge weight \(W_{13}\) is positive, then concept \(C_1\) has a positive (reinforcing) impact on concept \(C_3\). Thus, if \(C_1\) increases, so does \(C_3\). On the other hand, if \(W_{13}\) is negative, then \(C_1\) has a negative impact on \(C_3\), thereby attenuating \(C_3\)’s value. \(C_3\) in turn, can either reinforce, attenuate or have no effect on the output feature \(C_2\) depending on whether the edge weight \(W_{32}\) is positive, negative or zero. To make matters more interesting, note that not only does \(C_1\) indirectly influence \(C_2\) via \(C_3\) in the manner described above, but it also directly influences \(C_2\) via the edge weight \(W_{12}\). In this manner, in a multi-layered FCM, complex interdependence, i.e., correlations among concepts (features) can be modeled.
FCMs, when first implemented, were used to compute the mathematical value of weights based on the linguistic value of weights assigned by experts [38]. Many experts can identify the features that they consider to be important and assign weights to the outgoing directed edges from those features. Each expert’s input creates a distinct FCM. FCMs from various experts can be algorithmically combined to compute a single resultant FCM. FCMs are used mainly for time series forecasting to compute the future state of a system or to find values of features for a stabilized state of the system. FCMs are also used for classification problems. An FCM is somewhat similar to a neural network since both have input and output features with link weights. However, a neural network has a hidden layer between its input and output layers, whereas an FCM is fully transparent.

3.2 FCM as analysis tool

For the example FCM shown in Fig. 2, the features are labeled as \(C_j (j=1,2,..., N)\), and the weight values \(w_{ji}\) are as shown on the directed edges from \(C_j\) to \(C_i\). In our dynamic FCM, \(C_j =[0, 1]\), and \(w_{ji} =[-1, 1]\). In Fig. 2, the cumulative impact of input features \(C_1\), \(C_3\), \(C_4\) on output features \(C_2\), \(C_5\) (in red), is computed iteratively by the following rule [35, 39].
$$\begin{aligned} A_i^{t+1} = f \Bigg (\sum \limits _{{j=1,j \ne i}}^N A_j^t \times w_{ji} \Bigg ) \end{aligned}$$
(1)
where \(A_i^{t+1}\) is feature \(C_i\)’s value at time \(t+1\), \(A_i^{t}\) is \(C_i\)’s value at time t, \(w_{ji}\) is the weight (strength) of the fuzzy influence of \(C_j\) over \(C_i\), and f is the squashing function that transforms the computed value of \(C_i\) into our required interval [0, 1].
FCM feature and weight value computations, activation functions, and learning methods may vary so as to provide better predictions for future feature values. For instance, Stylios et al. [40] propose modifications to Eq. 1 that include adding \(A_j^t\) to the summation for time series predictions. Bueno et al. [41] provide a comparison of the four types of activation functions used in FCM—the sigmoid function, the hyperbolic tangent function, the step function, and the threshold linear function. The sigmoid function offers significantly greater advantages over the other functions as it allows decision support by defining middle intensities and not the extreme (0 or 1) cases. Papageorgiou et al. [42, 43] explore active Hebbian and nonlinear Hebbian-type learning algorithms to train FCMs. Detailed surveys on FCM are provided by Jetter et al. [44], Papageorgiou et al. [45], and Felix et al. [46] in which they discuss FCM concepts, advancement in FCM methodologies, and FCM applications. Tools and software for utilizing FCMs are also discussed.
FCMs are mainly used for time series forecasting, to compute the future state of a system or to find values of features for a stabilized state of the system. In addition, FCMs are also used for classification problems in various domains. Nápoles et al. [47] present a detailed review of the use of FCMs for classification problems. Salmeron et al. [48] use particle swarm optimization for FCM learning to classify a patient’s rheumatoid arthritis profile. In [49], Froelich proposes a new algorithm for generating discrimination thresholds. Nápoles et al. [50] combine rough set theory and FCM for pattern classification by grouping input neurons as three-way decision rules—positive, negative, and boundary. In [51], Mls et. al propose a new algorithm for optimization of the connection matrix (weights matrix) that would address the incompleteness and uncertainty of experts’ opinions. The dynamics of the state vectors of several candidate FCMs are simultaneously presented to the human expert on the Graphical User Interface (GUI) of the related application. Doing so aids the expert in visualizing and evaluating the quality of the whole population at once. Without the need for an analytically expressed fitness function, the expert uses the interface to prioritize the best FCM candidates in every population. Wozniak et al. [52] propose a new method Real-Coded Genetic Algorithm (RCGA), that is based on Genetic Algorithms (GAs), to generate a virtual population. This methodology does not require human intervention and provides high-quality models that are generated from historical data.

3.3 FCMs in intrusion detection systems

In the cyber-security domain, the contextual information in FCM has been used for decision support in Intrusion Detection Systems (IDS). Nápoles et al. [53] propose a supervised learning algorithm using Rough Set Theory and FCM. Concepts are grouped as positive features (weight is \(+1\)), negative features (weight is \(-1\)), and boundary features that include all the inconsistent features (weight is 0.5), based on a threshold value. Aparicio-Navarro et al. [5457] use FCMs along with the Dempster-Shafer theory (for data fusion) and basic probability assignment values to include contextual information at different stages of the intrusion detection process. They claim that their approach provides effective solutions by reducing false predictions.
In these papers, the FCM construction process is not codified, and the grouping of features and assignment of weights requires user intervention, and perhaps human judgment. Our work presented here is different in the following way. We have a clearly defined process to construct and utilize FCMs. We first use a feature selection algorithm to extract the weights for all input features, then utilize this information to construct the weighted FCM graph; this step provides a clear representation of the importance of each feature as explained in Section 4.2. The feature importance scores are then provided to the FCM Eq. 1 to train and test the model. An automated threshold computation process provides the final output classification. This results in a clear, simple, light-weight, and transparent approach to intrusion detection in resource-constrained devices.

4 Dataset and methodology

4.1 Dataset and pre-processing

We use the CICDDoS2019 dataset [12] for our study. The dataset contains data on several types of DDoS and reflective DoS attacks. It includes raw packet captures (pcap) and comma-separated value (csv) files for each type of attack, along with some benign traffic. The dataset is generated over a test-bed that consists of an attack network and a victim network. The victim network consists of all commonly used equipment. A benign profile agent generates realistic background traffic, and the attacks are executed using third-party tools. The raw pcap files are passed through an open-source tool called the “CICFlowMeter” [58] that generates Biflows from pcap files and extracts about 88 features from the packet captures.
In our study, we focus our computations and analysis on both the volumetric and reflective DDoS attack data available, namely SYN flood, UDP flood, DNS, LDAP, MSSQL, NTP, NetBIOS, SNMP, SSDP, Portmap, TFTP, and UDPLag. We do not consider WebDDoS attack packets as the sample is very small. CICDDoS2019 captures packet data on two different days: the first on January 12th, designated as the training day, and the second on March 11th, designated as the testing day. In this study, we refer to the January 12th csv files as <attack>-Jan and the March 11th csv file as <attack>-Mar. The authors of this dataset sort the flows by attack type. However, some csv files have more than one attack packet mixed in. In order to train and test each attack, we separate those and group the records for each type of attack. We note that the benign packets are low in number and cause a significant imbalance in the dataset. In order to address this imbalance, we combine benign packets from all the January records and March records separately and include them in all the attack record files. Our resulting dataset contains one csv file per day for each attack (training, testing) with the same number of benign packets per day.
We first cleanse the datasets by removing those features that (1) are duplicates, or (2) have all zero values, or (3) do not contribute to decision making or are correlated. We therefore reduce the datasets to 27 features (down from 88). For instance, we remove the “Forward Packet Length Max” and “Forward Packet Length Min” features as they have the same type of influence on the output feature as the feature “Total Length of Fwd Packets” and are thus marked as redundant. We drop features like “PSH Flag count,” “Fwd Avg Packets Bulk” that have all zero values and features like “Source port,” “Flow ID,” “Timestamp” that have no impact on the output, i.e., final classification. We do not use any data balancing techniques in our study; we demonstrate later in Section 5 that our hybrid model performs well with the preprocessed data despite being imbalanced.
Table 2
Summary of datasets
Training set
Packets included
Testing set
Packets included
dns-Jan-train
DNS - 2,489,508
dns-Jan-test
DNS - 622,377
 
BENIGN - 44,868
 
BENIGN - 11,217
ldap-Jan
LDAP - 1,841,373
ldap-Mar
LDAP - 369,554
 
BENIGN - 56,085
 
BENIGN - 12,191
mssql-Jan
MSSQL - 3,490,019
mssql-Mar
MSSQL - 777,819
 
BENIGN - 56,085
 
BENIGN - 9050
netbios-Jan
NetBIOS - 3,733,424
netbios-Mar
NetBIOS - 730,207
 
BENIGN - 56,085
 
BENIGN - 12,357
ntp-Jan-train
NTP - 699,669
ntp-Jan-test
NTP - 174,918
 
BENIGN - 44,868
 
BENIGN - 11,217
portmap-Mar-train
Portmap - 140,401
portmap-Mar-test
Portmap - 35,101
 
BENIGN - 44,982
 
BENIGN - 11,245
snmp-Jan-train
SNMP - 3,429,477
snmp-Jan-test
SNMP - 857,370
 
BENIGN - 44,868
 
BENIGN - 11,217
ssdp-Jan-train
SSDP - 1,861,882
ssdp-Jan-test
SSDP - 465,471
 
BENIGN - 44,868
 
BENIGN - 11,217
syn-Jan
SYN - 1,379,983
syn-Mar
SYN - 273,488
 
BENIGN - 56,085
 
BENIGN - 3416
tftp-Jan-train
TFTP - 3,528,242
tftp-Jan-test
TFTP - 882,061
 
BENIGN - 13,548
 
BENIGN - 3387
udp-Jan
UDP - 2,864,587
udp-Mar
UDP - 558,853
 
BENIGN - 56,085
 
BENIGN - 8840
udplag-Jan
UDP-lag - 313,945
udplag-Mar
UDPLag - 1873
 
BENIGN - 56,085
 
BENIGN - 56,227
The target column “Label” contains the two categories of the classification of packets—attack (SYN, UDP, LDAP, DNS, etc.) and benign (normal) on the corresponding datasets. These two categories are assigned numerical values, 0 for attack and 1 for benign, for computational purposes. The features have a wide range of values between them. For example, the “Flag” features have binary values—either 0 of 1, but features like “Flow Duration” or “Flow Inter Arrival Time” have values that could be as low as zero or as high as millions, or even \(\infty \). To make fair comparisons of features, we normalize the data so that all feature values fall in the interval [0, 1], using the normalization formula in Eq. 2.
$$\begin{aligned} FN_i^j = \frac{FA_i^j - F_{min}^j}{F_{max}^j - F_{min}^j} \end{aligned}$$
(2)
where
\(FN_i^j\) is the jth feature’s normalized value at the \(FA_i^j\) is the jth feature’s actual value at the ith flow,
\(F_{max}^j\) is the jth feature’s maximum value,
\(F_{min}^j\) is the jth feature’s minimum value.
Some attack packets are available only on either the training day or the testing day. For such attacks, we split the dataset into training (80%) and testing datasets (20%). For attacks that have records in both January and March files, we use the January files in full for training and reduce the March files to about 20% of the training records so as to conform to the standard practice of 80–20 ratio for training and testing. Table 2 shows the summary of the datasets used for training and testing, and the packet counts for each category in the dataset after cleansing operations.
Table 3
Top ten features and scores for LDAP dataset
Feature
Score
Feature
Score
SKB-C
 
SKB-C2
 
Average Packet Size
35,038,060
URG Flag Count
773,122.18
Inbound
8,191,439
CWE Flag Count
359,700.89
Down Up Ratio
1,758,414
ACK Flag Count
316,828.22
URG Flag Count
1,333,185
RST Flag Count
240,486.05
CWE Flag Count
447,083.9
Fwd PSH Flags
240,486.05
ACK Flag Count
382,749.2
Init Win bytes forward
211,437.4
Init Win bytes forward
363,110.3
Flow Duration
139,002.25
Flow Bytes s
645,236
Init Win bytes backward
101,439.52
Fwd PSH Flags
276,618.9
Down Up Ratio
75,834.72
RST Flag Count
276,618.9
Flow Bytes s
43,347.51
AdaB
 
CatB
 
Average Packet Size
0.16
Average Packet Size
33.11
Init Win bytes forward
0.12
Fwd Header Length
28.56
Total Length of Fwd Packets
0.10
min seg size forward
16.77
Init Win bytes backward
0.10
Bwd Packets s
5.6
Total Length of Bwd Packets
0.08
Inbound
2.64
min seg size forward
0.08
Init Win bytes forward
2.37
Fwd Header Length
0.06
Flow Bytes s
2.36
Flow Duration
0.04
URG Flag Count
1.89
Flow IAT Mean
0.04
Flow Duration
1.71
Bwd Header Length
0.04
Fwd Packets s
1.28
ETC
 
PCA
 
Inbound
0.37
Flow Duration
0.57
Average Packet Size
0.25
Total Fwd Packets
0.26
URG Flag Count
0.14
Total Backward Packets
0.09
Down Up Ratio
0.04
Total Length of Fwd Packets
0.03
CWE Flag Count
0.04
Total Length of Bwd Packets
0.02
Flow Bytes s
0.03
Flow Bytes s
0.01
ACK Flag Count
0.03
Flow Packets s
0.0
Flow Packets s
0.03
Flow IAT Mean
0.0
Init Win bytes forward
0.02
Fwd PSH Flags
0.0
RST Flag Count
0.01
Fwd Header Length
0.0
RF
 
XGB
 
Flow Bytes s
0.19
Average Packet Size
0.96
Average Packet Size
0.17
CWE Flag Count
0.01
Total Length of Fwd Packets
0.17
Init Win bytes forward
0.01
Inbound
0.09
Flow Bytes s
0.0
Total Backward Packets
0.08
Total Backward Packets
0.0
Fwd Packets s
0.07
URG Flag Count
0.0
Bwd Header Length
0.06
Flow Duration
0.0
Flow Packets s
0.05
ACK Flag Count
0.0
Init Win bytes forward
0.04
min seg size forward
0.0
Flow Duration
0.02
Init Win bytes backward
0.0
Fig. 3
Features and weights for AdaB classifier for DNS
Full size image

4.2 Features and weights

After the data is cleansed and normalized, we use the following feature selection algorithms in ML to measure the influence of the 26 input features on the output feature. Thus, feature selection algorithms are not used for actual feature selection, but to compute the weights between the input features and the output feature for our FCM models. Here, we evaluate the impact of input features on the output feature using a single-layer FCM model. The ML algorithm descriptions are referenced from SciKit Learn [59], an open-source ML library for Python, and from the Machine Learning Mastery website [60]. In our initial quest to find a simple solution, we first chose a few well-known statistical feature selection methods, such as SelectKBest-classification (SKB) and SelectKBest-ChiSquared (SKB-C2), and found the results to be promising. We then expanded the search to ML-based feature selection methods, selecting a few tree-based and boosting ML algorithms such as Random Forest (RF) and eXtreme Gradient Boosting (XGB), that are known to perform well in supervised learning classification problems. We also ran the Gradient Boosting Classifier (GBC) and Mutual Information (MI) algorithms and have presented those results in our earlier work in [34]. However, since those algorithms were two orders of magnitude slower than the best-performing algorithms with slightly inferior accuracy, we have dropped them from our analysis here.
1.
SelectKBest: selects the “K” best features from univariate statistical tests. It provides a few options for the score function. We chose this classification (SKB-C) and chi-squared (SKB-C2) score functions to obtain the scores for the input features.
 
2.
Extra Tree Classifier (ETC): an estimator that fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and to control over-fitting.
 
3.
Principal Component Analysis (PCA): a dimensionality reduction technique that transforms a large number of correlated features into a new small set of uncorrelated key features.
 
4.
Adaptable Boosting Classifier (AdaBC): an open-source ensemble learning technique that combines several weak classifiers based on decision trees to form a single strong classifier [61].
 
5.
Category Boosting Classifier (CatBC): an open-source gradient boosting classifier on decision trees with higher performance and accuracy over Gradient Boosting Classifier [62].
 
6.
Extreme Gradient Boosting (XGB): an optimized, distributed highly efficient gradient boosting library that works on parallel trees for speed and accuracy [63].
 
7.
RandomForest (RF): an ensemble of multiple decision tree models that fits a number of decision tree classifiers on various sub-samples of the dataset.
 
In addition, we tested the Recursive Feature Elimination (RFE) and the RFE Decision Tree Classifier (RFE-DTC) algorithms. These algorithms assign the rank 1 to the top ten features, and rank the remaining features as 2, 3, and so on. However, they do not output the selection scores (weights). For this reason, we were unable to utilize them in our hybrid approach.
Fig. 4
FCM model for SKB-C algorithm for LDAP attack dataset
Full size image
The score for each input feature represents the measure of that feature’s influence on the output feature. The raw feature scores reported by these algorithms for the LDAP attack dataset are shown in Table 3. SKB-C, SKB-C2 and CatB report large-magnitude scores that we convert to the range [0, 1]. For each algorithm, the scores for features are unique with very few exceptions, indicating that our data preprocessing steps effectively removes redundant features. The features and weights for the DNS Attack FCM obtained from the feature selection algorithm of AdaB is plotted in Fig. 3. However, none of these algorithms indicate whether the score correlations are positive or negative. Only in this specific step (more fully explained in method 2, Section 4.3.2) do we use our domain knowledge to add appropriate \(+\) or − signs to the weights. We thus ensure that the scores are appropriately transformed into FCM edge values (i.e., weights \(w_{ji}\) in Eq. 1), falling within our desired range of \([-1, 1]\). The initial feature values \(A_j^t\) are extracted from the dataset, its the value of each feature for each record in the dataset. For example, the feature “Total length of Fwd Packets” may have a value of 60 bytes for a certain packet, and the “Flow Bytes/s” may have a value of 292 bytes. All the feature values in the dataset are first normalized using Eq. 2 so they are comparable on the same scale. These values are then used along with the weights in computing the value of “Label,” the target column using the FCM formula in Eq. 1. Here, remind the reader that the feature selection process is used mainly to obtain the weights between input features and the output feature to fit our FCM model. In Section 4.3, we use our pre-processed dataset with 26 input features as the starting point for our experiments with FCM models.
Next, we create an FCM model for each feature selection algorithm with its ten selected inputs and one output (label), using the tool Mental Modeler [64]. Figure 4 shows one example FCM model associated with the SKB-C feature selection algorithm for an LDAP attack. The transformed scores from the feature selection algorithms are utilized as weights in the FCM model. The blue arrows indicate positive influence, and the orange arrows indicate negative influence. The thickness of the line indicates the intensity of the influence of the input feature on the output feature. A feature with a positive influence implies that an increase in its value increases the chance of that packet being a benign (BN) packet. A feature with a negative influence implies that an increase in its value increases the chance of that packet being a SYN attack packet. The final classification of a packet is based on the cumulative effect of all input features on the output feature.
Consider, for instance, in Fig. 4, the two input features “Average Packet Size” (negative influence) and “Down Up Ratio” (positive influence) in the FCM model for SKB-C algorithm for LDAP attack dataset. A large packet size increases the possibility that the packet is classified as an LDAP attack packet on account of its negative influence on the output feature. On the other hand, a small “Down Up Ratio” increases the possibility that the packet is classified as an attack packet due to its negative influence. This action is consistent with real-world situations in which large-sized packets are likely to be LDAP attack packets, as they are the responses from an LDAP server in an LDAP amplification or reflective DDoS attack. Similarly, a small “Down Up Ratio” is reflective of the congestion in the network due to a DDoS attack.
Fig. 5
Confusion matrix
Full size image

4.3 Methodology

We define a predictor’s performance in classifying a packet as “benign” or “attack” by measuring its accuracy, precision, recall, F1 score, and confusion matrix.
TP (True Positive): an attack packet that has been correctly identified.
TN (True Negative): a benign packet that has been correctly identified.
FP (False Positive): a benign packet that has been incorrectly identified as “attack”.
FN (False Negative): an attack packet that has been incorrectly identified as “benign.”
We also define the following terms for our performance metrics (see Eqs. 36), consistent with the definitions in Pedregosa et al. [59]. Accuracy measures how close the measured values are to the actual value. Precision measures true positive predictions among the total positive predictions made by the model, the correct prediction of a certain class out of all the predictions of that class. It tells us how many of our predictions are of positive or negative classes were correct out of the total predictions in that class. Recall is the true positive rate or sensitivity, a ratio of true positives to the total number of elements that belong to the positive class, the correct predictions of a certain class out of the all the actual values of that class. It tells us how many of the positive or negative classes we were able to predict correctly. The F measure, also known as F1 score, is the harmonic mean of precision and recall values.
$$\begin{aligned} Accuracy&= \frac{TP + TN}{TP+TN+FP+FN}\end{aligned}$$
(3)
$$\begin{aligned} Precision&= \frac{TP}{TP+FP}\end{aligned}$$
(4)
$$\begin{aligned} Recall&= \frac{TP}{TP+FN}\end{aligned}$$
(5)
$$\begin{aligned} F1~score&= 2\times \frac{Precision \times Recall}{Precision + Recall} \end{aligned}$$
(6)
The “confusion matrix” represents the four possible outcomes in a \(2\times 2\) matrix, in a binary classification: the true positives, the true negatives, the false positives, and the false negatives. Figure 5 shows the confusion matrix with the associated labels and formulas. The rows represent the true labels for positive and negative classes, and the columns represent the predicted labels for the positive and negative classes. In our case, the positive class is the “attack” label and the negative class is the “benign” label. The figure also shows the formulas for precision, recall, and accuracy as they pertain to the confusion matrix. The computation of the confusion matrix is explained with an example in Section 5.
Even after rectifying somewhat the imbalance in the CICDDoS2019 dataset as described in Section 4.1 by adding benign records from all csv files to all the attack csv files, we note the following. With such imbalanced data, it is best to evaluate the performance of the FCM predictions by examining not just the accuracy, but also the precision, recall, F1-score, and confusion matrix in totality. Normally in ML methods, a single dataset is split into “train” and “test” sets, and the model’s performance and accuracy are measured. However, we train the model using one dataset (for example, syn-Jan) and test using another dataset (for example, syn-Mar) to demonstrate that the learning process from one dataset is reliable and fast for detection on another dataset. We experiment with and compare the results of two different methods for the detection of all types of attacks included in the dataset: (1) the traditional pure ML-only approach described in Section 4.3.1 and (2) our hybrid ML-FCM approach described in Section 4.3.2.
Fig. 6
Flowchart for pure ML method with no FCM involvement
Full size image
Fig. 7
Flowchart for hybrid method: ML feature select, FCM classify
Full size image

4.3.1 ML algorithms only, with No FCM involvement

In this method, we follow the standard process for testing any ML model. We train using the training datasets <attack>-Jan and test using the corresponding test datasets <attack>-Mar. Thus, we deviate slightly from the normal train-test split of the same dataset. Tree-based and boosting algorithms perform well for supervised learning classification problems. Therefore, we choose a few ML classifiers that can perform both feature selection as well as classification—Adaptive Boosting, Category Boosting, Extra Tree Classifier, Neural Networks - Multi-Layer Perceptron, Random Forest, and eXtreme Gradient Boosting. We also include Neural Networks algorithm to demonstrate the difference between FCM and NN-MLP computations, even though they are similar in terms of having weights between features. However, NN-MLP has hidden layers between input and output features, which FCM does not possess, and NN-MLP utilizes back propagation to correct incorrect predictions and thus improve its accuracy. We train the model with 15 hidden layers and 100 maximum iterations. We test the Gradient Boosting algorithm as well, but drop it from our analysis because the computations are slower, even though the accuracy is comparable. In addition, we did test another popular classifier, Naive Bayes, but the classification results were unsatisfactory, and thus excluded from this study. Figure 6 shows the experimentation steps for this pure ML method.

4.3.2 Hybrid: ML feature selection with FCM classification

In our hybrid method, we trial eight ML algorithms for feature selection, including the two SelectKBest score functions. The scores for all the input features are computed using the training dataset. These scores are used as weights in Eq. 1 in the computation of “Label” values. The values computed are in the range \([-1, 1]\) following the process described in Section 4.2. We then classify, i.e., label the packets as attack (0) or benign (1), based on a threshold value. The threshold value is computed automatically during the training phase based on the Area Under the Curve (AUC) value. AUC, the area under the ROC (Receiver-Operating Characteristic) curve, is one of the best evaluation metrics used for binary classification problems. It represents the best prediction possible for either the positive or the negative class. The ROC curve is a visual representation of the model performance across all threshold values [66]. Through our program, we compute AUC values for all possible “label” value predictions. Since the AUC value is 1 for a perfect model, we select the maximum AUC value identified by our program as our threshold value for classification. This is the value that provides maximum accuracy and F1 score metrics in the training set. The maximum AUC value is different for each feature selection algorithm-FCM model pair. This automatic AUC-based threshold computation contributes approximately 25–90% to the total computation time for model training, depending on the algorithm that is used. For each feature selection algorithm, we compute the aggregate counts for TP, TN, FP, and FN, and from them, the accuracy and F1 score with Eqs. 36.
In the testing phase, we run our FCM classifier to compute the value of the “Label” (output) column on the testing dataset. Next, we use the computed thresholds to classify the packets in the dataset. We then compute our performance metrics on this dataset to determine the efficacy of our hybrid approach. Figure 7 illustrates the sequence of steps performed for this method.
Table 4
Metrics for SYN dataset with 26, 10, and 5 input features
Prediction by Hybrid approach - ML and FCMs
Training Dataset: syn-Jan, Test Dataset: syn-Mar
Test dataset: Syn - 1,367,650, BENIGN - 56,227
Features
Algorithm
Accuracy %
F1 Score%
Confusion Matrix
Time(s)
26
SKB-C
99.80
99.90
\(\left[ \begin{array}{lll}1365193 & 2457 \\ 353 & 55874\end{array}\right] \)
7.52
   
97.55
  
10
SKB-C
99.80
99.90
\(\left[ \begin{array}{lll}1365193 & 2457 \\ 353 & 55874\end{array}\right] \)
8.03
   
97.55
  
5
SKB-C
99.80
99.90
\(\left[ \begin{array}{lll}1365193 & 2457 \\ 353 & 55874\end{array}\right] \)
4.07
   
97.55
  
26
SKB-C2
99.80
99.90
\(\left[ \begin{array}{lll}1365173 & 2477 \\ 353 & 55874\end{array}\right] \)
12.01
   
97.53
  
10
SKB-C2
99.80
99.90
\(\left[ \begin{array}{lll}1365173 & 2477 \\ 353 & 55874\end{array}\right] \)
12.35
   
97.53
  
5
SKB-C2
98.03
99.04
\(\left[ \begin{array}{lll}1366857 & 793 \\ 25775 & 30452\end{array}\right] \)
7.75
   
69.63
  
After pre-processing, our dataset contains 26 input features. Here, we would like to mention that we juxtaposed the performance of our FCM models that incorporated the top 5 and top 10 input features against the equivalent FCM models incorporating all 26 input features. The difference in accuracy of detection between FCMs utilizing a reduced feature set and FCMs employing the full set of 26 features was marginal. At the other extreme, when just the top 5 input features were used in statistical methods such as SKB-C and SKB-C2, the computation time shrinks greatly, but the evaluation metrics either remain comparable or they worsen. Table 4 shows the full set of metrics for the Syn dataset for 26, 10, and 5 features.
Fig. 8
Hybrid method metrics for all types of DDoS attacks
Full size image

4.3.3 Hybrid: computational and space complexity

In this section, we discuss the time and space complexity of the two main feature selection algorithms (SKB-C and SKB-C2) that are paired with our FCM classifier.
A. Asymptotic running time complexity, training phase
The asymptotic running time (computational) complexity of our feature selection process depends on the complexity of scoring each candidate feature based on a univariate statistical test, followed by the selection of the k best features from the computed feature scores. If n is the sample size, k is the number of selected best features, and the full set of features is f, then we compute the computation complexity as follows.
SKB-C utilizes the Analysis of Variance (ANOVA) F-test [65] to select the top features. The assignment of a score to each feature requires O(n) operations. Since there are f features, the overall computational complexity in score-assignment is O(n.f). SKB-C2, the chi\(^{2}\) test has a similar computational complexity of O(n.f). Next, both SKB-C and SKB-C2 require the sorting of f features in increasing order of scores, stopping after k top results, i.e., a partial sort of the feature set. This operation requires O(f.log k) time, typically. Therefore, the overall time complexity of the SKB-C or the SKB-C2 feature selection algorithm, which includes the scoring and partial sorting operations, is O(n.f) \(+\) O(f.log k) = O(n.f) since \(n \gg k\).
The complexity of the FCM classifier is driven by the FCM computational rule in Eq. 1 and the computation of the Area Under the Receiver-Operating Characteristic Curve (AUC-ROC) metric [66]. The dimensions of matrix A are \(n \times k\), while the vector of weights w is of size k. We thus have to account for a total of k dot products and k summations per row of A, over n such rows, resulting in a time complexity of O(n.k) for the FCM computational rule. The time complexity of computing the AUC-ROC metrics is driven by the rank-sorting algorithm, which runs in O(n.log n) time. Therefore, the overall time complexity of the FCM classifier is O(n.k) + O(n.log n) = O(n.log n), since typically \(log~n \gg k\).
Finally, the total complexity of our hybrid algorithm (SKB-C or the SKB-C2 feature selector, sequentially paired with the FCM classifier) is simply the sum of their respective complexities, i.e., O(n.f) + O(n.log n) = O(n.log n), since typically \(log~n \gg f\). Therefore, the overall time complexity of the hybrid scheme is driven by the time complexity of the AUC-ROC computation. If the feature selection algorithm utilized (other than SKB-C or the SKB-C2) has a time complexity greater than O(n.log n), then that algorithm will have a detrimental effect on the overall time complexity of the hybrid scheme.
B. Asymptotic running time complexity, testing phase
In the testing phase, the feature selection algorithm and the AUC-ROC algorithm is not utilized. Since the FCM classifier executes by itself, the time complexity of the attack detector is O(n.k) for the whole set, or O(k) for a single traffic flow. Thus, the detection mechanism is computationally extremely efficient.
C. Space complexity
During the training phase, the size of the dataset, O(n.f), and not the space required to store the feature scores, O(f), is the dominating factor. During the testing phase, the space needs are driven by the space needed to record the flow (a few bytes), the set of feature scores O(n.f), and the set of weights, also O(f). Thus, space-wise too, the detection mechanism is extremely efficient.

4.3.4 Hybrid: system scalability

The hybrid method that we propose can be deployed as part of a host-based IDS on devices with limited resources, such as IoT devices in a variety of networks, including 5 G networks. The trained model is saved on the devices and consists of the feature set, the corresponding feature weights, and the thresholds for detecting each kind of attack. The training is presumed to be conducted offline; there is a communication overhead for pushing the parameters of the trained model from the backend to each of the devices, similar to the deployment of software upgrades to hosts. However, this push occurs when a model update is needed, which is rather infrequent in occurrence. Therefore, the model-push step is very scalable irrespective of the number of deployed devices. In the testing (operational) phase, the devices do not interact with one another and therefore incur no inter-device communication overhead.
The overall network throughput of these devices depends on the underlying communication technology, such as Ethernet, Wireless, and Near Field Communication (NFC). If the host-based IDS is deployed inline, the performance impact of the trained model on the device monitoring the incoming traffic is minimal on account of the low computational complexity (processing overhead) described earlier in this section. Thus, the effect on the overall throughput will be minimal. However, since DDoS attacks occur at the network layer, the security component, i.e., the host-based IDS on the device, is free to respond swiftly either to alert the administrator and/or to block the incoming traffic.

5 Experimental results and analyses

We used the cleansed CICDDoS2019 dataset to test the traditional pure ML-only approach described in Section 4.3.1, and also our hybrid ML-FCM approach described in Section 4.3.2. We used the Scikit-Learn [59] Python libraries for the ML feature selection algorithms and ML classifiers. We wrote our own code in Python for the FCM and threshold computations. The results are described below.
The evaluation metrics for our proposed hybrid method for all attack types are shown in Fig. 8. Four sets of histogram plots are shown for each attack type: accuracy (red), precision (green), recall (blue), and computation time (cyan). Note that the dataset for each attack type also includes benign packets, as shown in Table 2. Each vertical bar represents the ML algorithm that was used for the computation of weights, training, and testing the models, complementing the FCM algorithm. Since both “benign” and “attack” classifications are equally important, we examine the macro averages of the precision and recall scores. The recorded time is the computation time for both training and testing. Here, the training and the threshold computation form the bulk of the recorded time, whereas the prediction (classification) is extremely fast. We also note that the time is a function of the size of the dataset.
Figures 9, 10, and 11 show the confusion matrix for each attack type for the hybrid model in percentage values. This metric provides a clear picture of the false positives and false negatives for each attack type. The darker shade of blue shows how true the prediction is. For example, for DNS attack detection with our SKB-C-FCM hybrid algorithm, the false positives and false negatives are very low, while the true positives and true negatives are very high. In contrast, for SSDP attack detection with the PCA-FCM hybrid algorithm, the true positives are clearly too low, and the false positives are too high.
Fig. 9
Hybrid method Confusion Matrix - 1
Full size image
Fig. 10
Hybrid method Confusion Matrix - 2
Full size image
Fig. 11
Hybrid method Confusion Matrix - 3
Full size image
Table 5 summarizes the best performers for all tested attacks: the traditional ML-only method and our hybrid method. Here, we show the accuracy and F1-score for the hybrid model algorithms that perform best compared with the best performing ML algorithm. The F1-score being the harmonic mean of both Precision and Recall, we consider it as the best evaluation metric for our analysis. For the ML-only method, the XGB algorithm has the best overall scores among the six ML algorithms that we tested. Further details on our ML candidate algorithm tests can be found in [34]. Thus, we select XGB from the six ML algorithms to compare with our hybrid method.
Table 5
Best metrics for both methods
Attack
Method
Algorithm
Accuracy
F1-score
DNS
ML-only
XGB
100.0
99.97
DNS
Hybrid
SKB-C
99.92
98.87
DNS
Hybrid
SKB-C2
99.93
99.05
LDAP
ML-only
XGB
99.92
99.33
LDAP
Hybrid
SKB-C
99.93
99.41
LDAP
Hybrid
SKB-C2
99.97
99.77
MSSQL
ML-only
XGB
99.98
99.76
MSSQL
Hybrid
SKB-C
99.98
99.62
MSSQL
Hybrid
SKB-C2
99.9
98.51
NetBIOS
ML-only
XGB
99.99
99.86
NetBIOS
Hybrid
SKB-C
99.97
99.54
NetBIOS
Hybrid
SKB-C2
99.89
98.44
NTP
ML-only
XGB
100.0
99.98
NTP
Hybrid
SKB-C
99.52
97.95
NTP
Hybrid
SKB-C2
99.52
97.95
NTP
Hybrid
XGB
99.62
98.38
Portmap
ML-only
XGB
99.99
99.99
Portmap
Hybrid
SKB-C
99.38
99.16
Portmap
Hybrid
SKB-C2
99.55
99.39
SNMP
ML-only
XGB
100.0
99.98
SNMP
Hybrid
SKB-C
99.94
98.91
SNMP
Hybrid
SKB-C2
99.95
99.13
SSDP
ML-only
XGB
100.0
99.99
SSDP
Hybrid
SKB-C
99.95
99.41
SSDP
Hybrid
SKB-C2
99.94
99.38
SYN
ML-only
XGB
99.93
98.68
SYN
Hybrid
SKB-C
99.81
96.4
SYN
Hybrid
SKB-C2
99.55
89.57
SYN
Hybrid
XGB
99.88
97.57
TFTP
ML-only
XGB
100.0
99.93
TFTP
Hybrid
SKB-C
99.95
96.95
TFTP
Hybrid
XGB
99.95
96.97
UDP
ML-only
XGB
99.94
99.02
UDP
Hybrid
SKB-C
99.96
99.37
UDP
Hybrid
SKB-C2
99.96
99.37
UDPLag
ML-only
XGB
99.23
94.41
UDPLag
Hybrid
SKB-C
97.82
82.25
UDPLag
Hybrid
SKB-C2
98.36
89.1
UDPLag
Hybrid
XGB
98.07
83.59
In our hybrid method, both SKB-C-FCM and SKB-C2-FCM perform well for DNS, LDAP, MSSQL, NetBIOS, Portmap, SNMP, SSDP, TFTP, and UDP attacks. XGB performs well for NTP, SYN, and UPDLag attacks. Although the hybrid methods take somewhat longer compared to the ML-only method, these methods are simple and are suited for continuous learning and not just offline training of models. The attributes of the saved model, such as the feature weights and the threshold value, can be adjusted based on the new data available. This is distinctly advantageous for intrusion detection as new traffic patterns occur and the trained models do not accurately reflect the changes in network behavior. Here, we note that the computation of weights between the input features and the output feature is done using the aforementioned ML feature selection algorithms. The FCM algorithm is used to train and test the model. In the hybrid approach, although a few other ML-FCM pairings (other than SKB-C-FCM and SKB-C2-FCM) fare well, the results are not consistently good. We find that SKB-C-FCM and SKB-C2-FCM demonstrate the best predictions consistently. They are simple, transparent, statistical methods with a low memory footprint and can be easily incorporated into a host-based IDS solution for IoT devices. A summary of the contributions in this study is as follows.
1.
The hybrid model solution tackles the highly relevant and timely problem of intrusion detection in resource-constrained devices.
 
2.
The hybrid approach offers a lightweight, transparent, and reliable IDS solution that combines fuzzy cognitive maps with feature selection algorithms.
 
3.
This method is a great alternative to traditional ML models, thus overcoming some of the limitations of ML solutions, such as computation cost, black-box approach, and randomness in predictions.
 
4.
The bias introduced by human expert input for the traditional FCMs is alleviated by utilizing the ML-based feature selection algorithms to compute weights.
 
5.
The threshold value that decides the classification of a packet is computed automatically based on the maximum AUC value.
 
6.
The predictions are entirely based on empirical data and do not depend on human input at any stage.
 
7.
Statistical feature selection algorithms combined with FCM perform better than traditional ML algorithms; the predictions are transparent, fast, and reliable.
 

6 Conclusion and future work

We propose a novel method for intrusion detection that uses the interdependency of traffic features in order to classify a packet as benign or malicious. Our work showcases a hybrid algorithm for detecting a wide variety of DDoS attacks in which we (1) Identify a set of high-impact features and their weights using statistical algorithms, (2) Train and test an FCM-based model that requires low-memory overhead and is simple and transparent, and (3) Incorporate automatic threshold value computation based on an AUC score for packet classification. Contextual information from the traffic is used to create a single-stage FCM with features and weights, and then train and test the model. The final classification of a packet as “benign” or “attack” is by means of completely transparent computational steps in the FCM model.
Conventional FCM models require considerable “expert knowledge” in model construction. However, our FCM model extracts all numerical values for the FCM graph’s features and edge weights algorithmically. Expert judgment is required only to determine whether an extracted feature has a positive or a negative influence on the output feature (label), which is typically done just once. Even such a judgment is based on empirical evidence without requiring guesswork. The threshold value for classification is automatically computed based on the best AUC score of the trained model.
Our experimental results show that for all types of tested DDoS attacks, our hybrid approach is extremely reliable for selected ML-FCM hybrid pairings compared with a pure-ML approach. The simplicity of implementation is the result of pairing simple statistical feature selection methods such as SKB-C and SKB-C2 with a single-stage FCM model. During the feature selection phase of our hybrid method, the features and weights are transparently available for security analysts to review and analyze. In fact, this very information is utilized in the construction of our FCM model as mentioned in Section 2.4. The classification threshold (decision boundary) is also clearly visible in the trained model. Thus, the FCM graph and all related decision-making data are open and transparent for the purpose of analysis, if required.
In addition, feature selection and threshold value computation are performed only during the training phase. Therefore, run-time detection which requires just the FCM portion (but not the AUC-based threshold computation), is computationally efficient, has a very low memory footprint, is transparent and thus suitable for resource-constrained hosts such as smart home appliances, wearable medical devices, vehicular management systems, radio frequency identification (RF-ID) enabled devices, programmable logic controllers (PLC), sensors, probes and cameras. Furthermore, our hybrid model is completely deterministic and thus consistent in its results. Finally, due to the transparency and modularity of our hybrid approach, there exists the potential for further improvement in accuracy and speed of classification. Therefore, such a hybrid approach is worth serious consideration for use in IoT devices.
Some caveats to note are as follows. (1) Although our hybrid method performs well for individual DDoS attacks as demonstrated in this paper, its efficacy in detecting other kinds of attacks is as yet untested. (2) FCM classifiers are sensitive to the choice of the feature selection algorithm; as the results in Section 5 demonstrate, they work best with SKB-C and SKB-C2. (3) Since our model is a supervised learning model, labeled data is needed for training purposes. Thus, if a new kind of traffic pattern or attack were to emerge, the model would have to be retrained.
Future work entails the application of our hybrid approach to a multi-class dataset that includes diverse attack types, reflecting a real-world scenario, and the incorporation of multi-stage FCM models for even higher detection accuracy, albeit with some sacrifice of execution speeds. In addition, a detailed study of our model’s robustness against adversarial attacks and new, unseen types of network traffic, is another topic for future work.

Declarations

Competing interests

The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Title
An FCM-based hybrid method for DDoS attack detection in resource-constrained devices
Authors
Prathibha Keshavamurthy
Sarvesh Kulkarni
Publication date
27-10-2025
Publisher
Springer International Publishing
Published in
Annals of Telecommunications / Issue 11-12/2025
Print ISSN: 0003-4347
Electronic ISSN: 1958-9395
DOI
https://doi.org/10.1007/s12243-025-01130-z
1.
go back to reference Hutchins E, Cloppert M, Amin R (2010) Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains. https://lockheedmartin.com/content/dam/lockheed-martin/rms/documents/cyber/LM-White-Paper-Intel-Driven-Defense.pdf. Last Accessed on Aug 29 2025
2.
go back to reference 3GPP TS 22.261 (2023) Service requirements for the 5G system, v18.1.0
3.
go back to reference ETSI MEC ISG (2019) Multi-access edge computing (MEC); framework and reference architecture, ETSI GS MEC 003 V2.1.1, Jan. 2019
4.
go back to reference Alliance NGMN (2016) Description of network slicing concept. NGMN 5G:P1
5.
go back to reference Poularakis K, Iosifidis G, Tassiulas L (2018) SDN-enabled tactical ad hoc networks: extending programmable control to the edge. IEEE Commun Mag 56(7):132–138CrossRef
6.
go back to reference Zetter K (2014) Countdown to zero day : Stuxnet and the launch of the world’s first digital weapon, The Crown Publishing Group
8.
go back to reference Carpentier E, Thomasset C, Briffaut J (2019) Bridging the gap: data exfiltration in highly secured environments using Bluetooth IoTs. 2019 IEEE 37th International conference on computer design (ICCD), pp 297–300. https://doi.org/10.1109/ICCD46524.2019.00044
9.
go back to reference MITRE ATT&CK framework. https://attack.mitre.org/ Last Accessed on Aug 29 2025
10.
go back to reference Cui A, Stolfo SJ (2010) A quantitative analysis of the insecurity of embedded network devices: results of a wide-area scan. Proceedings of the 26th annual computer security applications conference (ACSAC), pp 97–106
11.
go back to reference O’Hare J, Macfarlane R, Lo O (2019) Identifying vulnerabilities using Internet-wide scanning data. IEEE 12th International Conference Global Security, Safety and Sustainability (ICGS3). https://ieeexplore.ieee.org/document/8688018. https://doi.org/10.1109/ICGS3.2019.8688018
12.
go back to reference Sharafaldin I, Lashkari AH, Hakak S, Ghorbani A (2019) Developing realistic distributed denial of service (DDoS) attack dataset and taxonomy. 2019 International Carnahan Conference on Security Technology (ICCST), pp. 1-8. DOI: https://ieeexplore.ieee.org/abstract/document/8888419
13.
go back to reference SYN flood attack. https://www.f5.com/glossary/syn-flood-attack Last Accessed on Aug 29 2025
15.
go back to reference Sharafaldin I, Lashkari AH, Ghorbani A, others (2018) Toward generating a new intrusion detection dataset and intrusion traffic characterization, ICISSP Vol 1, pp 108–116
16.
go back to reference Wang M, Lu Y, Qin J (2020) A dynamic MLP-based DDoS attack detection method using feature selection and feedback. Comput Secur J Vol 88
17.
go back to reference Saheed YK, Abiodun AI, Misra S, Holone MK, Colomo-Palacios R (2022) A machine learning-based intrusion detection for detecting internet of things network attacks. Alex Eng J 61:9395–9409CrossRef
18.
go back to reference Casajús-Setién J, Bielza C, Larrañaga P (2023) Anomaly-based intrusion detection in IIoT networks using transformer models. 2023 IEEE International conference on cyber security and resilience (CSR), pp 72–77
19.
go back to reference Becerra-Suarez FL, Fernández-Roman I, Forero MG (2024) Improvement of distributed denial of service attack detection through machine learning and data processing, MDPI Journal - Mathematics, Vol 12
20.
go back to reference Jemal I, Cheikhrouhou O, Haddar MA (2023) IoT DOS and DDOS attacks detection using an effective convolutional neural network. 2023 IEEE International conference on cyberworlds (CW), pp 373-379
21.
go back to reference Zhou L, Zhu Y, Zong T, Xiang Y (2022) A feature selection-based method for DDoS attack flow classification. Future Gener Comput Syst J Elsevier 132:67–79CrossRef
22.
go back to reference Gniewkowski M, Maciejewski H, Surmacz T (2022) Anomaly detection techniques for different DDoS attack types. Int Conf Dependability Complex Syst, Springer, pp 63–78
23.
go back to reference Mihoub A, Fredj OB, Cheikhrouhou O, Derhab A, Krichen M (2022) Denial of service attack detection and mitigation for internet of things using looking-back-enabled machine learning techniques. Comput Electr Eng J, Elsevier, Vol 98
24.
go back to reference Alzubi OA, Alzubi JA, Qiqieh I, Al-Zoubi A (2025) An IoT intrusion detection approach based on Salp swarm and artificial neural network. Int J Netw Manag, Vol 35
25.
go back to reference Alzubi OA, Alzubi JA, Alazab M, Alrabea A, Awajan A, Qiqieh I (2022) Optimized machine learning-based intrusion detection system for fog and edge computing environment. MDPI J Electron, Vol 11, pp 3007–3022
26.
go back to reference Alweshah M, Hammouri A, Alkhalaileh S, Alzubi O (2023) Intrusion detection for the internet of things (IoT) based on the emperor penguin colony optimization algorithm. Springer J Ambient Intell Humanized Comput 14:6349–6366CrossRef
27.
go back to reference Prasad M, Gupta R, Tripathi S (2022) A multi-level correlation-based feature selection for intrusion detection”, Springer Arab J Sci Eng, Vol 47
28.
go back to reference Fan Y, Ma H, Zhang Y, Li S, Guo X, Wang B (2025) A DDoS attack detection method based on improved transformer and temporal feature enhancement. The Journal of Supercomputing Springer 81:947–968CrossRef
29.
go back to reference Khraisat A, Alazab A (2021) A critical review of intrusion detection systems in the internet of things: techniques, deployment strategy, validation strategy, attacks, public datasets and challenges. Cybersecur J, Springer Vol 4, pp 1-27. https://doi.org/10.1186/s42400-021-00077-7
30.
go back to reference Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. Adv Neural Inform Process Syst J, Vol 30
31.
go back to reference Ribeiro MT, Singh S, Guestrin C (2016) Why should I trust you?: Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135-1144
32.
go back to reference Barbierato E, Gatti A (2024) The challenges of machine learning: a critical review. MDPI Electron J, Vol 13
33.
go back to reference Keshavamurthy P, Kulkarni S (2023) Early detection of reconnaissance attacks on IoT devices by analyzing performance and traffic characteristics. 2023 IEEE International conference on cyber security and resilience (CSR) pp 187-193, Venice, Italy. https://doi.org/10.1109/CSR57506.2023.10224986
34.
go back to reference Keshavamurthy P, Kulkarni S (2024) A hybrid machine learning – fuzzy cognitive map approach for fast, reliable DDoS attack detection 8th Cyber Security in Networking Conference (CSNET 2024), Dec. 4–6., Paris, France
35.
go back to reference Kosko B (1986) Fuzzy cognitive map, International journal of man-machine studies, Vol 24, No1, pp 65–75
36.
go back to reference Axelrod R (1976) CHAPTER THREE. The analysis of cognitive maps. Structure of Decision, edited by Robert Axelrod, Princeton: Princeton University Press, pp 55–74
37.
go back to reference Barbrook-Johnson P, Penn AS (2022) Fuzzy cognitive mapping. In: Systems Mapping Palgrave Macmillan, Cham
38.
go back to reference Mkhitaryan S, Giabbanelli P, Wozniak MK, Nápoles G, De Vries N, Crutzen R (2022) FCMpy: a python module for constructing and analyzing fuzzy cognitive maps. PeerJ Comput Sci 8:e1078
39.
go back to reference Kosko B (1992) Neural networks and fuzzy systems: a dynamical systems approach to machine intelligence Prentice-Hall, Inc
40.
go back to reference Stylios CD, Groumpos PP (2004) Modeling complex systems using fuzzy cognitive maps. IEEE Transactions on systems, man, and cybernetics-part a: systems and humans, 34:155–162
41.
go back to reference Bueno S, Salmeron JL (2009) Benchmarking main activation functions in fuzzy cognitive maps. Expert Syst Appl 36:5221–5229CrossRef
42.
go back to reference Papageorgiou E, Stylios CD, Groumpos PP (2004) Active Hebbian learning algorithm to train fuzzy cognitive maps. Int J Approximate Reasoning 37:217–249MathSciNetCrossRef
43.
go back to reference Papageorgiou EI, Groumpos PP (2005) A weight adaptation method for fuzzy cognitive map learning. Soft Comput 9:846–857CrossRef
44.
go back to reference Jetter AJ, Kok K (2014) Fuzzy cognitive maps for futures studies–a methodological assessment of concepts and methods. Futures 61:45–57CrossRef
45.
go back to reference Papageorgiou EI, Salmeron JL (2012) A review of fuzzy cognitive maps research during the last decade. IEEE Trans Fuzzy Syst 21:66–79CrossRef
46.
go back to reference Felix G, Nápoles G, Falcon R, Froelich W, Vanhoof K, Bello R (2019) A review on methods and software for fuzzy cognitive maps. Artif Intell Rev 52:1707–1737CrossRef
47.
go back to reference Nápoles G, Leon Espinosa M, Grau I, Vanhoof K, Bello R (2018) Fuzzy cognitive maps based models for pattern classification: advances and challenges. Soft Computing Based Optimization and Decision Models: To Commemorate the 65th Birthday of Professor José Luis" Curro" Verdegay, pp 83–98
48.
go back to reference Salmeron JL, Rahimi SA, Navali AM, Sadeghpour A (2017) Medical diagnosis of rheumatoid arthritis using data driven PSO-FCM with scarce datasets. Neurocomputing 232:104–112CrossRef
49.
go back to reference Froelich W (2017) Towards improving the efficiency of the fuzzy cognitive map classifier. Neurocomputing 232:83–93CrossRef
50.
go back to reference Nápoles G, Mosquera C, Falcon R, Grau I, Bello R, Vanhoof K (2018) Fuzzy-rough cognitive networks. Neural Netw 97:19–27CrossRef
51.
go back to reference Mls K, Cimler R, Vaščák J, Puheim M (2017) Interactive evolutionary optimization of fuzzy cognitive maps. Neurocomputing 232:58–68CrossRef
52.
go back to reference Wozniak MK, Mkhitaryan S, Giabbanelli PJ (2022) Automatic generation of individual fuzzy cognitive maps from longitudinal data. Int Conf Comput Sci pp 312–325
53.
go back to reference Nápoles G, Grau I, Falcon R, Bello R, Vanhoof K (2016) A granular intrusion detection system using rough cognitive networks. Recent advances in computational intelligence in defense and security, pp 169–191
54.
go back to reference Aparicio-Navarro FJ, Kyriakopoulos KG, Ghafir I, Lambotharan S, Chambers JA (2018) Multi-stage attack detection using contextual information. MILCOM 2018-2018 IEEE Military communications conference (MILCOM) pp 1–9
55.
go back to reference Aparicio-Navarro FJ, Kyriakopoulos KG, Parish D, David J, Chambers J (2016) Adding contextual information to intrusion detection systems using fuzzy cognitive maps. 2016 IEEE International multi-disciplinary conference on cognitive methods in situation awareness and decision support (CogSIMA) pp 180–186
56.
go back to reference Aparicio-Navarro FJ, Kyriakopoulos KG, Gong Y, Parish DJ, Chambers JA (2017) Using pattern-of-life as contextual information for anomaly-based intrusion detection systems. IEEE Access 5:22177–22193CrossRef
57.
go back to reference Aparicio-Navarro FJ, Chadza TA, Kyriakopoulos KG, Ghafir I, Lambotharan S, AsSadhan B (2018) Addressing multi-stage attacks using expert knowledge and contextual information. 2019 22nd Conference on innovation in clouds, internet and networks and workshops (ICIN), pp 188–194
59.
go back to reference Pedregosa F et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNet
60.
go back to reference Machine learning mastery. https://machinelearningmastery.com/ Last Accessed on Aug 29 2025
61.
go back to reference Adaptible boosting classifier. https://www.sciencedirect.com/topics/engineering/adaboost Last Accessed on Aug 29 2025
62.
go back to reference Category boosting classifier. https://catboost.ai/ Last Accessed on Aug 29 2025
63.
go back to reference XGBoost classifier. https://xgboost.ai/about Last Accessed on Aug 29 2025
64.
go back to reference Mental modeler. https://www.mentalmodeler.com/ Last Accessed on Aug 29 2025
Image Credits
Neuer Inhalt/© ITandMEDIA, Nagarro GmbH/© Nagarro GmbH, AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH, USU GmbH/© USU GmbH, Ferrari electronic AG/© Ferrari electronic AG