Intelligent intrusion detection based on federated learning aided long short-term memory

doi:10.1016/j.phycom.2020.101157

Physical Communication

Volume 42, October 2020, 101157

https://doi.org/10.1016/j.phycom.2020.101157 Get rights and content

Abstract

Deep learning based intelligent intrusion detection (IID) methods have been received strongly attention for computer security protection in cybersecurity. All these learning models are trained at either a single user server or centralized server. For one thing, it is almost impossible to train a powerful deep learning model at a single user. For other, it will encounter intrusion risks at centre server and violate user privacy if collecting dataset from all of user servers. In order to solve these problems, this paper proposes an effective IID method based on federated learning (FL) aided long short-term memory (FL-LSTM) framework. First, the initial LSTM global model is deployed at all of user servers. Second, each user trains its single model and then uploads its model parameters to central server. Finally, the central server performs model parameters aggregation to form a new global model and distributes it to user servers. Use this step as a loop for communication to complete the training of the intrusion detection model. Simulation results show that our proposed method achieves a higher accuracy and better consistency than conventional methods.

Introduction

With the wide spread of network application and the continuous development of network attack technology, all social circles have paid close attention to the cyberspace security technology [1], [2], [3]. Intrusion detection problem is urgently to solve in the field of cyberspace security. In recent years, the detection of abnormal behaviours of users has become an important branch of intrusion detection. Because each user has different work tasks and personal habits, user commands input has the characteristics of serialization and diversification [4]. Shell commands are stored in bash_history in the system main folder, if the intrusion occurs, the intruder’s input command will be different from the normal user. Hence, it is necessary to design a detection system to audit shell commands entered by users to detect and prevent malicious actions such as directory traversal attacks, reading and deleting files in bulk, and uninstalling software in bulk.

In recent years, deep learning has been considered as one of the most effective tools to solve various problems in cyberspace security technology [5], [6], [7], due to its powerful feature extraction capability. However, the user’s input of shell commands involves operational privacy, many users cannot share personal datasets for algorithm model training. Recent studies show that there is a positive correlation between the performance of machine learning models and the amount of training dataset. The larger the amount of training dataset usually means the higher the performance of the model [8]. Most of the existing intrusion detection models are built based on traditional machine learning algorithms, and it is difficult to use the user’s local dataset for training without involving user privacy. This paper solves these problems by establishing a federated learning (FL) model. FL coordinates multiple sub-servers through a central server and unites user datasets to establish a common model and to jointly benefit. The original data of each user in the model is stored locally and is not exchanged or transmitted, which does not cause risk to user data privacy.

Due to the complexity of user input and the contextual relevance of shell commands, this paper proposes a federated learning-aided long-time short time (FL-LSTM) framework for intelligent intrusion detection (IID) method [9], [10], [11]. The model focuses on the detection of high-risk malicious behaviours, such as directory traversal attacks, reading and deleting files in bulk, uninstalling software in bulk, etc. The dataset is adjusted based on the open source SEA dataset. Set attack scenarios by adding attack commands and reset the label on the dataset. Finally, we used independent validation datasets for model performance testing. Simulation results show that the proposed method can comprehensively learn the features of the sub-end user server dataset while ensuring user privacy and has a high classification accuracy and strong practicality.

Section snippets

Related work

Due to the concern of many researchers regarding the detection of abnormal behaviour from user shell commands, the issue has become a research hotspot in recent years. At the same time, because of the excellent classification performance of machine learning [12], [13], [14], [15], [16], [17], [18], researchers have used machine learning approaches, such as Bayesian models, support vector machines, genetic algorithms and other machine learning models in intrusion detection. Generally, intrusion

Dataset preprocessing

The preprocessing of the dataset is mainly completed by a Tokenizer. A Tokenizer is used to vectorize text or convert text to corresponding sequences. After a shell command block is input into the network model, the word segmented is first used to count the words in the text to generate a dictionary document. The input shell command block is converted into a vector representation based on the lexicographic order. The input length is insufficient to fill the length and meet the length

Experimental results

In the experiment, the LSTM-based and CNN-based intrusion detection models were trained. Then, we build the FL-LSTM model. Finally, we compare the performance of the model according to the prediction accuracy, recall, precision, $F_{1}$ value and other aspects. The basic information of the four sub-end datasets and validation dataset is shown in Table 2. In this section, we will perform the following tests.

•
Use the full dataset to train the intrusion detection model through the LSTM framework

Conclusion

In this paper, we have proposed an effective FL-LSTM based IID method for achieving excellent detection accuracy while protecting users’ privacy. Simulation results showed that the proposed FL-LSTM method can work well since LSTM framework can provide richer semantic information in feature vectors combined with context. Centralized learning has achieved the best performance as the upper limit of federated learning performance, but according to the simulation results, it can be seen that the

CRediT authorship contribution statement

Ruijie Zhao: Software, Methodology, Writing - original draft, Writing - review & editing. Yue Yin: Visualization, Validation. Yong Shi: Investigation, Writing - review & editing. Zhi Xue: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ruijie Zhao is currently pursuing the master’s degree with the School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, China. His research interest is deep learning, wireless network security, and intrusion detection systems.

References (37)

NguyenM. et al.
Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning
Comput. Secur.
(2018)
WuY. et al.
Remaining useful life estimation of engineered systems using vanilla LSTM neural networks
Neurocomputing
(2018)
KimH. et al.
Empirical evaluation of SVM-based masquerade detection using UNIX commands
Comput. Secur.
(2005)
KatoN. et al.
Ten challenges in advancing machine learning technologies towards 6G
IEEE Wireless Commun. Mag.
(2020)
TangF. et al.
Future intelligent and secure vehicular network towards 6G: Machine-learning approaches
Proc. IEEE
(2020)
GuiG. et al.
6G: Opening new horizons for integration of comfort, security and intelligence
IEEE Wireless Commun. Mag.
(2020)
SharmaA. et al.
Detecting masquerades using a combination of Naive Bayes and weighted RBF approach
J. Comput. Virol.
(2007)
LiJ. et al.
RTVD: A real-time volumetric detection scheme for DDoS in the internet of things
IEEE Access
(2020)
LiQ. et al.
Understanding the usage of industrial control system devices on the internet
IEEE Internet Things J.
(2018)
C. Sun, A. Shrivastava, S. Singh, Revisiting unreasonable effectiveness of data in deep learning era, in: ICCV, Venice,...

SunJ. et al.

Behavioral modeling and linearization of wideband RF power amplifiers using BiLSTM networks for 5G wireless systems

IEEE Trans. Veh. Technol.

(2019)

LiZ. et al.

Context embedding based on Bi-LSTM in semi-supervised biomedical word sense disambiguation

IEEE Access

(2019)

KatoN. et al.

The deep learning vision for heterogeneous network traffic control: Proposal, challenges, and future perspective

IEEE Wirel. Commun. Mag.

(2016)

WangY. et al.

Transfer learning for semi-supervised automatic modulation classification in ZF-MIMO systems

IEEE J. Emerg. Sel. Top. Circuits Syst.

(2020)

GaoS. et al.

Deep learning based channel estimation for massive MIMO with mixed-resolution ADCs

IEEE Commun. Lett.

(2019)

GuiG. et al.

Flight delay prediction based on aviation big data and machine learning

IEEE Trans. Veh. Technol.

(2020)

WangY. et al.

LightAMC: Lightweight automatic modulation classification using deep learning and compressive sensing

IEEE Trans. Veh. Technol.

(2020)

LiangH. et al.

A novel adaptive resource allocation model based on SMDP and reinforcement learning algorithm in vehicular cloud system

IEEE Trans. Veh. Technol.

(2019)

Cited by (72)

The cybersecurity mesh: A comprehensive survey of involved artificial intelligence methods, cryptographic protocols and challenges for future research
2024, Neurocomputing
In today’s world, it is vital to have strong cybersecurity measures in place. To combat the ever-evolving threats, adopting advanced models like cybersecurity mesh is necessary to enhance our protection. Cybersecurity mesh is an architecture scalable, flexible, composable, robust and resilient and allows the interoperability and coordination between intelligent systems to provide security services. Designing a cybersecurity mesh faces three major challenges: scalability, distributed or federated systems, and technology integration. For the design, it is necessary to apply security tools that support scalability because millions and millions of data are stored, processed, and analysed. Federated systems are needed to improve interoperability in a decentralized cybersecurity mesh. However, it can be tough to integrate different security tools and communication protocols. Cryptographic algorithms and AI models like federated learning, swarming intelligence and blockchain technologies are useful for security services. It is essential to study the integration of existing methods to determine the best technology for the job. We conduct a comprehensive analysis of intelligent systems, including federated learning, blockchain technology, and swarming intelligence, with a particular focus on how they have been and can be used to enhance cybersecurity. We examine the latest trends in these technologies, explore their connections, and weigh the pros and cons of each approach. To conduct this review, we utilized the Web of Science and Scopus databases and followed the PRISMA guidelines.
F-NIDS — A Network Intrusion Detection System based on federated learning
2023, Computer Networks
The rise of IoT networks has presented fresh challenges in terms of scalability and security for distributed Network Intrusion Detection Systems (NIDS) due to privacy concerns. While some progress has been made in addressing these challenges, there are still unanswered questions regarding how to achieve a balance between performance and robustness to ensure privacy in a distributed manner. Additionally, there is a need to develop a reliable and scalable architecture for distributed NIDS that can be effectively deployed in various IoT scenarios. These questions about robustness relied mainly on choosing privacy-secured and distributed Machine Learning techniques. In this work, we propose the F-NIDS, an intrusion detector that utilizes federated artificial intelligence and asynchronous communication techniques between system entities to provide horizontal scalability, along with differential privacy techniques to address data confidentiality concerns. The architecture of F-NIDS is designed to be adaptable for usage in IoT networks, suited to be used in cloud or fog-based environments. Results from our experiments have shown that the confidential detection model employed in F-NIDS – considering multi-class accuracy, binary accuracy, precision, and recall metrics – was capable of predicting and determining the nature of attacks when they occur. In order to determine optimal parameters that strike a balance between data privacy and classification performance, three strategies were employed, each evaluated for its corresponding robustness performance. Firstly, models were trained with varying Gaussian noise values, and subjected to membership inference black box rule-based attacks. Secondly, regular membership inference black box attacks were performed, utilizing different stolen samples with varying sizes to determine the maximum amount of data that could be securely stored on the detection agents for training tasks. Lastly, the robustness of the trained models was evaluated against a model inversion attack, and the results were compared through graphical comparisons. Based on these evaluations, Gaussian noise level and sample size values of 21 were obtained for each detection agent in the system, with sample sizes ranging from 10K to 25K.
Artificial intelligence for cybersecurity: Literature review and future research directions
2023, Information Fusion
Artificial intelligence (AI) is a powerful technology that helps cybersecurity teams automate repetitive tasks, accelerate threat detection and response, and improve the accuracy of their actions to strengthen the security posture against various security issues and cyberattacks. This article presents a systematic literature review and a detailed analysis of AI use cases for cybersecurity provisioning. The review resulted in 2395 studies, of which 236 were identified as primary. This article classifies the identified AI use cases based on a NIST cybersecurity framework using a thematic analysis approach. This classification framework will provide readers with a comprehensive overview of the potential of AI to improve cybersecurity in different contexts. The review also identifies future research opportunities in emerging cybersecurity application areas, advanced AI methods, data representation, and the development of new infrastructures for the successful adoption of AI-based cybersecurity in today's era of digital transformation and polycrisis.
GöwFed: A novel federated network intrusion detection system
2023, Journal of Network and Computer Applications
Network intrusion detection systems are evolving into intelligent systems that perform data analysis while searching for anomalies in their environment. Indeed, the development of deep learning techniques paved the way to build more complex and effective threat detection models. However, training those models may be computationally infeasible in most Edge or IoT devices. Current approaches rely on powerful centralized servers that receive data from all their parties — violating basic privacy constraints and substantially affecting response times and operational costs due to the huge communication overheads. To mitigate these issues, Federated Learning emerged as a promising approach, where different agents collaboratively train a shared model, without exposing training data to others or requiring a compute-intensive centralized infrastructure. This work presents GöwFed, a novel network threat detection system that combines the usage of Gower Dissimilarity matrices and Federated averaging. Different approaches of GöwFed have been developed based on state-of the-art knowledge: (1) a vanilla version — achieving a median point of [0.888, 0.960] in the PR space and a median accuracy of 0.930; and (2) a version instrumented with an attention mechanism — achieving comparable results when 0.8 of the best performing nodes contribute to the model. Furthermore, each variant has been tested using simulation oriented tools provided by TensorFlow Federated framework. In the same way, a centralized analogous development of the Federated systems is carried out to explore their differences in terms of scalability and performance — the median point of the experiments is [0.987, 0.987]) and the median accuracy is 0.989. Overall, GöwFed intends to be the first stepping stone towards the combined usage of Federated Learning and Gower Dissimilarity matrices to detect network threats in industrial-level networks.
Review on application progress of federated learning model and security hazard protection
2023, Digital Communications and Networks
Citation Excerpt :
In terms of federated learning combined with neural network to establish a model for intrusion detection. R. Zhao and his team established a model based on a Long and Short-Term Memory artificial neural network (LSTM) combined with a CNN and a federated learning framework [17], using the models for applications in intrusion detection while further comparing the models built by CNN combined with a federated learning framework for LSTM. With further experiments using the same dataset, the FL-LSTM model achieves an ultra-high detection accuracy of 99.21% after several rounds of training.
Federated learning is a new type of distributed learning framework that allows multiple participants to share training results without revealing their data privacy. As data privacy becomes more important, it becomes difficult to collect data from multiple data owners to make machine learning predictions due to the lack of data security. Data is forced to be stored independently between companies, creating “data silos”. With the goal of safeguarding data privacy and security, the federated learning framework greatly expands the amount of training data, effectively improving the shortcomings of traditional machine learning and deep learning, and bringing AI algorithms closer to our reality. In the context of the current international data security issues, federated learning is developing rapidly and has gradually moved from the theoretical to the applied level. The paper first introduces the federated learning framework, analyzes its advantages, reviews the results of federated learning applications in industries such as communication and healthcare, then analyzes the pitfalls of federated learning and discusses the security issues that should be considered in applications, and finally looks into the future of federated learning and the application layer.
FEDDBN-IDS: federated deep belief network-based wireless network intrusion detection system
2024, Eurasip Journal on Information Security

View all citing articles on Scopus

Yue Yin received the B.S. degree in Communication engineering from Nanjing University of Posts and Telecommunications, Nanjing, China, in 2018. She is currently pursuing his Ph.D. degree of communication engineering at Nanjing University of Posts and Telecommunications, Nanjing China, from 2018. Her research interest is deep learning, non-orthogonal multiple access (NOMA) and advanced wireless techniques.

Yong Shi is currently a Lecturer with the School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, China. His research interests include cyber threat intelligence and intrusion detection systems.

Zhi Xue is currently a Professor with the School of Electronic Information and and Electrical Engineering, Shanghai Jiao Tong University, China. His research interests include wireless network security, cloud security, cryptography, and cyber threat intelligence.

^☆: This work was supported by the Foundation Item: Cyber Security from the National Key Research and Development Program of Shanghai Jiao Tong University under Grant 2017YFB0803203.

View full text

Full length articleIntelligent intrusion detection based on federated learning aided long short-term memory☆

Abstract

Introduction

Section snippets

Related work

Dataset preprocessing

Experimental results

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Comput. Secur.

Neurocomputing

Comput. Secur.

Ten challenges in advancing machine learning technologies towards 6G

IEEE Wireless Commun. Mag.

Future intelligent and secure vehicular network towards 6G: Machine-learning approaches

Proc. IEEE

6G: Opening new horizons for integration of comfort, security and intelligence

IEEE Wireless Commun. Mag.

Detecting masquerades using a combination of Naive Bayes and weighted RBF approach

J. Comput. Virol.

RTVD: A real-time volumetric detection scheme for DDoS in the internet of things

IEEE Access

Understanding the usage of industrial control system devices on the internet

IEEE Internet Things J.

Behavioral modeling and linearization of wideband RF power amplifiers using BiLSTM networks for 5G wireless systems

IEEE Trans. Veh. Technol.

Context embedding based on Bi-LSTM in semi-supervised biomedical word sense disambiguation

IEEE Access

The deep learning vision for heterogeneous network traffic control: Proposal, challenges, and future perspective

IEEE Wirel. Commun. Mag.

Transfer learning for semi-supervised automatic modulation classification in ZF-MIMO systems

IEEE J. Emerg. Sel. Top. Circuits Syst.

Deep learning based channel estimation for massive MIMO with mixed-resolution ADCs

IEEE Commun. Lett.

Flight delay prediction based on aviation big data and machine learning

IEEE Trans. Veh. Technol.

LightAMC: Lightweight automatic modulation classification using deep learning and compressive sensing

IEEE Trans. Veh. Technol.

A novel adaptive resource allocation model based on SMDP and reinforcement learning algorithm in vehicular cloud system

IEEE Trans. Veh. Technol.

Full length article
Intelligent intrusion detection based on federated learning aided long short-term memory☆