Introduction
Data processing has risen in various fields, including geography, engineering, business, finance, and healthcare. Using cloud computing for data processing has become widely accepted. High-performance computing services are delivered through the Internet, and substantial scientific applications are run using this technique. You may use cloud computing to get three services: Infrastructure as a service, Platform as a service, and software as a service (SaaS). In the form of services, the Infrastructure as a Service (IaaS) cloud offers cloud customers access to vast computer hardware infrastructure platforms and software resources. On the other hand, users can only run an application on the Internet in a SaaS cloud; however, in a Platform as a service (PaaS) cloud, customers may utilize the existing Platform to build their application [
1].
Private, public, communal, and hybrid cloud computing models exist. For businesses with similar needs, the Community cloud model is essential. When a faulty management system is used, the performance of the submitted applications/workflows is reduced. In cloud computing settings, workflow is a popular way to represent high-volume data-processing systems [
2]. Graph nodes represent the computing jobs, while graph edges reflect the relationships among the graph’s activities. The DAG is used to depict a workflow. The application’s scientific requirements determine the DAG’s size. The workflow size is modest if the scientific application is simple and straightforward. Otherwise, the amount of effort involved is enormous [
3].
Cloud computing relies heavily on virtual machines (VMs), essential components. Virtual machines (VMs) let cloud service providers make the most of their physical resources. Clients may save money on computer resources in the cloud by using virtual machines (VMs) [
4]. Virtual machines (VMs) are vulnerable to various security risks as traditional web servers. Brute-force SSH assaults are reasonably straightforward to stop because of the evidence left behind by failed attempts, which may be discovered in the authorization logs [
5]. “Others leave the minimal record in the system logs and are thus more challenging to identify. The co-resident assault, a covert security concern, is the topic of this research (also known as co-residence, co-residency, or co-location attack). VMs on the same server can be logically isolated using virtualization methods (i.e., co-resident VMs). As a result, apps running on different VMs shouldn’t conflict with one another. Although this is possible in natural cloud systems, it’s unlikely. For example, the time it takes to perform a cache read operation depends heavily on the amount of data stored in the cache. Aside from the fact that malicious individuals can construct side channels between their VM and the target VM on the same server, they can collect sensitive information from the victim.
A co-resident assault is what we term this. There are several flaws in the classic defense systems [
6]. First, is the attacker, who has some time on their hands? One party can devote an extensive effort to modeling the cloud system (the protections in place) and then meticulously organize their strikes. Implementing these safeguards in practice, on the other hand, is sometimes far from optimal, giving attackers more possibilities to exploit the system.” More than 9 out of 10 vulnerabilities exploited will have been known to security and IT professionals for at least a year before 2020, according to an article [
7]. This is partly due to the time and difficulty of regularly fixing vulnerabilities. Modifications may put customers off to the current system configuration because they fear a decrease in Quality of Service (QoS). Secondly and thirdly, zero-day attacks emerged due to the data the attacker amasses during the assault cycle. Internet users are becoming increasingly concerned about their online safety. Intrusion Detection Systems (IDSs) face new dangers, such as Multi-Stage Attacks (MSAs), due to these new threats. An innovative and more intelligent detection strategy is needed, as is the use of new sources of information to help overcome these new difficulties [
8].
Traditional one-off network attacks differ from MSAs in that they are launched in phases and over time to preserve long-term access to the target system. The steps that make up each stage of an MSA may not all be malicious, but they all play a crucial role in its effective execution. The attacker could only complete the MSAs if they were run consecutively. In addition, because of the extended time intervals between attack phases, most existing IDS have difficulty detecting MSAs [
9]. There are two types of IDS now in use: those that identify abuse and those that detect anomalies. Based on known assaults, the former has a high success rate. It cannot detect new variations of established assaults, which is a bummer. However, the latter can avoid this constraint by recognizing the differences between present and usual behavior. An anomaly detection system that uses machine learning to cope with massive data and attack detection is becoming increasingly popular in intrusion detection. However, multi-stage attacks are brutal because of the two issues listed below.
1)
Retraining dataset windows must be manually established in all existing model re-update efforts, which means that their duration is fixed. But the length of each step in multi-stage assaults varies. Assault duration and a threshold for retraining window mismatch will significantly impact detection performance. As a result, one of the most challenging tasks is figuring out how to distinguish between the various phases of an attack.
2)
The scanning stage, prospective stage, data theft stage, and data transfer stage are all examples of multi-stage assaults.
However, in the present research, these stages are recognized individually, which makes it impossible for any intrusion prevention system (IPS) with convenient methods to identify and respond to the various stages of an assault. As a result, how to connect the various stages is an additional issue [
10]. Meanwhile, existing machine learning-based anomaly detection research has a false alarm and false negative rate of more than 10%, making it unable to deal with multi-stage assaults. We propose a Neural Network-based approach to detecting multi-stage assaults to overcome the mentioned issues. The following is a list of our most important contributions.
1)
There are two levels of time series and stage features built into a long-term memory network.
2)
The stage features layer is introduced to store and calculate historical data to detect the distinct stages with varied durations in multi-stage assaults. This is followed by an analysis of the time-series characteristics layer to determine if the current data falls within an attack timeframe.
3)
Multi-stage cyber attack dataset is used in the comparison tests. Using a variety of datasets, our method has an accuracy rate of at least 91% and a false negative rate of no more than 6.75%. The false positive and false negative rates are lowered by at least 65.83% and 65.26%, respectively, compared to the current systems.
The following outlines the paper: The introduction is followed by related work. The explanation of the current state of the art and the research methods precede next. Then, it provides an overview of the model. Next, we’ll look at several simulation examples. Results and discussions are provided at the end of the paper, followed by conclusions and recommendations for further research.
This section mainly covers the review of existing cloud security research work. Research [
5] expressed that work toward robotized location and recognizable proof of multi-step digital assault situations would benefit fundamentally from a technique and language for displaying such situations. The idea of assault designs was acquainted with work with the reuse of nonexclusive modules in the assault demonstrating process. CAML was utilized in a model execution of a situation acknowledgment motor that consumed first-level security cautions progressively and produced reports that distinguish multi-step assault situations found in the alarm stream.
Research [
6] depicted progressed capacities for mission-driven digital situational mindfulness, given safeguard top to bottom by the Cauldron device. Cauldron consequently planned all ways of weakness through networks by connecting, totaling, normalizing, and intertwining information from various sources. It gave a refined perception of assault ways and consequently produced alleviation proposals. Adaptable demonstrating upheld multi-step examination of firewall rules and host-to-have weakness, with assault vectors inside the organization and from an external perspective. They depicted a ready relationship given Caldron assault charts, examining mission influence from assaults. Research [
7] utilized the Hidden Markov Model (HMM) to break down and foresee the assailant’s conduct, given what was gained from noticed cautions and interruptions. They utilized information mining to handle alarms to create input for the HMM to determine the expected appropriation likelihood. Their framework had the option to stream continuous Snort cautions and foresee interruptions in view of our learned standards. This framework had the option to find designs in the multi-stage assault naturally and order aggressors in view of their way of behaving. By doing this, our framework can successfully anticipate conduct and assailants and survey the risk level of various gatherings of aggressors.
Research [
8] broadened a current multi-step signature language to help assault locations on standardized logs gathered from different applications and gadgets. Furthermore, the lengthy language upheld the joining of outer danger knowledge and permitted us to reference current danger pointers. With this methodology, they could make nonexclusive marks that keep them awake to date. Utilizing this language, they could distinguish different login animal power endeavors on numerous applications with only one nonexclusive mark. Research [
9] portrayed a way to deal with limiting network safety gambles called Cyber Security Game (CSG), where CSG could be seen as a model-based framework for security design. CSG was a strategy supporting programming that quantitatively distinguished mission results and cantered network protection chances. It utilized a hypothetical game arrangement utilizing a game detailing that distinguished safeguard methodologies to limit the greatest digital gamble (Mini-Max), utilizing the protection strategies characterized in the protector model. This paper portrayed the methodology and the models that CSG utilizes.
Research [
10] zeroed in on utilizing IDS alarms relating to unusual traffic to connect assaults identified by the IDS, recreated multi-step assault situations, and found assault chains. Because of numerous misleading up-sides in the data given by IDS, precise reproduction of the assault situation and extraction of the most fundamental assault chain was tested. Hence, they proposed a technique to reproduce multi-step assault situations in the organization, given numerous data combinations of assault time, risk evaluation, and assault hub data. The trial results demonstrated how the proposed strategy could recreate multi-step assault situations and follow them back to the first host. It could assist presiding officers with conveying safety efforts all the more successfully to guarantee the general security of the organization. Research [
11] introduced Kitsune: a fitting and play NIDS that figured out how to distinguish assaults on the XXXe-enactmentXXX organization without management and in a proficient web-based way. Kitsune’scenter calculation (KitNET) utilized a gathering of brain networks called autoencoders to separate aggregately among typical and unusual traffic designs. KitNET was upheld by a component extraction structure that productively tracks the examples of every organization channel. Their assessments showed that Kitsune recognized other assaults with an exhibition tantamount to disconnected oddity identifiers, even on a Raspberry PI. This showed the way that Kitsune can be a pragmatic and monetary NIDS.
Research [
12] proposed a comprehensive framework to test complicated, innovative risks and major countermeasures. In particular, zero-day attacks, which were not publicly disclosed, and multi-step attacks, which were constructed from a few discrete breakthroughs, some harmful and others benign, illustrated this problem well [
13]. Artificial intelligence (AI) was developed to track these attacks in the artificial brainpower arena. Rule-based and exception-recognition-based setups were among the measurable approaches. Incorporating social anomaly detection and event succession tracking into AI was a natural progression. Interrupt recognition is frequently performed online, and security examinations conducted unconnected both use artificial awareness.
Research [
14] introduced an original ID that takes advantage of logical data as Pattern-of-Life (PoL) and data connected with master judgment on the organization’s conduct. This IDS zeroed in on distinguishing an MSA continuously, without a past preparation process. The fundamental objective of the MSA was to make a Point of Entry (PoE) to an objective machine, which could be utilized as a component of an APT assault. Our outcomes check that the utilization of context-oriented data works on the productivity of our IDS by improving the identification pace of MSAs by 58%. Research [
15] introduced a methodology that gathered and corresponded cross-space digital danger data to recognize multi-stage digital assaults in energy data frameworks. To give an excellent premise to relevantly evaluate and comprehend what is happening to savvy lattices in the event of facilitated digital assaults, they required a precise and reasonable way to deal with distinguishing digital episodes. They researched the materialness and execution of the introduced connection approach. They examined the outcomes to feature difficulties in space explicit discovery components.
Research [
16] proposed a methodology for assault mining and location that performed errands of caution relationship, misleading positive end, assault mining, and assault expectation. To speed up the quest for the separated caution grouping information to mine assault designs, the Prefix Span calculation was additionally refreshed in the store system. The refreshed Prefix Span expanded the handling proficiency and accomplished an improved outcome than the first one in tests. With the Bayesian hypothesis, the changing likelihood for the grouping design string was determined, and the alert progress likelihood table was built to draw the assault diagram. At last, long-momentary memory organization and word-vector strategies were utilized to perform an online forecast. Consequences of mathematical tests show that the strategy proposed in this paper had severe strength areas for an incentive for assault discovery and expectation.
Research [
17] proposed MAAC, a multi-step assault ready connection framework, which decreased rehashed cautions and consolidated multi-step assault ways in light of ready semantics and assault stages. Progressed digital assaults incorporated numerous stages to accomplish a definitive objective. Conventional interruption identification frameworks, for example, endpoint security the board instruments, firewalls, and other checking devices, produced numerous cautions during the assault. These cautions included assault hints and numerous bogus up-sides inconsequential to assaults. The assessment consequences of this present reality datasets demonstrated that MAAC could successfully decrease the cautions by 90% and track down assault ways from countless alarms.
Research [
18] concentrated on the occasion set off multi-step model prescient control for the discrete-time nonlinear framework over correspondence networks affected by parcel dropouts and digital assaults. In the first place, it was equipped for deciding if the tested sign should be conveyed to the questionable organization and was intended to streamline correspondence assets. Second, two Bernoulli processes were acquainted with addressing the arbitrarily happening parcel dropouts in the questionable organization and the haphazardly happening misdirection assaults on the actuator side from the enemies. Also, the outcomes on the recursive plausibility and shut circle solidness connected with the arranged framework were accomplished, which unequivocally think about the outside aggravation and info requirement. At last, re-enactment probes of the mass-spring-damping framework were completed to delineate the judiciousness and adequacy of the control methodology. After studying various existing works, Table
1 depicts the summary of existing work as given below.
Table 1
Summary of existing work
1 | | Contextual information | 58% |
2 | | Cyber Security Game (CSG) | 70% |
3 | | multi-step attack alert correlation system | 90% |
4 | | Systematic & coherent approach | 97% |
Table
1 depicts the accuracy level of different techniques in predicting cyber attacks. As shown in this table, the maximum level of accuracy is 97%. This fact motivates the authors to propose a new machine learning model for predicting Multi-stage Cyber attack in Cloud Environment more accurately to make the cloud applications more secure in real time.
Conclusion
The proposed neural network for predicting multi-stage cyber assaults is developed in this study. It puts the intricate assaults into perspective by illustrating how they may be detected and investigated, two of the essential functions in the security area. Here, we outline a complete framework for studying complex assaults, their related analytical methodologies, and their primary uses in security: detection and investigation. This paradigm makes it easier to categorize new, complex dangers and the countermeasures that go along with them, such as Artificial Intelligence. Our model for Multi-stage Cyber attack prediction outperforms other discussed models in terms of accuracy for the given dataset.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.