Control variable classification, modeling and anomaly detection in Modbus/TCP SCADA systems
Introduction
Supervisory control and data acquisition (SCADA) systems are widely used for monitoring and controlling physical processes in electric utilities, water supply systems, nuclear reactors and other critical infrastructure assets. A SCADA system typically consists of a human–machine interface (HMI) and multiple programmable logic controllers (PLCs). The programmable logic controllers, which are connected to sensors and actuators, have internal memory and run the logic needed to control the attached equipment. A programmable logic controller stores the information it receives from the equipment and sends the data to the HMI, which presents it to system operators. Additionally, a programmable logic controller receives control data from the HMI and sends modification commands to the appropriate actuators.
Modbus is one of the most common SCADA system protocols. Originally designed for serial-line communications, the Modbus protocol was created under the assumption that all SCADA functions are secure and operating as intended. Therefore, many Modbus systems have little or no defense mechanisms to combat deliberate attacks. The protocol does not incorporate authentication and authorization, and there is no verification of data integrity. Moreover, all the data are transferred in the form of plaintext, without any encryption.
Over the years, economic trends have driven Modbus technology from small serial-line communications systems to large-scale networks based on TCP/IP. The resulting Modbus protocol, which is encapsulated by TCP/IP, is known as Modbus/TCP. The TCP/IP connectivity exposes formerly closed Modbus systems to remote network attacks, possibly even attacks over the Internet.
A key aspect of SCADA system security is the validity of the data values sent from programmable logic controllers to the HMI. An attacker who is able to manipulate the data values can mount attacks with severe consequences. For example, like the Stuxnet virus [6], [19], an attacker could hide the fact that the equipment is operating abnormally by sending fake sensor readings to the HMIs. Alternatively, by sending abnormal values, an attacker may trick an automatic control system – or human operator – into taking dangerous “corrective” actions. Therefore, tracking and modeling data values to ensure that the data values do not contain abnormalities are reliability and safety issues as well as an important security goal.
This paper describes a novel domain-aware anomaly detection system that detects irregular changes to SCADA control register values. Most prior research efforts in SCADA anomaly detection have focused on the command patterns and the referenced memory locations. They have not attempted to model the actual data values, which require a deep semantic understanding of the industrial control process.
The approach presented in this paper falls in between pure machine-learning-based anomaly detection and anomaly detection based on a complete model of an industrial process (see, e.g., [23]). In the proposed approach, the learning system is provided with minimal semantic knowledge such as the register classes and periodicities. This knowledge enables the system to cross a portion of Sommer and Paxson׳s semantic gap [29] by going beyond detecting general abnormal activity and specifically indicating the register type and memory range in which the anomaly occurred. Despite its minimal semantic awareness, the system is able to classify, model and flag anomalous behavior for several classes of registers.
A key contribution of this research is that interesting classes of registers with distinctive behavior patterns can be discovered in Modbus traffic, even without understanding the complete semantics of the underlying process. The inspection of Modbus traffic enabled the identification of three key register classes: (i) sensor registers; (ii) counter registers; and (iii) constant registers. An automated classifier was then designed to identify these classes by leveraging the dispersion index from the field of natural sciences. Parameterized behavior models were subsequently developed for each register class. During its learning phase, the system instantiated a model for each register; deviations from the models were detected during the enforcement phase. Experiments were conducted using 131 h of traffic from a production SCADA system with seven programmable logic controllers and 449 individual registers. The results are promising. The classifier exhibited a true positive classification of 93%. For the correctly-classified registers, the enforcement phase achieved a low false alarm rate of 0.86%.
Section snippets
Related work
A survey of techniques related to learning and detection of anomalies in critical control systems can be found in [1]. Byres et al. [5] have specified several attack trees for SCADA systems based on the Modbus/TCP protocol. They found that compromising a slave (programmable logic controller) or the master (HMI) can produce the most severe impact on a SCADA system. For instance, an attacker who gains access to the SCADA system could pretend to be the HMI and change the data values in
Preliminaries
This section briefly describes the Modbus protocol, the SCADA system used in the research and the manual classification of programmable logic controller registers.
Automated classification of registers
After identifying the three main classes of registers (sensor, counter and constant) by inspection, an automated process for classifying registers was developed. Sensor registers are the most “interesting” registers because they carry information about the physical state of the industrial process. Therefore, the first step was to automatically identify these registers. As seen in Fig. 1, sensor registers have long-term temporal behavior – in this case, a daily cycle; but, over short
Modeling normal behavior
As seen in Section 4, the classification algorithm is extremely effective at classifying sensor, counter and constant registers. The next step is to model the normal behavior of the data values of each register so that anomalies can be identified. Two phases are involved: (i) learning phase, during which a behavior model is created and instantiated for each register; and (ii) enforcement phase, during which checks are performed to verify that registers are acting according to the learned models.
System performance
Dataset 1 could not be used during the enforcement phase because it was only recorded over 20 h. However, Dataset 2 was recorded over a five-day period (Tuesday to Saturday). Because Fridays and Saturdays are weekends in Israel, the values seen during these days are significantly different from any other day of the week (see Fig. 17). In particular, observe the low peak on Friday and the lack of a peak on Saturday. Therefore, the experiments were limited to the first 72 h of Dataset 2. The first
Conclusions
The domain-aware anomaly detection system presented in this paper is designed to detect irregular deviations in SCADA control register values. Despite minimal knowledge of the controlled physical process, three semantically-meaningful classes of registers could be distinguished in the Modbus traffic of a production SCADA system. The anomaly detection system automatically assigns registers into one of the three classes, learns and models their behavior, and raises alerts when register values
Acknowledgments
This research was supported by the Ministry of Science and Technology, Israel.
References (34)
- et al.
Accurate modeling of Modbus/TCP for intrusion detection in SCADA systems
International Journal of Critical Infrastructure Protection
(2013) - C. Alcaraz, L. Cazorla, G. Fernandez, Context-awareness using anomaly-based detectors for smart grid domains,...
- et al.
A test of goodness of fit
Journal of the American Statistical Association
(1954) - T. Batu, L. Fortnow, R. Rubinfeld, W. Smith, P. White, Testing that distributions are close, Proceedings of the...
- L. Briesemeister, S. Cheung, U. Lindqvist and A. Valdes, Detection, correlation and visualization of attacks against...
- E. Byres, M. Franz and D. Miller, The use of attack trees in assessing vulnerabilities in SCADA systems, Proceedings of...
Stuxnet, the real start of cyber warfare?
IEEE Network
(2010)- S. Cheung, B. Dutertre, M. Fong, U. Lindqvist, K. Skinner, A. Valdes, Using model-based intrusion detection for SCADA...
- et al.
The Theory of Stochastic Processes
(1977) - J. Diaz, Using Snort for Intrusion Detection in Modbus TCP/IP Communications, InfoSec Reading Room, SANS Institute,...
Exponential smoothing: The state of the art
Journal of Forecasting
On indices of dispersion
The Annals of Mathematical Statistics
Cited by (90)
Cyber-attacks detection in industrial systems using artificial intelligence-driven methods
2022, International Journal of Critical Infrastructure ProtectionA new perspective towards the development of robust data-driven intrusion detection for industrial control systems
2020, Nuclear Engineering and TechnologyCitation Excerpt :An attacker who can manipulate the data values can mount attacks with severe consequences. Access to layer 2 devices could also empower the attacker to trick the human operator by sending spurious measurements [61]. Another major consideration is the process measurement dynamics introduced when complex ICS system undergoes routine maintenance or part replacement.
Detecting stealthy attacks on industrial control systems using a permutation entropy-based method
2020, Future Generation Computer SystemsSCADA World: An exploration of the diversity in power grid networks
2024, Proceedings of the ACM on Measurement and Analysis of Computing SystemsSePanner: Analyzing Semantics of Controller Variables in Industrial Control Systems based on Network Traffic
2023, ACM International Conference Proceeding Series