1 Introduction
2 Research challenges
3 The structure and functionality of the robust software modeling tool (RSMT)
3.1 Overview of rSMT
-
during a training epoch, where RSMT uses the traces generated by test suites to learn a model of correct program execution, and
-
during a subsequent validation epoch, where RSMT classifies traces extracted from a live application using previously learned models to determine whether each trace is indicative of normal or abnormal behavior.
3.2 The rSMT agent
-
The online model construction/validation subtree performs model construction and verification in the current thread of execution i.e., on the critical path.
-
The offline model construction/validation subtree converts events into a form that can be stored asynchronously with a (possibly remote) instance of Elasticsearch [23], which is an open-source search and analytics engine that provides a distributed real-time document store.
3.3 The rSMT agent server
-
A trace API that RSMT agents use to transmit execution traces. This API allows an agent to (1) register a recently launched JVM as a component in a previously defined architecture and (2) push execution trace(s).
-
An application management API for defining and maintaining applications by (1) defining/deleting/modifying an application, (2) retrieving a list of applications, and (3) transitioning components in an application from one state to another. This design affects how traces received from monitoring agents are handled, e.g., in the IDLE state, traces are discarded whereas in the TRAIN state they are conveyed to a machine learning backend that applies them incrementally to build a model of expected behavior. In the VALIDATE state, traces are compared against existing models and classified as normal or abnormal.
-
A classification API that monitors the health of applications. This API can be used to query the status of application components over a sliding window of time, whose width determines how far back in time traces are retrieved during the health check and which rolls up into a summary of all classified traces for an application’s operation. This API can also be used to retrieve a JSON representation of the current health of an application.
4 Unsupervised web attack detection with end-to-End deep learning
4.1 Traces collection with unit tests
4.2 Anomaly detection with deep learning
-
Supervised learning approaches (such as Naive Bayes [25] and SVM [26]) work by calibrating a classifier with a training dataset that consists of data labeled as either normal traffic or attack traffic. The classifier then classifies the incoming traffic as either normal data or an attack request. Two general types of problems arise when applying supervised approaches to detect web attacks: (1) classifiers cannot handle new types of attacks that are not included in the training dataset, as described in Challenge 3 (hard to obtain labeled training data) in Section 2 and (2) it is hard to get a large amount of labeled training data, as described in Challenge 3 in Section 2.
-
Unsupervised learning approaches (such as Principal Component Analysis (PCA) [27] and autoencoder [19]) do not require labeled training datasets. Instead, they rely on the assumption that data can be embedded into a lower dimensional subspace in which normal instances and anomalies appear significantly different. The idea is to apply dimension reduction techniques (such as PCA or autoencoders) for anomaly detection. PCA or autoencoders try to learn a function h(X)=X that maps input to itself.
4.3 End-to-end deep learning with stacked denoising autoencoders
5 Analysis of experimental results
5.1 Testbed
-
Type1: Tautology-based. Statements like OR ’1’ = ’1’ and OR ’1’ < ’2’ were added at the end of the query to make the preceding statement always true. For example, SELECT * FROM user WHERE username = ’user1’ OR ’1’ = ’1’.
-
Type2: Comment-based. A comment was used to ignore the succeeding statements, e.g., SELECT * FROM user WHERE username = ’user1’ #’ AND password = ’123’.
-
Type3: Use semicolon to add additional statement, e.g., SELECT * FROM user WHERE username = ’user1’; DROP TABLE users; AND password = ’123’.
5.2 Evaluation metrics
5.3 Overhead observations
5.4 Supervised attack detection with manually extracted features
5.4.1 Experiment benchmarks
-
Naive Bayes, whose classification decisions calculate the probabilities/costs for each decision and are widely used in cyber-attack detection [43].
-
Random forests, which is an ensemble learning method for classification that train decision trees on sub-samples of the dataset and then improve classification accuracy via averaging. A key parameter for random forest is the number of attributes to consider in each split point, which are selected automatically by Weka.
-
Support vector machine (SVM), which is an efficient supervised learning model that draws an optimal hyperplane in the feature space and divides separate categories as widely as possible. RSMT uses Weka’s Sequential Minimal Optimization algorithm to train the SVM.
-
Aggregate_vote, which returns ATTACK if a majority of classifiers detect attacks and returns NOT_ATTACK otherwise.
-
Aggregate_any, which returns attack if any classifier detects attacks and NOT_ATTACK otherwise.
5.4.2 Experiment results
Precision | Recall | F-score | |
---|---|---|---|
Naive bayes | 0.941 | 0.800 | 0.865 |
Random forest | 1.000 | 0.800 | 0.889 |
SVM | 0.933 | 0.800 | 0.889 |
AGGREGATE_VOTE | 1.000 | 0.800 | 0.889 |
AGGREGATE_ANY | 0.941 | 0.800 | 0.865 |
Precision | Recall | F-score | |
---|---|---|---|
Naive bayes | 0.721 | 1.000 | 0.838 |
Random forest | 0.721 | 1.000 | 0.838 |
SVM | 0.728 | 1.000 | 0.843 |
AGGREGATE_VOTE | 0.724 | 1.000 | 0.840 |
AGGREGATE_ANY | 0.710 | 1.000 | 0.831 |
5.5 Unsupervised attack detection with deep learning
5.5.1 Experiment benchmarks
5.5.2 Experiment results
Precision | Recall | F-score | |
---|---|---|---|
Naive | 0.722 | 0.985 | 0.831 |
PCA | 0.827 | 0.926 | 0.874 |
One-class SVM | 0.809 | 0.909 | 0.858 |
Autoencoder | 0.898 | 0.942 | 0.914 |
Precision | Recall | F-score | |
---|---|---|---|
Naive | 0.421 | 1.000 | 0.596 |
PCA | 0.737 | 0.856 | 0.796 |
One-class SVM | 0.669 | 0.740 | 0.702 |
Autoencoder | 0.906 | 0.928 | 0.918 |
Training Time | Classification Time | |
---|---|---|
Naive | 51s | 0.05s |
PCA | 2min 12s | 0.2s |
One-class SVM | 2min 6s | 0.2s |
Autoencoder | 8min 24s | 0.4s |