Skip to main content

2017 | Buch

Data Analytics and Decision Support for Cybersecurity

Trends, Methodologies and Applications

herausgegeben von: Iván Palomares Carrascosa, Harsha Kumara Kalutarage, Yan Huang

Verlag: Springer International Publishing

Buchreihe : Data Analytics

insite
SUCHEN

Über dieses Buch

The book illustrates the inter-relationship between several data management, analytics and decision support techniques and methods commonly adopted in Cybersecurity-oriented frameworks. The recent advent of Big Data paradigms and the use of data science methods, has resulted in a higher demand for effective data-driven models that support decision-making at a strategic level. This motivates the need for defining novel data analytics and decision support approaches in a myriad of real-life scenarios and problems, with Cybersecurity-related domains being no exception.

This contributed volume comprises nine chapters, written by leading international researchers, covering a compilation of recent advances in Cybersecurity-related applications of data analytics and decision support approaches. In addition to theoretical studies and overviews of existing relevant literature, this book comprises a selection of application-oriented research contributions. The investigations undertaken across these chapters focus on diverse and critical Cybersecurity problems, such as Intrusion Detection, Insider Threats, Insider Threats, Collusion Detection, Run-Time Malware Detection, Intrusion Detection, E-Learning, Online Examinations, Cybersecurity noisy data removal, Secure Smart Power Systems, Security Visualization and Monitoring.

Researchers and professionals alike will find the chapters an essential read for further research on the topic.

Inhaltsverzeichnis

Frontmatter

Regular Chapters

Frontmatter
A Toolset for Intrusion and Insider Threat Detection
Abstract
Company data are a valuable asset and must be protected against unauthorized access and manipulation. In this contribution, we report on our ongoing work that aims to support IT security experts with identifying novel or obfuscated attacks in company networks, irrespective of their origin inside or outside the company network. A new toolset for anomaly based network intrusion detection is proposed. This toolset uses flow-based data which can be easily retrieved by central network components. We study the challenges of analysing flow-based data streams using data mining algorithms and build an appropriate approach step by step. In contrast to previous work, we collect flow-based data for each host over a certain time window, include the knowledge of domain experts and analyse the data from three different views. We argue that incorporating expert knowledge and previous flows allow us to create more meaningful attributes for subsequent analysis methods. This way, we try to detect novel attacks while simultaneously limiting the number of false positives.
Markus Ring, Sarah Wunderlich, Dominik Grüdl, Dieter Landes, Andreas Hotho
Human-Machine Decision Support Systems for Insider Threat Detection
Abstract
Insider threats are recognised to be quite possibly the most damaging attacks that an organisation could experience. Those on the inside, who have privileged access and knowledge, are already in a position of great responsibility for contributing towards the security and operations of the organisation. Should an individual choose to exploit this privilege, perhaps due to disgruntlement or external coercion from a competitor, then the potential impact to the organisation can be extremely damaging. There are many proposals of using machine learning and anomaly detection techniques as a means of automated decision-making about which insiders are acting in a suspicious or malicious manner, as a form of large scale data analytics. However, it is well recognised that this poses many challenges, for example, how do we capture an accurate representation of normality to assess insiders against, within a dynamic and ever-changing organisation? More recently, there has been interest in how visual analytics can be incorporated with machine-based approaches, to alleviate the data analytics challenges of anomaly detection and to support human reasoning through visual interactive interfaces. Furthermore, by combining visual analytics and active machine learning, there is potential capability for the analysts to impart their domain expert knowledge back to the system, so as to iteratively improve the machine-based decisions based on the human analyst preferences. With this combined human-machine approach to decision-making about potential threats, the system can begin to more accurately capture human rationale for the decision process, and reduce the false positives that are flagged by the system. In this work, I reflect on the challenges of insider threat detection, and look to how human-machine decision support systems can offer solutions towards this.
Philip A. Legg
Detecting Malicious Collusion Between Mobile Software Applications: The Android TM Case
Abstract
Malware has been a major problem in desktop computing for decades. With the recent trend towards mobile computing, malware is moving rapidly to smartphone platforms. “Total mobile malware has grown 151% over the past year”, according to McAfee®’s quarterly treat report in September 2016. By design, AndroidTM is “open” to download apps from different sources. Its security depends on restricting apps by combining digital signatures, sandboxing, and permissions. Unfortunately, these restrictions can be bypassed, without the user noticing, by colluding apps for which combined permissions allow them to carry out attacks. In this chapter we report on recent and ongoing research results from our ACID project which suggest a number of reliable means to detect collusion, tackling the aforementioned problems. We present our conceptual work on the topic of collusion and discuss a number of automated tools arising from it.
Irina Măriuca Asăvoae, Jorge Blasco, Thomas M. Chen, Harsha Kumara Kalutarage, Igor Muttik, Hoang Nga Nguyen, Markus Roggenbach, Siraj Ahmed Shaikh
Dynamic Analysis of Malware Using Run-Time Opcodes
Abstract
The continuing fight against intentionally malicious software has, to date, favoured the proliferators of malware. Signature detection methods are growingly impotent against rapidly evolving obfuscation techniques. Research has recently focussed on the low-level opcode analysis of disassembled executable programs, both statically and dynamically. While able to detect malware, static analysis often still cannot unravel obfuscated code; dynamic approaches allow investigators to reveal the run-time code. Old and inadequately sampled datasets have limited the extrapolation potential of much of the body of research. This work presents a dynamic opcode analysis approach to malware detection, applying machine learning techniques to the largest dataset of its kind, both in terms of breadth (610–100k features) and depth (48k samples). N-gram analysis of opcode sequences from n = 1. . 3 was applied as a means of enhancing the feature set. Feature selection was then investigated to tackle the feature explosion which resulted in more than 100,000 features in some cases. As the earliest detection of malware is the most favourable, run-length, i.e. the number of recorded opcodes in a trace, was examined to find the optimal capture size. This research found that dynamic opcode analysis can detect malware from benignware with a 99.01% accuracy rate, using a sequence of only 32k opcodes and 50 features. This demonstrates that a dynamic opcode analysis approach can compare with static analysis in terms of speed. Furthermore, it has a very real potential application to the unending fight against malware, which is, by definition, continuously on the back foot.
Domhnall Carlin, Philip O’Kane, Sakir Sezer
Big Data Analytics for Intrusion Detection System: Statistical Decision-Making Using Finite Dirichlet Mixture Models
Abstract
An intrusion detection system has become a vital mechanism to detect a wide variety of malicious activities in the cyber domain. However, this system still faces an important limitation when it comes to detecting zero-day attacks, concerning the reduction of relatively high false alarm rates. It is thus necessary to no longer consider the tasks of monitoring and analysing network data in isolation, but instead optimise their integration with decision-making methods for identifying anomalous events. This chapter presents a scalable framework for building an effective and lightweight anomaly detection system. This framework includes three modules of capturing and logging, pre-processing and a new statistical decision engine, called the Dirichlet mixture model based anomaly detection technique. The first module sniffs and collects network data while the second module analyses and filters these data to improve the performance of the decision engine. Finally, the decision engine is designed based on the Dirichlet mixture model with a lower-upper interquartile range as decision engine. The performance of this framework is evaluated on two well-known datasets, the NSL-KDD and UNSW-NB15. The empirical results showed that the statistical analysis of network data helps in choosing the best model which correctly fits the network data. Additionally, the Dirichlet mixture model based anomaly detection technique provides a higher detection rate and lower false alarm rate than other three compelling techniques. These techniques were built based on correlation and distance measures that cannot detect modern attacks which mimic normal activities, whereas the proposed technique was established using the Dirichlet mixture model and precise boundaries of interquartile range for finding small differences between legitimate and attack vectors, efficiently identifying these attacks.
Nour Moustafa, Gideon Creech, Jill Slay
Security of Online Examinations
Abstract
Online-examination modeling has been advancing at a slow, thus steady pace. Such an endeavor is embedded in many of today’s fast-paced educational institutions. So, the online examination (i.e. e-Examination) model demonstrated in this chapter proposes two major schemes that utilize the most up-to-date features of information and communication technology (ICT). We have integrated authentication methods into this model in the form of simulated and controlled, thus measurable enhancements. The new model complies with international examination standards and have been proved to be equally, if not more, immuned to its predecessor models, including classroom-based examination sessions. Therefore, it can be selected as a new model of examination to cut-down on the cost of exam administration and proctoring.
e-Examination systems are vulnerable to cyberattacks, leading to denial-of-service and/or unauthorized access to sensitive information. In order to prevent such attacks and impersonation threats, we have employed smart techniques of continuous authentication. Therefore, we propose two schemes; Interactive and Secure E-Examination Unit (ISEEU) which is based on video monitoring, and Smart Approach for Bimodal Biometrics Authentication in Home-exams (SABBAH) which implements bimodal biometrics and video-matching algorithms. Still, the model is scalable and upgradable to keep it open to smarter integration of state-of-the-art in the field of continuous authentication. For validation purposes, we have conducted a comprehensive risk analysis, and results show that our proposed model achieved higher scores than the previous ones.
Yousef W. Sabbah
Attribute Noise, Classification Technique, and Classification Accuracy
Abstract
Binary data classification is an integral part in cyber-security, as most of the response variables follow a binary nature. The accuracy of data classification depends on various aspects. Though the data classification technique has a major impact on classification accuracy, the nature of the data also matters lot. One of the main concerns that can hinder the classification accuracy is the availability of noise. Therefore, both choosing the appropriate data classification technique and the identification of noise in the data are equally important. The aim of this study is bidirectional. At first, we aim to study the influence of noise on the accurate data classification. Secondly, we strive to improve the classification accuracy by handling the noise. To this end, we compare several classification techniques and propose a novel noise removal algorithm. Our study is based on the collected data about online credit-card transactions. According to the empirical outcomes, we find that the noise hinders the classification accuracy significantly. In addition, the results indicate that the accuracy of data classification depends on the quality of the data and the used classification technique. Out of the selected classification techniques, Random Forest performs better than its counterparts. Furthermore, experimental evidence suggests that the classification accuracy of noised data can be improved by the appropriate selection of the sizes of training and testing data samples. Our proposed simple noise-removal algorithm shows higher performance and the percentage of noise removal significantly depends on the selected bin size.
R. Indika P. Wickramasinghe

Invited Chapters

Frontmatter
Learning from Loads: An Intelligent System for Decision Support in Identifying Nodal Load Disturbances of Cyber-Attacks in Smart Power Systems Using Gaussian Processes and Fuzzy Inference
Abstract
The future of electric power is associated with the use of information technologies. The smart grid of the future will utilize communications and big data to regulate power flow, shape demand with a plethora of pieces of information and ensure reliability at all times. However, the extensive use of information technologies in the power system may also form a Trojan horse for cyberattacks. Smart power systems where information is utilized to predict load demand at the nodal level are of interest in this work. Control of power grid nodes may consist of an important tool in cyberattackers’ hands to bring chaos in the electric power system. An intelligent system is proposed for analyzing loads at the nodal level in order to detect whether a cyberattack has occurred in the node. The proposed system integrates computational intelligence with kernel modeled Gaussian processes and fuzzy logic. The overall goal of the intelligent system is to provide a degree of possibility as to whether the load demand is legitimate or it has been manipulated in a way that is a threat to the safety of the node and that of the grid in general. The proposed system is tested with real-world data.
Miltiadis Alamaniotis, Lefteri H. Tsoukalas
Visualization and Data Provenance Trends in Decision Support for Cybersecurity
Abstract
The vast amount of data collected daily from logging mechanisms on web and mobile applications lack effective analytic approaches to provide insights for cybersecurity. Current analytical time taken to identify zero-day attacks and respond with a patch or detection mechanism is unmeasurable. This is a current challenge and struggle for cybersecurity researchers. User- and data provenance-centric approaches are the growing trend in aiding defensive and offensive decisions on cyber-attacks. In this chapter we introduce (1) our Security Visualization Standard (SCeeL-VisT); (2) the Security Visualization Effectiveness Measurement (SvEm) Theory; (3) the concept of Data Provenance as a Security Visualization Service (DPaaSVS); and (4) highlight growing trends of using data provenance methodologies and security visualization methods to aid data analytics and decision support for cyber security. Security visualization showing provenance from a spectrum of data samples on an attack helps researchers to reconstruct the attack from source to destination. This helps identify possible attack patterns and behaviors which results in the creation of effective detection mechanisms and cyber-attacks.
Jeffery Garae, Ryan K. L. Ko
Metadaten
Titel
Data Analytics and Decision Support for Cybersecurity
herausgegeben von
Iván Palomares Carrascosa
Harsha Kumara Kalutarage
Yan Huang
Copyright-Jahr
2017
Electronic ISBN
978-3-319-59439-2
Print ISBN
978-3-319-59438-5
DOI
https://doi.org/10.1007/978-3-319-59439-2