Skip to main content

2018 | Buch

Cyber Security Cryptography and Machine Learning

Second International Symposium, CSCML 2018, Beer Sheva, Israel, June 21–22, 2018, Proceedings

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed proceedings of the Second International Symposium on Cyber Security Cryptography and Machine Learning, CSCML 2018, held in Beer-Sheva, Israel, in June 2018.

The 16 full and 6 short papers presented in this volume were carefully reviewed and selected from 44 submissions. They deal with the theory, design, analysis, implementation, or application of cyber security, cryptography and machine learning systems and networks, and conceptually innovative topics in the scope.

Inhaltsverzeichnis

Frontmatter
Optical Cryptography for Cyber Secured and Stealthy Fiber-Optic Communication Transmission
Invited Paper
Abstract
We propose a method for stealthy, covert, fiber-optic communication. In this method, the power of the transmitted signal is spread and lowered below the noise level both in time as well as in frequency domains which makes the signal “invisible”. The method is also efficient in jamming avoidance.
Tomer Yeminy, Eyal Wohlgemuth, Dan Sadot, Zeev Zalevsky
Efficient Construction of the Kite Generator Revisited
Abstract
The kite generator, first introduced by Andreeva et al. [1], is a strongly connected directed graph that allows creating a message of almost any desired length, connecting two chaining values covered by the kite generator. The kite generator can be used in second pre-image attacks against (dithered) Merkle-Damgård hash functions.
In this work we discuss the complexity of constructing the kite generator. We show that the analysis of the construction of the kite generator first described by Andreeva et al. is somewhat inaccurate and discuss its actual complexity. We follow with presenting a new method for a more efficient construction of the kite generator, cutting the running time of the preprocessing by half (compared with the original claims of Andreeva et al. or by a linear factor compared to corrected analysis). Finally, we adapt the new method to the dithered Merkle-Damgård structure.
Orr Dunkelman, Ariel Weizman
Using Noisy Binary Search for Differentially Private Anomaly Detection
Abstract
In this paper, we study differential privacy in noisy search. This problem is connected to noisy group testing: the goal is to find a defective or anomalous item within a group using only aggregate group queries, not individual queries. Differentially private noisy group testing has the potential to be used for anomaly detection in a way that provides differential privacy to the non-anomalous individuals while still helping to allow the anomalous individuals to be located. To do this, we introduce the notion of anomaly-restricted differential privacy. We then show that noisy group testing can be used to satisfy anomaly-restricted differential privacy while still narrowing down the location of the anomalous samples, and evaluate our approach experimentally.
Daniel M. Bittner, Anand D. Sarwate, Rebecca N. Wright
Distributed Web Mining of Ethereum
Abstract
We consider the problem of mining crytocurrencies by harnessing the inherent distribution capabilities of the World Wide Web. More specifically, we propose, analyze, and implement WebEth, a browser-based distributed miner of the Ethereum cryptocurrency. WebEth handles Proof-of-Work (PoW) calculations through individualized code that runs on the client browsers, and thereafter collates them at a web server to complete the mining operation. WebEth is based on a lazy evaluation technique designed to function within the expected limitations of the clients, including bounds on memory, computation and communication bandwidth to the server. We provide proofs-of-concept of WebEth based on JavaScript and WebAssembly implementations, with the latter reaching hash rates up to roughly 40 kiloHashes per second, which is only 30% slower than the corresponding native C++-based implementation. Finally, we explore several applications of WebEth, including monetization of web content, rate limitation to server access, and private Ethereum networks. Though several distributed web-based cryptominers have appeared in the wild (for other currencies), either in malware or in commercial trials, we believe that WebEth is the first open-source cryptominer of this type.
Trishita Tiwari, David Starobinski, Ari Trachtenberg
An Information-Flow Control Model for Online Social Networks Based on User-Attribute Credibility and Connection-Strength Factors
Abstract
During the last couple of years there have been many researches on Online Social Networks (OSN). The common manner of representing an OSN is by a user-based graph, where the vertices are different OSN users, and the edges are different interactions between these users, such as friendships, information-sharing instances, and other connection types. The question of whether a certain user is willing to share its information to other users, known and less known, is a question that occupies several researches in aspects of information security, sharing habits and information-flow models for OSN. While many approaches take into consideration the OSN graph edges as sharing-probability factors, here we present a novel approach, that also combines the vertices as well-defined attributed entities, that contain several properties, in which we seek a certain level of credibility based on the user’s attributes, such as number of total friends, age of user account, etc. The edges in our model represent the connection-strength of two users, by taking into consideration the attributes that represent their connection, such as number of mutual friend, friendship duration, etc. and the model also recognizes resemblance factors, meaning the number of similar user attributes. This approach optimizes the evaluation of users’ information-sharing willingness by deriving it from these attributes, thus creating an accurate flow-control graph that prevents information leakage from users to unwanted entities, such as adversaries or spammers. The novelty of the model is mainly its choice of integrated factors for user credibility and connection credibility, making it very useful for different OSN flow-control decisions and security permissions.
Ehud Gudes, Nadav Voloch
Detecting and Coloring Anomalies in Real Cellular Network Using Principle Component Analysis
Abstract
Anomaly detection in a communication network is a powerful tool for predicting faults, detecting network sabotage attempts and learning user profiles for marketing purposes and quality of services improvements. In this article, we convert the unsupervised data mining learning problem into a supervised classification problem. We will propose three methods for creating an associative anomaly within a given commercial traffic data database and demonstrate how, using the Principle Component Analysis (PCA) algorithm, we can detect the network anomaly behavior and classify between a regular data stream and a data stream that deviates from a routine, at the IP network layer level. Although the PCA method was used in the past for the task of anomaly detection, there are very few examples where such tasks were performed on real traffic data that was collected and shared by a commercial company.
The article presents three interesting innovations: The first one is the use of an up-to-date database produced by the users of an international communications company. The dataset for the data mining algorithm retrieved from a data center which monitors and collects low-level network transportation log streams from all over the world. The second innovation is the ability to enable the labeling of several types of anomalies, from untagged datasets, by organizing and prearranging the database. The third innovation is the abilities, not only to detect the anomaly but also, to coloring the anomaly type. I.e., identification, classification and labeling some forms of the abnormality.
Yoram Segal, Dan Vilenchik, Ofer Hadar
Self-stabilizing Byzantine Tolerant Replicated State Machine Based on Failure Detectors
Abstract
Byzantine Fault Tolerant (BFT) replication leverages highly available cloud services and can facilitate the implementation of distributed ledgers, e.g., the blockchain. Systems providing BFT State Machine Replication (SMR) work under severe system assumptions, for example, that less than a third of replicas may suffer a Byzantine failure. Infrequent arbitrary violations of such design assumptions, may lead the system to an unintended state, and render it unavailable thereafter, requiring human intervention. Self-stabilization is a highly desirable system property that can complement Byzantine fault tolerant systems, and allow them to both tolerate Byzantine-failures and automatically recovery from any unintended state that assumption violations may lead to.
This paper contributes the first self-stabilizing State Machine Replication service that is based on failure detectors. We suggest an implementable self-stabilizing failure detector to monitor both responsiveness and the replication progress. We thus encapsulate weaker synchronization guarantees than the previous self-stabilizing BFT SMR solution. We follow the seminal paper by Castro and Liskov of Practical Byzantine Fault Tolerance and focus on the self-stabilizing perspective. This work can aid towards building distributed blockchain system infrastructure enhanced with the self-stabilization design criteria.
Shlomi Dolev, Chryssis Georgiou, Ioannis Marcoullis, Elad M. Schiller
Brief Announcement: Providing End-to-End Secure Communication in Low-Power Wide Area Networks
Abstract
Recent technologies for low-rate, long-range transmission in unlicensed sub-GHz frequency bands enables the realization of Long-range Wide Area Network. Despite the rapid uptake of LPWANs, security concerns arising from the open architecture and usage of the unlicensed band are also growing. While the current LPWAN deployments include basic techniques to deal with end-to-end encryption there are specific security issues that arise due to the overall architecture and protocol layer design. In this paper, a new scheme to establish end-to-end secure communication in long-range IoT deployments is introduced. The advantages over the existing approaches and architectural design are presented in the context of typical smart cities application scenarios.
Ioannis Chatzigiannakis, Vasiliki Liagkou, Paul G. Spirakis
Privacy via Maintaining Small Similitude Data for Big Data Statistical Representation
Abstract
Despite its attractiveness, Big Data oftentimes is hard, slow and expensive to handle due to its size. Moreover, as the amount of collected data grows, individual privacy raises more and more concerns: “what do they know about me?” Different algorithms were suggested to enable privacy-preserving data release with the current de-facto standard differential privacy. However, the processing time of keeping the data private is inhibiting and currently not practical for every day use. Combined with the continuously growing data collection, the solution is not seen on a horizon.
In this research, we suggest replacing the Big Data with a much smaller similitude model. The model “resembles” the data with respect to a class of query. The user defines the maximum acceptable error and privacy requirements ahead of the query execution. Those requirements define the minimal size of the similitude model. The suggested method is demonstrated by using a wavelet transform and then by pruning the tree according to both the data reduction and the privacy requirements. We propose methods of combining the noise required for privacy preservation with noise of similitude model, that allow us to decrease the amount of added noise thus, improving the utilization of the method.
Philip Derbeko, Shlomi Dolev, Ehud Gudes
Highway State Gating for Recurrent Highway Networks: Improving Information Flow Through Time
Abstract
Recurrent Neural Networks (RNNs) play a major role in the field of sequential learning, and have outperformed traditional algorithms on many benchmarks. Training deep RNNs still remains a challenge, and most of the state-of-the-art models are structured with a transition depth of 2–4 layers. Recurrent Highway Networks (RHNs) were introduced in order to tackle this issue. These have achieved state-of-the-art performance on a few benchmarks using a depth of 10 layers. However, the performance of this architecture suffers from a bottleneck, and ceases to improve when an attempt is made to add more layers. In this work, we analyze the causes for this, and postulate that the main source is the way that the information flows through time. We introduce a novel and simple variation for the RHN cell, called Highway State Gating (HSG), which allows adding more layers, while continuing to improve performance. By using a gating mechanism for the state, we allow the net to “choose” whether to pass information directly through time, or to gate it. This mechanism also allows the gradient to back-propagate directly through time and, therefore, results in a slightly faster convergence. We use the Penn Treebank (PTB) dataset as a platform for empirical proof of concept. Empirical results show that the improvement due to Highway State Gating is for all depths, and as the depth increases, the improvement also increases.
Ron Shoham, Haim Permuter
Secured Data Gathering Protocol for IoT Networks
Abstract
Data collection in Wireless Sensor Networks (WSN) and specifically in the Internet of Things (IoT) networks draws significant attention both by the industrial and academic communities. Numerous Medium Access Control (MAC) protocols for WSN have been suggested over the years, designed to cope with a variety of setups and objectives. However, most IoT devices are only required to exchange very little information (typically one out of several predetermined messages), and do so only sporadically. Furthermore, only a small subset (which is not necessarily known a priori) intends to transmit at any given time. Accordingly, a tailored protocol is much more suited than the existing general purpose WSN protocols. In many IoT applications securing the data transmitted and the identity of the transmitting devices is critical. However, security in such IoT networks is highly challenging since the devices are typically very simple, with highly constrained capabilities, e.g., limited memory and computational power or no sophisticated algorithmic capabilities, which make the utilization of complex cryptographic primitives unfeasible. Furthermore, note that in many such applications, securing the information transmitted is not sufficient, since knowing the transmitters identity conveys a lot of information (e.g., the identity of a hazard detector conveys the information that a threat was detected).
In this paper, we design and analyze an efficient secure data collection protocol based on information theoretic principles, in which an eavesdropper observing only partial information sent on the channel cannot gain significant information on the transmitted messages or even on the identity of the devices that sent these messages. In the suggested protocol, the sink collects messages from up to K sensors simultaneously, out of a large population of sensors, without knowing in advance which sensors will transmit, and without requiring any synchronization, coordination or management overhead. In other words, neither the sink nor the other sensors need to know who are the actively transmitting sensors, and this data is decoded directly from the channel output. We provide a simple secure codebook construction with very efficient and simple encoding and decoding procedures.
Alejandro Cohen, Asaf Cohen, Omer Gurewitz
Towards Building Active Defense Systems for Software Applications
Abstract
Over the last few years, cyber attacks have become increasingly sophisticated. PDF malware – a continuously effective method of attack due to the difficulty of classifying malicious files – is a popular target of study within the field of machine learning for cybersecurity. The obstacles to using machine learning are many: attack patterns change over time as attackers change their behavior (sometimes automatically), and application security systems are deployed in a highly resource-constrained environments, meaning that an accurate but time-consuming machine learning cannot be deployed.
Motivated by these challenges, we propose an active defender system to adapt to evasive PDF malware in a resource-constrained environment. We observe this system to improve the \(f_1\) score from 0.17535 to 0.4562 over five stages of receiving unlabeled PDF files. Furthermore, average classification time per file is low across all 5 stages, and is reduced from an average of 1.16908 s per file to 1.09649 s per file. Beyond classifying malware, we provide a general active defender framework that can be used to deploy decision systems for a variety of applications operating under resource-constrained environments with adversaries.
Zara Perumal, Kalyan Veeramachaneni
Secure Non-interactive User Re-enrollment in Biometrics-Based Identification and Authentication Systems
Abstract
Recent years have witnessed an increase in demand for biometrics based identification, authentication and access control (BIA) systems, which offer convenience, ease of use, and (in some cases) improved security. In contrast to other methods, such as passwords or pins, BIA systems face new unique challenges; chiefly among them is ensuring long-term confidentiality of biometric data stored in backends, as such data has to be secured for the lifetime of an individual. Cryptographic approaches such as Fuzzy Extractors (FE) and Fuzzy Vaults (FV) have been developed to address this challenge. FE/FV do not require storing any biometric data in backends, and instead generate and store helper data that enables BIA when a new biometric reading is supplied. Security of FE/FV ensures that an adversary obtaining such helper data cannot (efficiently) learn the biometric. Relying on such cryptographic approaches raises the following question: what happens when helper data is lost or destroyed (e.g., due to a failure, or malicious activity), or when new helper data has to be generated (e.g., in response to a breach or to update the system)? Requiring a large number of users to physically re-enroll is impractical, and the literature falls short of addressing this problem. In this paper we develop SNUSE, a secure computation based approach for non-interactive re-enrollment of a large number of users in BIA systems. We prototype SNUSE to illustrate its feasibility, and evaluate its performance and accuracy on two biometric modalities, fingerprints and iris scans. Our results show that thousands of users can be securely re-enrolled in seconds without affecting the accuracy of the system.
Ivan De Oliveira Nunes, Karim Eldefrawy, Tancrède Lepoint
Brief Announcement: Image Authentication Using Hyperspectral Layers
Abstract
Access control systems using face recognition, are widely implemented. This technic lacks the ability to bypass it. To avoid it, an authentication process is required. In this paper we propose a new security image-signature, which authenticates the given image. The proposed signature is generated from the corresponding hyperspectral image layers. The process extracts unique patterns from the hyperspectral layers, these are collected to build a unique biometric signature for the related person. Experiments show the potential of enhancing image authentication using the proposed signature.
Guy Leshem, Menachem Domb
Brief Announcement: Graph-Based and Probabilistic Discrete Models Used in Detection of Malicious Attacks
Abstract
Design of program secure systems is connected with choice of mathematical models of the systems. A widely-used approach to malware detection (or classification as “benign-malicious”) is based on the system calls traces similarity measurement. Presently both the set-theoretical metrics (for example, Jaccard similarity, the Edit (Levenshtein) distance (ED) [1]) between the traces of system calls and the Markov chain based models of attack effect are used. Jaccard similarity is used when the traces are considered as a non-ordering set. The Edit Distance, namely, the minimal number of edit operations (delete, insert and substitute of a single symbol) required to convert one sequence to the other, is used as it reflects the traces ordering and semantics. However, the time and space complexity of the edit distance between two strings requires quadratic (in symbol numbers) complexity [1]. The traces can also be represented as a system calls graphs [2], the nodes of which are the system calls (or the items of the q-grams [1]). That is, we can consider the traces description by the ordered string as a partial case of the graph representation, for which it is possible to use the same similarity metrics with the same computational complexity.
This work demonstrates a framework for combining both graph-based and probabilistic models enabling both the analysis of the system robustness to malicious attacks and malicious codes recognition and detection.
Sergey Frenkel, Victor Zakharov
Intercepting a Stealthy Network
Abstract
We investigate an understudied threat: networks of stealthy routers (S-Routers), communicating across a restricted area. S-Routers use short-range, low-energy communication, detectable only by nearby devices.
We examine algorithms to intercept S-Routers, using one or more mobile devices, called Interceptors. We focus on Destination-Search scenarios, in which the goal of the Interceptors is to find a (single) destination S-Router, by detecting transmissions along one or more paths from a given (single) source S-Router. We evaluate the algorithms analytically and experimentally (simulations), including against a parametric, optimized S-Routers algorithm.
Our main result is an Interceptors algorithm which bounds the expected time until the destination is found to \(O\left( \frac{N}{\hat{B}} \log ^2(N)\right) \), where \(N\) is the number of S-Routers and \(\hat{B}\) is the average rate of transmission.
Mai Ben Adar Bessos, Amir Herzberg
Privacy in e-Shopping Transactions: Exploring and Addressing the Trade-Offs
Abstract
The huge growth of e-shopping has brought convenience to customers, increased revenue to merchants and financial entities and evolved to possess a rich set of functionalities and requirements (e.g., regulatory ones). However, enhancing customer privacy remains to be a challenging problem; while it is easy to create a simple system with privacy, this typically causes loss of functions.
In this work, we look into current e-shopping infrastructures and aim at enhancing customer privacy while retaining important features and requiring the system to maintain the topology and transaction flow of established e-shopping systems that are currently operational. Thus, we apply what we call the “utility, privacy, and then utility again” paradigm: we start from the state of the art of e-shopping (utility); then we add privacy enhancing mechanisms, reducing its functionality in order to tighten privacy to the fullest (privacy); and finally, we incorporate tools which add back lost features, carefully relaxing privacy this time (utility again).
We also implemented and tested our design, verifying its reasonable added costs.
Jesus Diaz, Seung Geol Choi, David Arroyo, Angelos D. Keromytis, Francisco B. Rodriguez, Moti Yung
Detection in the Dark – Exploiting XSS Vulnerability in C&C Panels to Detect Malwares
Abstract
Numerous defense techniques exist for preventing and detecting malware on end stations and servers (endpoints). Although these techniques are widely deployed on enterprise networks, many types of malware manage to stay under the radar, executing their malicious actions time and again. Therefore, a more creative and effective solution is necessary, especially as classic threat detection techniques do not utilize all stages of the attack kill chain in their attempt to detect malicious behavior on endpoints.
In this paper, we propose a novel approach for detecting malware. Our approach uses offensive and defensive techniques for detecting active malware attacks by exploiting the vulnerabilities of their command and control panels and manipulating significant values in the operating systems of endpoints – in order to attack these panels and utilize trusted communications between them and the infected machine.
Shay Nachum, Assaf Schuster, Opher Etzion
A Planning Approach to Monitoring Computer Programs’ Behavior
Abstract
We describe a novel approach to monitoring high level behaviors using concepts from AI planning. Our goal is to understand what a program is doing based on its system call trace. This ability is particularly important for detecting malware. We approach this problem by building an abstract model of the operating system using the STRIPS planning language, casting system calls as planning operators. Given a system call trace, we simulate the corresponding operators on our model and by observing the properties of the state reached, we learn about the nature of the original program and its behavior. Thus, unlike most statistical detection methods that focus on syntactic features, our approach is semantic in nature. Therefore, it is more robust against obfuscation techniques used by malware that change the outward appearance of the trace but not its effect. We demonstrate the efficacy of our approach by evaluating it on actual system call traces.
Alexandre Cukier, Ronen I. Brafman, Yotam Perkal, David Tolpin
One-Round Secure Multiparty Computation of Arithmetic Streams and Functions
(Extended Abstract)
Abstract
Efficient secure multiparty computation (SMPC) schemes over secret shares are presented. We consider scenarios in which the secrets are elements of a finite field, \(\mathbb {F}_{p}\), and are held and shared by a single participant, the user. Evaluation of any function \(f:\mathbb {F}_{p}^n\rightarrow \mathbb {F}_{p}\) is implemented in one round of communication by representing f as a multivariate polynomial. Our schemes are based on partitioning secrets to sums or products of random elements of the field. Secrets are shared using either (multiplicative) shares whose product is the secret or (additive) shares that sum up to the secret. Sequences of additions of secrets are implemented locally by addition of local shares, requiring no communication among participants, and so does sequences of multiplications of secrets. The shift to handle a sequence of additions from the execution of multiplications or vice versa is efficiently handled as well with no need to decrypt the secrets in the course of the computation. On each shift from multiplications to additions or vice versa, the current set of participants is eliminated, and a new set of participants becomes active. Assuming no coalitions among the active participants and the previously eliminated participants are possible, our schemes are information-theoretically secure with a threshold of all active participants. Our schemes can also be used to support SMPC of boolean circuits.
Dor Bitan, Shlomi Dolev
Brief Announcement: Gradual Learning of Deep Recurrent Neural Network
Abstract
Deep Recurrent Neural Networks (RNNs) achieve state-of-the-art results in many sequence-to-sequence modeling tasks. However, deep RNNs are difficult to train and tend to suffer from overfitting. Motivated by the Data Processing Inequality (DPI) we formulate the multi-layered network as a Markov chain, introducing a training method that comprises training the network gradually and using layer-wise gradient clipping. In total, we have found that applying our methods combined with previously introduced regularization and optimization methods resulted in improvement to the state-of-the-art architectures operating in language modeling tasks.
Ziv Aharoni, Gal Rattner, Haim Permuter
Brief Announcement: Adversarial Evasion of an Adaptive Version of Western Electric Rules
Abstract
Western-Electric are one of the earliest, and widely used, anomaly detection rules. In this paper we describe an adaptive scenario using these rules and show how a malicious player can optimally fabricate data to deceive the algorithm to enlarge the standard deviation of the data while avoiding being detected.
Oded Margalit
Brief Announcement: Deriving Context for Touch Events
Abstract
To quantify the amount of high-level context information which can be derived by observing only a user’s touchscreen interactions, we performed a user study, in which we recorded 160 touch interaction sessions from users running different applications, and then applied both classical machine learning methods and deep learning methods to the results. Our results show that it is possible to derive higher-level user context information based on touch events alone, validating the efficacy of touch injection attacks.
Moran Azran, Niv Ben Shabat, Tal Shkolnik, Yossi Oren
Backmatter
Metadaten
Titel
Cyber Security Cryptography and Machine Learning
herausgegeben von
Itai Dinur
Dr. Shlomi Dolev
Sachin Lodha
Copyright-Jahr
2018
Electronic ISBN
978-3-319-94147-9
Print ISBN
978-3-319-94146-2
DOI
https://doi.org/10.1007/978-3-319-94147-9

Premium Partner