Skip to main content

Über dieses Buch

Advances in artificial intelligence, sensor computing, robotics, and mobile systems are making autonomous systems a reality. At the same time, the influence of edge computing is leading to more distributed architectures incorporating more autonomous elements. The flow of information is critical in such environments, but the real time, distributed nature of the system components complicates the data protection mechanisms. Policy-based management has proven useful in simplifying the complexity of management in domains like networking, security, and storage; it is expected that many of those benefits would carry over to the task of managing big data and autonomous systems.

This book aims at providing an overview of recent work and identifying challenges related to the design of policy-based approaches for managing big data and autonomous systems. An important new direction explored in the book is to make the major elements of the system self-describing and self-managing. This would lead to architectures where policy mechanisms are tightly coupled with the system elements. In such integrated architectures, we need new models for information assurance, traceability of information, and better provenance on information flows. In addition when dealing with devices with actuation capabilities and, thus, being able to make changes to physical spaces, safety is critical. With an emphasis on policy-based mechanisms for governance of data security and privacy, and for safety assurance, the papers in this volume follow three broad themes: foundational principles and use-cases for the autonomous generation of policies; safe autonomy; policies and autonomy in federated environments.



Systems, Use-Cases and Foundational Principles Underlying Generative Policies


AGENP: An ASGrammar-based GENerative Policy Framework

Generative policies have been proposed as a mechanism to learn the constraints and preferences of a system—especially complex systems such as the ones found in coalitions—in a given context so that the system can adapt to unexpected changes seamlessly, thus achieving the system goals with minimal human intervention. Generative policies can help a coalition system to be more effective when working in a distributed, continuously transforming environment with a diverse set of members, resources, and tasks. Learning mechanisms based on logic programming, e.g., Inductive Logic Programming (ILP), have several properties that make them suitable and attractive for the creation and adaptation of generative policies, such as the ability to learn a general model from a small number of examples, and being able to incorporate existing background knowledge. ILP has recently been extended with the introduction of systems for Inductive Learning of Answer Set Programs (ILASP) which are capable of supporting automated acquisition of complex knowledge such as constraints, preferences and rule-based models. Motivated by the capabilities of ILASP, we present AGENP, an Answer Set Grammar-based Generative Policy Framework for Autonomous Managed Systems (AMS) that aims to support the creation and evolution of generative policies by leveraging ILASP. We describe the framework components, i.e., inputs, data structures, mechanisms to support the refinement and instantiation of policies, identification of policy violations, monitoring of policies, and policy adaptation according to changes in the AMS and its context. Additionally, we present the main work-flow for the global and local refinement of policies and their adaptation based on Answer Set Programming (ASP) for policy representation and reasoning using ILASP. We then discuss an application of the AGENP framework and present preliminary results.
Seraphin Calo, Irene Manotas, Geeth de Mel, Daniel Cunnington, Mark Law, Dinesh Verma, Alessandra Russo, Elisa Bertino

Value of Information: Quantification and Application to Coalition Machine Learning

The creation of good machine learning models relies on the availability of good training data. In coalition settings, this training data may be obtained from many different coalition partners. However, due to the difference in the trust level of the coalition partners, the value of the information provided by the coalition partners could be questionable. In this paper, we examine the concept of Value of Information, provide a quantitative measure for it, and show how this can be used to determine the policies for information fusion in the training of machine learning models.
Gavin Pearson, Dinesh Verma, Geeth de Mel

Self-Generating Policies for Machine Learning in Coalition Environments

In any machine learning problem, obtaining and acquiring good training data is the main challenge that needs to be overcome to build a good model. When applying machine learning approaches in the context of coalition operations, one may only be able to get data for training machine learning models from coalition partners. However, all coalition partners may not be equally trusted, thus the task of deciding when, and when not, to accept training data for coalition operations remain complex. Policies can provide a mechanism for making these decisions but determining the right policies may be difficult given the variability of the environment. Motivated by this observation, in this paper, we propose an architecture that can generate policies required for building a machine learning model in a coalition environment without a significant amount of human input.
Dinesh Verma, Seraphin Calo, Shonda Witherspoon, Irene Manotas, Elisa Bertino, Amani M. Abu Jabal, Greg Cirincione, Ananthram Swami, Gavin Pearson, Geeth de Mel

Approaches and Techniques for Safe Autonomy


Can N-Version Decision-Making Prevent the Rebirth of HAL 9000 in Military Camo? Using a “Golden Rule” Threshold to Prevent AI Mission Individuation

The promise of AIs that can target, shoot at, and eliminate enemies in the blink of an eye, brings about the possibility that such AIs can turn rogue and create an adversarial “Skynet.” The main danger is not that AIs might turn against us because they hate us, but because they think they want to be like us: individuals. The solution might be to treat them like individuals. This should include the right and obligation to do unto others as any AI would want other AIs or humans to do unto them. Technically, this involves an N-version decision making process that takes into account not how good or efficient the decision of an AI is, but how likely the AI is to show algorithmic “respect” to other AIs or human rules and operators. In this paper, we discuss a possible methodology for deploying AI decision making that uses multiple AI actors to check on each other to prevent “mission individuation,” i.e., the AIs wanting to complete the mission even if the human operators are sacrificed. The solution envisages mechanisms that demand the AIs to “do unto others as others would do onto them” in making final solutions. This should encourage AIs to accept critique and censoring in certain situations and most important it should lead to decisions that protect both human operators and the final goal of the mission.
Sorin Adam Matei, Elisa Bertino

Simulating User Activity for Assessing Effect of Sampling on DB Activity Monitoring Anomaly Detection

Monitoring database activity is useful for identifying and preventing data breaches. Such database activity monitoring (DAM) systems use anomaly detection algorithms to alert security officers to possible infractions. However, the sheer number of transactions makes it impossible to track each transaction. Instead, solutions use manually crafted policies to decide which transactions to monitor and log. Creating a smart data-driven policy for monitoring transactions requires moving beyond manual policies. In this paper, we describe a novel simulation method for user activity. We introduce events of change in the user transaction profile and assess the impact of sampling on the anomaly detection algorithm. We found that looking for anomalies in a fixed subset of the data using a static policy misses most of these events since low-risk users are ignored. A Bayesian sampling policy identified 67% of the anomalies while sampling only 10% of the data, compared to a baseline of using all of the data.
Hagit Grushka-Cohen, Ofer Biller, Oded Sofer, Lior Rokach, Bracha Shapira

FADa-CPS—Faults and Attacks Discrimination in Cyber Physical Systems

Running autonomous cyber physical systems (CPSs) in a safe way entails several complex activities that include monitoring the system for ongoing attacks or faults. Root cause analysis is a technique used to identify the initial cause of a cascading sequence of faults affecting a complex system. In this paper we introduce FADa-CPS, an architecture for root cause analysis in CPSs whose goal is identifying and localizing faults caused either by natural events or by attacks. The architecture is designed to be flexible such to adapt to evolving monitored systems.
Pierpaolo Bo, Alessandro Granato, Marco Ernesto Mancuso, Claudio Ciccotelli, Leonardo Querzoni

Techniques and Systems for Anomaly Detection in Database Systems

Techniques for detection of anomalies in accesses to database systems have been widely investigated. Existing techniques operate in two main phases. The first phase is a training phase during which profiles of the database subjects are created based on historical data representing past users’ actions. New actions are then checked with these profiles to detect deviations from the expected normal behavior. Such deviations are considered indicators of possible attacks and may thus require further analyses. The existing techniques have considered different categories of features to describe users’ actions and followed different methodologies and algorithms to build access profiles and track users’ behaviors. In this chapter, we review the prominent techniques and systems for anomaly detection in database systems. We discuss the attacks they help detect as well as their limitations and possible extensions. We also give directions on potential future research.
Asmaa Sallam, Elisa Bertino

Policies and Autonomy in Federated and Distributed Environments


Towards Enabling Trusted Artificial Intelligence via Blockchain

Machine Learning and Artificial Intelligence models are created, trained and used by different entities. The entity that curates data used for the model is frequently different from the entity that trains the model, which is different yet again from the end user of the trained model. The end user needs to trust the received AI model, and this requires having the provenance information about how the model was trained, and the data the model was trained on. This chapter describes how blockchain can be used to track the provenance of training models, leading to better trusted Artificial Intelligence.
Kanthi Sarpatwar, Roman Vaculin, Hong Min, Gong Su, Terry Heath, Giridhar Ganapavarapu, Donna Dillenberger

Secure Model Fusion for Distributed Learning Using Partial Homomorphic Encryption

Distributed learning has emerged as a useful tool for analyzing data stored in multiple geographic locations, especially when the distributed data sets are large and hard to move around, or the data owner is reluctant to put data into the Cloud due to privacy concerns. In distributed learning, only the locally computed models are uploaded to the fusion server, which however may still cause privacy issues since the fusion server could implement various inference attacks from its observations. To address this problem, we propose a secure distributed learning system that aims to utilize the additive property of partial homomorphic encryption to prevent direct exposure of the computed models to the fusion server. Furthermore, we propose two optimization mechanisms for applying partial homomorphic encryption to model parameters in order to improve the overall efficiency. Through experimental analysis, we demonstrate the effectiveness of our proposed mechanisms in practical distributed learning systems. Furthermore, we analyze the relationship between the computational time in the training process and several important system parameters, which can serve as a useful guide for selecting proper parameters for balancing the trade-off among model accuracy, model security and system overhead.
Changchang Liu, Supriyo Chakraborty, Dinesh Verma

Policy-Based Identification of IoT Devices’ Vendor and Type by DNS Traffic Analysis

The explosive growth of IoT devices and the weak security protection in some types of devices makes them an attractive target for attackers. IoT devices can become a vulnerable weak link for penetrating a secure IT infrastructure. The risks are exacerbated by the Bring-Your-Own-Device trend that allows employees to connect their own personal devices into an enterprise network. Currently, network administrators lack adequate tools to discover and manage IoT devices in their environments. A good tool to address this requirement can be created by adapting and applying natural language interpretation algorithms to network traffic. In this paper, we show that an application of algorithms like Term Frequency - Inverse Document Frequency (TF-IDF) to the domain name resolution process, a required first step in every Internet based communication, can be highly effective to determine IoT devices, their manufacturers and their type. By treating the domain names being resolved as words, and the set of domain names queried by a device as a document, then comparing these synthetic documents from a reference data set to real traffic results in a very effective approach for IoT discovery. Evaluation of our approach on a traffic data set shows that the approach can identify 84% of the instances, with an accuracy of 91% for the IoT devices’ vendor, and 100% of the instances with an accuracy of 94% for the IoT devices’ type. We believe that this is the first attempt to apply natural language processing algorithms for traffic analysis, and the promising results could open new venues for securing and understanding computer networks through natural language processing algorithms. These and other techniques require policies to determine how the large volume of data will be handled efficiently. By assisting in detecting potential malicious devices, this paper contributes to the topic of safe autonomy.
Franck Le, Jorge Ortiz, Dinesh Verma, Dilip Kandlur

Redundancy as a Measure of Fault-Tolerance for the Internet of Things: A Review

In this paper we review and analyze redundancy-based fault-tolerant techniques for the IoT as a paradigm to support two of the main goals of computer security: availability and integrity. We organized the presentation in terms of the three main tasks performed by the nodes of an IoT network: sensing, routing, and control. We first discuss how the implementation of fault-tolerance in the three areas is primary for the correct operation of an entire system. We provide an overview of the different approaches that have been used to address failures in sensing and routing. Control devices typically implement state machines that take decisions based on the measurement of sensors and may also ask actuators to execute actions. Traditionally state-machine replication for fault-tolerance is realized through consensus protocols. Most protocols were developed in the 80’s and 90’s. We will review the properties of such protocols in detail and discuss their limitations for the IoT. Since 2008, consensus algorithms took a new direction with the introduction of the concept of blockchain. Standard blockchain based protocols cannot be applied without modifications to support fault-tolerance in the IoT. We will review some recent results in this new class of algorithms, and show how they can provide the flexibility required to support fault-tolerance in control devices, and thus overcome some of the limitations of the traditional consensus protocols.
Antonino Rullo, Edoardo Serra, Jorge Lobo


Weitere Informationen

Premium Partner