Skip to main content
Top

2023 | Book

Machine Learning under Malware Attack

insite
SEARCH

About this book

Machine learning has become key in supporting decision-making processes across a wide array of applications, ranging from autonomous vehicles to malware detection. However, while highly accurate, these algorithms have been shown to exhibit vulnerabilities, in which they could be deceived to return preferred predictions. Therefore, carefully crafted adversarial objects may impact the trust of machine learning systems compromising the reliability of their predictions, irrespective of the field in which they are deployed. The goal of this book is to improve the understanding of adversarial attacks, particularly in the malware context, and leverage the knowledge to explore defenses against adaptive adversaries. Furthermore, to study systemic weaknesses that can improve the resilience of machine learning models.

Table of Contents

Frontmatter

The Beginnings of Adversarial ML

Frontmatter
Chapter 1. Introduction
Abstract
Generally, it is difficult to imagine the world today without the influence of Machine Learning (ML) from improving medical diagnosis [RLJ20] to autonomous vehicles [Jan+20]. Although ML models have been ubiquitously deployed to make life easier, not all of the algorithms have been vetted enough to ensure their safety, which is an often neglected aspect when designing solutions.
Raphael Labaca-Castro
Chapter 2. Background
Abstract
In this section, we address some important concepts that form the basis for our research and are required for the understanding of this work. We start by defining the origin of malicious applications to understand their impact. Next, we present the PE format, which is the type of binary that we use as an input object during our experimental evaluation. We then explore AML and review the literature, starting with early implementations in the security domain, ranging from spam filtering to the malware classification problem.
Raphael Labaca-Castro

Framework for Adversarial Malware Evaluation

Frontmatter
Chapter 3. FAME
Abstract
On the basis of the literature evaluated in Chapter 2 and the limitations outlined in Table 2.1, we introduce our Framework for Adversarial Malware Evaluation (FAME) [LR22], which can be observed in Fig. 1.1 under a homonymous name. We define the notation and threat model for our research by describing the adversary’s knowledge, objectives, and capabilities. Since the requirements vary depending on the attack settings, they will be presented individually in the subsequent modules of Part III.
Raphael Labaca-Castro

Problem-Space Attacks

Frontmatter
Chapter 4. Stochastic Method
Abstract
As introduced in Part II, FAME is built in a modular fashion to allow increased compatibility with further extensions.
Raphael Labaca-Castro
Chapter 5. Genetic Programming
Abstract
In Chapter 4, we explored how adversarial examples in the context of PE malware can be generated automatically, without intervention, using random sequences of transformations. However, as outlined in § 4.3, the integrity verification rates are indirectly proportional to the length of the attack vector and the level of convergence does not increase over time. Moreover, although cloud-based aggregators provide substantial information about prediction labels, they are expensive to implement.
Raphael Labaca-Castro
Chapter 6. Reinforcement Learning
Abstract
After exploring evolutionary approaches, such as genetic algorithms (i.e., AIMED in Chapter 5), we observed that strong improvements are achieved compared to regular stochastic methods (i.e., ARMED in Chapter 4).
Raphael Labaca-Castro
Chapter 7. Universal Attacks
Abstract
In Chapters 4—6, we explored input-specific attacks, in which a committed adversary needs to calculate the best sequence of perturbations for every object.
Raphael Labaca-Castro

Feature-Space Attacks

Frontmatter
Chapter 8. Gradient Optimization
Abstract
Since further domains, such as computer vision, highly consider white-box settings using, for example, gradient attacks, we also investigated these for malware classification.
Raphael Labaca-Castro
Chapter 9. Generative Adversarial Nets
Abstract
Goodfellow et al. [Goo+14] introduced the concept of GANs, which consist of two neural networks trained by competing (collaborating) with each other in order to optimize opposite goals in a zero-sum game framework. Although in this scenario one network profits from the other’s loss, GANs become better at their predictions by means of cooperation rather than competitiveness. The generator learns the statistics from the training set to produce new synthetic data, whereas the discriminator assumes the role of an evaluator that determines whether the input received from the former is real or synthetic.
Raphael Labaca-Castro

Benchmark & Defenses

Frontmatter
Chapter 10. Comparison of Strategies
Abstract
Having properly introduced each module of FAME, we now proceed to evaluate the problem-space attack suite with a holistic approach in order to better understand the advantages and disadvantages of each strategy. As depicted in Fig. 1.1, our goal is to create an initial benchmark to evaluate adversarial examples in the context of malware using PE files. Notably, since gradient-based approaches (i.e., GRIPE in Chapter 8) and GAN attacks (i.e., GAINED in Chapter 9) generate feature-space adversarial examples, we did not include them in the main suite of realizable attacks implemented in FAME.
Raphael Labaca-Castro
Chapter 11. Towards Robustness
Abstract
Having shown that ML models are susceptible to adversarial examples, we now explore potential strategies to harden malware classifiers, as displayed under Defenses in Fig. 1.1. In general, the success metric evaluated is based on minimizing the FNR, that is, the number of adversarial examples that effectively bypass the classifier. By measuring the baseline evasion rate and comparing it to the hardened model, we can establish the success of the defense mechanism implemented.
Raphael Labaca-Castro

Closing Remarks

Frontmatter
Chapter 12. Conclusions & Outlook
Abstract
In this chapter, we conclude our work by providing a brief summary of our research and the main contributions presented. Next, we reflect on the lessons learned by connecting the dots and provide potential directions for future work.
Raphael Labaca-Castro
Backmatter
Metadata
Title
Machine Learning under Malware Attack
Author
Raphael Labaca-Castro
Copyright Year
2023
Electronic ISBN
978-3-658-40442-0
Print ISBN
978-3-658-40441-3
DOI
https://doi.org/10.1007/978-3-658-40442-0

Premium Partner