Top

2023 | Book

Machine Learning under Malware Attack

Author: Raphael Labaca-Castro

Publisher: Springer Fachmedien Wiesbaden

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

Machine learning has become key in supporting decision-making processes across a wide array of applications, ranging from autonomous vehicles to malware detection. However, while highly accurate, these algorithms have been shown to exhibit vulnerabilities, in which they could be deceived to return preferred predictions. Therefore, carefully crafted adversarial objects may impact the trust of machine learning systems compromising the reliability of their predictions, irrespective of the field in which they are deployed. The goal of this book is to improve the understanding of adversarial attacks, particularly in the malware context, and leverage the knowledge to explore defenses against adaptive adversaries. Furthermore, to study systemic weaknesses that can improve the resilience of machine learning models.

Frontmatter

The Beginnings of Adversarial ML

Frontmatter

Chapter 1. Introduction

Abstract

Generally, it is difficult to imagine the world today without the influence of Machine Learning (ML) from improving medical diagnosis [RLJ20] to autonomous vehicles [Jan+20]. Although ML models have been ubiquitously deployed to make life easier, not all of the algorithms have been vetted enough to ensure their safety, which is an often neglected aspect when designing solutions.

Raphael Labaca-Castro

Chapter 2. Background

Abstract

In this section, we address some important concepts that form the basis for our research and are required for the understanding of this work. We start by defining the origin of malicious applications to understand their impact. Next, we present the PE format, which is the type of binary that we use as an input object during our experimental evaluation. We then explore AML and review the literature, starting with early implementations in the security domain, ranging from spam filtering to the malware classification problem.

Raphael Labaca-Castro

Framework for Adversarial Malware Evaluation

Frontmatter

Chapter 3. FAME

Abstract

On the basis of the literature evaluated in Chapter 2 and the limitations outlined in Table 2.1, we introduce our Framework for Adversarial Malware Evaluation (FAME) [LR22], which can be observed in Fig. 1.1 under a homonymous name. We define the notation and threat model for our research by describing the adversary’s knowledge, objectives, and capabilities. Since the requirements vary depending on the attack settings, they will be presented individually in the subsequent modules of Part III.

Raphael Labaca-Castro

Problem-Space Attacks

Frontmatter

Chapter 4. Stochastic Method

Abstract

As introduced in Part II, FAME is built in a modular fashion to allow increased compatibility with further extensions.

Raphael Labaca-Castro

Chapter 5. Genetic Programming

Abstract

In Chapter 4, we explored how adversarial examples in the context of PE malware can be generated automatically, without intervention, using random sequences of transformations. However, as outlined in § 4.3, the integrity verification rates are indirectly proportional to the length of the attack vector and the level of convergence does not increase over time. Moreover, although cloud-based aggregators provide substantial information about prediction labels, they are expensive to implement.

Raphael Labaca-Castro

Chapter 6. Reinforcement Learning

Abstract

After exploring evolutionary approaches, such as genetic algorithms (i.e., AIMED in Chapter 5), we observed that strong improvements are achieved compared to regular stochastic methods (i.e., ARMED in Chapter 4).

Raphael Labaca-Castro

Chapter 7. Universal Attacks

Abstract

In Chapters 4—6, we explored input-specific attacks, in which a committed adversary needs to calculate the best sequence of perturbations for every object.

Raphael Labaca-Castro

Feature-Space Attacks

Frontmatter

Chapter 8. Gradient Optimization

Abstract

Since further domains, such as computer vision, highly consider white-box settings using, for example, gradient attacks, we also investigated these for malware classification.

Raphael Labaca-Castro

Chapter 9. Generative Adversarial Nets

Abstract

Goodfellow et al. [Goo+14] introduced the concept of GANs, which consist of two neural networks trained by competing (collaborating) with each other in order to optimize opposite goals in a zero-sum game framework. Although in this scenario one network profits from the other’s loss, GANs become better at their predictions by means of cooperation rather than competitiveness. The generator learns the statistics from the training set to produce new synthetic data, whereas the discriminator assumes the role of an evaluator that determines whether the input received from the former is real or synthetic.

Raphael Labaca-Castro

Benchmark & Defenses

Frontmatter

Chapter 10. Comparison of Strategies

Abstract

Having properly introduced each module of FAME, we now proceed to evaluate the problem-space attack suite with a holistic approach in order to better understand the advantages and disadvantages of each strategy. As depicted in Fig. 1.1, our goal is to create an initial benchmark to evaluate adversarial examples in the context of malware using PE files. Notably, since gradient-based approaches (i.e., GRIPE in Chapter 8) and GAN attacks (i.e., GAINED in Chapter 9) generate feature-space adversarial examples, we did not include them in the main suite of realizable attacks implemented in FAME.

Raphael Labaca-Castro

Chapter 11. Towards Robustness

Abstract

Having shown that ML models are susceptible to adversarial examples, we now explore potential strategies to harden malware classifiers, as displayed under Defenses in Fig. 1.1. In general, the success metric evaluated is based on minimizing the FNR, that is, the number of adversarial examples that effectively bypass the classifier. By measuring the baseline evasion rate and comparing it to the hardened model, we can establish the success of the defense mechanism implemented.

Raphael Labaca-Castro

Closing Remarks

Frontmatter

Chapter 12. Conclusions & Outlook

Abstract

In this chapter, we conclude our work by providing a brief summary of our research and the main contributions presented. Next, we reflect on the lessons learned by connecting the dots and provide potential directions for future work.

Raphael Labaca-Castro

Backmatter

Title: Machine Learning under Malware Attack
Author: Raphael Labaca-Castro
Publisher: Springer Fachmedien Wiesbaden
Electronic ISBN: 978-3-658-40442-0
Print ISBN: 978-3-658-40441-3
DOI: https://doi.org/10.1007/978-3-658-40442-0

Springer Professional

Machine Learning under Malware Attack

About this book

Table of Contents

Frontmatter

The Beginnings of Adversarial ML

Frontmatter

Chapter 1. Introduction

Chapter 2. Background

Framework for Adversarial Malware Evaluation

Frontmatter

Chapter 3. FAME

Problem-Space Attacks

Frontmatter

Chapter 4. Stochastic Method

Chapter 5. Genetic Programming

Chapter 6. Reinforcement Learning

Chapter 7. Universal Attacks

Feature-Space Attacks

Frontmatter

Chapter 8. Gradient Optimization

Chapter 9. Generative Adversarial Nets

Benchmark & Defenses

Frontmatter

Chapter 10. Comparison of Strategies

Chapter 11. Towards Robustness

Closing Remarks

Frontmatter

Chapter 12. Conclusions & Outlook

Backmatter

Premium Partner