Skip to main content
Top

2016 | Book

Detection of Intrusions and Malware, and Vulnerability Assessment

13th International Conference, DIMVA 2016, San Sebastián, Spain, July 7-8, 2016, Proceedings

insite
SEARCH

About this book

This book constitutes the refereed proceedings of the 13th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, DIMVA 2016, held in San Sebastián, Spain, in July 2016.

The 19 revised full papers and 2 extended abstracts presented were carefully reviewed and selected from 66 submissions. They present the state of the art in intrusion detection, malware analysis, and vulnerability assessment, dealing with novel ideas, techniques, and applications in important areas of computer security including vulnerability detection, attack prevention, web security, malware detection and classification, authentication, data leakage prevention, and countering evasive techniques such as obfuscation.

Table of Contents

Frontmatter

Attacks

Frontmatter
Subverting Operating System Properties Through Evolutionary DKOM Attacks
Abstract
Modern rootkits have moved their focus on the exploitation of dynamic memory structures, which allows them to tamper with the behavior of the system without modifying or injecting any additional code.
In this paper we discuss a new class of Direct Kernel Object Manipulation (DKOM) attacks that we call Evolutionary DKOM (E-DKOM). The goal of this attack is to alter the way some data structures “evolve” over time. As case study, we designed and implemented an instance of Evolutionary DKOM attack that targets the OS scheduler for both userspace programs and kernel threads. Moreover, we discuss the implementation of a hypervisor-based data protection system that mimics the behavior of an OS component (in our case the scheduling system) and detect any unauthorized modification. We finally discuss the challenges related to the design of a general detection system for this class of attacks.
Mariano Graziano, Lorenzo Flore, Andrea Lanzi, Davide Balzarotti
DeepFuzz: Triggering Vulnerabilities Deeply Hidden in Binaries
(Extended Abstract)
Abstract
We introduce a new method for triggering vulnerabilities in deep layers of binary executables and facilitate their exploitation. In our approach we combine dynamic symbolic execution with fuzzing techniques. To maximize both the execution path depth and the degree of freedom in input parameters for exploitation, we define a novel method to assign probabilities to program paths. Based on this probability distribution we apply new path exploration strategies. This facilitates payload generation and therefore vulnerability exploitation.
Konstantin Böttinger, Claudia Eckert

Defenses

Frontmatter
AutoRand: Automatic Keyword Randomization to Prevent Injection Attacks
Abstract
AutoRand automatically transforms Java applications to use SQL keyword randomization to defend against SQL injection vulnerabilities. AutoRand is completely automatic. Unlike previous approaches it requires no manual modifications to existing code and does not require source (it works directly on Java bytecode). It can thus easily be applied to the large numbers of existing potentially insecure applications without developer assistance. Our key technical innovation is augmented strings. Augmented strings allow extra information (such as random keys) to be embedded within a string. AutoRand transforms string operations so that the extra information is transparent to the program, but is always propagated with each string operation. AutoRand checks each keyword at SQL statements for the random key. Experimental results on large, production Java applications and malicious inputs provided by an independent evaluation team hired by an agency of the United States government showed that AutoRand successfully blocked all SQL injection attacks and preserved transparent execution for benign inputs, all with low overhead.
Jeff Perkins, Jordan Eikenberry, Alessandro Coglio, Daniel Willenson, Stelios Sidiroglou-Douskos, Martin Rinard
AVRAND: A Software-Based Defense Against Code Reuse Attacks for AVR Embedded Devices
Abstract
Code reuse attacks are advanced exploitation techniques that constitute a serious threat for modern systems. They profit from a control flow hijacking vulnerability to maliciously execute one or more pieces of code from the targeted application. ASLR and Control Flow Integrity are two mechanisms commonly used to deter automated attacks based on code reuse. Unfortunately, none of these solutions are suitable for modified Harvard architectures such as AVR microcontrollers. In this work, we present a code reuse attack against embedded AVR devices that shows how an adversary can execute arbitrary code reused from the firmware and other external libraries. We then propose a software-based defense based on fine-grained random permutations of the code memory. Our solution is installed in the bootloader section of the embedded device and thus executes during every device reset. We also propose a self-obfuscation technique to hinder code-reuse attacks against the bootloader.
Sergio Pastrana, Juan Tapiador, Guillermo Suarez-Tangil, Pedro Peris-López
Towards Vulnerability Discovery Using Staged Program Analysis
Abstract
Eliminating vulnerabilities from low-level code is vital for securing software. Static analysis is a promising approach for discovering vulnerabilities since it can provide developers early feedback on the code they write. But, it presents multiple challenges not the least of which is understanding what makes a bug exploitable and conveying this information to the developer. In this paper, we present the design and implementation of a practical vulnerability assessment framework, called https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-40667-1_5/416839_1_En_5_IEq1_HTML.gif . Mélange performs data and control flow analysis to diagnose potential security bugs, and outputs well-formatted bug reports that help developers understand and fix security bugs. Based on the intuition that real-world vulnerabilities manifest themselves across multiple parts of a program, Mélange performs both local and global analyses in stages. To scale up to large programs, global analysis is demand-driven. Our prototype detects multiple vulnerability classes in C and C++ code including type confusion, and garbage memory reads. We have evaluated Mélange extensively. Our case studies show that Mélange scales up to large codebases such as Chromium, is easy-to-use, and most importantly, capable of discovering vulnerabilities in real-world code. Our findings indicate that static analysis is a viable reinforcement to the software testing tool set.
Bhargava Shastry, Fabian Yamaguchi, Konrad Rieck, Jean-Pierre Seifert

Malware Detection

Frontmatter
Comprehensive Analysis and Detection of Flash-Based Malware
Abstract
Adobe Flash is a popular platform for providing dynamic and multimedia content on web pages. Despite being declared dead for years, Flash is still deployed on millions of devices. Unfortunately, the Adobe Flash Player increasingly suffers from vulnerabilities, and attacks using Flash-based malware regularly put users at risk of being remotely attacked. As a remedy, we present Gordon, a method for the comprehensive analysis and detection of Flash-based malware. By analyzing Flash animations at different levels during the interpreter’s loading and execution process, our method is able to spot attacks against the Flash Player as well as malicious functionality embedded in ActionScript code. To achieve this goal, Gordon combines a structural analysis of the container format with guided execution of the contained code, a novel analysis strategy that manipulates the control flow to maximize the coverage of indicative code regions. In an empirical evaluation with 26,600 Flash samples collected over 12 consecutive weeks, Gordon significantly outperforms related approaches when applied to samples shortly after their first occurrence in the wild, demonstrating its ability to provide timely protection for end users.
Christian Wressnegger, Fabian Yamaguchi, Daniel Arp, Konrad Rieck
Reviewer Integration and Performance Measurement for Malware Detection
Abstract
We present and evaluate a large-scale malware detection system integrating machine learning with expert reviewers, treating reviewers as a limited labeling resource. We demonstrate that even in small numbers, reviewers can vastly improve the system’s ability to keep pace with evolving threats. We conduct our evaluation on a sample of VirusTotal submissions spanning 2.5 years and containing 1.1 million binaries with 778 GB of raw feature data. Without reviewer assistance, we achieve 72 % detection at a 0.5 % false positive rate, performing comparable to the best vendors on VirusTotal. Given a budget of 80 accurate reviews daily, we improve detection to 89 % and are able to detect 42 % of malicious binaries undetected upon initial submission to VirusTotal. Additionally, we identify a previously unnoticed temporal inconsistency in the labeling of training datasets. We compare the impact of training labels obtained at the same time training data is first seen with training labels obtained months later. We find that using training labels obtained well after samples appear, and thus unavailable in practice for current training data, inflates measured detection by almost 20 % points. We release our cluster-based implementation, as well as a list of all hashes in our evaluation and 3 % of our entire dataset.
Brad Miller, Alex Kantchelian, Michael Carl Tschantz, Sadia Afroz, Rekha Bachwani, Riyaz Faizullabhoy, Ling Huang, Vaishaal Shankar, Tony Wu, George Yiu, Anthony D. Joseph, J. D. Tygar
On the Lack of Consensus in Anti-Virus Decisions: Metrics and Insights on Building Ground Truths of Android Malware
Abstract
There is generally a lack of consensus in Antivirus (AV) engines’ decisions on a given sample. This challenges the building of authoritative ground-truth datasets. Instead, researchers and practitioners may rely on unvalidated approaches to build their ground truth, e.g., by considering decisions from a selected set of Antivirus vendors or by setting up a threshold number of positive detections before classifying a sample. Both approaches are biased as they implicitly either decide on ranking AV products, or they consider that all AV decisions have equal weights. In this paper, we extensively investigate the lack of agreement among AV engines. To that end, we propose a set of metrics that quantitatively describe the different dimensions of this lack of consensus. We show how our metrics can bring important insights by using the detection results of 66 AV products on 2 million Android apps as a case study. Our analysis focuses not only on AV binary decision but also on the notoriously hard problem of labels that AVs associate with suspicious files, and allows to highlight biases hidden in the collection of a malware ground truth—a foundation stone of any malware detection approach.
Médéric Hurier, Kevin Allix, Tegawendé F. Bissyandé, Jacques Klein, Yves Le Traon

Evasion

Frontmatter
Probfuscation: An Obfuscation Approach Using Probabilistic Control Flows
Abstract
Sensitive parts of a program, such as proprietary algorithms or licensing information, are often protected with the help of code obfuscation techniques. Many obfuscation schemes transform the control flow of the protected program. Typically, the control flow of obfuscated programs is deterministic, i.e., recorded execution traces do not differ for multiple executions using the same input values. An adversary can take advantage of this behavior and create multiple traces to perform analyses on the target program in order to deobfuscate it.
In this paper, we introduce an obfuscation approach which yields probabilistic control flow within a given method. That is, for the same input values, multiple execution traces differ, whilst preserving semantics. This effectively renders analyses relying on multiple traces impractical. We have implemented a prototype and applied it to several different programs. Our experimental results show that our approach can be used to ensure divergent traces for the same input values and that it can significantly improve the resilience against dynamic analysis.
Andre Pawlowski, Moritz Contag, Thorsten Holz
RAMBO: Run-Time Packer Analysis with Multiple Branch Observation
Abstract
Run-time packing is a technique employed by malware authors in order to conceal (e.g., encrypt) malicious code and recover it at run-time. In particular, some run-time packers only decrypt individual regions of code on demand, re-encrypting them again when they are not running. This technique is known as shifting decode frames and it can greatly complicate malware analysis. The first solution that comes to mind to analyze these samples is to apply multi-path exploration to trigger the unpacking of all the code regions. Unfortunately, multi-path exploration is known to have several limitations, such as its limited scalability for the analysis of real-world binaries. In this paper, we propose a set of domain-specific optimizations and heuristics to guide multi-path exploration and improve its efficiency and reliability for unpacking binaries protected with shifting decode frames.
Xabier Ugarte-Pedrero, Davide Balzarotti, Igor Santos, Pablo G. Bringas
Detecting Hardware-Assisted Virtualization
Abstract
Virtualization has become an indispensable technique for scaling up the analysis of malicious code, such as for malware analysis or shellcode detection systems. Frameworks like Ether, ShellOS and an ever-increasing number of commercially-operated malware sandboxes rely on hardware-assisted virtualization. A core technology is Intel’s VT-x, which — compared to software-emulated virtulization — is believed to be stealthier, especially against evasive attackers that aim to detect virtualized systems to hide the malicious behavior of their code.
We propose and evaluate low-level timing-based mechanisms to detect hardware-virtualized systems. We build upon the observation that an adversary can invoke hypervisors and trigger context switches that are noticeable both in timing and in their side effects on caching. We have locally trained and then tested our detection methodology on a wide variety of systems, including 240 PlanetLab nodes, showing a high detection accuracy. As a real-world evaluation, we detected the virtualization technology of more than 30 malware sandboxes. Finally, we demonstrate how an adversary may even use these detections to evade multi-path exploration systems that aim to explore the full behavior of a program. Our results show that VT-x is not sufficiently stealthy for reliable analysis of malicious code.
Michael Brengel, Michael Backes, Christian Rossow

Web Security

Frontmatter
Financial Lower Bounds of Online Advertising Abuse
A Four Year Case Study of the TDSS/TDL4 Botnet
Abstract
Online advertising is a complex on-line business, which has become the target of abuse. Recent charges filed from the United States Department of Justice against the operators of the DNSChanger botnet stated that the botnet operators stole approximately US $14 million [11, 18] over two years. Using monetization tactics similar to DNSChanger, several large botnets (i.e., ZeroAccess and TDSS/TDL4) abuse the ad ecosystem at scale. In order to understand the depth of the financial abuse problem, we need methods that will enable us to passively study large botnets and estimate the lower bounds of their financial abuse. In this paper we present a system, \(A^{2}S\), which is able to analyze one of the most complex, sophisticated, and long-lived botnets: TDSS/TDL4. Using passive datasets from a large Internet Service Provider in north America, we conservatively estimate lower bounds behind the financial abuse TDSS/TDL4 inflicted on the advertising ecosystem since 2010. Over its lifetime, less than 15 % of the botnet’s victims caused at least US$346 million in damages to advertisers due to impression fraud. TDSS/TDL4 abuse translates to an average US$340 thousand loss per day to advertisers, which is three times the ZeroAccess botnet [27] and more than ten times the DNSChanger botnet [2] estimates of fraud.
Yizheng Chen, Panagiotis Kintis, Manos Antonakakis, Yacin Nadji, David Dagon, Wenke Lee, Michael Farrell
Google Dorks: Analysis, Creation, and New Defenses
Abstract
With the advent of Web 2.0, many users started to maintain personal web pages to show information about themselves, their businesses, or to run simple e-commerce applications. This transition has been facilitated by a large number of frameworks and applications that can be easily installed and customized. Unfortunately, attackers have taken advantage of the widespread use of these technologies – for example by crafting special search engines queries to fingerprint an application framework and automatically locate possible targets. This approach, usually called Google Dorking, is at the core of many automated exploitation bots.
In this paper we tackle this problem in three steps. We first perform a large-scale study of existing dorks, to understand their typology and the information attackers use to identify their target applications. We then propose a defense technique to render URL-based dorks ineffective. Finally we study the effectiveness of building dorks by using only combinations of generic words, and we propose a simple but effective way to protect web applications against this type of fingerprinting.
Flavio Toffalini, Maurizio Abbà, Damiano Carra, Davide Balzarotti

Data Leaks

Frontmatter
Flush+Flush: A Fast and Stealthy Cache Attack
Abstract
Research on cache attacks has shown that CPU caches leak significant information. Proposed detection mechanisms assume that all cache attacks cause more cache hits and cache misses than benign applications and use hardware performance counters for detection.
In this article, we show that this assumption does not hold by developing a novel attack technique: the Flush+Flush attack. The Flush+Flush attack only relies on the execution time of the flush instruction, which depends on whether data is cached or not. Flush+Flush does not make any memory accesses, contrary to any other cache attack. Thus, it causes no cache misses at all and the number of cache hits is reduced to a minimum due to the constant cache flushes. Therefore, Flush+Flush attacks are stealthy, i.e., the spy process cannot be detected based on cache hits and misses, or state-of-the-art detection mechanisms. The Flush+Flush attack runs in a higher frequency and thus is faster than any existing cache attack. With 496 KB/s in a cross-core covert channel it is 6.7 times faster than any previously published cache covert channel.
Daniel Gruss, Clémentine Maurice, Klaus Wagner, Stefan Mangard
Rowhammer.js: A Remote Software-Induced Fault Attack in JavaScript
Abstract
A fundamental assumption in software security is that a memory location can only be modified by processes that may write to this memory location. However, a recent study has shown that parasitic effects in DRAM can change the content of a memory cell without accessing it, but by accessing other memory locations in a high frequency. This so-called Rowhammer bug occurs in most of today’s memory modules and has fatal consequences for the security of all affected systems, e.g., privilege escalation attacks.
All studies and attacks related to Rowhammer so far rely on the availability of a cache flush instruction in order to cause accesses to DRAM modules at a sufficiently high frequency. We overcome this limitation by defeating complex cache replacement policies. We show that caches can be forced into fast cache eviction to trigger the Rowhammer bug with only regular memory accesses. This allows to trigger the Rowhammer bug in highly restricted and even scripting environments.
We demonstrate a fully automated attack that requires nothing but a website with JavaScript to trigger faults on remote hardware. Thereby we can gain unrestricted access to systems of website visitors. We show that the attack works on off-the-shelf systems. Existing countermeasures fail to protect against this new Rowhammer attack.
Daniel Gruss, Clémentine Maurice, Stefan Mangard
Detile: Fine-Grained Information Leak Detection in Script Engines
Abstract
Memory disclosure attacks play an important role in the exploitation of memory corruption vulnerabilities. By analyzing recent research, we observe that bypasses of defensive solutions that enforce control-flow integrity or attempt to detect return-oriented programming require memory disclosure attacks as a fundamental first step. However, research lags behind in detecting such information leaks.
In this paper, we tackle this problem and present a system for fine-grained, automated detection of memory disclosure attacks against scripting engines. The basic insight is as follows: scripting languages, such as JavaScript in web browsers, are strictly sandboxed. They must not provide any insights about the memory layout in their contexts. In fact, any such information potentially represents an ongoing memory disclosure attack. Hence, to detect information leaks, our system creates a clone of the scripting engine process with a re-randomized memory layout. The clone is instrumented to be synchronized with the original process. Any inconsistency in the script contexts of both processes appears when a memory disclosure was conducted to leak information about the memory layout. Based on this detection approach, we have designed and implemented Detile (detection of information leaks), a prototype for the JavaScript engine in Microsoft’s Internet Explorer 10/11 on Windows 8.0/8.1. An empirical evaluation shows that our tool can successfully detect memory disclosure attacks even against this proprietary software.
Robert Gawlik, Philipp Koppe, Benjamin Kollenda, Andre Pawlowski, Behrad Garmany, Thorsten Holz
Understanding the Privacy Implications of ECS
(Extended Abstract)
Abstract
The edns-client-subnet (ECS) is a new extension for the Domain Name System (DNS) that delivers a “faster Internet” with the help of client-specific DNS answers. Under ECS, recursive DNS servers (recursives) provide client network address information to upstream authorities, permitting topologically localized answers for content delivery networks (CDNs). This optimization, however, comes with a privacy penalty that has not yet been studied. Our analysis concludes that ECS makes DNS communications less private: the potential for mass surveillance is greater, and stealthy, highly targeted DNS poisoning attacks become possible.
Despite being an experimental extension, ECS is already deployed, and users are expected to “opt out” on their own. Yet, there are no available client-side tools to do so. We describe a configuration of an experimental recursive tool to reduce the privacy leak from ECS queries in order to immediately allow users to protect their privacy. We recommend the protocol change from “opt out” to “opt in”, given the experimental nature of the extension and its privacy implications.
Panagiotis Kintis, Yacin Nadji, David Dagon, Michael Farrell, Manos Antonakakis

Authentication

Frontmatter
Analysing the Security of Google’s Implementation of OpenID Connect
Abstract
Many millions of users routinely use Google to log in to relying party (RP) websites supporting Google’s OpenID Connect service. OpenID Connect builds an identity layer on top of the OAuth 2.0 protocol, which has itself been widely adopted to support identity management. OpenID Connect allows an RP to obtain authentication assurances regarding an end user. A number of authors have analysed OAuth 2.0 security, but whether OpenID Connect is secure in practice remains an open question. We report on a large-scale practical study of Google’s implementation of OpenID Connect, involving forensic examination of 103 RP websites supporting it. Our study reveals widespread serious vulnerabilities of a number of types, many allowing an attacker to log in to an RP website as a victim user. These issues appear to be caused by a combination of Google’s design of its OpenID Connect service and RP developers making design decisions sacrificing security for ease of implementation. We give practical recommendations for both RPs and OPs to help improve the security of real world OpenID Connect systems.
Wanpeng Li, Chris J. Mitchell
Leveraging Sensor Fingerprinting for Mobile Device Authentication
Abstract
Device fingerprinting is a technique for identification and recognition of clients and widely used in practice for Web tracking and fraud prevention. While common systems depend on software attributes, sensor-based fingerprinting relies on hardware imperfections and thus opens up new possibilities for device authentication. Recent work focusses on accelerometers as easily accessible sensors of modern mobile devices. However, it has remained unclear if device recognition via sensor-based fingerprinting is feasible under real-world conditions.
In this paper, we analyze the effectiveness of a specialized feature set for sensor-based device fingerprinting and compare the results to feature-less fingerprinting techniques based on raw measurements. Furthermore, we evaluate other sensor types—like gravity and magnetic field sensors—as well as combinations of different sensors concerning their suitability for the purpose of device authentication. We demonstrate that combinations of different sensors yield precise device fingerprints when evaluating the approach on a real-world data set consisting of empirical measurement results obtained from almost 5,000 devices.
Thomas Hupperich, Henry Hosseini, Thorsten Holz

Malware Classification

Frontmatter
MtNet: A Multi-Task Neural Network for Dynamic Malware Classification
Abstract
In this paper, we propose a new multi-task, deep learning architecture for malware classification for the binary (i.e. malware versus benign) malware classification task. All models are trained with data extracted from dynamic analysis of malicious and benign files. For the first time, we see improvements using multiple layers in a deep neural network architecture for malware classification. The system is trained on 4.5 million files and tested on a holdout test set of 2 million files which is the largest study to date. To achieve a binary classification error rate of 0.358 %, the objective functions for the binary classification task and malware family classification task are combined in the multi-task architecture. In addition, we propose a standard (i.e. non multi-task) malware family classification architecture which also achieves a malware family classification error rate of 2.94 %.
Wenyi Huang, Jack W. Stokes
Adaptive Semantics-Aware Malware Classification
Abstract
Automatic malware classification is an essential improvement over the widely-deployed detection procedures using manual signatures or heuristics. Although there exists an abundance of methods for collecting static and behavioral malware data, there is a lack of adequate tools for analysis based on these collected features. Machine learning is a statistical solution to the automatic classification of malware variants based on heterogeneous information gathered by investigating malware code and behavioral traces. However, the recent increase in variety of malware instances requires further development of effective and scalable automation for malware classification and analysis processes.
In this paper, we investigate the topic modeling approaches as semantics-aware solutions to the classification of malware based on logs from dynamic malware analysis. We combine results of static and dynamic analysis to increase the reliability of inferred class labels. We utilize a semi-supervised learning architecture to make use of unlabeled data in classification. Using a nonparametric machine learning approach to topic modeling we design and implement a scalable solution while maintaining advantages of semantics-aware analysis. The outcomes of our experiments reveal that our approach brings a new and improved solution to the reoccurring problems in malware classification and analysis.
Bojan Kolosnjaji, Apostolis Zarras, Tamas Lengyel, George Webster, Claudia Eckert
Backmatter
Metadata
Title
Detection of Intrusions and Malware, and Vulnerability Assessment
Editors
Juan Caballero
Urko Zurutuza
Ricardo J. Rodríguez
Copyright Year
2016
Electronic ISBN
978-3-319-40667-1
Print ISBN
978-3-319-40666-4
DOI
https://doi.org/10.1007/978-3-319-40667-1

Premium Partner