Skip to main content

Über dieses Buch

This book constitutes the proceedings of the Third International Workshop on Traffic Monitoring and Analysis, TMA 2011, held in Vienna, Austria, on April 27, 2011 - co-located with EW 2011, the 17th European Wireless Conference. The workshop is an initiative from the COST Action IC0703 "Data Traffic Monitoring and Analysis: Theory, Techniques, Tools and Applications for the Future Networks". The 10 revised full papers and 6 poster papers presented together with 4 short papers were carefully reviewed and selected from 29 submissions. The papers are organized in topical sections on traffic analysis, applications and privacy, traffic classification, and a poster session.



Traffic Analysis (1)

On Profiling Residential Customers

Some recent large scale studies on residential networks (ADSL and FTTH) have provided important insights concerning the set of applications used in such networks. For instance, it is now apparent that Web based traffic is dominating again at the expense of P2P traffic in lots of countries due to the surge of HTTP streaming and possibly social networks. In this paper we confront the analysis of the overall (high level) traffic characteristics of the residential network with the study of the users traffic profiles. We propose approaches to tackle those issues and illustrate them with traces from an ADSL platform. Our main findings are that even if P2P still dominates the first heavy hitters, the democratization of Web and Streaming traffic is the main cause of the come-back of HTTP. Moreover, the mixture of applications study highlights that these two classes (P2P vs. Web + Streaming) are almost never used simultaneously by our residential customers.
Marcin Pietrzyk, Louis Plissonneau, Guillaume Urvoy-Keller, Taoufik En-Najjary

Sub-Space Clustering and Evidence Accumulation for Unsupervised Network Anomaly Detection

Network anomaly detection has been a hot research topic for many years. Most detection systems proposed so far employ a supervised strategy to accomplish the task, using either signature-based detection methods or supervised-learning techniques. However, both approaches present major limitations: the former fails to detect unknown anomalies, the latter requires training and labeled traffic, which is difficult and expensive to produce. Such limitations impose a serious bottleneck to the development of novel and applicable methods in the near future network scenario, characterized by emerging applications and new variants of network attacks. This work introduces and evaluates an unsupervised approach to detect and characterize network anomalies, without relying on signatures, statistical training, or labeled traffic. Unsupervised detection is accomplished by means of robust data-clustering techniques, combining Sub-Space Clustering and multiple Evidence Accumulation algorithms to blindly identify anomalous traffic flows. Unsupervised characterization is achieved by exploring inter-flows structure from multiple outlooks, building filtering rules to describe a detected anomaly. Detection and characterization performance of the unsupervised approach is extensively evaluated with real traffic from two different data-sets: the public MAWI traffic repository, and the METROSEC project data-set. Obtained results show the viability of unsupervised network anomaly detection and characterization, an ambitious goal so far unmet.
Johan Mazel, Pedro Casas, Philippe Owezarski

An Analysis of Longitudinal TCP Passive Measurements (Short Paper)

This paper reports on the longitudinal dynamics of TCP flows at an international backbone link over the period from 2001 to 2010. The dataset was a collection of traffic traces called MAWI data consisting of daily 15min pcap traffic trace measured at a trans-pacific link between Japan and the US. The environment of the measurement link has changed in several aspects (i.e., congestion, link upgrade, application). The main findings of the paper are as follows. (1) A comparison of the AS-level delays between 2001 and 2010 shows that the mean delay decreased in 55% of ASes, but the median value increased. Moreover, largely inefficient paths disappeared. (2) The deployment of TCP SACK increased from 10% to 90% over the course of 10 years. On the other hand, the window scale and timestamp options remained under-deployed (less than 50%).
Kensuke Fukuda

Traffic Analysis (2)

Understanding the Impact of the Access Technology: The Case of Web Search Services

In this paper, we address the problem of comparing the performance perceived by end users when they use different technologies to access the Internet. We focus on three key technologies: Cellular, ADSL and FTTH. Users primarily interact with the network through the networking applications they use. We tackle the comparison task by focusing on Web search services, which are arguably a key service for end users. We first demonstrate that RTT and packet loss alone are not enough to fully understand the observed differences or similarities of performance between the different access technologies. We then present an approach based on a fine-grained profiling of the data time of transfers that sheds light on the interplay between service, access and usage, for the client and server side. We use a clustering approach to identify groups of connections experiencing similar performance over the different access technologies. This technique allows to attribute performance differences perceived by the client separately to the specific characteristics of the access technology, behavior of the server, and behavior of the client.
Aymen Hafsaoui, Guillaume Urvoy-Keller, Denis Collange, Matti Siekkinen, Taoufik En-Najjary

A Hadoop-Based Packet Trace Processing Tool

Internet traffic measurement and analysis has become a significantly challenging job because large packet trace files captured on fast links could not be easily handled on a single server with limited computing and memory resources. Hadoop is a popular open-source cloud computing platform that provides a software programming framework called MapReduce and the distributed filesystem, HDFS, which are useful for analyzing a large data set. Therefore, in this paper, we present a Hadoop-based packet processing tool that provides scalability for a large data set by harnessing MapReduce and HDFS. To tackle large packet trace files in Hadoop efficiently, we devised a new binary input format, called PcapInputFormat, hiding the complexity of processing binary-formatted packet data and parsing each packet record. We also designed efficient traffic analysis MapReduce job models consisting of map and reduce functions. To evaluate our tool, we compared its computation time with a well-known packet-processing tool, CoralReef, and showed that our approach is more affordable to process a large set of packet data.
Yeonhee Lee, Wonchul Kang, Youngseok Lee

Monitoring of Tunneled IPv6 Traffic Using Packet Decapsulation and IPFIX (Short Paper)

IPv6 is being deployed but many Internet Service Providers have not implemented its support yet. Most of the end users have IPv6 ready computers but their network doesn’t support native IPv6 connection so they are forced to use transition mechanisms to transport IPv6 packets through IPv4 network. We do not know, what kind of traffic is inside of these tunnels, which services are used and if the traffic does not bypass security policy. This paper proposes an approach, how to monitor IPv6 tunnels even on high-speed networks. The proposed approach allows to monitor traffic on 10, Gbps links, because it supports hardware-accelerated packet distribution on multi-core processors. A system based on the proposed approach is deployed at the CESNET2 network, which is the largest academic network in the Czech Republic. This paper also presents several statistics about tunneled traffic on the CESNET2 backbone links.
Martin Elich, Matěj Grégr, Pavel Čeleda

Applications and Privacy

Identifying Skype Traffic in a Large-Scale Flow Data Repository

We present a novel method for identifying Skype clients and supernodes on a network using only flow data, based upon the detection of certain Skype control traffic. Flow-level identification allows long-term retrospective studies of Skype traffic as well as studies of Skype traffic on much larger scale networks than existing packet-based approaches. We use this method to identify Skype hosts and connection events to the network in a historical flow data set containing 182 full days of data over the six years from 2004 to 2009, in order to explore the evolution of the Skype network in general and a large observed portion thereof in particular. This represents, to the best of our knowledge, the first long-term retrospective analysis of the behavior of the Skype network based solely on flow data, and the first successful application of a Skype detection algorithm to flow data collected from a production network.
Brian Trammell, Elisa Boschi, Gregorio Procissi, Christian Callegari, Peter Dorfinger, Dominik Schatzmann

On the Stability of Skype Super Nodes

The heart of skype services, one of the most ubiquitous P2P networks, is based on a set of super nodes. Choosing stable SNs is an important task, since it improves the whole performance and quality of the P2P network [1, 2]. In this paper we shed light on the life cycle of SNs using extensive data sets on Skype Super nodes, which were gathered over a period of 3 months. We then suggest how to choose a more stable SNs set.
The dynamic of nodes is inherent to the use of a computer, which is unplugged for some time or is mobile. Hence it is natural to predict that a Super Node would have multiple sessions correlated with the time the computer is up. Surprisingly, we show that 40% of the Super Nodes have only one session, with median residual life time of 1.75 days. These nodes also have a significantly shorter lifespan than Super Nodes that have multiple sessions, which have median residual life time of 67.5 days. We propose and give evidence that nodes with one session are nodes with dynamic IP addresses, and hence they have ended their life cycle due to a change of IP address. We show that the nodes with multiple sessions are mostly nodes with static IPs, and that choosing super nodes with static IPs would increase the availability and stability of the P2P network significantly.
Anat Bremler-Barr, Ran Goldschmidt

Reduce to the Max: A Simple Approach for Massive-Scale Privacy-Preserving Collaborative Network Measurements (Short Paper)

Privacy-preserving techniques for distributed computation have been proposed recently as a promising framework in collaborative inter-domain network monitoring. Several different approaches exist to solve such class of problems, e.g., Homomorphic Encryption (HE) and Secure Multiparty Computation (SMC) based on Shamir’s Secret Sharing algorithm (SSS). Such techniques are complete from a computation-theoretic perspective: given a set of private inputs, it is possible to perform arbitrary computation tasks without revealing any of the intermediate results. In this paper we advocate the use of “elementary” (as opposite to “complete“) Secure Multiparty Computation (E-SMC) procedures for traffic monitoring. E-SMC supports only simple computations with private input and public output, i.e., they can not handle secret input nor secret (intermediate) output. The proposed simplification brings a dramatic reduction in complexity and enables massive-scale implementation with acceptable delay and overhead. Notwithstanding their simplicity, we claim that a simple additive E-SMC scheme is sufficient to perform many computation tasks of practical relevance to collaborative network monitoring, such as anonymous publishing and set operations.
Fabio Ricciato, Martin Burkhart

An Analysis of Anonymizer Technology Usage

Anonymity techniques provide legitimate usage such as privacy and freedom of speech, but are also used by cyber criminals to hide themselves. In this paper, we provide usage and geo-location analysis of major anonymization systems, i.e., anonymous proxy servers, remailers, JAP, I2P and Tor. Among these systems, remailers and JAP seem to have minimal usage. We then provide a detailed analysis of Tor system by analyzing traffic through two relays. Our results indicate certain countries utilize Tor network more than others. We also analyze anonymity systems from service perspective by inspecting sources of spam e-mail and peer-to-peer clients in recent data sets. We found that proxy servers are used more than other anonymity techniques in both. We believe this is due to proxies providing basic anonymity with minimal delay compared to other systems that incur higher delays.
Bingdong Li, Esra Erdin, Mehmet Hadi Güneş, George Bebis, Todd Shipley

Traffic Classification

Early Classification of Network Traffic through Multi-classification

In this work we present and evaluate different automated combination techniques for traffic classification. We consider six intelligent combination algorithms applied to both traditional and more recent traffic classification techniques using either packet content or statistical properties of flows. Preliminary results show that, when selecting complementary classifiers, some combination algorithms allow a further improvement – in terms of classification accuracy – over already well-performing stand-alone classification techniques. Moreover, our experiments show that the positive impact of combination is particularly significant when there are early-classification constraints, that is, when the classification of a flow must be obtained in its early stage (e.g. first 1 – 4 packets) in order to perform network operations online.
Alberto Dainotti, Antonio Pescapé, Carlo Sansone

Software Architecture for a Lightweight Payload Signature-Based Traffic Classification System

Traffic classification is a preliminary and essential step for achieving stable network service provision and efficient network resource management. While a number of classification methods have been introduced in the literature, the payload signature-based classification method shows the highest performance in terms of accuracy, completeness, and practicality. However, the payload signature-based method has a significant drawback in high-speed network environments; the processing speed is much slower than that of other classification methods such as the header-based and statistical methods. In this paper, we describe various design options to improve the processing speed of traffic classification in designing a payload signature-based classification system, and we describe choices we made for designing our traffic classification system. Also, the feasibility of our design choices was proved via experimental evaluation on our campus traffic trace.
Jun-Sang Park, Sung-Ho Yoon, Myung-Sup Kim

Mining Unclassified Traffic Using Automatic Clustering Techniques

In this paper we present a fully unsupervised algorithm to identify classes of traffic inside an aggregate. The algorithm leverages on the K-means clustering algorithm, augmented with a mechanism to automatically determine the number of traffic clusters. The signatures used for clustering are statistical representations of the application layer protocols.
The proposed technique is extensively tested considering UDP traffic traces collected from operative networks. Performance tests show that it can clusterize the traffic in few tens of pure clusters, achieving an accuracy above 95%. Results are promising and suggest that the proposed approach might effectively be used for automatic traffic monitoring, e.g., to identify the birth of new applications and protocols, or the presence of anomalous or unexpected traffic.
Alessandro Finamore, Marco Mellia, Michela Meo

Entropy Estimation for Real-Time Encrypted Traffic Identification (Short Paper)

This paper describes a novel approach to classify network traffic into encrypted and unencrypted traffic. The classifier is able to operate in real-time as only the first packet of each flow is processed. The main metric used for classification is an estimation of the entropy of the first packet payload. The approach is evaluated based on encrypted ground truth traces and on real network traces. Encrypted traffic such as Skype, or encrypted eDonkey traffic are detected as encrypted with probability higher than 94%. Unencrypted protocols such as SMTP, HTTP, POP3 or FTP are detected as unencrypted with probability higher than 99.9%. The presented approach, named real-time encrypted traffic detector (RT-ETD), is well suited to operate as pre-filter for advanced classification approaches to enable their applicability on increased bandwidth.
Peter Dorfinger, Georg Panholzer, Wolfgang John

Poster Session

Limits in the Bandwidth Exploitation in Passive Optical Networks Due to the Operating Systems (Poster)

In this work we apply an accredited standard technique for end-user bandwidth evaluation in a wired access scenario and show limitations in the bandwidth exploitation of user of optical access networks due to the computer operating systems.
Paolo Bolletta, Valerio Petricca, Sergio Pompei, Luca Rea, Alessandro Valenti, Francesco Matera

Toward a Scalable and Collaborative Network Monitoring Overlay (Poster)

This paper presents ongoing work toward the definition of a new network monitoring model which resorts to a cooperative interaction among measurement entities to monitor the quality of network services. Exploring (i) the definition of representative measurement points to form a network monitoring overlay; (ii) the removal of measurement redundancy through composition of metrics; and (iii) a simple active measurement methodology, the proposed model aims to contribute to a scalable, robust and reliable end-to-end monitoring. Besides the model proposal, a JAVA prototype was implemented to test the conceptual model and its design goals.
Vasco Castro, Paulo Carvalho, Solange Rito Lima

Hardware-Based “on-the-fly” Per-flow Scan Detector Pre-filter (Poster)

Pre-filtering monitoring tasks, directly running over traffic probes, may accomplish a significant degree of data reduction by isolating a relatively small number of flows (likely to be of interest for the monitoring application) from the rest of the traffic. As these filtering mechanisms are conveniently run as close as possible to the data gathering devices (traffic probes), and must scale to multi-gigabit speed, the feasibility of their implementation in hardware is a key requirement. In this paper, we document a hardware FPGA implementation of a recently proposed network scan pre-filter. It leverages processing stages based on Bloom filters and Counting Bloom Filters, and it is devised to detect, through on-the-fly per-packet analysis, the flows which potentially exhibit a network/port scanning behaviour. The framework has been implemented in a modular manner. It suitably combines two different general-purpose modules (a rate meter and a variation detector) likely to be reused as building blocks for other monitoring tasks. In the following presentation, we further discuss some lessons learned and general implementation guidelines which emerge when the goal is to efficiently implement run-time updated (i.e., dynamic) Bloom-filter-based data structures in hardware.
Salvatore Pontarelli, Simone Teofili, Giuseppe Bianchi

Anti-evasion Technique for Packet Based Pre-filtering for Network Intrusion Detection Systems (Poster)

This work proposes a method to extend packet pre-filtering for Network Intrusion Detection Systems (NIDS). The aim of the method is to avoid the false negatives occurring when a malicious content has been sent splitted in several packets. In this paper we propose a method that is able to identify even the fragmented malicious content avoiding false negative limiting the false positive rate.
Salvatore Pontarelli, Simone Teofili

COMINDIS – COllaborative Monitoring with MINimum DISclosure (Poster)

In the modern Internet, network anomalies are manifold and range from Distributed Denial of Service (DDoS) attacks over unsolicited communication (e.g. Spam), to large-scale information harvesting. Network operators react by deploying carefully selected monitoring equipment, tuned to protect their individual core assets. Consequently, there exist a multitude of different views on the activities of a particular host at one moment in time, depending on the locally observed activity patterns, the configurations of the monitoring equipment, and the policies and legislations which influence the amount of traffic information that can be analyzed.
Jacopo Cesareo, Andreas Berger, Alessandro D’Alconzo

Dynamic Monitoring of Dark IP Address Space (Poster)

number of security-related research topics are based on the monitoring of dark IP address space. Unfortunately there is large administrative overhead associated with the dynamic assignment of a specific subnet for monitoring purposes, such as the deployment of a honeypot farm or a distributed intrusion detection system. In this paper, we propose a system that enables the dynamic allocation of an unadvertised IP address subnet for use by a monitoring sensor. The system dynamically selects network subnets that have been allocated to the organization but are not being advertised, advertises them, and subsequently forwards all received traffic destined to the selected subnet to a monitoring sensor.
Iasonas Polakis, Georgios Kontaxis, Sotiris Ioannidis, Evangelos P. Markatos


Weitere Informationen

Premium Partner