Skip to main content

About this book

This book constitutes the proceedings of the 4th International Workshop on Traffic Monitoring and Analysis, TMA 2012, held in Vienna, Austria, in March 2012. The thoroughly refereed 10 full papers and 8 short papers presented in this volume were carefully reviewed and selected from 31 submissions. The contributions are organized in topical sections on traffic analysis and characterization: new results and improved measurement techniques; measurement for QoS, security and service level agreements; and tools for network measurement and experimentation.

Table of Contents


Traffic Analysis and Characterization: New Results and Improved Measurement Techniques I

Assessing the Real-World Dynamics of DNS

The DNS infrastructure is a key component of the Internet and is thus used by a multitude of services, both legitimate and malicious. Recently, several works demonstrated that malicious DNS activity usually exhibits observable dynamics that may be exploited for detection and mitigation. Clearly, reliable differentiation requires legitimate activity to not show these dynamics. In this paper, we show that this is often not the case, and propose a set of DNS stability metrics that help to efficiently categorize the DNS activity of a diverse set of Internet sites.
Andreas Berger, Eduard Natale

Uncovering the Big Players of the Web

In this paper we aim at observing how today the Internet large organizations deliver web content to end users. Using one-week long data sets collected at three vantage points aggregating more than 30,000 Internet customers, we characterize the offered services precisely quantifying and comparing the performance of different players. Results show that today 65% of the web traffic is handled by the top 10 organizations. We observe that, while all of them serve the same type of content, different server architectures have been adopted considering load balancing schemes, servers number and location: some organizations handle thousands of servers with the closest being few milliseconds far away from the end user, while others manage few data centers. Despite this, the performance of bulk transfer rate offered to end users are typically good, but impairment can arise when content is not readily available at the server and has to be retrieved from the CDN back-end.
Vinicius Gehlen, Alessandro Finamore, Marco Mellia, Maurizio M. Munafò

Internet Access Traffic Measurement and Analysis

The fast changing application types and their behavior require consecutive measurements of access networks. In this paper, we present the results of a 14-day measurement in an access network connecting 600 users with the Internet. Our application classification reveals a trend back to HTTP traffic, underlines the immense usage of flash videos, and unveils a participant of a Botnet. In addition, flow and user statistics are presented, which resulting traffic models can be used for simulation and emulation of access networks.
Steffen Gebert, Rastin Pries, Daniel Schlosser, Klaus Heck

From ISP Address Announcement Patterns to Routing Scalability

The Internet routing table size has been growing rapidly. In future Internet, if given a much larger public address space (e.g., IPv6), the potential expansion can be very significant, and the cost of a large routing table will affect many ISPs. Before devising methods to ensure routing scalability, it is necessary to understand what factors lead to the expansion of routing tables and to what extent they impact. In addition to the well known factors such as multi-homing, traffic engineering and non-contiguous address allocations, the tendency towards convenient address management also increases the routing table size. In this paper, we take a measurement-based approach to examine quantitatively how various factors, especially the current address aggregation status of different types of ISPs, contribute to the growth of global routing table. We show how these patterns affect the routing scalability, and discuss the implications as we plan address management and routing for the future Internet.
Liang Chen, Xingang Shi, Dah Ming Chiu

I2P’s Usage Characterization

We present the first monitoring study aiming to characterize the usage of the I2P network, a low-latency anonymous network based on garlic routing. We design a distributed monitoring architecture for the I2P network and show through three one-week measurement experiments the ability of the system to identify a significant number of all running applications, among web servers and file-sharing clients.
Juan Pablo Timpanaro, Isabelle Chrisment, Olivier Festor

Traffic Analysis and Characterization: New Results and Improved Measurement Techniques II

Experimental Assessment of BitTorrent Completion Time in Heterogeneous TCP/uTP Swarms

BitTorrent, one of the most widespread used P2P application for file-sharing, recently got rid of TCP by introducing an application-level congestion control protocol named uTP. The aim of this new protocol is to efficiently use the available link capacity, while minimizing its interference with the rest of user traffic (e.g., Web, VoIP and gaming) sharing the same access bottleneck.
In this paper we perform an experimental study of the impact of uTP on the torrent completion time, the metric that better captures the user experience. We run BitTorrent applications in a flash crowd scenario over a dedicated cluster platform, under both homogeneous and heterogeneous swarm population. Experiments show that an all-uTP swarms have shorter torrent download time with respect to all-TCP swarms. Interestingly, at the same time, we observe that even shorter completion times can be achieved under mixtures of TCP and uTP traffic, as in the default BitTorrent settings.
Claudio Testa, Dario Rossi, Ashwin Rao, Arnaud Legout

Steps towards the Extraction of Vehicular Mobility Patterns from 3G Signaling Data

The signaling traffic of a cellular network is rich of information related to the movement of devices across cell boundaries. Thus, passive monitoring of anonymized signaling traffic enables the observation of the devices’ mobility patterns. This approach is intrinsically more powerful and accurate than previous studies based exclusively on Call Data Records as significantly more devices can be included for investigation, but it is also more challenging to implement due to a number of artifacts implicitly present in the network signaling. In this study we tackle the problem of estimating vehicular trajectories from 3G signaling traffic with particular focus on crucial elements of the data processing chain. The work is based on a sample set of anonymous traces from a large operational 3G network, including both the circuit-switched and packet-switched domains. We first investigate algorithms and procedures for preprocessing the raw dataset to make it suitable for mobility studies. Second, we present a preliminary analysis and characterization of the mobility signaling traffic. Finally, we present an algorithm for exploiting the refined data for road traffic monitoring, i.e., route detection. The work shows the potential of leveraging the 3G cellular network as a complementary “sensor” to existing solutions for road traffic monitoring.
Pierdomenico Fiadino, Danilo Valerio, Fabio Ricciato, Karin Anna Hummel

Identifying Skype Nodes in the Network Exploiting Mutual Contacts

In this paper we present an algorithm that is able to progressively discover nodes of a Skype overlay P2P network. Most notably, super nodes in the network core. Starting from a single, known Skype node, we can easily identify other Skype nodes in the network, through the analysis of widely available and standardized IPFIX (NetFlow) data. Instead of relying on the analysis of content characteristics or packet properties of the flow itself, we monitor connections of known Skype nodes in the network and then progressively discover the other nodes through the analysis of their mutual contacts.
Jan Jusko, Martin Rehak

Padding and Fragmentation for Masking Packet Length Statistics

We aim at understanding if and how complex it is to obfuscate traffic features exploited by statistical traffic flow classification tools. We address packet length masking and define perfect masking as an optimization problem, aiming at minimizing overhead. An explicit efficient algorithm is given to compute the optimum masking sequence. Numerical results are provided, based on measured traffic traces. We find that fragmenting requires about the same overhead as padding does.
Alfonso Iacovazzi, Andrea Baiocchi

Real-Time Traffic Classification Based on Cosine Similarity Using Sub-application Vectors

Internet traffic classification has a critical role on network monitoring, quality of service, intrusion detection, network security and trend analysis. The conventional port-based method is ineffective due to dynamic port usage and masquerading techniques. Besides, payload-based method suffers from heavy load and encryption. Due to these facts, machine learning based statistical approaches have become the new trend for the network measurement community. In this short paper, we propose a new statistical approach based on DBSCAN clustering and weighted cosine similarity. Our experimental test results show that the proposed approach achieves very high accuracy.
Cihangir Beşiktaş, Hacı Ali Mantar

Measurement for QoS, Security and Service Level Agreements

Towards Efficient Flow Sampling Technique for Anomaly Detection

With increasing amount of network traffic, sampling techniques have become widely employed allowing monitoring and analysis of high-speed network links. Despite of all benefits, sampling methods negatively influence the accuracy of anomaly detection techniques and other subsequent processing. In this paper, we present an adaptive, feature-aware sampling technique that reduces the loss of information bounded with the sampling process, thus minimizing the decrease of anomaly detection efficiency.
To verify the optimality of our proposed technique, we build a model of the ideal sampling algorithm and define general metrics allowing us to compute the distortion of traffic feature distribution for various types of sampling algorithms. We compare our technique with random flow sampling and reveal their impact on several anomaly detection methods by using real network traffic data. The presented ideas can be applied on high-speed network links to refine the input data by suppressing highly-redundant information.
Karel Bartos, Martin Rehak

Detecting and Profiling TCP Connections Experiencing Abnormal Performance

We study functionally correct TCP connections – normal set-up, data transfer and tear-down – that experience lower than normal performance in terms of delay and throughput. Several factors, including packet loss or application behavior, may lead to such abnormal performance. We present a methodology to detect TCP connections with such abnormal performance from packet traces recorded at a single vantage point. Our technique decomposes a TCP transfer into periods where: (i) TCP is recovering from losses, (ii) the client or the server are thinking or preparing data, respectively, or (iii) the data is sent but at an abnormally low rate. We apply this methodology to several traces containing traffic from FTTH, ADSL, and Cellular access networks. We discover that regardless of the access technology type, packet loss dramatically degrades performance as TCP is rarely able to rely on Fast Retransmit to recover from losses. However, we also find out that the TCP timeout mechanism has been optimized in Cellular networks as compared to ADSL/FTTH technologies. Concerning loss-free periods, our technique exposes various abnormal performance, some being benign, with no impact on user, e.g., p2p or instant messaging applications, and some that are more critical, e.g., HTTPS sessions.
Aymen Hafsaoui, Guillaume Urvoy-Keller, Matti Siekkinen, Denis Collange

Geographical Internet PoP Level Maps

We introduce DIMES’s geographical Internet PoP-level connectivity maps, created using a structural approach to automatically generate at world scale. We provide preliminary results of the algorithm and discuss the properties of the generated maps as well as their global spread.
Yuval Shavitt, Noa Zilberman

Distributed Troubleshooting of Web Sessions Using Clustering

Web browsing is a very common way of using the Internet to, among others, read news, do on-line shopping, or search for user generated content such as YouTube or Dailymotion. Traditional evaluations of web surfing focus on objectively measured Quality of Service (QoS) metrics such as loss rate or round-trip times; In this paper, we propose to use K-means clustering to share knowledge about the performance of the same web page experienced by different clients. Such technique allows to discover and explain the performance differences among users and identify the root causes for poor performances.
Heng Cui, Ernst Biersack

Tools for Network Measurement and Experimentation

Using Metadata to Improve Experiment Reliability in Shared Environments

Experimental network research is subject to challenges since the experiment outcomes can be influenced by undesired effects from other activities in the network. In shared experiment networks, control over resources is often limited and QoS guarantees might not be available. When the network conditions vary during a series of experiment unwanted artifacts can be introduced in the experimental results, reducing the reliability of the experiments. We propose a novel, systematic, methodology where network conditions are monitored during the experiments and information about the network is collected. This information, known as metadata, is analyzed statistically to identify periods during the experiments when the network conditions have been similar. Data points collected during these periods are valid for comparison. Our hypothesis is that this methodology can make experiments more reliable. We present a proof-of-concept implementation of our method, deployed in the FEDERICA and PlanetLab networks.
Pehr Söderman, Markus Hidell, Peter Sjödin

tsdb: A Compressed Database for Time Series

Large-scale network monitoring systems require efficient storage and consolidation of measurement data. Relational databases and popular tools such as the Round-Robin Database show their limitations when handling a large number of time series. This is because data access time greatly increases with the cardinality of data and number of measurements. The result is that monitoring systems are forced to store very few metrics at low frequency in order to grant data access within acceptable time boundaries.
This paper describes a novel compressed time series database named tsdb whose goal is to allow large time series to be stored and consolidated in realtime with limited disk space usage. The validation has demonstrated the advantage of tsdb over traditional approaches, and has shown that tsdb is suitable for handling a large number of time series.
Luca Deri, Simone Mainardi, Francesco Fusco

Flexible High Performance Traffic Generation on Commodity Multi–core Platforms

Generating high-volume and accurate test traffic is crucial for assessing the performance of network devices in a reliable way and under different stress conditions. However, traffic generation still relies mostly on special purpose hardware. In fact, available software generators are able to reproduce rich and involved traffic patterns, but do not meet the performance requirements that are needed for effectively challenging the device under test. Nevertheless, hardware devices usually provide limited flexibility with respect to the traffic patterns that they can generate. The aim of this work is to design a traffic generator which can both achieve good performance and provide a flexible framework for supporting arbitrary traffic models. The key factor that enables our system to meet both requirements is parallelism, which is increasingly provided by modern commodity hardware: indeed our generator, which includes both kernel and user space components, can efficiently scale with multiple cores and multi–queue commodity network cards. By leveraging such a design, our generator is able to produce close-to-line-rate traffic on a 10Gbps link, while accommodating multiple traffic models and providing good accuracy.
Nicola Bonelli, Andrea Di Pietro, Stefano Giordano, Gregorio Procissi

Improving Network Measurement Efficiency through Multiadaptive Sampling

Sampling techniques play a key role in achieving efficient network measurements by reducing the amount of traffic processed while trying to maintain the accuracy of network statistical behavior estimation.
Despite the evolution of current techniques regarding the correctness of network parameters estimation, the overhead associated with the volume of data involved in the sampling process is still considerable. In this context, this paper proposes a new technique for multiadaptive traffic sampling based on linear prediction, which allows to reduce significantly the traffic under analysis, keeping the representativeness of samples in capturing network behavior.
A proof-of-concept, evaluating this technique for real traffic traces representing distinct traffic profiles, demonstrates the effectiveness of the proposal, outperforming classic techniques both in accuracy and data volumes processed.
João Marco C. Silva, Solange Rito Lima


Additional information

Premium Partner

    Image Credits