ABSTRACT
Profiling Internet backbone traffic is becoming an increasingly hard problem since users and applications are avoiding detection using traffic obfuscation and encryption. The key question addressed here is: Is it possible to profile traffic at the backbone without relying on its packet and flow level information, which can be obfuscated? We propose a novel approach, called Profiling-By-Association (PBA), that uses only the IP-to-IP communication graph and information about some applications used by few IP-hosts (a.k.a. seeds). The key insight is that IP-hosts tend to communicate more frequently with hosts involved in the same application forming communities (or clusters). Profiling few members within a cluster can "give away" the whole community. Following our approach, we develop different algorithms to profile Internet traffic and evaluate them on real-traces from four large backbone networks. We show that PBA's accuracy is on average around 90% with knowledge of only 1% of all the hosts in a given data set and its runtime is on the order of minutes (≈ 5).
- L. Bernaille, R. Teixeira, and K. Salamatian. Early Application Identification. In ACM CoNEXT, 2006. Google ScholarDigital Library
- V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre. Fast unfolding of communities in large networks. J. Stat. Mech., page 10008, 2008.Google Scholar
- CAIDA Org. The CoralReef Project, http://www.caida.org/tools/measurement/coralreef/.Google Scholar
- D. Chakrabarti, S. Papadimitriou, D. S. Modha, and C. Faloutsos. Fully automatic cross-associations. In SIGKDD, 2004. Google ScholarDigital Library
- A. Clauset, M. Newman, and C. Moore. Finding community structure in very large networks. Physical Review E, 70:066111, 2004.Google ScholarCross Ref
- M. Dusi, A. Este, F. Gringoli, and L. Salgarelli. Using GMM and SVM-based techniques for the classification of SSH-Encrypted traffic. In IEEE ICC, 2009. Google ScholarDigital Library
- J. Erman, A. Mahanti, M. Arlitt, and C. Williamson. Identifying and Discriminating Between Web and Peer-to-peer Traffic in the Network Core. In WWW, 2007. Google ScholarDigital Library
- B. Gallagher, M. Iliofotou, T. Eliassi-Rad, and M. Faloutsos. Homophily in application layer and its usage in traffic classification. In IEEE INFOCOM (mini-conference), 2010. Google ScholarDigital Library
- K. Henderson and T. Eliassi-Rad. Applying latent Dirichlet allocation to group discovery in large graphs. In ACM SAC, 2009. Google ScholarDigital Library
- M. Iliofotou, H. Kim, P. Pappu, M. Faloutsos, M. Mitzenmacher, and G. Varghese. Graph-based P2P Traffic Classification at the Internet Backbone. In IEEE GI, 2009.Google ScholarCross Ref
- Y. Jin, S. Esam, and Z. L. Zhang. Unveiling Core Network-Wide Communication Patterns through Application Traffic Activity Graph Decomposition. In ACM SIGMETRICS, 2009. Google ScholarDigital Library
- T. Karagiannis, K. Papagiannaki, and M. Faloutsos. BLINC: Multi-level Traffic Classification in the Dark. In ACM SIGCOMM, 2005. Google ScholarDigital Library
- H. Kim, K. Claffy, M. Fomenkov, D. Barman, M. Faloutsos, and K. Lee. Internet Traffic Classification Demystified: Myths, Caveats, and the Best Practices. In ACM CoNEXT, 2008. Google ScholarDigital Library
- H. Kwak, Y. Choi, Y.-H. Eom, H. Jeong, and S. Moon. Mining communities in networks: a solution for consistency and its evaluation. In IMC. ACM, 2009. Google ScholarDigital Library
- J. Ma, K. Levchenko, C. Kreibich, S. Savage, and G. M. Voelker. Unexpected Means of Protocol Inference. In ACM IMC, 2006. Google ScholarDigital Library
- A. Moore and K. Papagiannaki. Toward the accurate identification of network applications. In PAM, 2005. Google ScholarDigital Library
- A. Moore and D. Zuev. Internet Traffic Classification Using Bayesian Analysis Techniques. In ACM SIGMETRICS, 2005. Google ScholarDigital Library
- S. Sen, O. Spatscheck, and D. Wang. Accurate, scalable in-network identification of p2p traffic using application signatures. In WWW, 2004. Google ScholarDigital Library
- I. Trestian, S. Ranjan, A. Kuzmanovic, and A. Nucci. Unconstrained endpoint profiling (Googling the Internet). In ACM SIGCOMM, 2008. Google ScholarDigital Library
- S. van Dongen. Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht, 2000. http://www.micans.org/mcl/.Google Scholar
- K. Xu, Z. Zhang, and S. Bhattacharyya. Profiling Internet Backbone Traffic: Behavior Models and Applications. In ACM SIGCOMM, 2005. Google ScholarDigital Library
Index Terms
- Profiling-By-Association: a resilient traffic profiling solution for the internet backbone
Recommendations
Profiling internet backbone traffic: behavior models and applications
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communicationsRecent spates of cyber-attacks and frequent emergence of applications affecting Internet traffic dynamics have made it imperative to develop effective techniques that can extract, and make sense of, significant communication patterns from Internet ...
SCTP profiling framework for multi-homed environment
MobiShare '06: Proceedings of the 1st international workshop on Decentralized resource sharing in mobile computing and networkingNowadays, it is not rare that a mobile node equips multiple interfaces such as Ethernet, 802.11b, mobile phones and WiMAX. A node with multiple interfaces has the possibility to utilize the multiple network accesses for robust and effcient ...
Network prefix-level traffic profiling: Characterizing, modeling, and evaluation
A cardinal prerequisite for the proper and efficient management of a network, especially an ISP network, is to understand the traffic that it carries. Traffic profiling is a means to obtain knowledge of the traffic behavior. Previous work has been ...
Comments