ABSTRACT
Identifying the "heavy hitter" flows or flows with large traffic volumes in the data plane is important for several applications e.g., flow-size aware routing, DoS detection, and traffic engineering. However, measurement in the data plane is constrained by the need for line-rate processing (at 10-100Gb/s) and limited memory in switching hardware. We propose HashPipe, a heavy hitter detection algorithm using emerging programmable data planes. HashPipe implements a pipeline of hash tables which retain counters for heavy flows while evicting lighter flows over time. We prototype HashPipe in P4 and evaluate it with packet traces from an ISP backbone link and a data center. On the ISP trace (which contains over 400,000 flows), we find that HashPipe identifies 95% of the 300 heaviest flows with less than 80KB of memory.
- M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic flow scheduling for data center networks. In USENIX NSDI, 2010.Google ScholarDigital Library
- Arista Networks. Arista 7050x switch architecture, 2014. https://www.corporatearmor.com/documents/Arista_7050X_Switch_Architecture_Datasheet.pdf.Google Scholar
- Barefoot Networks. Barefoot Tofino. https://www.barefootnetworks.com/technology/.Google Scholar
- T. Benson, A. Akella, and D. A. Maltz. Network traffic characteristics of data centers in the wild. In ACM IMC, 2010. Google ScholarDigital Library
- T. Benson, A. Anand, A. Akella, and M. Zhang. MicroTE: Fine grained traffic engineering for data centers. In ACM CoNEXT, 2011.Google ScholarDigital Library
- P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford, C. Schlesinger, D. Talayco, A. Vahdat, G. Varghese, and D. Walker. P4: Programming protocol-independent packet processors. ACM SIGCOMM Computer Communication Review, 44(3):87--95, 2014. Google ScholarDigital Library
- P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Izzard, F. Mujica, and M. Horowitz. Forwarding metamorphosis: Fast programmable match-action processing in hardware for SDN. In ACM SIGCOMM, 2013.Google ScholarDigital Library
- CAIDA. The CAIDA UCSD Anonymized Internet Traces 2016 - March. http://www.caida.org/data/passive/passive_2016_dataset.xml.Google Scholar
- Cavium. Cavium and XPliant Introduce a Fully Programmable Switch Silicon Family Scaling to 3.2 Terabits per Second. http://cavium.com/newsevents-Cavium-and-XPliant-Introduce-a-Fully-Programmable-Switch-Silicon-Family.html.Google Scholar
- M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. In Springer ICALP, 2002. Google ScholarCross Ref
- S. Choi. Implementing Heavy-Hitter Detection, 2016. https://github.com/p4lang/tutorials/tree/master/SIGCOMM_2016/heavy_hitter.Google Scholar
- Cisco Networks. Netflow. http://www.cisco.com/c/en/us/products/ios-nx-os-software/ios-netflow/index.html.Google Scholar
- G. Cormode and M. Hadjieleftheriou. Finding frequent items in data streams. In VLDB Endowment, 2008. Google ScholarDigital Library
- G. Cormode and S. Muthukrishnan. An improved data stream summary: The count-min sketch and its applications. Journal of Algorithms, 55(1):58--75, 2005. Google ScholarDigital Library
- G. Cormode and S. Muthukrishnan. What's hot and what's not: Tracking most frequent items dynamically. ACM Trans. Database Syst., 30(1), 2005. Google ScholarDigital Library
- A. R. Curtis, J. C. Mogul, J. Tourrilhes, P. Yalagandula, P. Sharma, and S. Banerjee. DevoFlow: Scaling flow management for high-performance networks. In ACM SIGCOMM, 2011.Google ScholarDigital Library
- C. Estan and G. Varghese. New directions in traffic measurement and accounting. ACM Trans. Computer Systems, 21(3), 2003. Google ScholarDigital Library
- A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, and F. True. Deriving traffic demands for operational IP networks: Methodology and experience. In ACM SIGCOMM, 2000.Google ScholarDigital Library
- V. Jeyakumar, M. Alizadeh, Y. Geng, C. Kim, and D. Mazières. Millions of little minions: Using packets for low latency network programming and visibility. In ACM SIGCOMM, 2014.Google ScholarDigital Library
- John Wawrzynek and Krste Asanovic and John Lazzaro and Yunsup Lee. Memory (lecture), 2010. https://inst.eecs.berkeley.edu/~cs250/fa10/lectures/lec08.pdf.Google Scholar
- L. Jose, M. Yu, and J. Rexford. Online measurement of large traffic aggregates on commodity switches. In USENIX Hot-ICE, 2011.Google ScholarDigital Library
- M. Kumar and K. Prasad. Auto-learning of MAC addresses and lexicographic lookup of hardware database. US Patent App. 10/747,332.Google Scholar
- A. Lakhina, M. Crovella, and C. Diot. Characterization of network-wide anomalies in traffic flows. In ACM IMC, 2004. Google ScholarDigital Library
- Y. Li, R. Miao, C. Kim, and M. Yu. FlowRadar: A better NetFlow for data centers. In USENIX NSDI, 2016.Google ScholarDigital Library
- Z. Liu, A. Manousis, G. Vorsanger, V. Sekar, and V. Braverman. One sketch to rule them all: Rethinking network flow monitoring with UnivMon. In ACM SIGCOMM, 2016.Google ScholarDigital Library
- Y. Lu, A. Montanari, B. Prabhakar, S. Dharmapurikar, and A. Kabbani. Counter braids: A novel counter architecture for per-flow measurement. In ACM SIGMETRICS, 2008. Google ScholarDigital Library
- G. S. Manku and R. Motwani. Approximate frequency counts over data streams. In VLDB Endowment, 2002. Google ScholarCross Ref
- A. Metwally, D. Agrawal, and A. El Abbadi. Efficient computation of frequent and top-k elements in data streams. In International Conference on Database Theory. Springer, 2005.Google Scholar
- J. Misra and D. Gries. Finding repeated elements. Science of computer programming, 2(2):143--152, 1982. Google ScholarCross Ref
- NANOG mailing list. SFlow vs NetFlow/IPFIX. https://mailman.nanog.org/pipermail/nanog/2016-February/thread.html#84418.Google Scholar
- R. Ozdag. Intel Ethernet Switch FM6000 Series - Software Defined Networking. http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ethernet-switch-fm6000-sdn-paper.pdf.Google Scholar
- P4 Language Consortium. P4 Switch Behavioral Model. https://github.com/p4lang/behavioral-model.Google Scholar
- R. Pagh and F. F. Rodler. Cuckoo hashing. J. Algorithms, 51(2):122--144, May 2004. Google ScholarDigital Library
- P. Phaal. SFlow sampling rates. http://blog.sflow.com/2009/06/sampling-rates.html.Google Scholar
- J. Rasley, B. Stephens, C. Dixon, E. Rozner, W. Felter, K. Agarwal, J. Carter, and R. Fonseca. Planck: Millisecond-scale monitoring and control for commodity networks. In ACM SIGCOMM, 2014.Google ScholarDigital Library
- O. Rottenstreich and J. Tapolcai. Optimal rule caching and lossy compression for longest prefix matching. IEEE/ACM Trans. Netw., 2017.Google ScholarDigital Library
- R. Schweller, A. Gupta, E. Parsons, and Y. Chen. Reversible sketches for efficient and accurate change detection over network data streams. In ACM IMC, 2004. Google ScholarDigital Library
- SFlow. http://sflow.org/.Google Scholar
- Simon Horman. Packet recirculation, 2013. https://Iwn.net/Articles/546476/.Google Scholar
- A. Sivaraman, A. Cheung, M. Budiu, C. Kim, M. Alizadeh, H. Balakrishnan, G. Varghese, N. McKeown, and S. Licking. Packet transactions: High-level programming for line-rate switches. In ACM SIGCOMM, 2016.Google ScholarDigital Library
- A. Sivaraman, S. Subramanian, M. Alizadeh, S. Chole, S.-T. Chuang, A. Agrawal, H. Balakrishnan, T. Edsall, S. Katti, and N. McKeown. Programmable packet scheduling at line rate. In ACM SIGCOMM, 2016. Google ScholarDigital Library
- The P4 Language Consortium. The P4 Language Specification, Version 1.1.0 - Release Candidate, January 2016. http://p4.org/wp-content/uploads/2016/03/p4_v1.1.pdf.Google Scholar
- B. Vöcking. How asymmetry helps load balancing. Journal of the ACM (JACM), 50(4):568--589, 2003. Google ScholarDigital Library
- N. Weste and D. Harris. CMOS VLSI Design: A Circuits and Systems Perspective. Addison-Wesley, 2010.Google ScholarDigital Library
- M. Yu, L. Jose, and R. Miao. Software defined traffic measurement with OpenSketch. In USENIX NSDI, 2013.Google ScholarDigital Library
- L. Yuan, C.-N. Chuah, and P. Mohapatra. ProgME: Towards programmable network measurement. In ACM SIGCOMM, 2007.Google ScholarDigital Library
- Heavy-Hitter Detection Entirely in the Data Plane
Recommendations
Network-Wide Heavy Hitter Detection with Commodity Switches
SOSR '18: Proceedings of the Symposium on SDN ResearchMany network monitoring tasks identify subsets of traffic that stand out, e.g., top-k flows for a particular statistic. A Protocol Independent Switch Architecture (PISA) switch can identify these "heavy hitter" flows directly in the data plane, by ...
Heavy Hitter Detection on Multi-Pipeline Switches
ANCS '21: Proceedings of the Symposium on Architectures for Networking and Communications SystemsRecently, several applications have been designed and implemented to run entirely in the dataplane. However, most if not all the applications assume that network traffic traverses the same pipe, from ingress to egress inside the switch. While this seems ...
Revisiting heavy-hitters: don't count packets, compute flow inter-packet metrics in the data plane
SIGCOMM '20: Proceedings of the SIGCOMM '20 Poster and Demo SessionsDetecting Heavy Hitter (HH) flows, i.e., flows exceeding a pre-determined threshold in a time window, is a fundamental task as it enables network management and security applications like DoS attack detection/prevention, flow-size aware routing, and ...
Comments