Abstract
Debugging faults in complex networks often requires capturing and analyzing traffic at the packet level. In this task, datacenter networks (DCNs) present unique challenges with their scale, traffic volume, and diversity of faults. To troubleshoot faults in a timely manner, DCN administrators must a) identify affected packets inside large volume of traffic; b) track them across multiple network components; c) analyze traffic traces for fault patterns; and d) test or confirm potential causes. To our knowledge, no tool today can achieve both the specificity and scale required for this task.
We present Everflow, a packet-level network telemetry system for large DCNs. Everflow traces specific packets by implementing a powerful packet filter on top of "match and mirror" functionality of commodity switches. It shuffles captured packets to multiple analysis servers using load balancers built on switch ASICs, and it sends "guided probes" to test or confirm potential faults. We present experiments that demonstrate Everflow's scalability, and share experiences of troubleshooting network faults gathered from running it for over 6 months in Microsoft's DCNs.
Supplemental Material
- Data plane development kit. http://www.dpdk.org/.Google Scholar
- Receive side scaling. https://msdn.microsoft.com/en-us/library/windows/hardware/ff567236(v=vs.85).aspx.Google Scholar
- A. Arefin, A. Khurshid, M. Caesar, and K. Nahrstedt. Scaling data-plane logging in large scale networks. In MILCOM, 2011.Google ScholarCross Ref
- P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Izzard, F. Mujica, and M. Horowitz. Forwarding metamorphosis: Fast programmable match-action processing in hardware for SDN. In SIGCOMM, 2013. Google ScholarDigital Library
- J. Case, M. Fedor, M. Schoffstall, and J. Davin. RFC 1157: Simple network management protocol. Google ScholarDigital Library
- R. Chaiken, B. Jenkins, P.-Å. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou. Scope: easy and efficient parallel processing of massive data sets. VLDB, 2008. Google ScholarDigital Library
- B. Claise. RFC 3954: Cisco systems netflow services export version 9 (2004).Google ScholarDigital Library
- N. G. Duffield and M. Grossglauser. Trajectory sampling for direct traffic observation. IEEE/ACM Trans. Netw., June 2001. Google ScholarDigital Library
- S. K. Fayaz and V. Sekar. Testing stateful and dynamic data planes with flowtest. In HotSDN, 2014. Google ScholarDigital Library
- A. Fogel, S. Fung, L. Pedrosa, M. Walraed-Sullivan, R. Govindan, R. Mahajan, and T. Millstein. A general approach to network configuration analysis. In NSDI, 2015. Google ScholarDigital Library
- R. Fonseca, G. Porter, R. H. Katz, S. Shenker, and I. Stoica. X-trace: A pervasive network tracing framework. In NSDI, 2007. Google ScholarDigital Library
- R. Gandhi, H. H. Liu, Y. C. Hu, G. Lu, J. Padhye, L. Yuan, and M. Zhang. Duet: Cloud scale load balancing with hardware and software. In SIGCOMM, 2014. Google ScholarDigital Library
- N. Gvozdiev, B. Karp, and M. Handley. Loup: who's afraid of the big bad loop? In HotNets, 2012. Google ScholarDigital Library
- N. Handigol, B. Heller, V. Jeyakumar, D. Mazières, and N. McKeown. I know what your packet did last hop: Using packet histories to troubleshoot networks. In NSDI, 2014. Google ScholarDigital Library
- C.-Y. Hong, M. Caesar, N. Duffield, and J. Wang. Tiresias: Online anomaly detection for hierarchical operational network data. In ICDCS, 2012. Google ScholarDigital Library
- C.-Y. Hong, S. Kandula, R. Mahajan, M. Zhang, V. Gill, M. Nanduri, and R. Wattenhofer. Achieving high utilization with software-driven WAN. In SIGCOMM, 2013. Google ScholarDigital Library
- Infiniband Trade Association. InfiniBand Architecture Volume 1, General Specifications, Release 1.2.1, 2008.Google Scholar
- Infiniband Trade Association. Supplement to infiniband architecture specification volume 1 release 1.2.2 annex A17: RoCEv2 (ip routable ROCE), 2014.Google Scholar
- S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu, J. Zolla, U. Hölzle, S. Stuart, and A. Vahdat. B4: Experience with a globally-deployed software defined WAN. In SIGCOMM, 2013. Google ScholarDigital Library
- V. Jeyakumar, M. Alizadeh, Y. Geng, C. Kim, and D. Mazières. Millions of little minions: Using packets for low latency network programming and visibility. In SIGCOMM, 2014. Google ScholarDigital Library
- S. Kandula, S. Sengupta, A. Greenberg, P. Patel, and R. Chaiken. The nature of datacenter traffic: measurements & analysis. In IMC, 2009. Google ScholarDigital Library
- P. Kazemian, M. Chan, H. Zeng, G. Varghese, N. McKeown, and S. Whyte. Real time network policy checking using header space analysis. In NSDI, 2013. Google ScholarDigital Library
- A. Khurshid, X. Zou, W. Zhou, M. Caesar, and P. B. Godfrey. Veriflow: Verifying network-wide invariants in real time. In NSDI, 2013. Google ScholarDigital Library
- T. Koponen, K. Amidon, P. Balland, M. Casado, A. Chanda, B. Fulton, I. Ganichev, J. Gross, N. Gude, P. Ingram, E. Jackson, A. Lambeth, R. Lenglet, S.-H. Li, A. Padmanabhan, J. Pettit, B. Pfaff, R. Ramanathan, S. Shenker, A. Shieh, J. Stribling, P. Thakkar, D. Wendlandt, A. Yip, and R. Zhang. Network virtualization in multi-tenant datacenters. In NSDI, 2014. Google ScholarDigital Library
- R. Mahajan, N. Spring, D. Wetherall, and T. Anderson. User-level Internet path diagnosis. In SOSP, 2003. Google ScholarDigital Library
- V. Mann, A. Vishnoi, and S. Bidkar. Living on the edge: Monitoring network flows at the edge in cloud data centers. In COMSNETS, 2013.Google ScholarCross Ref
- P. Marchetta, A. Botta, E. Katz-Bassett, and A. Pescapé. Dissecting round trip time on the slow path with a single packet. In PAM, 2014. Google ScholarDigital Library
- P. Patel, D. Bansal, L. Yuan, A. Murthy, A. Greenberg, D. A. Maltz, and R. Kern. Ananta: cloud scale load balancing. In SIGCOMM, 2013. Google ScholarDigital Library
- P. Phaal, S. Panchen, and N. McKee. RFC 3176: Inmon corporation's sflow: A method for monitoring traffic in switched and routed networks, 2001. Google ScholarDigital Library
- T. Qiu, Z. Ge, D. Pei, J. Wang, and J. Xu. What happened in my network: mining network events from router syslogs. In IMC, 2010. Google ScholarDigital Library
- J. Rasley, B. Stephens, C. Dixon, E. Rozner, W. Felter, K. Agarwal, J. Carter, and R. Fonseca. Planck: Millisecond-scale monitoring and control for commodity networks. In SIGCOMM, 2014. Google ScholarDigital Library
- L. Rizzo. netmap: A novel framework for fast packet I/O. In USENIX ATC, 2012. Google ScholarDigital Library
- J. Suh, T. Kwon, C. Dixon, W. Felter, and J. Carter. Opensample: A low-latency, sampling-based measurement platform for SDN. In ICDCS, 2014. Google ScholarDigital Library
- W. Wu and P. Demar. Wirecap: a novel packet capture engine for commodity NICs in high-speed networks. In IMC, 2014. Google ScholarDigital Library
- A. Wundsam, D. Levin, S. Seetharaman, and A. Feldmann. OFRewind: Enabling record and replay troubleshooting for networks. In ATC, 2011. Google ScholarDigital Library
- M. Yu, L. Jose, and R. Miao. Software defined traffic measurement with opensketch. In NSDI, 2013. Google ScholarDigital Library
Index Terms
- Packet-Level Telemetry in Large Datacenter Networks
Recommendations
Packet-Level Telemetry in Large Datacenter Networks
SIGCOMM '15: Proceedings of the 2015 ACM Conference on Special Interest Group on Data CommunicationDebugging faults in complex networks often requires capturing and analyzing traffic at the packet level. In this task, datacenter networks (DCNs) present unique challenges with their scale, traffic volume, and diversity of faults. To troubleshoot faults ...
Improving datacenter throughput and robustness with Lazy TCP over packet spraying
Packet spraying is being recognized as a promising multipath routing approach in datacenter for full utilization of multiple paths. However, existing TCP schemes face serious challenge of robustness when running over packet spraying. They suffer ...
Analyzing and Optimizing Packet Corruption in RDMA Network
AbstractRemote direct memory access (RDMA) has become one of the state-of-the-art high-performance network technologies in datacenters. The reliable transport of RDMA is designed based on a lossless underlying network and cannot endure a high packet loss ...
Comments