skip to main content
10.1145/1592568.1592599acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Free Access

Every microsecond counts: tracking fine-grain latencies with a lossy difference aggregator

Published:16 August 2009Publication History

ABSTRACT

Many network applications have stringent end-to-end latency requirements, including VoIP and interactive video conferencing, automated trading, and high-performance computing---where even microsecond variations may be intolerable. The resulting fine-grain measurement demands cannot be met effectively by existing technologies, such as SNMP, NetFlow, or active probing. We propose instrumenting routers with a hash-based primitive that we call a Lossy Difference Aggregator (LDA) to measure latencies down to tens of microseconds and losses as infrequent as one in a million.

Such measurement can be viewed abstractly as what we refer to as a coordinated streaming problem, which is fundamentally harder than standard streaming problems due to the need to coordinate values between nodes. We describe a compact data structure that efficiently computes the average and standard deviation of latency and loss rate in a coordinated streaming environment. Our theoretical results translate to an efficient hardware implementation at 40 Gbps using less than 1% of a typical 65-nm 400-MHz networking ASIC. When compared to Poisson-spaced active probing with similar overheads, our LDA mechanism delivers orders of magnitude smaller relative error; active probing requires 50--60 times as much bandwidth to deliver similar levels of accuracy.

References

  1. Corvil, Ltd. http://www.corvil.com.Google ScholarGoogle Scholar
  2. Multicast-based intference of network-internal characteristics. http://gaia.cs.umass.edu/minc/.Google ScholarGoogle Scholar
  3. Alon, N., Matias, Y., and Szegedy, M. The space complexity of approximating the frequency moments. J. Computer and System Sciences 58, 1 (Feb. 1999), 137--147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Arista Networks, Inc. 7100 series datasheet. http://www.aristanetworks.com/en/7100_datasheet.pdf, 2008.Google ScholarGoogle Scholar
  5. Beigbeder, T., Coughlan, R., Lusher, C., Plunkett, J., Agu, E., and Claypool, M. The effects of loss and latency on user performance in Unreal Tournament 2003. In Proceedings of the ACM SIGCOMM Workshop on Network Games (Aug. 2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chen, Y., Bindel, D., Song, H., and Katz, R. H. An algebraic approach to practical and scalable overlay network monitoring. In ACM SIGCOMM (Sept. 2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Dobra, A., Garofalakis, M., Gehrke, J. E., and Rastogi, R. Processing complex aggregate queries over data streams. In Proceedings of ACM SIGMOD (June 2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Duffield, N. Simple network performance tomography. In Proceedings of USENIX/ACM Internet Measurement Conference (Oct. 2003). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Duffield, N., Gerber, A., and Grossglauser, M. Trajectory engine: A backend for trajectory sampling. In Proceedings of IEEE Network Operations and Management Symposium (Apr. 2002).Google ScholarGoogle ScholarCross RefCross Ref
  10. Duffield, N., and Grossglauser, M. Trajectory sampling for direct traffic observation. In Proceedings of ACM SIGCOMM (Aug. 2000). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Estan, C., and Varghese, G. New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice. ACM Transactions on Computer Systems 21, 3 (Aug. 2003). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Estan, C., Varghese, G., and Fisk, M. Bitmap algorithms for counting active flows on high speed links. In Proceedings of the USENIX/ACM Internet Measurement Conference (Oct. 2003). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hoeffding, W. Probability inequalities for sums of bounded random variables. J. American Statistical Association 58, 301 (March 1963), 13--30.Google ScholarGoogle ScholarCross RefCross Ref
  14. Hohn, N., Veitch, D., Papagiannaki, K., and Diot, C. Bridging router performance and queuing theory. In Proceedings of ACM SIGMETRICS (June 2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. IEEE. Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems, 2002. IEEE/ANSI 1588 Standard.Google ScholarGoogle Scholar
  16. INCITS. Fibre channel backbone-5 (FC-BB-5), Oct. 2008. Ver. 1.03.Google ScholarGoogle Scholar
  17. Kandula, S., Katabi, D., and Vasseur, J. P. Shrink: A tool for failure diagnosis in IP networks. In Proceedings of ACM SIGCOMM MineNet Workshop (Aug. 2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kompella, R. R., Yates, J., Greenberg, A., and Snoeren, A. C. Detection and localization of network black holes. In Proceedings of IEEE Infocom (May 2007).Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kumar, A., Sung, M., Xu, J., and Zegura, E. W. A data streaming algorithm for estimating subpopulation flow size distribution. In Proceedings of ACM SIGMETRICS (June 2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lall, A., Sekar, V., Ogihara, M., Xu, J., and Zhang, H. Data streaming algorithms for estimating entropy of network traffic. In Proceedings of ACM SIGMETRICS (June 2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. London Stock Exchange plc. Launch of exchange hosting creates sub-millisecond access to its markets. http://www.londonstockexchange.com/NR/exeres/04192D02-B949-423D-94E2-68%3D7506C530.htm, Sept. 2008.Google ScholarGoogle Scholar
  22. Lu, Y., Montanari, A., Prabhakar, B., Dharmapurikar, S., and Kabbani, A. Counter braids: a novel counter architecture for per-flow measurement. In Proceedings of ACM SIGMETRICS (June 2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Machiraju, S., and Veitch, D. A measurement-friendly network (MFN) architecture. In Proceedings of ACM SIGCOMM Workshop on Internet Network Management (Sept. 2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mahdavi, J., Paxson, V., Adams, A., and Mathis, M. Creating a scalable architecture for internet measurement. In Proceedings of INET (July 1998).Google ScholarGoogle Scholar
  25. Martin, R. Wall street's quest to process data at the speed of light. http://www.informationweek.com/news/infrastructure/showArticle.jhtml?articleID=199200297.Google ScholarGoogle Scholar
  26. Misra, V., Gong, W.-B., and Towsley, D. Stochastic differential equation modeling and analysis of tcp windowsize behavior. In Proceedings of IFIP WG 7.3 Performance (Nov. 1999).Google ScholarGoogle Scholar
  27. Nguyen, H. X., and Thiran, P. Network loss inference with second order statistics of end-to-end flows. In Proceedings of ACM Internet Measurement Conference (Oct. 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Papagiannaki, K., Moon, S., Fraleigh, C., Thiran, P., Tobagi, F., and Diot, C. Analaysis of measured single--hop delay from an operational backbone network. IEEE Journal on Selected Areas in Communications 21, 6 (Aug. 2003).Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ramakrishna, M., Fu, E., and Bahcekapili, E. Efficient hardware hashing functions for high performance computers. IEEE Transactions on Computers 46, 12 (Dec. 1997). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Riska, M., Malik, D., and Kessler, A. Trading flow architecture. Tech. rep., Cisco Systems, Inc. http://www.cisco.com/en/US/docs/solutions/Verticals/Trading_Floor_Architecture-E.pdf.Google ScholarGoogle Scholar
  31. Savage, S. Sting: a TCP--based network measurement tool. In Proceedings of USENIX Symposium on Internet Technologies and Systems (Oct. 1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Shaikh, A., and Greenberg, A. OSPF monitoring: Architecture, design and deployment experience. In Proceedings of USENIX NSDI (Mar. 2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sommers, J., Barford, P., Duffield, N., and Ron, A. Improving accuracy in end-to-end packet loss measurement. In Proceedings of ACM SIGCOMM (Aug. 2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Szigeti, T., and Hattingh, C. Quality of service design overview. http://www.ciscopress.com/articles/article.asp?p=357102&seqNum=2, Dec. 2004.Google ScholarGoogle Scholar
  35. Toomey, F. Monitoring and analysis of traffic for low-latency trading networks. Tech. rep., Corvil, Ltd., 2008.Google ScholarGoogle Scholar
  36. Vardi, Y. Network tomography: estimating source-destination traffic intensities from link data. J. American Statistical Association 91 (1996), 365--377.Google ScholarGoogle ScholarCross RefCross Ref
  37. Woven Systems, Inc. EFX switch series overview. http://www.wovensystems.com/pdfs/products/Woven_EFX_Series.pdf, 2008.Google ScholarGoogle Scholar
  38. Zhang, Y., Roughan, M., Duffield, N., and Greenberg, A. Fast accurate computation of large-scale IP traffic matrices from link loads. In Proceedings of ACM SIGMETRICS (June 2003). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Zhao, Y., Chen, Y., and Bindel, D. Towards unbiased end-to-end network diagnosis. In Proceedings of ACM SIGCOMM (Sept. 2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Zseby, T., Zander, S., and Carle, G. Evaluation of building blocks for passive one-way-delay measurements. In Proceedings of Passive and Active Measurement Workshop (Apr. 2001).Google ScholarGoogle Scholar

Index Terms

  1. Every microsecond counts: tracking fine-grain latencies with a lossy difference aggregator

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGCOMM '09: Proceedings of the ACM SIGCOMM 2009 conference on Data communication
      August 2009
      340 pages
      ISBN:9781605585949
      DOI:10.1145/1592568
      • cover image ACM SIGCOMM Computer Communication Review
        ACM SIGCOMM Computer Communication Review  Volume 39, Issue 4
        SIGCOMM '09
        October 2009
        325 pages
        ISSN:0146-4833
        DOI:10.1145/1594977
        Issue’s Table of Contents

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 August 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate554of3,547submissions,16%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader