skip to main content
article

New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice

Published:01 August 2003Publication History
Skip Abstract Section

Abstract

Accurate network traffic measurement is required for accounting, bandwidth provisioning and detecting DoS attacks. These applications see the traffic as a collection of flows they need to measure. As link speeds and the number of flows increase, keeping a counter for each flow is too expensive (using SRAM) or slow (using DRAM). The current state-of-the-art methods (Cisco's sampled NetFlow), which count periodically sampled packets are slow, inaccurate and resource-intensive. Previous work showed that at different granularities a small number of "heavy hitters" accounts for a large share of traffic. Our paper introduces a paradigm shift by concentrating the measurement process on large flows only---those above some threshold such as 0.1% of the link capacity.We propose two novel and scalable algorithms for identifying the large flows: sample and hold and multistage filters, which take a constant number of memory references per packet and use a small amount of memory. If M is the available memory, we show analytically that the errors of our new algorithms are proportional to 1/M; by contrast, the error of an algorithm based on classical sampling is proportional to 1/√M, thus providing much less accuracy for the same amount of memory. We also describe optimizations such as early removal and conservative update that further improve the accuracy of our algorithms, as measured on real traffic traces, by an order of magnitude. Our schemes allow a new form of accounting called threshold accounting in which only flows above a threshold are charged by usage while the rest are charged a fixed fee. Threshold accounting generalizes usage-based and duration based pricing.

References

  1. Altman, J. and Chu, K. 2001. A proposal for a flexible service plan that is attractive to users and internet service providers. In IEEE Proceedings of the INFOCOM.Google ScholarGoogle Scholar
  2. Bloom, B. 1970. Space/time trade-offs in hash coding with allowable errors. Comm. ACM. 13, 422--426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Brownlee, N., Mills, C., and Ruth, G. 1999. Traffic flow measurement: Architecture. RFC 2722. Google ScholarGoogle Scholar
  4. Burrows, M., Erlingson, U., Leung, S.-T., Vandevoorde, M., Waldspurger, C. A., and Weihl, K. W. W. 2000. Efficient and flexible value sampling. In ASPLOS. Google ScholarGoogle Scholar
  5. Cohen, S. and Matias, Y. 2003. Spectral bloom filters. In SIGMOD/PODS. Google ScholarGoogle Scholar
  6. Duffield, N., Lund, C., and Thorup, M. 2001. Charging from sampled network usage. In SIGCOMM Internet Measurement Workshop. Google ScholarGoogle Scholar
  7. Duffield, N. G. and Grossglauser, M. 2000. Trajectory sampling for direct traffic observation. In Proceedings of the ACM SIGCOMM. 271--282. Google ScholarGoogle Scholar
  8. Estan, C. and Varghese, G. 2002. New directions in traffic measurement and accounting. Tech. Rep. 0699, UCSD CSE Department. Feb.Google ScholarGoogle Scholar
  9. Fang, M., Shivakumar, N., Garcia-Molina, H., Motwani, R., and Ullman, J. D. 1998. Computing iceberg queries efficiently. In International Conference on Very Large Data Bases. 307--317. Google ScholarGoogle Scholar
  10. Fang, W. and Peterson, L. 1999. Inter-as traffic patterns and their implications. In Proceedings of IEEE GLOBECOM.Google ScholarGoogle Scholar
  11. Feldmann, A., Greenberg, A., Lund, C., Reingold, N., Rexford, J., and True, F. 2000. Deriving traffic demands for operational ip networks: Methodology and experience. In Proceedings of the ACM SIGCOMM. 257--270. Google ScholarGoogle Scholar
  12. Feng, W.-C., Kandlur, D. D., Saha, D., and Shin, K. G. 2001. Stochastic fair blue: A queue management algorithm for enforcing fairness. In IEEE Proceedings of the INFOCOM.Google ScholarGoogle Scholar
  13. Gibbons, P. B. and Matias, Y. 1998. New sampling-based summary statistics for improving approximate query answers. In Proceedings of the ACM SIGMOD. 331--342. Google ScholarGoogle Scholar
  14. Huber, J. 2001. Design of an oc-192 flow monitoring chip. Class Project.Google ScholarGoogle Scholar
  15. Karp, R. M., Papadimitriou, C. H., and Shenker, S. 2003. A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Data. Syst. Google ScholarGoogle Scholar
  16. Mackie-Masson, J. and Varian, H. 1995. Public Access to the Internet. MIT Press, Chapter Pricing the Internet. Google ScholarGoogle Scholar
  17. Mahajan, R., Bellovin, S. M., Floyd, S., Ioannidis, J., Paxson, V., and Shenker, S. 2001. Controlling high bandwidth aggregates in the network. http://www.aciri.org/pushback/.Google ScholarGoogle Scholar
  18. Moore, D. 2001. Caida analysis of code-red. http://www.caida.org/analysis/security/code-red/.Google ScholarGoogle Scholar
  19. Narayanasamy, S., Sherwood, T., Sair, S., Calder, B., and Varghese, G. 2003. Catching accurate profiles in hardware. In HPCA. Google ScholarGoogle Scholar
  20. Pan, R., Breslau, L., Prabhakar, B., and Shenker, S. 2001. Approximate fairness through differential dropping. Tech. Rep., ACIRI.Google ScholarGoogle Scholar
  21. Patterson, D. A. and Hennessy, J. L. 1998. Computer Organization and Design, second ed. Morgan Kaufmann, 619. Google ScholarGoogle Scholar
  22. Sastry, S., Bodik, R., and Smith, J. E. 2001. Rapid profiling via stratified sampling. In 28th. International Symposium on Computer Architecture. 278--289. Google ScholarGoogle Scholar
  23. Shaikh, A., Rexford, J., and Shin, K. G. 1999. Load-sensitive routing of long-lived ip flows. In Proceedings of the ACM SIGCOMM. Google ScholarGoogle Scholar
  24. Shenker, S., Clark, D., Estrin, D., and Herzog, S. 1996. Pricing in computer networks: Reshaping the research agenda. In ACM Comput. Comm. Rev. 26, 19--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Smitha, Kim, I., and Reddy, A. L. N. 2001. Identifying long term high rate flows at a router. In Proceedings of High Performance Computing. Google ScholarGoogle Scholar
  26. Thomson, K., Miller, G. J., and Wilder, R. 1997. Wide-area traffic patterns and characteristics. In IEEE Network. Google ScholarGoogle Scholar
  27. Tong, D. and Reddy, A. L. N. 1999. Qos enhancement with partial state. In International Workshop on QOS.Google ScholarGoogle Scholar

Index Terms

  1. New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice

        Recommendations

        Reviews

        Ferdi W. J. Put

        A phenomenon recently observed in traffic measurement on the Internet is that a small fraction of large flows accounts for a large share of the traffic. Because these so-called elephants deteriorate the performance experienced by the many short flows (mice), a volume charge could be imposed on those large flows. As a consequence, the measurement process can be concentrated on large flows only, such that it is not necessary to keep a counter for each flow (which would be too expensive or too slow). The biggest problem is to identify large flows: how does the measurement device know which flows to track__?__ For this purpose, the authors present two algorithms with similar performance: sample and hold (easier to implement); and multistage filters (more accurate). Because both algorithms access memory for each packet, they must use fast static random access memory (SRAM) to keep up with line speeds. The basic idea of sample and hold is to sample each packet with a certain probability. If the corresponding flow has no entry in memory, a new entry is created. After that, this entry is updated for every subsequent packet belonging to that flow. The sampling probability is determined in function of the threshold above, in which flows are considered large. Multistage filters operate according to the following principle. A stage has a table indexed by a hash function. When a packet comes in, a hash is computed, and the size of the packet is added to the corresponding counter. If the counter exceeds a threshold, we consider this flow large. Because many flows can be mapped to the same counter and can be falsely identified as large flows, multiple stages with separate counter tables and independent hash functions are used. The authors present several improvements of these basic algorithms to increase accuracy and to reduce memory requirements. Consequently, they give a very extensive theoretical and experimental analysis, and a comparison with Cisco's Sampled NetFlow. Sampled NetFlow does not process every packet, but samples every x th packet, and can afford to use large amounts of inexpensive but slow dynamic random access memory (DRAM). The two new algorithms outperform NetFlow in the sense that they detect new large flows fast, and provide accurate results (exact values and provable lower bounds). Sampled NetFlow gives better results only if the threshold is set very low (medium flows are also considered large). Another advantage of Sampled NetFlow is that it allows an a posteriori analysis of patterns in sampled data, while the new algorithms require an a priori definition of flows and patterns we are interested in. Online Computing Reviews Service

        Access critical reviews of Computing literature here

        Become a reviewer for Computing Reviews.

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Computer Systems
          ACM Transactions on Computer Systems  Volume 21, Issue 3
          August 2003
          106 pages
          ISSN:0734-2071
          EISSN:1557-7333
          DOI:10.1145/859716
          Issue’s Table of Contents

          Copyright © 2003 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 August 2003
          Published in tocs Volume 21, Issue 3

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader