article

New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice

Authors:
Cristian Estan

University of California, San Diego, La Jolla, CA

University of California, San Diego, La Jolla, CA
View Profile

,
George Varghese

University of California, San Diego, La Jolla, CA

University of California, San Diego, La Jolla, CA
View Profile

Authors Info & Claims

ACM Transactions on Computer Systems Volume 21 Issue 3pp 270–313https://doi.org/10.1145/859716.859719

Published:01 August 2003Publication History

ACM Transactions on Computer Systems

Abstract

Accurate network traffic measurement is required for accounting, bandwidth provisioning and detecting DoS attacks. These applications see the traffic as a collection of flows they need to measure. As link speeds and the number of flows increase, keeping a counter for each flow is too expensive (using SRAM) or slow (using DRAM). The current state-of-the-art methods (Cisco's sampled NetFlow), which count periodically sampled packets are slow, inaccurate and resource-intensive. Previous work showed that at different granularities a small number of "heavy hitters" accounts for a large share of traffic. Our paper introduces a paradigm shift by concentrating the measurement process on large flows only---those above some threshold such as 0.1% of the link capacity.We propose two novel and scalable algorithms for identifying the large flows: sample and hold and multistage filters, which take a constant number of memory references per packet and use a small amount of memory. If M is the available memory, we show analytically that the errors of our new algorithms are proportional to 1/M; by contrast, the error of an algorithm based on classical sampling is proportional to 1/√M, thus providing much less accuracy for the same amount of memory. We also describe optimizations such as early removal and conservative update that further improve the accuracy of our algorithms, as measured on real traffic traces, by an order of magnitude. Our schemes allow a new form of accounting called threshold accounting in which only flows above a threshold are charged by usage while the rest are charged a fixed fee. Threshold accounting generalizes usage-based and duration based pricing.

References

Altman, J. and Chu, K. 2001. A proposal for a flexible service plan that is attractive to users and internet service providers. In IEEE Proceedings of the INFOCOM.Google Scholar
Bloom, B. 1970. Space/time trade-offs in hash coding with allowable errors. Comm. ACM. 13, 422--426. Google ScholarDigital Library
Brownlee, N., Mills, C., and Ruth, G. 1999. Traffic flow measurement: Architecture. RFC 2722. Google Scholar
Burrows, M., Erlingson, U., Leung, S.-T., Vandevoorde, M., Waldspurger, C. A., and Weihl, K. W. W. 2000. Efficient and flexible value sampling. In ASPLOS. Google Scholar
Cohen, S. and Matias, Y. 2003. Spectral bloom filters. In SIGMOD/PODS. Google Scholar
Duffield, N., Lund, C., and Thorup, M. 2001. Charging from sampled network usage. In SIGCOMM Internet Measurement Workshop. Google Scholar
Duffield, N. G. and Grossglauser, M. 2000. Trajectory sampling for direct traffic observation. In Proceedings of the ACM SIGCOMM. 271--282. Google Scholar
Estan, C. and Varghese, G. 2002. New directions in traffic measurement and accounting. Tech. Rep. 0699, UCSD CSE Department. Feb.Google Scholar
Fang, M., Shivakumar, N., Garcia-Molina, H., Motwani, R., and Ullman, J. D. 1998. Computing iceberg queries efficiently. In International Conference on Very Large Data Bases. 307--317. Google Scholar
Fang, W. and Peterson, L. 1999. Inter-as traffic patterns and their implications. In Proceedings of IEEE GLOBECOM.Google Scholar
Feldmann, A., Greenberg, A., Lund, C., Reingold, N., Rexford, J., and True, F. 2000. Deriving traffic demands for operational ip networks: Methodology and experience. In Proceedings of the ACM SIGCOMM. 257--270. Google Scholar
Feng, W.-C., Kandlur, D. D., Saha, D., and Shin, K. G. 2001. Stochastic fair blue: A queue management algorithm for enforcing fairness. In IEEE Proceedings of the INFOCOM.Google Scholar
Gibbons, P. B. and Matias, Y. 1998. New sampling-based summary statistics for improving approximate query answers. In Proceedings of the ACM SIGMOD. 331--342. Google Scholar
Huber, J. 2001. Design of an oc-192 flow monitoring chip. Class Project.Google Scholar
Karp, R. M., Papadimitriou, C. H., and Shenker, S. 2003. A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Data. Syst. Google Scholar
Mackie-Masson, J. and Varian, H. 1995. Public Access to the Internet. MIT Press, Chapter Pricing the Internet. Google Scholar
Mahajan, R., Bellovin, S. M., Floyd, S., Ioannidis, J., Paxson, V., and Shenker, S. 2001. Controlling high bandwidth aggregates in the network. http://www.aciri.org/pushback/.Google Scholar
Moore, D. 2001. Caida analysis of code-red. http://www.caida.org/analysis/security/code-red/.Google Scholar
Narayanasamy, S., Sherwood, T., Sair, S., Calder, B., and Varghese, G. 2003. Catching accurate profiles in hardware. In HPCA. Google Scholar
Pan, R., Breslau, L., Prabhakar, B., and Shenker, S. 2001. Approximate fairness through differential dropping. Tech. Rep., ACIRI.Google Scholar
Patterson, D. A. and Hennessy, J. L. 1998. Computer Organization and Design, second ed. Morgan Kaufmann, 619. Google Scholar
Sastry, S., Bodik, R., and Smith, J. E. 2001. Rapid profiling via stratified sampling. In 28th. International Symposium on Computer Architecture. 278--289. Google Scholar
Shaikh, A., Rexford, J., and Shin, K. G. 1999. Load-sensitive routing of long-lived ip flows. In Proceedings of the ACM SIGCOMM. Google Scholar
Shenker, S., Clark, D., Estrin, D., and Herzog, S. 1996. Pricing in computer networks: Reshaping the research agenda. In ACM Comput. Comm. Rev. 26, 19--43. Google ScholarDigital Library
Smitha, Kim, I., and Reddy, A. L. N. 2001. Identifying long term high rate flows at a router. In Proceedings of High Performance Computing. Google Scholar
Thomson, K., Miller, G. J., and Wilder, R. 1997. Wide-area traffic patterns and characteristics. In IEEE Network. Google Scholar
Tong, D. and Reddy, A. L. N. 1999. Qos enhancement with partial state. In International Workshop on QOS.Google Scholar

Index Terms

New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice
1. General and reference
  1. Cross-computing tools and techniques
    1. Measurement
    2. Metrics
2. Networks
  1. Network services
    1. Network monitoring

Recommendations

New directions in traffic measurement and accounting
SIGCOMM '02: Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications

Accurate network traffic measurement is required for accounting, bandwidth provisioning and detecting DoS attacks. These applications see the traffic as a collection of flows they need to measure. As link speeds and the number of flows increase, keeping ...
Read More
New directions in traffic measurement and accounting
Proceedings of the 2002 SIGCOMM conference

Accurate network traffic measurement is required for accounting, bandwidth provisioning and detecting DoS attacks. These applications see the traffic as a collection of flows they need to measure. As link speeds and the number of flows increase, keeping ...
Read More
New directions in traffic measurement and accounting
IMW '01: Proceedings of the 1st ACM SIGCOMM Workshop on Internet measurement

Accurate network traffic measurement is required for accounting, bandwidth provisioning, and detecting DOS attacks. However, keeping a counter to measure the traffic sent by each of a million concurrent flows is too expensive (using SRAM) or slow (using ...
Read More

Reviews

Reviewer: Ferdi W. J. Put

A phenomenon recently observed in traffic measurement on the Internet is that a small fraction of large flows accounts for a large share of the traffic. Because these so-called elephants deteriorate the performance experienced by the many short flows (mice), a volume charge could be imposed on those large flows. As a consequence, the measurement process can be concentrated on large flows only, such that it is not necessary to keep a counter for each flow (which would be too expensive or too slow). The biggest problem is to identify large flows: how does the measurement device know which flows to track__?__ For this purpose, the authors present two algorithms with similar performance: sample and hold (easier to implement); and multistage filters (more accurate). Because both algorithms access memory for each packet, they must use fast static random access memory (SRAM) to keep up with line speeds. The basic idea of sample and hold is to sample each packet with a certain probability. If the corresponding flow has no entry in memory, a new entry is created. After that, this entry is updated for every subsequent packet belonging to that flow. The sampling probability is determined in function of the threshold above, in which flows are considered large. Multistage filters operate according to the following principle. A stage has a table indexed by a hash function. When a packet comes in, a hash is computed, and the size of the packet is added to the corresponding counter. If the counter exceeds a threshold, we consider this flow large. Because many flows can be mapped to the same counter and can be falsely identified as large flows, multiple stages with separate counter tables and independent hash functions are used. The authors present several improvements of these basic algorithms to increase accuracy and to reduce memory requirements. Consequently, they give a very extensive theoretical and experimental analysis, and a comparison with Cisco's Sampled NetFlow. Sampled NetFlow does not process every packet, but samples every x th packet, and can afford to use large amounts of inexpensive but slow dynamic random access memory (DRAM). The two new algorithms outperform NetFlow in the sense that they detect new large flows fast, and provide accurate results (exact values and provable lower bounds). Sampled NetFlow gives better results only if the threshold is set very low (medium flows are also considered large). Another advantage of Sampled NetFlow is that it allows an a posteriori analysis of patterns in sampled data, while the new algorithms require an a priori definition of flows and patterns we are interested in. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Computer Systems Volume 21, Issue 3
August 2003
106 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/859716
Issue’s Table of Contents

Copyright © 2003 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 August 2003
Published in tocs Volume 21, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Network traffic measurement
identifying large flows
on-line algorithms
scalability
usage based accounting
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 439
  Total Citations
  View Citations
- 2,702
  Total Downloads
- Downloads (Last 12 months)95
- Downloads (Last 6 weeks)20
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

New directions in traffic measurement and accounting

New directions in traffic measurement and accounting

New directions in traffic measurement and accounting

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

New directions in traffic measurement and accounting

New directions in traffic measurement and accounting

New directions in traffic measurement and accounting

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media