research-article

Toward a standard benchmark for computer security research: the worldwide intelligence network environment (WINE)

Authors:
Tudor Dumitraş

Symantec Research Labs

Symantec Research Labs
View Profile

,
Darren Shou

Symantec Research Labs

Symantec Research Labs
View Profile

BADGERS '11: Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for SecurityApril 2011Pages 89–96https://doi.org/10.1145/1978672.1978683

Published:10 April 2011Publication History

BADGERS '11: Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security

Pages 89–96

ABSTRACT

Unlike benchmarks that focus on performance or reliability evaluations, a benchmark for computer security must necessarily include sensitive code and data. Because these artifacts could damage systems or reveal personally identifiable information about the users affected by cyber attacks, publicly disseminating such a benchmark raises several scientific, ethical and legal challenges. We propose the Worldwide Intelligence Network Environment (WINE), a security-benchmarking approach based on rigorous experimental methods. WINE includes representative field data, collected worldwide from 240,000 sensors, for new empirical studies, and it will enable the validation of research on all the phases in the lifecycle of security threats. We tackle the key challenges for security benchmarking by designing a platform for repeatable experimentation on the WINE data sets and by collecting the metadata required for understanding the results. In this paper, we review the unique characteristics of the WINE data, we discuss why rigorous benchmarking will provide fresh insights on the security arms race and we propose a research agenda for this area.

References

Baker, M. G., Hartman, J. H., Kupfer, M. D., Shirriff, K. W., and Ousterhout, J. K. 1991. Measurements of a distributed file system. In ACM Symposium on Operating Systems Principles. Pacific Grove, CA, 198--212. Google ScholarDigital Library
Brumley, D., Poosankam, P., Song, D. X., and Zheng, J. 2008. Automatic patch-based exploit generation is possible: Techniques and implications. In IEEE Symposium on Security and Privacy. Oakland, CA, 143--157. Google ScholarDigital Library
Camp, J., Cranor, L., Feamster, N., Feigenbaum, J., Forrest, S., Kotz, D., Lee, W., Lincoln, P., Paxson, V., Reiter, M., Rivest, R., Sanders, W., Savage, S., Smith, S., Spafford, E., and Stolfo, S. 2009. Data for cybersecurity research: Process and "wish list". http://www.gtisc.gatech.edu/files_nsf10/data-wishlist.pdf.Google Scholar
Chatfield, C. 1983. Statistics for Technology: A Course in Applied Statistics, 3^rd ed. Chapman &amp; Hall/CRC.Google Scholar
Cova, M., Leita, C., Thonnard, O., Keromytis, A. D., and Dacier, M. 2010. An analysis of rogue AV campaigns. In International Symposium on Recent Advances in Intrusion Detection. Ottawa, Canada, 442--463. Google ScholarDigital Library
CWE/SANS. 2010. Top 25 most dangerous programming errors.Google Scholar
Dean, J. and Ghemawat, S. 2004. MapReduce: Simplified data processing on large clusters. In USENIX Symposium on Operating Systems Design and Implementation. San Francisco, CA, 137--150. Google ScholarDigital Library
DeWitt, D. J. 1993. The Wisconsin benchmark: Past, present, and future. In The Benchmark Handbook for Database and Transaction Systems, J. Gray, Ed. Morgan Kaufmann.Google Scholar
DHS. 2011a. DETER. http://www.isi.deterlab.net/.Google Scholar
DHS. 2011b. PREDICT. http://www.predict.org/.Google Scholar
Eide, E., Stoller, L., and Lepreau, J. 2007. An experimentation workbench for replayable networking research. In USENIX Symposium on Networked Systems Design and Implementation. Cambridge, MA. Google ScholarDigital Library
Frei, S. 2009. Security econometrics: The dynamics of (in)security. Ph. D. thesis, ETH Z&#252;rich.Google Scholar
Google Inc. 2011. Google Apps service level agreement. http://www.google.com/apps/intl/en/terms/sla.html.Google Scholar
Griffin, K., Schneider, S., Hu, X., and Chiueh, T.-C. 2009. Automatic generation of string signatures for malware detection. In International Symposium on Recent Advances in Intrusion Detection. Saint-Malo, France, 101--120. Google ScholarDigital Library
Keeton, K., Mehra, P., and Wilkes, J. 2009. Do you know your IQ? A research agenda for information quality in systems. SIGMETRICS Performance Evaluation Review 37, 26--31. Google ScholarDigital Library
Leita, C., Bayer, U., and Kirda, E. 2010. Exploiting diverse observation perspectives to get insights on the malware landscape. In International Conference on Dependable Systems and Networks. Chicago, IL, 393--402.Google Scholar
Lippmann, R. P., Fried, D. J., Graf, I., Haines, J. W., Kendall, K. R., McClung, D., Weber, D., Webster, S. E., Wyschogrod, D., Cunningham, R. K., and Zissman, M. A. 2000. Evaluating intrusion detection systems: The 1998 DARPA off-line intrusion detection evaluation. DARPA Information Survivability Conference and Exposition, 12--26.Google Scholar
Maxion, R. A. and Townsend, T. N. 2004. Masquerade detection augmented with error analysis. IEEE Transactions on Reliability 53, 1, 124--147.Google ScholarCross Ref
McHugh, J. 2000. Testing intrusion detection systems: A critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Transactions on Information and System Security 3, 4, 262--294. Google ScholarDigital Library
Miller, B. P., Fredriksen, L., and So, B. 1990. An empirical study of the reliability of UNIX utilities. Communications of the ACM 33, 12 (Dec), 32--44. Google ScholarDigital Library
Pavlo, A., Paulson, E., Rasin, A., Abadi, D. J., Dewitt, D. J., Madden, S., and Stonebraker, M. 2009. A comparison of approaches to large-scale data analysis. In ACM SIGMOD International Conference on Management of Data. Providence, RI, 165--178. Google ScholarDigital Library
Paxson, V. 2004. Strategies for sound internet measurement. In Internet Measurement Conference. Taormina, Italy, 263--271. Google ScholarDigital Library
Perkins, J. H., Kim, S., Larsen, S., Amarasinghe, S., Bachrach, J., Carbin, M., Pacheco, C., Sherwood, F., Sidiroglou, S., Sullivan, G., Wong, W.-F., Zibin, Y., Ernst, M. D., and Rinard, M. 2009. Automatically patching errors in deployed software. In ACM Symposium on Operating Systems Principles. Big Sky, Montana, USA, 87--102. Google ScholarDigital Library
Pham, N. H., Nguyen, T. T., Nguyen, H. A., and Nguyen, T. N. 2010. Detection of recurring software vulnerabilities. In IEEE/ACM International Conference on Automated Software Engineering. Antwerp, Belgium, 447--456. Google ScholarDigital Library
Zaharia, M., Konwinski, A., Joseph, A. D., Katz, R. H., and Stoica, I. 2008. Improving MapReduce performance in heterogeneous environments. In USENIX Symposium on Operating Systems Design and Implementation. San Diego, CA, 29--42. Google ScholarDigital Library

Index Terms

Toward a standard benchmark for computer security research: the worldwide intelligence network environment (WINE)

Recommendations

Subsetting the SPEC CPU2006 benchmark suite

On August 24, 2006, the Standard Performance Evaluation Corporation (SPEC) announced CPU2006 -- the next generation of industry-standardized CPU-intensive benchmark suite. The SPEC CPU benchmark suite has become the most frequently used suite for ...
Read More
A Benchmark Characterization of the EEMBC Benchmark Suite

Benchmark consumers expect benchmark suites to be complete, accurate, and consistent, and benchmark scores serve as relative measures of performance. However, it is important to understand how benchmarks stress the processors that they aim to test. This ...
Read More
Toward an automated benchmark management system
SOAP 2016: Proceedings of the 5th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis

The systematic evaluation of program analyses as well as software-engineering tools requires benchmark suites that are representative of real-world projects in the domains for which the tools or analyses are designed. Such benchmarks currently only ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BADGERS '11: Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security
April 2011
111 pages
ISBN:9781450307680
DOI:10.1145/1978672
Program Chairs:
Engin Kirda
Northeastern University
,
Thorsten Holz
Ruhr-University Bochum
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 April 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate4of7submissions,57%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 58
  Total Citations
  View Citations
- 585
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Toward a standard benchmark for computer security research: the worldwide intelligence network environment (WINE)

BADGERS '11: Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security

ABSTRACT

References

Cited By

Index Terms

Recommendations

Subsetting the SPEC CPU2006 benchmark suite

A Benchmark Characterization of the EEMBC Benchmark Suite

Toward an automated benchmark management system