skip to main content
10.1145/2746194.2746198acmotherconferencesArticle/Chapter ViewAbstractPublication PageshotsosConference Proceedingsconference-collections
research-article

Challenges with applying vulnerability prediction models

Published:21 April 2015Publication History

ABSTRACT

Vulnerability prediction models (VPM) are believed to hold promise for providing software engineers guidance on where to prioritize precious verification resources to search for vulnerabilities. However, while Microsoft product teams have adopted defect prediction models, they have not adopted vulnerability prediction models (VPMs). The goal of this research is to measure whether vulnerability prediction models built using standard recommendations perform well enough to provide actionable results for engineering resource allocation. We define 'actionable' in terms of the inspection effort required to evaluate model results. We replicated a VPM for two releases of the Windows Operating System, varying model granularity and statistical learners. We reproduced binary-level prediction precision (~0.75) and recall (~0.2). However, binaries often exceed 1 million lines of code, too large to practically inspect, and engineers expressed preference for source file level predictions. Our source file level models yield precision below 0.5 and recall below 0.2. We suggest that VPMs must be refined to achieve actionable performance, possibly through security-specific metrics.

References

  1. Zimmermann, T., Nagappan, N., and Williams, L. Searching for a Needle in a Haystack: Predicting Security Vulnerabilities for Windows Vista. In Software Testing, Verification and Validation (ICST), 2010 Third International Conference on (2010), 421--428. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Howard, M. and Lipner, S. The Security Development Lifecycle. Microsoft Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Basili, V. R., Briand, L. C., and Melo, W. L. A validation of object-oriented design metrics as quality indicators. Software Engineering, IEEE Transactions on, 22 (1996), 751--761. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Emam, K. E., Melo, W., and Machado, J. C. The prediction of faulty classes using object-oriented design metrics. J. Syst. Softw., 56 (feb 2001), 63--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Nagappan, N. and Ball, T. Use of relative code churn measures to predict system defect density. In Software Engineering, 2005. ICSE 2005. Proceedings. 27th International Conference on (2005), 284--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gegick, M., Williams, L., Osborne, J., and Vouk, M. Prioritizing software security fortification throughcode-level metrics. In Proceedings of the 4th ACM workshop on Quality of protection (2008), ACM, 31--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Shin, Y. and Williams, L. Can traditional fault prediction models be used for vulnerability prediction? Empirical Software Engineering, 18 (2013), 25--59.Google ScholarGoogle ScholarCross RefCross Ref
  8. Neuhaus, S., Zimmermann, T., Holler, C., and Zeller, A. Predicting vulnerable software components. In Proceedings of the 14th ACM conference on Computer and communications security (2007), ACM, 529--540. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Zimmermann, T. and Nagappan, N. Predicting defects using network analysis on dependency graphs. In Proceedings of the 30th international conference on Software engineering (2008), ACM, 531--540. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Arisholm, E. and Briand, L. C. Predicting Fault-prone Components in a Java Legacy System. In Proceedings of the 2006 ACM/IEEE International Symposium on Empirical Software Engineering (2006), ACM, 8--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Mende, T. and Koschke, R. Revisiting the Evaluation of Defect Prediction Models. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering (2009), ACM, 7:1--7:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Menzies, T., Greenwald, J., and Frank, A. Data Mining Static Code Attributes to Learn Defect Predictors. IEEE Transactions on Software Engineering, 33 (2007), 2--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D'Ambros, M., Lanza, M., and Robbes, R. Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empirical Softw. Engg., 17 (aug 2012), 531--577. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hall, T., Beecham, S., Bowes, D., Gray, D., and Counsell, S. A Systematic Literature Review on Fault Prediction Performance in Software Engineering. Software Engineering, IEEE Transactions on, 38 (2012), 1276--1304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Moser, R., Pedrycz, W., and Succi, G. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In Proceedings of the 30th international conference on Software engineering (2008), ACM, 181--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Pinzger, M., Nagappan, N., and Murphy, B. Can developer-module networks predict failures? In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering (2008), ACM, 2--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nagappan, N., Murphy, B., and Basili, V. The Influence of Organizational Structure on Software Quality: An Empirical Case Study. In Proceedings of the 30th International Conference on Software Engineering (2008), ACM, 521--530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hassan, A. E. Predicting faults using the complexity of code changes. In Proceedings of the 31st International Conference on Software Engineering (2009), IEEE Computer Society, 78--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Herzig, K. and Zeller, A. Mining cause-effect-chains from version histories. In Software Reliability Engineering (ISSRE), 2011 IEEE 22nd International Symposium on (2011), 60--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Herzig, K., Just, S., Rau, A., and Zeller, A. Predicting Defects Using Change Genealogies. In Proceedings of the 2013 IEEE 24nd International Symposium on Software Reliability Engineering (2013), IEEE Computer Society.Google ScholarGoogle ScholarCross RefCross Ref
  21. Hovsepyan, A., Scandariato, R., Joosen, W., and Walden, J. Software Vulnerability Prediction Using Text Analysis Techniques. In Proceedings of the 4th International Workshop on Security Measurements and Metrics (2012), ACM, 7--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Shin, Y., Meneely, A., Williams, L., and Osborne, J. A. Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities. Software Engineering, IEEE Transactions on, 37 (2011), 772--787. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Chowdhury, I. and Zulkernine, M. Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities. Journal of Systems Architecture, 57 (2011), 294--313. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Giger, E., D'Ambros, M., Pinzger, M., and Gall, H. C. Method-level bug prediction. In Empirical Software Engineering and Measurement (ESEM), 2012 ACM-IEEE International Symposium on (2012), 171--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Menzies, T., Dekhtyar, A., Distefano, J., and Greenwald, J. Problems with Precision: A Response to Comments on Data Mining Static Code Attributes to Learn Defect Predictors. Software Engineering, IEEE Transactions on, 33 (2007), 637--640. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Premraj, R. and Herzig, K. Network Versus Code Metrics to Predict Defects: A Replication Study. In Proceedings of the 2011 International Symposium on Empirical Software Engineering and Measurement (2011), IEEE Computer Society, 215--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Weyuker, E., Ostrand, T., and Bell, R. Comparing the effectiveness of several modeling methods for fault prediction. Empirical Software Engineering, 15, 277--295. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Czerwonka, J., Nagappan, N., Schulte, W., and Murphy, B. CODEMINE: Building a Software Development Data Analytics Platform at Microsoft. Software, IEEE, 30, 4 (2013), 64--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Four Grand Challenges in Trustworthy Computing., 2003.Google ScholarGoogle Scholar
  30. Team, R. D. C. R: A Language and Environment for Statistical Computing., 2010. R Foundation for Statistical Computing.Google ScholarGoogle Scholar
  31. Venables, W. N. and Ripley, B. D. Modern Applied Statistics with S. Fourth Edition. Springer, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Pearson, K. LIII. On lines and planes of closest fit to systems of points in space. Philosophical Magazine Series 6, 2 (1901), 559--572.Google ScholarGoogle Scholar
  33. Kuhn, M. caret: Classification and Regression Training., 2011.Google ScholarGoogle Scholar
  34. Witten, I. H. and Frank, E. Data mining: practical machine learning tools and techniques with Java implementations. SIGMOD Rec., 31 (mar 2002), 76--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Friedman, J., Hastie, T., and Tibshirani, R. The Elements of Statistical Learning. Springer Publishing Company, Incorporated, 2009.Google ScholarGoogle Scholar
  36. Nagappan, N., Ball, T., and Zeller, A. Mining Metrics to Predict Component Failures. In Proceedings of the 28th International Conference on Software Engineering (2006), ACM, 452--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Dowd, M., McDonald, J., and Schuh, J. The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities. Addison-Wesley Professional, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Smith, B. and Williams, L. Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web Application Vulnerabilities. In Software Testing, Verification and Validation (ICST), 2011 IEEE Fourth International Conference on (March 2011), 220--229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Chawla, N. V. C4. 5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In Proceedings of the ICML (2003).Google ScholarGoogle Scholar
  40. Beautiful Evidence. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Challenges with applying vulnerability prediction models

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          HotSoS '15: Proceedings of the 2015 Symposium and Bootcamp on the Science of Security
          April 2015
          170 pages
          ISBN:9781450333764
          DOI:10.1145/2746194
          • General Chair:
          • David Nicol

          Copyright © 2015 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 21 April 2015

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          HotSoS '15 Paper Acceptance Rate13of22submissions,59%Overall Acceptance Rate34of60submissions,57%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader