ABSTRACT
Static analyses are generally parametrized by an abstraction which is chosen from a family of abstractions. We are interested in flexible families of abstractions with many parameters, as these families can allow one to increase precision in ways tailored to the client without sacrificing scalability. For example, we consider k-limited points-to analyses where each call site and allocation site in a program can have a different k value. We then ask a natural question in this paper: What is the minimal (coarsest) abstraction in a given family which is able to prove a set of queries? In addressing this question, we make the following two contributions: (i) We introduce two machine learning algorithms for efficiently finding a minimal abstraction; and (ii) for a static race detector backed by a k-limited points-to analysis, we show empirically that minimal abstractions are actually quite coarse: It suffices to provide context/object sensitivity to a very small fraction (0.4-2.3%) of the sites to yield equally precise results as providing context/object sensitivity uniformly to all sites.
Supplemental Material
- D. Angluin. Queries and concept learning. Machine Learning, 2(4):319--342, 1988. Google ScholarCross Ref
- T. Ball and S. Rajamani. The SLAM project: debugging system software via static analysis. In Proceedings of ACM Symp. on Principles of Programming Languages (POPL), pages 1--3, 2002. Google ScholarDigital Library
- T. Ball, R. Majumdar, T. Millstein, and S. Rajamani. Automatic predicate abstraction of C programs. In Proceedings of ACM Conf. on Programming Language Design and Imple-mentation (PLDI), pages 203--213, 2001. Google ScholarDigital Library
- D. Donoho. Compressed sensing. IEEE Trans. on Information Theory, 52(4):1289--1306, 2006. Google ScholarDigital Library
- S. Graf and H. Saidi. Construction of abstract state graphs with PVS. pages 72--83, 1997. Google ScholarDigital Library
- S. Gulwani. Program Analysis using Random Interpretation. PhD thesis, University of California, Berkeley, 2005. Google ScholarDigital Library
- S. Guyer and C. Lin. Client-driven pointer analysis. In Proceedings of Intl. Static Analysis Symposium, pages 214--236, 2003. Google ScholarDigital Library
- D. Hamlet. Random testing. In Encyclopedia of Software Engineering, pages 970--978, 1994.Google Scholar
- N. Heintze and O. Tardieu. Demand-driven pointer analysis. In Proceedings of ACM Conf. on Programming Language Design and Implementation (PLDI), pages 24--34, 2001. Google ScholarDigital Library
- O. Lhoták and L. Hendren. Context-sensitive points-to analysis: is it worth it? In Proceedings of Intl. Conf. on Compiler Construction, pages 47--64, 2006. Google ScholarDigital Library
- O. Lhoták and L. Hendren. Evaluating the benefits of context-sensitive points-to analysis using a BDD-based implemen-tation. ACM Transactions on Software Engineering and Methodology, 18(1):1--53, 2008. Google ScholarDigital Library
- A. Milanova, A. Rountev, and B. Ryder. Parameterized object sensitivity for points-to and side-effect analyses for Java. In Proceedings of ACM Intl. Symp. on Software Testing and Analysis, pages 1--11, 2002. Google ScholarDigital Library
- A. Milanova, A. Rountev, and B. Ryder. Parameterized object sensitivity for points-to analysis for Java. ACM Transactions on Software Engineering and Methodology, 14(1):1--41, 2005. Google ScholarDigital Library
- M. Naik, A. Aiken, and J. Whaley. Effective static race detection for Java. In Proceedings of ACM Conf. on Programming Language Design and Implementation (PLDI), pages 308--319. Google ScholarDigital Library
- J. Plevyak and A. Chien. Precise concrete type inference for object-oriented languages. In Proceedings of ACM Conf. on Object-Oriented Programming, Systems, Languages, and Applications, pages 324--340. Google ScholarDigital Library
- T. W. Reps. Demand interprocedural program analysis using logic databases. In Workshop on Programming with Logic Databases, pages 163--196, 1993.Google Scholar
- T. W. Reps. Solving demand versions of interprocedural analysis problems. In Proceedings of Intl. Conf. on Compiler Construction, pages 389--403, 1994. Google ScholarDigital Library
- H. Robbins and S. Monro. A stochastic approximation method. Annals of Mathematical Statistics, 22(3):400--407, 1951.Google ScholarCross Ref
- M. Sagiv, T. W. Reps, and R. Wilhelm. Parametric shape analysis via 3-valued logic. ACM Transactions on Programming Languages and Systems, 24(3):217--298, 2002. Google ScholarDigital Library
- O. Shivers. Control-flow analysis in Scheme. In Proceedings of ACM Conf. on Programming Language Design and Imple-mentation (PLDI), pages 164--174, 1988. Google ScholarDigital Library
- M. Sridharan and R. Bodík. Refinement-based context-sensitive points-to analysis for Java. In Proceedings of ACM Conf. on Programming Language Design and Implementation, pages 387--400, 2006. Google ScholarDigital Library
- M. Sridharan, D. Gopan, L. Shan, and R. Bodík. Demand-driven points-to analysis for Java. In Proceedings of ACM Conf. on Object-Oriented Programming, Systems, Languages, and Applications, pages 59--76, 2005. Google ScholarDigital Library
- L. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134--1142, 1984. Google ScholarDigital Library
- M. J. Wainwright. Sharp thresholds for noisy and high-dimensional recovery of sparsity using '1-constrained quadratic programming (lasso). IEEE Transactions on Information Theory, 55:2183--2202, 2009. Google ScholarDigital Library
- J. Whaley. Context-Sensitive Pointer Analysis using Binary Decision Diagrams. PhD thesis, Stanford University, 2007. Google ScholarDigital Library
- J. Whaley and M. Lam. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In Proceedings of ACM Conf. on Programming Language Design and Implementation (PLDI), pages 131--144, 2004. Google ScholarDigital Library
- X. Zheng and R. Rugina. Demand-driven alias analysis for C. In Proceedings of ACM Symp. on Principles of Programming Languages (POPL), pages 197--208, 1998. Google ScholarDigital Library
Index Terms
- Learning minimal abstractions
Recommendations
Learning minimal abstractions
POPL '11Static analyses are generally parametrized by an abstraction which is chosen from a family of abstractions. We are interested in flexible families of abstractions with many parameters, as these families can allow one to increase precision in ways ...
Machine-learning-guided selectively unsound static analysis
ICSE '17: Proceedings of the 39th International Conference on Software EngineeringWe present a machine-learning-based technique for selectively applying unsoundness in static analysis. Existing bug-finding static analyzers are unsound in order to be precise and scalable in practice. However, they are uniformly unsound and hence at ...
Machine-Learning-Guided Typestate Analysis for Static Use-After-Free Detection
ACSAC '17: Proceedings of the 33rd Annual Computer Security Applications ConferenceTypestate analysis relies on pointer analysis for detecting temporal memory safety errors, such as use-after-free (UAF). For large programs, scalable pointer analysis is usually imprecise in analyzing their hard "corner cases", such as infeasible paths, ...
Comments