ABSTRACT
Context-sensitive pointer analysis algorithms with full "heapcloning" are powerful but are widely considered to be too expensive to include in production compilers. This paper shows, for the first time, that a context-sensitive, field-sensitive algorithm with fullheap cloning (by acyclic call paths) can indeed be both scalable and extremely fast in practice. Overall, the algorithm is able to analyze programs in the range of 100K-200K lines of C code in 1-3 seconds,takes less than 5% of the time it takes for GCC to compile the code (which includes no whole-program analysis), and scales well across five orders of magnitude of code size. It is also able to analyze the Linux kernel (about 355K linesof code) in 3.1 seconds. The paper describes the major algorithmic and engineering design choices that are required to achieve these results, including (a) using flow-insensitive and unification-basedanalysis, which are essential to avoid exponential behavior in practice;(b) sacrificing context-sensitivity within strongly connected components of the call graph; and (c) carefully eliminating several kinds of O(N2) behaviors (largely without affecting precision). The techniques used for (b) and (c) eliminated several major bottlenecks to scalability, and both are generalizable to other context-sensitive algorithms. We show that the engineering choices collectively reduce analysis time by factors of up to 10x-15xin our larger programs, and have found that the savings grow strongly with program size. Finally, we briefly summarize results demonstrating the precision of the analysis.
- LLVM Link Time Optimization: Design and Implementation. http://llvm.org/docs/LinkTimeOptimization.html.Google Scholar
- L. O. Andersen. Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, University of Copenhagen, May 1994.Google Scholar
- B.-C. Cheng and W. mei Hwu. Modular interprocedural pointer analysis using access paths: Design, implementation, and evaluation. In PLDI, pages 57--69, Vancouver, British Columbia, Canada, June 2000. Google ScholarDigital Library
- S. Chong and R. Rugina. Static analysis of accessed regions in recursive data structures. In Proc. Int'l Symp. on Static Analysis (SAS), San Diego, CA, June 2003.Google ScholarCross Ref
- M. Das. Unification-based pointer analysis with directional assignments. In PLDI, pages 35--46, 2000. Google ScholarDigital Library
- M. Das, B. Liblit, M. Fähndrich, and J. Rehof. Estimating the impact of scalable pointer analysis on optimization. In Proc. Int'l Symp. on Static Analysis (SAS), pages 260--278. Springer-Verlag, 2001. Google ScholarDigital Library
- A. Deutsch. Interprocedural may-alias analysis for pointers: Beyond k-limiting. In PLDI, pages 230--241, June 1994. Google ScholarDigital Library
- D. Dhurjati, S. Kowshik, and V. Adve. SAFECode: Enforcing alias analysis for weakly typed languages. In PLDI, June 2006. Google ScholarDigital Library
- D. Dhurjati, S. Kowshik, V. Adve, and C. Lattner. Memory safety without garbage collection for embedded applications. ACM Trans. on Embedded Computing Systems, Feb. 2005. Google ScholarDigital Library
- M. Emami, R. Ghiya, and L. J. Hendren. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In PLDI, pages 242--256, Orlando, FL, June 1994. Google ScholarDigital Library
- M. Fähndrich, J. Rehof, and M. Das. Scalable context--sensitive flow analysis using instantiation constraints. In PLDI, Vancouver, Canada, June 2000. Google ScholarDigital Library
- J. S. Foster, M. Fähndrich, and A. Aiken. Polymorphic versus monomorphic flow-insensitive points-to analysis for c. In Proc. Int'l Symp. on Static Analysis (SAS), pages 175--198, London, UK, 2000. Google ScholarDigital Library
- R. Ghiya and L. J. Hendren. Connection analysis: A practical interprocedural heap analysis for C. International Journal of Parallel Programming, 24(6):547--578, 1996.Google ScholarDigital Library
- R. Ghiya and L. J. Hendren. Is it a tree, a DAG, or a cyclic graph? A shape analysis for heap--directed pointers in C. In POPL, pages 1--15, 1996. Google ScholarDigital Library
- R. Ghiya, D. Lavery, and D. Sehr. On the importance of points-to analysis and other memory disambiguation methods for C programs. In PLDI, 2001. Google ScholarDigital Library
- B. Hackett and R. Rugina. Region-based shape analysis with tracked locations. In POPL, pages 310--323, New York, NY, USA, 2005. Google ScholarDigital Library
- M. Hind. Pointer analysis: Haven't we solved this problem yet? In Proc. ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 54--61, 2001. Google ScholarDigital Library
- C. Lattner. Macroscopic Data Structure Analysis and Optimization. PhD thesis, Comp. Sci. Dept., Univ. of Illinois, Urbana, IL, May 2005. Google ScholarDigital Library
- C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation. In Int'l Symp. on Code Generation and Optimization, Mar 2004. Google ScholarDigital Library
- C. Lattner and V. Adve. Automatic pool allocation: Improving performance by controlling data structure layout in the heap. In PLDI, Chicago, IL, Jun 2005. Google ScholarDigital Library
- C. Lattner and V. Adve. Transparent Pointer Compression for Linked Data Structures. In MSP, Chicago, IL, Jun 2005. Google ScholarDigital Library
- D. Liang and M. J. Harrold. Efficient points-to analysis for whole-program analysis. In Proc. European Software Engineering Conf. (ESEC), pages 199--215, 1999. Google ScholarDigital Library
- D. Liang and M. J. Harrold. Efficient computation of parameterized pointer information for interprocedural analysis. In SAS 2001, July 2001. Google ScholarDigital Library
- P. Meredith, B. Pankaj, S. Sahoo, C. Lattner, and V. Adve. How successful is data structure analysis in isolating and analyzing linked data structures? Tech. Report UIUCDCS-R-2005-2658, Computer Science Dept., Univ. of Illinois at Urbana-Champaign, Nov 2005.Google Scholar
- E. M. Nystrom, H.-S. Kim, and W. mei W. Hwu. Bottom-up and top-down context-sensitive summary-based pointer analysis. In SAS 2004, 2004.Google ScholarCross Ref
- E. M. Nystrom, H.-S. Kim, and W. mei W. Hwu. Importance of heap specialization in pointer analysis. In Proc. ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 43--48, New York, NY, USA, 2004. Google ScholarDigital Library
- R. O'Callahan and D. Jackson. Lackwit: a program understanding tool based on type inference. In ICSE '97: Proceedings of the 19th international conference on Software engineering, pages 338--348, New York, NY, USA, 1997. ACM Press. Google ScholarDigital Library
- E. Ruf. Effective synchronization removal for java. In PLDI, pages 208--218, 2000. Google ScholarDigital Library
- M. Sagiv, T. Reps, and R. Wilhelm. Solving shape-analysis problems in languages with destructive updating. TOPLAS, 20(1), Jan. 1998. Google ScholarDigital Library
- B. Steensgaard. Points-to analysis by type inference of programs with structures and unions. In Compiler Construction, pages 136--150, London, UK, 1996. Google ScholarDigital Library
- B. Steensgaard. Points-to analysis in almost linear time. In POPL, 1996. Google ScholarDigital Library
- R. E. Tarjan. Efficiency of a good but not linear set union algorithm. J. ACM, 22(2):215--225, 1975. Google ScholarDigital Library
- F. Vivien and M. Rinard. Incrementalized pointer and escape analysis. In PLDI, pages 35--46, 2001. Google ScholarDigital Library
- J. Whaley and M. S. Lam. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In PLDI, pages 131--144, 2004. Google ScholarDigital Library
- R. P. Wilson and M. S. Lam. Effective context sensitive pointer analysis for C programs. In PLDI, pages 1--12, June 1995. Google ScholarDigital Library
Index Terms
- Making context-sensitive points-to analysis with heap cloning practical for the real world
Recommendations
Making context-sensitive points-to analysis with heap cloning practical for the real world
Proceedings of the 2007 PLDI conferenceContext-sensitive pointer analysis algorithms with full "heapcloning" are powerful but are widely considered to be too expensive to include in production compilers. This paper shows, for the first time, that a context-sensitive, field-sensitive ...
Merging equivalent contexts for scalable heap-cloning-based context-sensitive points-to analysis
ISSTA '08: Proceedings of the 2008 international symposium on Software testing and analysisA context-sensitive points-to analysis maintains separate points-to relationships for each possible (abstract) calling context of a method. Previous work has shown that a large number of equivalence classes exists in the representation of calling ...
Cloning-based context-sensitive pointer alias analysis using binary decision diagrams
PLDI '04This paper presents the first scalable context-sensitive, inclusion-based pointer alias analysis for Java programs. Our approach to context sensitivity is to create a clone of a method for every context of interest, and run a context-insensitive ...
Comments