skip to main content
10.1145/1250734.1250766acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
Article

Making context-sensitive points-to analysis with heap cloning practical for the real world

Published:10 June 2007Publication History

ABSTRACT

Context-sensitive pointer analysis algorithms with full "heapcloning" are powerful but are widely considered to be too expensive to include in production compilers. This paper shows, for the first time, that a context-sensitive, field-sensitive algorithm with fullheap cloning (by acyclic call paths) can indeed be both scalable and extremely fast in practice. Overall, the algorithm is able to analyze programs in the range of 100K-200K lines of C code in 1-3 seconds,takes less than 5% of the time it takes for GCC to compile the code (which includes no whole-program analysis), and scales well across five orders of magnitude of code size. It is also able to analyze the Linux kernel (about 355K linesof code) in 3.1 seconds. The paper describes the major algorithmic and engineering design choices that are required to achieve these results, including (a) using flow-insensitive and unification-basedanalysis, which are essential to avoid exponential behavior in practice;(b) sacrificing context-sensitivity within strongly connected components of the call graph; and (c) carefully eliminating several kinds of O(N2) behaviors (largely without affecting precision). The techniques used for (b) and (c) eliminated several major bottlenecks to scalability, and both are generalizable to other context-sensitive algorithms. We show that the engineering choices collectively reduce analysis time by factors of up to 10x-15xin our larger programs, and have found that the savings grow strongly with program size. Finally, we briefly summarize results demonstrating the precision of the analysis.

References

  1. LLVM Link Time Optimization: Design and Implementation. http://llvm.org/docs/LinkTimeOptimization.html.Google ScholarGoogle Scholar
  2. L. O. Andersen. Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, University of Copenhagen, May 1994.Google ScholarGoogle Scholar
  3. B.-C. Cheng and W. mei Hwu. Modular interprocedural pointer analysis using access paths: Design, implementation, and evaluation. In PLDI, pages 57--69, Vancouver, British Columbia, Canada, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Chong and R. Rugina. Static analysis of accessed regions in recursive data structures. In Proc. Int'l Symp. on Static Analysis (SAS), San Diego, CA, June 2003.Google ScholarGoogle ScholarCross RefCross Ref
  5. M. Das. Unification-based pointer analysis with directional assignments. In PLDI, pages 35--46, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Das, B. Liblit, M. Fähndrich, and J. Rehof. Estimating the impact of scalable pointer analysis on optimization. In Proc. Int'l Symp. on Static Analysis (SAS), pages 260--278. Springer-Verlag, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Deutsch. Interprocedural may-alias analysis for pointers: Beyond k-limiting. In PLDI, pages 230--241, June 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Dhurjati, S. Kowshik, and V. Adve. SAFECode: Enforcing alias analysis for weakly typed languages. In PLDI, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Dhurjati, S. Kowshik, V. Adve, and C. Lattner. Memory safety without garbage collection for embedded applications. ACM Trans. on Embedded Computing Systems, Feb. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Emami, R. Ghiya, and L. J. Hendren. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In PLDI, pages 242--256, Orlando, FL, June 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Fähndrich, J. Rehof, and M. Das. Scalable context--sensitive flow analysis using instantiation constraints. In PLDI, Vancouver, Canada, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. S. Foster, M. Fähndrich, and A. Aiken. Polymorphic versus monomorphic flow-insensitive points-to analysis for c. In Proc. Int'l Symp. on Static Analysis (SAS), pages 175--198, London, UK, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Ghiya and L. J. Hendren. Connection analysis: A practical interprocedural heap analysis for C. International Journal of Parallel Programming, 24(6):547--578, 1996.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Ghiya and L. J. Hendren. Is it a tree, a DAG, or a cyclic graph? A shape analysis for heap--directed pointers in C. In POPL, pages 1--15, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Ghiya, D. Lavery, and D. Sehr. On the importance of points-to analysis and other memory disambiguation methods for C programs. In PLDI, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. B. Hackett and R. Rugina. Region-based shape analysis with tracked locations. In POPL, pages 310--323, New York, NY, USA, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Hind. Pointer analysis: Haven't we solved this problem yet? In Proc. ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 54--61, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Lattner. Macroscopic Data Structure Analysis and Optimization. PhD thesis, Comp. Sci. Dept., Univ. of Illinois, Urbana, IL, May 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation. In Int'l Symp. on Code Generation and Optimization, Mar 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Lattner and V. Adve. Automatic pool allocation: Improving performance by controlling data structure layout in the heap. In PLDI, Chicago, IL, Jun 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Lattner and V. Adve. Transparent Pointer Compression for Linked Data Structures. In MSP, Chicago, IL, Jun 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. Liang and M. J. Harrold. Efficient points-to analysis for whole-program analysis. In Proc. European Software Engineering Conf. (ESEC), pages 199--215, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Liang and M. J. Harrold. Efficient computation of parameterized pointer information for interprocedural analysis. In SAS 2001, July 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. Meredith, B. Pankaj, S. Sahoo, C. Lattner, and V. Adve. How successful is data structure analysis in isolating and analyzing linked data structures? Tech. Report UIUCDCS-R-2005-2658, Computer Science Dept., Univ. of Illinois at Urbana-Champaign, Nov 2005.Google ScholarGoogle Scholar
  25. E. M. Nystrom, H.-S. Kim, and W. mei W. Hwu. Bottom-up and top-down context-sensitive summary-based pointer analysis. In SAS 2004, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  26. E. M. Nystrom, H.-S. Kim, and W. mei W. Hwu. Importance of heap specialization in pointer analysis. In Proc. ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 43--48, New York, NY, USA, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. O'Callahan and D. Jackson. Lackwit: a program understanding tool based on type inference. In ICSE '97: Proceedings of the 19th international conference on Software engineering, pages 338--348, New York, NY, USA, 1997. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. E. Ruf. Effective synchronization removal for java. In PLDI, pages 208--218, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Sagiv, T. Reps, and R. Wilhelm. Solving shape-analysis problems in languages with destructive updating. TOPLAS, 20(1), Jan. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. B. Steensgaard. Points-to analysis by type inference of programs with structures and unions. In Compiler Construction, pages 136--150, London, UK, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. B. Steensgaard. Points-to analysis in almost linear time. In POPL, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. R. E. Tarjan. Efficiency of a good but not linear set union algorithm. J. ACM, 22(2):215--225, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. F. Vivien and M. Rinard. Incrementalized pointer and escape analysis. In PLDI, pages 35--46, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Whaley and M. S. Lam. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In PLDI, pages 131--144, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. P. Wilson and M. S. Lam. Effective context sensitive pointer analysis for C programs. In PLDI, pages 1--12, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Making context-sensitive points-to analysis with heap cloning practical for the real world

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PLDI '07: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation
      June 2007
      508 pages
      ISBN:9781595936332
      DOI:10.1145/1250734
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 42, Issue 6
        Proceedings of the 2007 PLDI conference
        June 2007
        491 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/1273442
        Issue’s Table of Contents

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 June 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate406of2,067submissions,20%

      Upcoming Conference

      PLDI '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader