Abstract
In this article, we present a general method for achieving global static analyzers that are precise and sound, yet also scalable. Our method, on top of the abstract interpretation framework, is a general sparse analysis technique that supports relational as well as nonrelational semantics properties for various programming languages. Analysis designers first use the abstract interpretation framework to have a global and correct static analyzer whose scalability is unattended. Upon this underlying sound static analyzer, analysis designers add our generalized sparse analysis techniques to improve its scalability while preserving the precision of the underlying analysis. Our method prescribes what to prove to guarantee that the resulting sparse version should preserve the precision of the underlying analyzer.
We formally present our framework and show that existing sparse analyses are all restricted instances of our framework. In addition, we show more semantically elaborate design examples of sparse nonrelational and relational static analyses. We then present their implementation results that scale to globally analyze up to one million lines of C programs. We also show a set of implementation techniques that turn out to be critical to economically support the sparse analysis process.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Lazy Scheduling: A Runtime Adaptive Scheduler for Declarative Parallelism
- Xavier Allamigeon, Wenceslas Godard, and Charles Hymans. 2006. Static analysis of string manipulations in critical embedded C programs. In Proceedings of the International Symposium on Static Analysis. 35--51. Google ScholarDigital Library
- Gogul Balakrishnan and Thomas Reps. 2004. Analyzing memory accesses in ×86 binary executables. In Proceedings of the International Conference on Compiler Construction. 5--23.Google ScholarCross Ref
- Bruno Blanchet, Patrick Cousot, Radhia Cousot, Jerome Feret, Laurent Mauborgne, Antoine Miné, David Monniaux, and Xavier Rival. 2003. A static analyzer for large safety-critical software. In Proceedings of the ACM SIGPLAN-SIGACT Conference on Programming Language Design and Implementation. 196--207. Google ScholarDigital Library
- Francois Bourdoncle. 1993. Efficient chaotic iteration strategies with widenings. In Proceedings of the International Conference on Formal Methods in Programming and Their Applications. 128--141.Google ScholarCross Ref
- David R. Chase, Mark Wegman, and F. Kenneth Zadeck. 1990. Analysis of pointers and structures. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 296--310. Google ScholarDigital Library
- Jong-Deok Choi, Ron Cytron, and Jeanne Ferrante. 1991. Automatic construction of sparse data flow evaluation graphs. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 55--66. Google ScholarDigital Library
- Jong-Deok Choi, Vivek Sarkar, and Edith Schonberg. 1996. Incremental computation of static single assignment form. In Proceedings of the 6th International Conference on Compiler Construction (CC'96). Springer-Verlag, London, UK, 223--237. http://dl.acm.org/citation.cfm?id=647473.727451. Google ScholarDigital Library
- Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 238--252. Google ScholarDigital Library
- Patrick Cousot and Radhia Cousot. 1979. Systematic design of program analysis frameworks. In Conference Record of the 6th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York, NY, 269--282. Google ScholarDigital Library
- Patrick Cousot and Radhia Cousot. 1992. Abstract interpretation frameworks. Journal of Logic and Computation 2, 4, 511--547.Google ScholarCross Ref
- Patrick Cousot, Radhia Cousot, Jerome Feret, Laurent Mauborgne, Antoine Miné, and Xavier Rival. 2009. Why does Astre scale up? Formal Methods in System Design 35, 3, 229--264. DOI:http://dx.doi.org/10.1007/s10703-009-0089-6 Google ScholarDigital Library
- Patrick Cousot and Nicolas Halbwachs. 1978. Automatic discovery of linear restraints among variables of a program. In Conference Record of the 5th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York, NY, 84--97. Google ScholarDigital Library
- Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems 13, 4, 451--490. Google ScholarDigital Library
- Ron K. Cytron and Jeanne Ferrante. 1995. Efficiently computing &phis;-nodes on-the-fly. ACM Transactions on Programming Languages and Systems 17, 3, 487--506. Google ScholarDigital Library
- Dhananjay M. Dhamdhere, Barry K. Rosen, and F. Kenneth Zadeck. 1992. How to analyze large programs efficiently and informatively. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'92). ACM, New York, NY, 212--223. Google ScholarDigital Library
- Isil Dillig, Thomas Dillig, and Alex Aiken. 2008. Sound, complete and scalable path-sensitive analysis. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'08). ACM, New York, NY, 270--280. DOI:http://dx.doi.org/10.1145/1375581.1375615 Google ScholarDigital Library
- Isil Dillig, Thomas Dillig, and Alex Aiken. 2011. Precise reasoning for programs using containers. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'11). ACM, New York, NY, 187--200. DOI:http://dx.doi.org/10.1145/1926385.1926407 Google ScholarDigital Library
- Azadeh Farzan and Zachary Kincaid. 2012. Verification of parameterized concurrent programs by modular reasoning about data and control. In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'12). ACM, New York, NY, 297--308. DOI:http://dx.doi.org/10.1145/2103656.2103693 Google ScholarDigital Library
- Azadeh Farzan, Zachary Kincaid, and Andreas Podelski. 2013. Inductive data flow graphs. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'13). ACM, New York, NY, 129--142. DOI:http://dx.doi.org/10.1145/2429069.2429086 Google ScholarDigital Library
- Ben Hardekopf and Calvin Lin. 2007. The ant and the grasshopper: Fast and accurate pointer analysis for millions of lines of code. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'07). ACM, New York, NY, 290--299. Google ScholarDigital Library
- Ben Hardekopf and Calvin Lin. 2009. Semi-sparse flow-sensitive pointer analysis. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 226--238. Google ScholarDigital Library
- Ben Hardekopf and Calvin Lin. 2011. Flow-sensitive pointer analysis for millions of lines of code. In Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization. 289--298. Google ScholarDigital Library
- Michael Hind and Anthony Pioli. 1998. Assessing the effects of flow-sensitivity on pointer alias analyses. In Proceedings of the International Symposium on Static Analysis. 57--81. Google ScholarDigital Library
- Bertrand Jeannet and Antoine Miné. 2009. APRON: A library of numerical abstract domains for static analysis. In Proceedings of the 21st International Conference on Computer Aided Verification (CAV'09). 661--667. Google ScholarDigital Library
- Yongin Jhee, Minsik Jin, Yungbum Jung, Deokhwan Kim, Soonho Kong, Heejong Lee, Hakjoo Oh, Daejun Park, and Kwangkeun Yi. 2008. Abstract interpretation + impure catalysts: Our sparrow experience. Presentation at the Workshop of the 30 Years of Abstract Interpretation. Available at ropas. snu.ac.kr/~kwang/paper/30yai-08.pdf.Google Scholar
- Richard Johnson and Keshav Pingali. 1993. Dependence-based program analysis. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 78--89. Google ScholarDigital Library
- Yungbum Jung, Jaehwang Kim, Jaeho Shin, and Kwangkeun Yi. 2005. Taming false alarms from a domain-unaware C analyzer by a Bayesian statistical post analysis. In Proceedings of the International Symposium on Static Analysis. 203--217. Google ScholarDigital Library
- John B. Kam and Jeffrey D. Ullman. 1977. Monotone data flow analysis frameworks. Acta Informatica 7, 3, 305--317. Google ScholarDigital Library
- Chris Lattner, Andrew Lenharth, and Vikram Adve. 2007. Making context-sensitive points-to analysis with heap cloning practical for the real world. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'07). Google ScholarDigital Library
- Woosuk Lee, Wonchan Lee, and Kwangkeun Yi. 2012. Sound non-statistical clustering of static analysis alarms. In Proceedings of the 13th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI'12). Lecture Notes in Computer Science, Vol. 7148. Springer, 299--314. Google ScholarDigital Library
- Lian Li, Cristina Cifuentes, and Nathan Keynes. 2011. Boosting the performance of flow-sensitive points-to analysis using value flow. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE'11). ACM, New York, NY, 343-- 353. Google ScholarDigital Library
- Jorn Lind-Nielson. n.d. BuDDy, a Binary Decision Diagram Package. Retrieved July 27, 2014, from http://vlsicad.eecs.umich.edu/BK/Slots/cache/www.itu.dk/research/buddy/.Google Scholar
- MathWorks. n.d. Polyspace Embedded Software Verification. Retrieved July 27, 2014, from http://www.mathworks.com/products/polyspace/index.html.Google Scholar
- Laurent Mauborgne and Xavier Rival. 2005. Trace partitioning in abstract interpretation based static analyzers. In Proceedings of the European Symposium on Programming. Lecture Notes in Computer Science, Vol. 3444. Springer, 5--20. Google ScholarDigital Library
- Matthew Might and Olin Shivers. 2006. Improving flow analyses via ΓCFA: Abstract garbage collection and counting. In Proceedings of the ACM SIGPLAN-SIGACT International Conference on Functional Programming. 13--25. Google ScholarDigital Library
- Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2004. Precise and efficient call graph construction for C programs with function pointers. Journal of Automated Software Engineering 11, 1, 7--26. Google ScholarDigital Library
- Antoine Miné. 2006a. Field-sensitive value analysis of embedded C programs with union types and pointer arithmetics. In Proceedings of the ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'06). ACM, New York, NY, 54--63. Google ScholarDigital Library
- Antoine Miné. 2006b. The octagon abstract domain. Higher-Order and Symbolic Computation 19, 1, 31-- 100. Google ScholarDigital Library
- Hakjoo Oh. 2009. Large spurious cycle in global static analyses and its algorithmic mitigation. In Proceedings of the Asian Symposium on Programming Languages and Systems. Lecture Notes in Computer Science, Vol. 5904. Springer, 14--29. Google ScholarDigital Library
- Hakjoo Oh, Lucas Brutschy, and Kwangkeun Yi. 2011. Access analysis-based tight localization of abstract memories. In Proceedings of the 12th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI'11). Lecture Notes in Computer Science, Vol. 6538. Springer, 356--370. Google ScholarDigital Library
- Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, and Kwangkeun Yi. 2012. Design and implementation of sparse global analyses for C-like languages. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. Google ScholarDigital Library
- Hakjoo Oh and Kwangkeun Yi. 2010. An algorithmic mitigation of large spurious interprocedural cycles in static analysis. Software: Practice and Experience 40, 8, 585--603. DOI:http://dx.doi.org/10.1002/spe.969 Google ScholarCross Ref
- Hakjoo Oh and Kwangkeun Yi. 2011. Access-based localization with bypassing. In Proceedings of the 9th Asian Symposium on Programming Languages and Systems (APLAS'11). Lecture Notes in Computer Science, Vol. 7078. Springer, 50--65. Google ScholarDigital Library
- Ganesan Ramalingam. 2002. On sparse evaluation representations. Theoretical Computer Science 277, 1--2, 119--147. Google ScholarDigital Library
- John H. Reif and Harry R. Lewis. 1977. Symbolic evaluation and the global value graph. In Proceedings of the 4th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 104--118. Google ScholarDigital Library
- Noam Rinetzky, Jörg Bauer, Thomas Reps, Mooly Sagiv, and Reinhard Wilhelm. 2005. A semantics for procedure local heaps and its abstractions. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 296--309. Google ScholarDigital Library
- Teck Bok Tok, Samuel Z. Guyer, and Calvin Lin. 2006. Efficient flow-sensitive interprocedural data-flow analysis in the presence of pointers. In Proceedings of the International Conference on Compiler Construction. 17--31. Google ScholarDigital Library
- Arnaud Venet and Guillaume Brat. 2004. Precise and efficient static array bound checking for large embedded C programs. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation (PLDI'04). ACM, New York, NY, 231--242. Google ScholarDigital Library
- Mark N. Wegman and F. Kenneth Zadeck. 1991. Constant propagation with conditional branches. ACM Transactions on Programming Languages and Systems 13, 2, 181--210. Google ScholarDigital Library
- Hongseok Yang, Oukseh Lee, Josh Berdine, Cristiano Calcagno, Byron Cook, Dino Distefano, and Peter O'Hearn. 2008. Scalable shape analysis for systems code. In Proceedings of the International Conference on Computer Aided Verification. 385--398. Google ScholarDigital Library
- Hongtao Yu, Jingling Xue, Wei Huo, Xiaobing Feng, and Zhaoqing Zhang. 2010. Level by level: Making flow- and context-sensitive pointer analysis scalable for millions of lines of code. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'10). ACM, New York, NY, 218--229. Google ScholarDigital Library
- Misha Zitser, D. E. Shaw Group, and Tim Leek. 2004. Testing static analysis tools using exploitable buffer overflows from open source code. In Proceedings of the 12th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT'04/FSE-12). ACM, New York NY, 97--106. Google ScholarDigital Library
Index Terms
- Global Sparse Analysis Framework
Recommendations
Selective X-Sensitive Analysis Guided by Impact Pre-Analysis
We present a method for selectively applying context-sensitivity during interprocedural program analysis. Our method applies context-sensitivity only when and where doing so is likely to improve the precision that matters for resolving given queries. The ...
Design and implementation of sparse global analyses for C-like languages
PLDI '12: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and ImplementationIn this article we present a general method for achieving global static analyzers that are precise, sound, yet also scalable. Our method generalizes the sparse analysis techniques on top of the abstract interpretation framework to support relational as ...
Design and implementation of sparse global analyses for C-like languages
PLDI '12In this article we present a general method for achieving global static analyzers that are precise, sound, yet also scalable. Our method generalizes the sparse analysis techniques on top of the abstract interpretation framework to support relational as ...
Comments