research-article

Open Access

Global Sparse Analysis Framework

Authors:
Hakjoo Oh

Seoul National University, Seoul, Korea

Seoul National University, Seoul, Korea
View Profile

,
Kihong Heo

Seoul National University, Seoul, Korea

Seoul National University, Seoul, Korea
View Profile

,
Wonchan Lee

Seoul National University, Seoul, Korea

Seoul National University, Seoul, Korea
View Profile

,
Woosuk Lee

Seoul National University, Seoul, Korea

Seoul National University, Seoul, Korea
View Profile

,
Daejun Park

Seoul National University, Seoul, Korea

Seoul National University, Seoul, Korea
View Profile

,
Jeehoon Kang

Seoul National University, Seoul, Korea

Seoul National University, Seoul, Korea
View Profile

,
Kwangkeun Yi

Seoul National University, Seoul, Korea

Seoul National University, Seoul, Korea
View Profile

ACM Transactions on Programming Languages and Systems Volume 36 Issue 3Article No.: 8pp 1–44https://doi.org/10.1145/2590811

Published:25 September 2014Publication History

ACM Transactions on Programming Languages and Systems

Abstract

In this article, we present a general method for achieving global static analyzers that are precise and sound, yet also scalable. Our method, on top of the abstract interpretation framework, is a general sparse analysis technique that supports relational as well as nonrelational semantics properties for various programming languages. Analysis designers first use the abstract interpretation framework to have a global and correct static analyzer whose scalability is unattended. Upon this underlying sound static analyzer, analysis designers add our generalized sparse analysis techniques to improve its scalability while preserving the precision of the underlying analysis. Our method prescribes what to prove to guarantee that the resulting sparse version should preserve the precision of the underlying analyzer.

We formally present our framework and show that existing sparse analyses are all restricted instances of our framework. In addition, we show more semantically elaborate design examples of sparse nonrelational and relational static analyses. We then present their implementation results that scale to globally analyze up to one million lines of C programs. We also show a set of implementation techniques that turn out to be critical to economically support the sparse analysis process.

Supplemental Material

Available for Download

zip

tzannes.zip (450.1 KB)

Supplemental movie, appendix, image and software files for, Lazy Scheduling: A Runtime Adaptive Scheduler for Declarative Parallelism

References

Xavier Allamigeon, Wenceslas Godard, and Charles Hymans. 2006. Static analysis of string manipulations in critical embedded C programs. In Proceedings of the International Symposium on Static Analysis. 35--51. Google ScholarDigital Library
Gogul Balakrishnan and Thomas Reps. 2004. Analyzing memory accesses in ×86 binary executables. In Proceedings of the International Conference on Compiler Construction. 5--23.Google ScholarCross Ref
Bruno Blanchet, Patrick Cousot, Radhia Cousot, Jerome Feret, Laurent Mauborgne, Antoine Miné, David Monniaux, and Xavier Rival. 2003. A static analyzer for large safety-critical software. In Proceedings of the ACM SIGPLAN-SIGACT Conference on Programming Language Design and Implementation. 196--207. Google ScholarDigital Library
Francois Bourdoncle. 1993. Efficient chaotic iteration strategies with widenings. In Proceedings of the International Conference on Formal Methods in Programming and Their Applications. 128--141.Google ScholarCross Ref
David R. Chase, Mark Wegman, and F. Kenneth Zadeck. 1990. Analysis of pointers and structures. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 296--310. Google ScholarDigital Library
Jong-Deok Choi, Ron Cytron, and Jeanne Ferrante. 1991. Automatic construction of sparse data flow evaluation graphs. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 55--66. Google ScholarDigital Library
Jong-Deok Choi, Vivek Sarkar, and Edith Schonberg. 1996. Incremental computation of static single assignment form. In Proceedings of the 6th International Conference on Compiler Construction (CC'96). Springer-Verlag, London, UK, 223--237. http://dl.acm.org/citation.cfm&quest;id=647473.727451. Google ScholarDigital Library
Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 238--252. Google ScholarDigital Library
Patrick Cousot and Radhia Cousot. 1979. Systematic design of program analysis frameworks. In Conference Record of the 6th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York, NY, 269--282. Google ScholarDigital Library
Patrick Cousot and Radhia Cousot. 1992. Abstract interpretation frameworks. Journal of Logic and Computation 2, 4, 511--547.Google ScholarCross Ref
Patrick Cousot, Radhia Cousot, Jerome Feret, Laurent Mauborgne, Antoine Miné, and Xavier Rival. 2009. Why does Astre scale up&quest; Formal Methods in System Design 35, 3, 229--264. DOI:http://dx.doi.org/10.1007/s10703-009-0089-6 Google ScholarDigital Library
Patrick Cousot and Nicolas Halbwachs. 1978. Automatic discovery of linear restraints among variables of a program. In Conference Record of the 5th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York, NY, 84--97. Google ScholarDigital Library
Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems 13, 4, 451--490. Google ScholarDigital Library
Ron K. Cytron and Jeanne Ferrante. 1995. Efficiently computing &phis;-nodes on-the-fly. ACM Transactions on Programming Languages and Systems 17, 3, 487--506. Google ScholarDigital Library
Dhananjay M. Dhamdhere, Barry K. Rosen, and F. Kenneth Zadeck. 1992. How to analyze large programs efficiently and informatively. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'92). ACM, New York, NY, 212--223. Google ScholarDigital Library
Isil Dillig, Thomas Dillig, and Alex Aiken. 2008. Sound, complete and scalable path-sensitive analysis. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'08). ACM, New York, NY, 270--280. DOI:http://dx.doi.org/10.1145/1375581.1375615 Google ScholarDigital Library
Isil Dillig, Thomas Dillig, and Alex Aiken. 2011. Precise reasoning for programs using containers. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'11). ACM, New York, NY, 187--200. DOI:http://dx.doi.org/10.1145/1926385.1926407 Google ScholarDigital Library
Azadeh Farzan and Zachary Kincaid. 2012. Verification of parameterized concurrent programs by modular reasoning about data and control. In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'12). ACM, New York, NY, 297--308. DOI:http://dx.doi.org/10.1145/2103656.2103693 Google ScholarDigital Library
Azadeh Farzan, Zachary Kincaid, and Andreas Podelski. 2013. Inductive data flow graphs. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'13). ACM, New York, NY, 129--142. DOI:http://dx.doi.org/10.1145/2429069.2429086 Google ScholarDigital Library
Ben Hardekopf and Calvin Lin. 2007. The ant and the grasshopper: Fast and accurate pointer analysis for millions of lines of code. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'07). ACM, New York, NY, 290--299. Google ScholarDigital Library
Ben Hardekopf and Calvin Lin. 2009. Semi-sparse flow-sensitive pointer analysis. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 226--238. Google ScholarDigital Library
Ben Hardekopf and Calvin Lin. 2011. Flow-sensitive pointer analysis for millions of lines of code. In Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization. 289--298. Google ScholarDigital Library
Michael Hind and Anthony Pioli. 1998. Assessing the effects of flow-sensitivity on pointer alias analyses. In Proceedings of the International Symposium on Static Analysis. 57--81. Google ScholarDigital Library
Bertrand Jeannet and Antoine Miné. 2009. APRON: A library of numerical abstract domains for static analysis. In Proceedings of the 21st International Conference on Computer Aided Verification (CAV'09). 661--667. Google ScholarDigital Library
Yongin Jhee, Minsik Jin, Yungbum Jung, Deokhwan Kim, Soonho Kong, Heejong Lee, Hakjoo Oh, Daejun Park, and Kwangkeun Yi. 2008. Abstract interpretation + impure catalysts: Our sparrow experience. Presentation at the Workshop of the 30 Years of Abstract Interpretation. Available at ropas. snu.ac.kr/~kwang/paper/30yai-08.pdf.Google Scholar
Richard Johnson and Keshav Pingali. 1993. Dependence-based program analysis. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 78--89. Google ScholarDigital Library
Yungbum Jung, Jaehwang Kim, Jaeho Shin, and Kwangkeun Yi. 2005. Taming false alarms from a domain-unaware C analyzer by a Bayesian statistical post analysis. In Proceedings of the International Symposium on Static Analysis. 203--217. Google ScholarDigital Library
John B. Kam and Jeffrey D. Ullman. 1977. Monotone data flow analysis frameworks. Acta Informatica 7, 3, 305--317. Google ScholarDigital Library
Chris Lattner, Andrew Lenharth, and Vikram Adve. 2007. Making context-sensitive points-to analysis with heap cloning practical for the real world. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'07). Google ScholarDigital Library
Woosuk Lee, Wonchan Lee, and Kwangkeun Yi. 2012. Sound non-statistical clustering of static analysis alarms. In Proceedings of the 13th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI'12). Lecture Notes in Computer Science, Vol. 7148. Springer, 299--314. Google ScholarDigital Library
Lian Li, Cristina Cifuentes, and Nathan Keynes. 2011. Boosting the performance of flow-sensitive points-to analysis using value flow. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE'11). ACM, New York, NY, 343-- 353. Google ScholarDigital Library
Jorn Lind-Nielson. n.d. BuDDy, a Binary Decision Diagram Package. Retrieved July 27, 2014, from http://vlsicad.eecs.umich.edu/BK/Slots/cache/www.itu.dk/research/buddy/.Google Scholar
MathWorks. n.d. Polyspace Embedded Software Verification. Retrieved July 27, 2014, from http://www.mathworks.com/products/polyspace/index.html.Google Scholar
Laurent Mauborgne and Xavier Rival. 2005. Trace partitioning in abstract interpretation based static analyzers. In Proceedings of the European Symposium on Programming. Lecture Notes in Computer Science, Vol. 3444. Springer, 5--20. Google ScholarDigital Library
Matthew Might and Olin Shivers. 2006. Improving flow analyses via ΓCFA: Abstract garbage collection and counting. In Proceedings of the ACM SIGPLAN-SIGACT International Conference on Functional Programming. 13--25. Google ScholarDigital Library
Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2004. Precise and efficient call graph construction for C programs with function pointers. Journal of Automated Software Engineering 11, 1, 7--26. Google ScholarDigital Library
Antoine Miné. 2006a. Field-sensitive value analysis of embedded C programs with union types and pointer arithmetics. In Proceedings of the ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'06). ACM, New York, NY, 54--63. Google ScholarDigital Library
Antoine Miné. 2006b. The octagon abstract domain. Higher-Order and Symbolic Computation 19, 1, 31-- 100. Google ScholarDigital Library
Hakjoo Oh. 2009. Large spurious cycle in global static analyses and its algorithmic mitigation. In Proceedings of the Asian Symposium on Programming Languages and Systems. Lecture Notes in Computer Science, Vol. 5904. Springer, 14--29. Google ScholarDigital Library
Hakjoo Oh, Lucas Brutschy, and Kwangkeun Yi. 2011. Access analysis-based tight localization of abstract memories. In Proceedings of the 12th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI'11). Lecture Notes in Computer Science, Vol. 6538. Springer, 356--370. Google ScholarDigital Library
Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, and Kwangkeun Yi. 2012. Design and implementation of sparse global analyses for C-like languages. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. Google ScholarDigital Library
Hakjoo Oh and Kwangkeun Yi. 2010. An algorithmic mitigation of large spurious interprocedural cycles in static analysis. Software: Practice and Experience 40, 8, 585--603. DOI:http://dx.doi.org/10.1002/spe.969 Google ScholarCross Ref
Hakjoo Oh and Kwangkeun Yi. 2011. Access-based localization with bypassing. In Proceedings of the 9th Asian Symposium on Programming Languages and Systems (APLAS'11). Lecture Notes in Computer Science, Vol. 7078. Springer, 50--65. Google ScholarDigital Library
Ganesan Ramalingam. 2002. On sparse evaluation representations. Theoretical Computer Science 277, 1--2, 119--147. Google ScholarDigital Library
John H. Reif and Harry R. Lewis. 1977. Symbolic evaluation and the global value graph. In Proceedings of the 4th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 104--118. Google ScholarDigital Library
Noam Rinetzky, Jörg Bauer, Thomas Reps, Mooly Sagiv, and Reinhard Wilhelm. 2005. A semantics for procedure local heaps and its abstractions. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 296--309. Google ScholarDigital Library
Teck Bok Tok, Samuel Z. Guyer, and Calvin Lin. 2006. Efficient flow-sensitive interprocedural data-flow analysis in the presence of pointers. In Proceedings of the International Conference on Compiler Construction. 17--31. Google ScholarDigital Library
Arnaud Venet and Guillaume Brat. 2004. Precise and efficient static array bound checking for large embedded C programs. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation (PLDI'04). ACM, New York, NY, 231--242. Google ScholarDigital Library
Mark N. Wegman and F. Kenneth Zadeck. 1991. Constant propagation with conditional branches. ACM Transactions on Programming Languages and Systems 13, 2, 181--210. Google ScholarDigital Library
Hongseok Yang, Oukseh Lee, Josh Berdine, Cristiano Calcagno, Byron Cook, Dino Distefano, and Peter O'Hearn. 2008. Scalable shape analysis for systems code. In Proceedings of the International Conference on Computer Aided Verification. 385--398. Google ScholarDigital Library
Hongtao Yu, Jingling Xue, Wei Huo, Xiaobing Feng, and Zhaoqing Zhang. 2010. Level by level: Making flow- and context-sensitive pointer analysis scalable for millions of lines of code. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'10). ACM, New York, NY, 218--229. Google ScholarDigital Library
Misha Zitser, D. E. Shaw Group, and Tim Leek. 2004. Testing static analysis tools using exploitable buffer overflows from open source code. In Proceedings of the 12th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT'04/FSE-12). ACM, New York NY, 97--106. Google ScholarDigital Library

Index Terms

Global Sparse Analysis Framework
1. Theory of computation
  1. Semantics and reasoning
    1. Program semantics

Recommendations

Selective X-Sensitive Analysis Guided by Impact Pre-Analysis

We present a method for selectively applying context-sensitivity during interprocedural program analysis. Our method applies context-sensitivity only when and where doing so is likely to improve the precision that matters for resolving given queries. The ...
Read More
Design and implementation of sparse global analyses for C-like languages
PLDI '12: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation

In this article we present a general method for achieving global static analyzers that are precise, sound, yet also scalable. Our method generalizes the sparse analysis techniques on top of the abstract interpretation framework to support relational as ...
Read More
Design and implementation of sparse global analyses for C-like languages
PLDI '12

In this article we present a general method for achieving global static analyzers that are precise, sound, yet also scalable. Our method generalizes the sparse analysis techniques on top of the abstract interpretation framework to support relational as ...
Read More

Reviews

Reviewer: Jacques Carette

High-profile bugs have increased the urgency of designing good program analysis frameworks that can detect such bugs before the programs are released into the wild. Luckily, we possess a great framework for the systematic design of analyses: abstract interpretation (AI). Unfortunately, a straightforward implementation does not scale to realistic programs. A simple but powerful observation is at play in the framework presented here: even very large programs tend to modify a very small number of memory locations in any given block of code. So if we can take advantage of this "sparsity," our analyses ought to scale much better. Of course, to take this observation, turn it first into sound theory, and then implement it requires quite a lot of careful work. This is exactly the kind of work that this paper describes. The reader should be warned that the results are extremely technical. There are helpful examples throughout to illustrate the most important points, two sections on how to design various kinds of analyses on top of this framework, a helpful section on implementation, and a lucid section on the results of experimental evaluation of their implementation on a wide variety of programs. The authors cleverly relegated all proofs to appendices. Nevertheless, anyone not fully versed in the lore of abstract interpretation would find this paper extremely challenging. But for those who do know AI, this paper is well worth reading. I was disappointed not to find a link to the authors' software anywhere; it is past time that all papers evaluating an implementation provide the source code electronically for independent evaluation. It is also slightly disappointing that the proofs are not automated, but perhaps mainstream CS has not quite reached that stage yet. Lest the reader of this review misunderstand, let me be quite clear: this is a superb paper. Well written, well structured, and smoothly bridging theory and practice, it could serve as a model for writing solid CS papers. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Programming Languages and Systems Volume 36, Issue 3
September 2014
124 pages
ISSN:0164-0925
EISSN:1558-4593
DOI:10.1145/2669618
Editor:
Jens Palsberg
University of California, Los Angeles, USA
Issue’s Table of Contents
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 September 2014
- Accepted: 1 January 2014
- Revised: 1 October 2013
- Received: 1 March 2013
Published in toplas Volume 36, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Static analysis
abstract interpretation
sparse analysis
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 16
  Total Citations
  View Citations
- 439
  Total Downloads
- Downloads (Last 12 months)55
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Global Sparse Analysis Framework

ACM Transactions on Programming Languages and Systems

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Selective X-Sensitive Analysis Guided by Impact Pre-Analysis

Design and implementation of sparse global analyses for C-like languages

Design and implementation of sparse global analyses for C-like languages

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Global Sparse Analysis Framework

ACM Transactions on Programming Languages and Systems

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Selective X-Sensitive Analysis Guided by Impact Pre-Analysis

Design and implementation of sparse global analyses for C-like languages

Design and implementation of sparse global analyses for C-like languages

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media