Abstract
This paper describes an inter-procedural technique for computing symbolic bounds on the number of statements a procedure executes in terms of its scalar inputs and user-defined quantitative functions of input data-structures. Such computational complexity bounds for even simple programs are usually disjunctive, non-linear, and involve numerical properties of heaps. We address the challenges of generating these bounds using two novel ideas.
We introduce a proof methodology based on multiple counter instrumentation (each counter can be initialized and incremented at potentially multiple program locations) that allows a given linear invariant generation tool to compute linear bounds individually on these counter variables. The bounds on these counters are then composed together to generate total bounds that are non-linear and disjunctive. We also give an algorithm for automating this proof methodology. Our algorithm generates complexity bounds that are usually precise not only in terms of the computational complexity, but also in terms of the constant factors.
Next, we introduce the notion of user-defined quantitative functions that can be associated with abstract data-structures, e.g., length of a list, height of a tree, etc. We show how to compute bounds in terms of these quantitative functions using a linear invariant generation tool that has support for handling uninterpreted functions. We show application of this methodology to commonly used data-structures (namely lists, list of lists, trees, bit-vectors) using examples from Microsoft product code. We observe that a few quantitative functions for each data-structure are usually sufficient to allow generation of symbolic complexity bounds of a variety of loops that iterate over these data-structures, and that it is straightforward to define these quantitative functions.
The combination of these techniques enables generation of precise computational complexity bounds for real-world examples (drawn from Microsoft product code and C++ STL library code) for some of which it is non-trivial to even prove termination. Such automatically generated bounds are very useful for early detection of egregious performance problems in large modular codebases that are constantly being changed by multiple developers who make heavy use of code written by others without a good understanding of their implementation complexity.
- J. Berdine, A. Chawdhary, B. Cook, D. Distefano, and P. O'Hearn. Variance analyses from invariance analyses. In POPL, 2007. Google ScholarDigital Library
- D. Beyer, T. Henzinger, R. Majumdar, and A. Rybalchenko. Invariant synthesis for combined theories. In VMCAI'07, pages 378--394. Google ScholarDigital Library
- A. Bouajjani, M. Bozga, P. Habermehl, R. Iosif, P. Moro, and T. Vojnar. Programs with lists are counter automata. In CAV, 2006. Google ScholarDigital Library
- A. R. Bradley, Z. Manna, and H. B. Sipma. The polyranking principle. In ICALP, pages 1349--1361, 2005. Google ScholarDigital Library
- A. R. Bradley, Z. Manna, and H. B. Sipma. Termination analysis of integer linear loops. In CONCUR, pages 488--502, 2005. Google ScholarDigital Library
- B. Cook, A. Podelski, and A. Rybalchenko. Termination proofs for systems code. In PLDI, pages 415--426, 2006. Google ScholarDigital Library
- P. Cousot and N. Halbwachs. Automatic Discovery of Linear Restraints among Variables of a Program. In POPL, 1978. Google ScholarDigital Library
- K. Crary and S. Weirich. Resource bound certification. In POPL, pages 184--198, 2000. Google ScholarDigital Library
- N. A. Danielsson. Lightweight semiformal time complexity analysis for purely functional data structures. In POPL, pages 133--144, 2008. Google ScholarDigital Library
- N. Dor, M. Rodeh, and M. Sagiv. CSSV: towards a realistic tool for statically detecting all buffer overflows in C. In PLDI, 2003. Google ScholarDigital Library
- S. Goldsmith, A. Aiken, and D. S. Wilkerson. Measuring empirical computational complexity. In ESEC/SIGSOFT FSE, 2007. Google ScholarDigital Library
- B. S. Gulavani and S. Gulwani. A numerical abstract domain based on expression abstraction and max operator with application in timing analysis. In CAV, pages 370--384, 2008. Google ScholarDigital Library
- S. Gulwani, krishna Mehra, and T. Chilimbi. Speed: Precise and efficient static estimation of program computational complexity. Technical Report MSR-TR-2008-95, Microsoft Research, 2008.Google Scholar
- S. Gulwani, T. Lev-Ami, and M. Sagiv. A combination framework for tracking partition sizes. In POPL, 2009. Google ScholarDigital Library
- S. Gulwani and G. C. Necula. A Polynomial-Time Algorithm for Global Value Numbering. In SAS, pages 212--227, 2004.Google ScholarCross Ref
- S. Gulwani and A. Tiwari. Combining Abstract Interpreters. In PLDI, pages 376--386, 2006. Google ScholarDigital Library
- J. Gustafsson, A. Ermedahl, C. Sandberg, and B. Lisper. Automatic derivation of loop bounds and infeasible paths for wcet analysis using abstract execution. In RTSS, pages 57--66, 2006. Google ScholarDigital Library
- N. Halbwachs, Y.-E. Proy, and P. Roumanoff. Verification of real-time systems using linear relation analysis. FMSD, 11(2), '97. Google ScholarDigital Library
- C. A. Healy, M. Sjodin, V. Rustagi, D. B. Whalley, and R. van Engelen. Supporting timing analysis by automatic bounding of loop iterations. Real-Time Systems, 18(2/3):129--156, 2000. Google ScholarDigital Library
- M. Hofmann and S. Jost. Static prediction of heap space usage for first-order functional programs. In POPL, pages 185--197, 2003. Google ScholarDigital Library
- J. Hughes and L. Pareto. Recursion and Dynamic Data-structures in bounded space: Towards Embedded ML Programming. In ICFP, '99. Google ScholarDigital Library
- K. R. M. Leino and P. Müller. Object invariants in dynamic contexts. In ECOOP, volume 3086 of LNCS, pages 491--516, 2004.Google Scholar
- S. Magill, J. Berdine, E. M. Clarke, and B. Cook. Arithmetic strengthening for shape analysis. In SAS, pages 419--436, 2007. Google ScholarDigital Library
- D. L. Métayer. Ace: An Automatic Complexity Evaluator. ACM Trans. Program. Lang. Syst., 10(2):248--266, 1988. Google ScholarDigital Library
- J. Navas, E. Mera, P. López-Garcıa, and M. V. Hermenegildo. User-definable resource bounds analysis for logic programs. In ICLP, pages 348--363, 2007. Google ScholarDigital Library
- Microsoft Phoenix Compiler Infrastructure, http://research.microsoft.com/phoenix/.Google Scholar
- A. Podelski and A. Rybalchenko. Transition invariants. In LICS, pages 32--41. IEEE, July 2004. Google ScholarDigital Library
- M. Rosendahl. Automatic Complexity Analysis. In FPCA, pages 144--156, New York, NY, USA, 1989. ACM Press. Google ScholarDigital Library
- R. Wilhelm, J. Engblom, A. Ermedahl, N. Holsti, S. Thesing, D. Whalley, G. Bernat, C. Ferdinand, R. Heckmann, F. Mueller, I. Puaut, P. Puschner, J. Staschulat, and P. Stenström. The Determination of Worst-Case Execution Times-Overview of the Methods and Survey of Tools. In ACM Transactions on Embedded Computing Systems (TECS), 2007. Google ScholarDigital Library
- R. Wilhelm and B. Wachter. Abstract interpretation with applications to timing validation. In CAV, pages 22--36, 2008. Google ScholarDigital Library
Index Terms
- SPEED: precise and efficient static estimation of program computational complexity
Recommendations
SPEED: precise and efficient static estimation of program computational complexity
POPL '09: Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesThis paper describes an inter-procedural technique for computing symbolic bounds on the number of statements a procedure executes in terms of its scalar inputs and user-defined quantitative functions of input data-structures. Such computational ...
A termination analyzer for Java bytecode based on path-length
It is important to prove that supposedly terminating programs actually terminate, particularly if those programs must be run on critical systems or downloaded into a client such as a mobile phone. Although termination of computer programs is generally ...
Termination and Cost Analysis with COSTA and its User Interfaces
COSTA is a static analyzer for Java bytecode which is able to infer cost and termination information for large classes of programs. The analyzer takes as input a program and a resource of interest, in the form of a cost model, and aims at obtaining an ...
Comments