Abstract
Runtime bloat degrades significantly the performance and scalability of software systems. An important source of bloat is the inefficient use of containers. It is expensive to create inefficiently-used containers and to invoke their associated methods, as this may ultimately execute large volumes of code, with call stacks dozens deep, and allocate many temporary objects.
This paper presents practical static and dynamic tools that can find inappropriate use of containers in Java programs. At the core of these tools is a base static analysis that identifies, for each container, the objects that are added to this container and the key statements (i.e., heap loads and stores) that achieve the semantics of common container operations such as ADD and GET. The static tool finds problematic uses of containers by considering the nesting relationships among the loops where these semantics-achieving statements are located, while the dynamic tool can instrument these statements and find inefficiencies by profiling their execution frequencies.
The high precision of the base analysis is achieved by taking advantage of a context-free language (CFL)-reachability formulation of points-to analysis and by accounting for container-specific properties. It is demand-driven and client-driven, facilitating refinement specific to each queried container object and increasing scalability. The tools built with the help of this analysis can be used both to avoid the creation of container-related performance problems early during development, and to help with diagnosis when problems are observed during tuning. Our experimental results show that the static tool has a low false positive rate and produces more relevant information than its dynamic counterpart. Further case studies suggest that significant optimization opportunities can be found by focusing on statically-identified containers for which high allocation frequency is observed at run time.
- J. Aldrich, V. Kostadinov, and C. Chambers. Alias annotations for program understanding. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 311--330, 2002. Google ScholarDigital Library
- S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B.Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 169--190, 2006. Google ScholarDigital Library
- M. D. Bond and K. S. McKinley. Bell: Bit-encoding online memory leak detection. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 61--72, 2006. Google ScholarDigital Library
- M. D. Bond and K. S. McKinley. Leak pruning. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 277--288, 2009. Google ScholarDigital Library
- C. Boyapati, B. Liskov, and L. Shrira. Ownership types for object encapsulation. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 213--223, 2003. Google ScholarDigital Library
- C. Calcagno, D. Distefano, P. OHearn, and H. Yang. Compositional shape analysis by means of bi-abduction. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 289--300, 2009. Google ScholarDigital Library
- B. Chang and X. Rival. Relational inductive shape analysis. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 247--260, 2008. Google ScholarDigital Library
- D. Clarke and S. Drossopoulou. Ownership, encapsulation and the disjointness of type and effect. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 292--310, 2002. Google ScholarDigital Library
- B. Dufour, B. G. Ryder, and G. Sevitsky. A scalable technique for characterizing the usage of temporaries in framework-intensive Java applications. In ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), pages 59--70, 2008. Google ScholarDigital Library
- C. Grothoff, J. Palsberg, and J. Vitek. Encapsulating objects with confined types. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 241--255, 2001. Google ScholarDigital Library
- S. Gulwani, S. Jain, and E. Koskinen. Control-flow refinement and progress invariants for bound analysis. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 375--385, 2009. Google ScholarDigital Library
- S. Gulwani, K. Mehra, and T. Chilimbi. SPEED: Precise and efficient static estimation of program computational complexity. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 127--139, 2009. Google ScholarDigital Library
- D. L. Heine and M. S. Lam. A practical flow-sensitive and contextsensitive C and C++ memory leak detector. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 168--181, 2003. Google ScholarDigital Library
- D. L. Heine and M. S. Lam. Static detection of leaks in polymorphic containers. In International Conference on Software Engineering (ICSE), pages 252--261, 2006. Google ScholarDigital Library
- M. Jump and K. S. McKinley. Cork: Dynamic memory leak detection for garbage-collected languages. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 31--38, 2007. Google ScholarDigital Library
- J. Kodumal and A. Aiken. The set constraint/CFL reachability connection in practice. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 207--218, 2004. Google ScholarDigital Library
- T. Lev-Ami, N. Immerman, T. Reps, M. Sagiv, and S. Srivastava. Simulating reachability using first-order logic with applications to verification of linked data structures. In International Conference on Automated Deduction (CADE), pages 99--115, 2005. Google ScholarDigital Library
- Y. Liu and A. Milanova. Ownership and immutability inference for UML-based object access control. In International Conference on Software Engineering (ICSE), pages 323--332, 2007. Google ScholarDigital Library
- Y. Liu and A. Milanova. Static analysis for inference of explicit information flow. In ACMSIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 50--56, 2008. Google ScholarDigital Library
- S. McPeak and G. Necula. Data structure specifications via local equality axioms. In International Conference on Computer Aided Verification (CAV), pages 476--490, 2005. Google ScholarDigital Library
- D. Melski and T. Reps. Interconvertibility of a class of set constraints and context-free-language reachability. Theoretical Computer Science, 248:29--98, 2000. Google ScholarDigital Library
- N. Mitchell, E. Schonberg, and G. Sevitsky. Four trends leading to Java runtime bloat. IEEE Software, 27(1):56--63, 2010. Google ScholarDigital Library
- N. Mitchell and G. Sevitsky. The causes of bloat, the limits of health. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 245--260, 2007. Google ScholarDigital Library
- N. Mitchell, G. Sevitsky, and H. Srinivasan. Modeling runtime behavior in framework-based applications. In European Conference on Object-Oriented Programming (ECOOP), pages 429--451, 2006. Google ScholarDigital Library
- M. Naik and A. Aiken. Conditional must not aliasing for static race detection. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 327--338, 2007. Google ScholarDigital Library
- G. Novark, E. D. Berger, and B. G. Zorn. Efficiently and precisely locating memory leaks and bloat. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 397--407, 2009. Google ScholarDigital Library
- J. Rehof and M. Fáhndrich. Type-based flow analysis: From polymorphic subtyping to CFL-reachability. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 54--66, 2001. Google ScholarDigital Library
- T. Reps. Solving demand versions of interprocedural analysis problems. In International Conference on Compiler Construction (CC), pages 389--403, 1994. Google ScholarDigital Library
- T. Reps. Shape analysis as a generalized path problem. In ACM SIGPLAN Symposium on Partial Evaluation and Semantics-based Program Manipulation (PEPM), pages 1--11, 1995. Google ScholarDigital Library
- T. Reps. Program analysis via graph reachability. Information and Software Technology, 40(11-12):701--726, 1998.Google ScholarCross Ref
- T. Reps, S. Horwitz, and M. Sagiv. Precise interprocedural dataflow analysis via graph reachability. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 49--61, 1995. Google ScholarDigital Library
- T. Reps, S. Horwitz, M. Sagiv, and G. Rosay. Speeding up slicing. In ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), pages 11--20, 1994. Google ScholarDigital Library
- A. Rountev, A. Milanova, and B. G. Ryder. Fragment class analysis for testing of polymorphism in Java software. IEEE Transactions on Software Engineering, 30(6):372--387, 2004. Google ScholarDigital Library
- M. Sagiv, T. Reps, and R. Wilhelm. Parametric shape analysis via 3-valued logic. ACM Transactions on Programming Languages and Systems, 24(3):217--298, 1999. Google ScholarDigital Library
- O. Shacham, M. Vechev, and E. Yahav. Chameleon: Adaptive selection of collections. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 408--418, 2009. Google ScholarDigital Library
- A. Shankar, M. Arnold, and R. Bodik. JOLT: Lightweight dynamic analysis and removal of object churn. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 127--142, 2008. Google ScholarDigital Library
- M. Sridharan and R. Bodik. Refinement-based context-sensitive points-to analysis for Java. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 387--400, 2006. Google ScholarDigital Library
- M. Sridharan, S. J. Fink, and R. Bodik. Thin slicing. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 112--122, 2007. Google ScholarDigital Library
- M. Sridharan, D. Gopan, L. Shan, and R. Bodik. Demand-driven points-to analysis for Java. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 59--76, 2005. Google ScholarDigital Library
- Z. Su and D. Wagner. A class of polynomially solvable range constraints for interval analysis without widenings. Theoretical Computer Science, 345(1):122--138, 2005. Google ScholarDigital Library
- R. Vallée-Rai, E. Gagnon, L. Hendren, P. Lam, P. Pominville, and V. Sundaresan. Optimizing Java bytecode using the Soot framework: Is it feasible? In International Conference on Compiler Construction (CC), pages 18--34, 2000. Google ScholarDigital Library
- G. Xu, M. Arnold, N. Mitchell, A. Rountev, and G. Sevitsky. Go with the flow: Profiling copies to find runtime bloat. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 419--430, 2009. Google ScholarDigital Library
- G. Xu, N. Mitchell, M. Arnold, A. Rountev, E. Schonberg, and G. Sevitsky. Finding low-utility data structures. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2010. Google ScholarDigital Library
- G. Xu and A. Rountev. Precise memory leak detection for Java software using container profiling. In International Conference on Software Engineering (ICSE), pages 151--160, 2008. Google ScholarDigital Library
- G. Xu, A. Rountev, and M. Sridharan. Scaling CFL-reachability-based points-to analysis using context-sensitive must-not-alias analysis. In European Conference on Object-Oriented Programming (ECOOP), pages 98--122, 2009. Google ScholarDigital Library
- X. Zheng and R. Rugina. Demand-driven alias analysis for C. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 197--208, 2008. Google ScholarDigital Library
Index Terms
- Detecting inefficiently-used containers to avoid bloat
Recommendations
Detecting inefficiently-used containers to avoid bloat
PLDI '10: Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and ImplementationRuntime bloat degrades significantly the performance and scalability of software systems. An important source of bloat is the inefficient use of containers. It is expensive to create inefficiently-used containers and to invoke their associated methods, ...
On-demand dynamic summary-based points-to analysis
CGO '12: Proceedings of the Tenth International Symposium on Code Generation and OptimizationStatic analyses can be typically accelerated by reducing redundancies. Modern demand-driven points-to or alias analysis techniques rest on the foundation of Context-Free Language (CFL) reachability. These techniques achieve high precision efficiently ...
Fast and precise points-to analysis with incremental CFL-reachability summarisation: preliminary experience
ASE '12: Proceedings of the 27th IEEE/ACM International Conference on Automated Software EngineeringWe describe our preliminary experience in the design and implementation of a points-to analysis for Java, called EMU, that enables developers to perform pointer-related queries in programs undergoing constant changes in IDEs. EMU achieves fast response ...
Comments