skip to main content
research-article

Detecting inefficiently-used containers to avoid bloat

Published:05 June 2010Publication History
Skip Abstract Section

Abstract

Runtime bloat degrades significantly the performance and scalability of software systems. An important source of bloat is the inefficient use of containers. It is expensive to create inefficiently-used containers and to invoke their associated methods, as this may ultimately execute large volumes of code, with call stacks dozens deep, and allocate many temporary objects.

This paper presents practical static and dynamic tools that can find inappropriate use of containers in Java programs. At the core of these tools is a base static analysis that identifies, for each container, the objects that are added to this container and the key statements (i.e., heap loads and stores) that achieve the semantics of common container operations such as ADD and GET. The static tool finds problematic uses of containers by considering the nesting relationships among the loops where these semantics-achieving statements are located, while the dynamic tool can instrument these statements and find inefficiencies by profiling their execution frequencies.

The high precision of the base analysis is achieved by taking advantage of a context-free language (CFL)-reachability formulation of points-to analysis and by accounting for container-specific properties. It is demand-driven and client-driven, facilitating refinement specific to each queried container object and increasing scalability. The tools built with the help of this analysis can be used both to avoid the creation of container-related performance problems early during development, and to help with diagnosis when problems are observed during tuning. Our experimental results show that the static tool has a low false positive rate and produces more relevant information than its dynamic counterpart. Further case studies suggest that significant optimization opportunities can be found by focusing on statically-identified containers for which high allocation frequency is observed at run time.

References

  1. J. Aldrich, V. Kostadinov, and C. Chambers. Alias annotations for program understanding. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 311--330, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B.Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 169--190, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. D. Bond and K. S. McKinley. Bell: Bit-encoding online memory leak detection. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 61--72, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. D. Bond and K. S. McKinley. Leak pruning. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 277--288, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Boyapati, B. Liskov, and L. Shrira. Ownership types for object encapsulation. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 213--223, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Calcagno, D. Distefano, P. OHearn, and H. Yang. Compositional shape analysis by means of bi-abduction. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 289--300, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Chang and X. Rival. Relational inductive shape analysis. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 247--260, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Clarke and S. Drossopoulou. Ownership, encapsulation and the disjointness of type and effect. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 292--310, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. Dufour, B. G. Ryder, and G. Sevitsky. A scalable technique for characterizing the usage of temporaries in framework-intensive Java applications. In ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), pages 59--70, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Grothoff, J. Palsberg, and J. Vitek. Encapsulating objects with confined types. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 241--255, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Gulwani, S. Jain, and E. Koskinen. Control-flow refinement and progress invariants for bound analysis. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 375--385, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Gulwani, K. Mehra, and T. Chilimbi. SPEED: Precise and efficient static estimation of program computational complexity. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 127--139, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. L. Heine and M. S. Lam. A practical flow-sensitive and contextsensitive C and C++ memory leak detector. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 168--181, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D. L. Heine and M. S. Lam. Static detection of leaks in polymorphic containers. In International Conference on Software Engineering (ICSE), pages 252--261, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Jump and K. S. McKinley. Cork: Dynamic memory leak detection for garbage-collected languages. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 31--38, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Kodumal and A. Aiken. The set constraint/CFL reachability connection in practice. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 207--218, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. Lev-Ami, N. Immerman, T. Reps, M. Sagiv, and S. Srivastava. Simulating reachability using first-order logic with applications to verification of linked data structures. In International Conference on Automated Deduction (CADE), pages 99--115, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Liu and A. Milanova. Ownership and immutability inference for UML-based object access control. In International Conference on Software Engineering (ICSE), pages 323--332, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Liu and A. Milanova. Static analysis for inference of explicit information flow. In ACMSIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 50--56, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. McPeak and G. Necula. Data structure specifications via local equality axioms. In International Conference on Computer Aided Verification (CAV), pages 476--490, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Melski and T. Reps. Interconvertibility of a class of set constraints and context-free-language reachability. Theoretical Computer Science, 248:29--98, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. N. Mitchell, E. Schonberg, and G. Sevitsky. Four trends leading to Java runtime bloat. IEEE Software, 27(1):56--63, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. N. Mitchell and G. Sevitsky. The causes of bloat, the limits of health. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 245--260, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. N. Mitchell, G. Sevitsky, and H. Srinivasan. Modeling runtime behavior in framework-based applications. In European Conference on Object-Oriented Programming (ECOOP), pages 429--451, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Naik and A. Aiken. Conditional must not aliasing for static race detection. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 327--338, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. Novark, E. D. Berger, and B. G. Zorn. Efficiently and precisely locating memory leaks and bloat. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 397--407, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Rehof and M. Fáhndrich. Type-based flow analysis: From polymorphic subtyping to CFL-reachability. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 54--66, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. Reps. Solving demand versions of interprocedural analysis problems. In International Conference on Compiler Construction (CC), pages 389--403, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. Reps. Shape analysis as a generalized path problem. In ACM SIGPLAN Symposium on Partial Evaluation and Semantics-based Program Manipulation (PEPM), pages 1--11, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T. Reps. Program analysis via graph reachability. Information and Software Technology, 40(11-12):701--726, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  31. T. Reps, S. Horwitz, and M. Sagiv. Precise interprocedural dataflow analysis via graph reachability. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 49--61, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. T. Reps, S. Horwitz, M. Sagiv, and G. Rosay. Speeding up slicing. In ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), pages 11--20, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Rountev, A. Milanova, and B. G. Ryder. Fragment class analysis for testing of polymorphism in Java software. IEEE Transactions on Software Engineering, 30(6):372--387, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Sagiv, T. Reps, and R. Wilhelm. Parametric shape analysis via 3-valued logic. ACM Transactions on Programming Languages and Systems, 24(3):217--298, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. O. Shacham, M. Vechev, and E. Yahav. Chameleon: Adaptive selection of collections. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 408--418, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Shankar, M. Arnold, and R. Bodik. JOLT: Lightweight dynamic analysis and removal of object churn. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 127--142, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Sridharan and R. Bodik. Refinement-based context-sensitive points-to analysis for Java. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 387--400, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Sridharan, S. J. Fink, and R. Bodik. Thin slicing. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 112--122, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M. Sridharan, D. Gopan, L. Shan, and R. Bodik. Demand-driven points-to analysis for Java. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 59--76, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Z. Su and D. Wagner. A class of polynomially solvable range constraints for interval analysis without widenings. Theoretical Computer Science, 345(1):122--138, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. R. Vallée-Rai, E. Gagnon, L. Hendren, P. Lam, P. Pominville, and V. Sundaresan. Optimizing Java bytecode using the Soot framework: Is it feasible? In International Conference on Compiler Construction (CC), pages 18--34, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. G. Xu, M. Arnold, N. Mitchell, A. Rountev, and G. Sevitsky. Go with the flow: Profiling copies to find runtime bloat. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 419--430, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. G. Xu, N. Mitchell, M. Arnold, A. Rountev, E. Schonberg, and G. Sevitsky. Finding low-utility data structures. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. G. Xu and A. Rountev. Precise memory leak detection for Java software using container profiling. In International Conference on Software Engineering (ICSE), pages 151--160, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. G. Xu, A. Rountev, and M. Sridharan. Scaling CFL-reachability-based points-to analysis using context-sensitive must-not-alias analysis. In European Conference on Object-Oriented Programming (ECOOP), pages 98--122, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. X. Zheng and R. Rugina. Demand-driven alias analysis for C. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 197--208, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Detecting inefficiently-used containers to avoid bloat

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM SIGPLAN Notices
            ACM SIGPLAN Notices  Volume 45, Issue 6
            PLDI '10
            June 2010
            496 pages
            ISSN:0362-1340
            EISSN:1558-1160
            DOI:10.1145/1809028
            Issue’s Table of Contents
            • cover image ACM Conferences
              PLDI '10: Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation
              June 2010
              514 pages
              ISBN:9781450300193
              DOI:10.1145/1806596

            Copyright © 2010 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 5 June 2010

            Check for updates

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader