ABSTRACT
Performance issues in large applications arise only in particular scenarios under heavy load conditions. It is therefore difficult to catch them during testing and they easily escape into production. This necessitates the design of a common and efficient instrumentation strategy that profiles the flow of objects during an execution. Designing such a strategy which enables profile generation precisely with low overhead is non-trivial due to the number of objects created, accessed and paths traversed by them in an execution. In this paper, we design and implement an efficient instrumentation technique that efficiently generates object flow profiles for Java programs, without requiring any modifications to the underlying virtual machine. We achieve this by applying Ball-Larus numbering on a specialized hybrid flow graph (hfg). The hfg path profiles that are collected during runtime are post-processed offline to derive the object flow profiles. We implemented the profiler and validated its efficacy by applying it on Java programs. The results demonstrate the scalability of our approach, which handles 0.2M to 0.55B object accesses with an average runtime overhead of 8x. We also demonstrate the effectiveness of the generated profiles by implementing a client analysis that consumes the profiles to detect performance bugs. The analysis detects 38 performance bugs which when refactored result in significant performance gains (up to 30%) in running times.
- O. Agesen and A. Garthwaite. Efficient object sampling via weak references. In Proceedings of the 2Nd International Symposium on Memory Management, ISMM ’00, pages 121–126, New York, NY, USA, 2000. Google ScholarDigital Library
- T. Ball and J. R. Larus. Efficient path profiling. In Proceedings of the 29th Annual ACM/IEEE International Symposium on Microarchitecture, MICRO 29, pages 46–57, Washington, DC, USA, 1996. Google ScholarDigital Library
- T. Bao, Y. Zheng, Z. Lin, X. Zhang, and D. Xu. Strict control dependence and its effect on dynamic information flow analyses. In Proceedings of the 19th International Symposium on Software Testing and Analysis, ISSTA ’10, pages 13–24, New York, NY, USA, 2010. Google ScholarDigital Library
- S. Bhattacharya, K. Gopinath, and M. G. Nanda. Combining concern input with program analysis for bloat detection. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA ’13, pages 745–764, New York, NY, USA, 2013. Google ScholarDigital Library
- S. Bhattacharya, M. G. Nanda, K. Gopinath, and M. Gupta. Reuse, recycle to de-bloat software. In Proceedings of the 25th European Conference on Object-oriented Programming, ECOOP’11, pages 408–432, Berlin, Heidelberg, 2011. Google ScholarDigital Library
- S. M. Blackburn, S. Singhai, M. Hertz, K. S. McKinely, and J. E. B. Moss. Pretenuring for java. In Proceedings of the 16th ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications, OOPSLA ’01, pages 342–352, New York, NY, USA, 2001. Google ScholarDigital Library
- D. C. D’Elia and C. Demetrescu. Ball-larus path profiling across multiple loop iterations. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA ’13, pages 373–390, New York, NY, USA, 2013. Google ScholarDigital Library
- L. Della Toffola, M. Pradel, and T. R. Gross. Performance problems you can fix: A dynamic analysis of memoization opportunities. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2015, pages 607–622, New York, NY, USA, 2015. Google ScholarDigital Library
- J. Ferrante, K. J. Ottenstein, and J. D. Warren. The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst., 9(3):319–349, July 1987. Google ScholarDigital Library
- L. Gong, M. Pradel, and K. Sen. Jitprof: Pinpointing jit-unfriendly javascript code. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, pages 357–368, New York, NY, USA, 2015. Google ScholarDigital Library
- L. Gong, M. Pradel, M. Sridharan, and K. Sen. DLint: Dynamically checking bad coding practices in javascript. In Proceedings of the 2015 International Symposium on Software Testing and Analysis, ISSTA 2015, pages 94–105, New York, NY, USA, 2015. Google ScholarDigital Library
- S. Z. Guyer, K. S. McKinley, and D. Frampton. Free-me: A static analysis for automatic individual object reclamation. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’06, pages 364–375, New York, NY, USA, 2006. Google ScholarDigital Library
- S. Han, Y. Dang, S. Ge, D. Zhang, and T. Xie. Performance debugging in the large via mining millions of stack traces. In Proceedings of the 34th International Conference on Software Engineering, ICSE ’12, pages 145–155, Piscataway, NJ, USA, 2012. Google ScholarDigital Library
- M. Hertz, S. M. Blackburn, J. E. B. Moss, K. S. McKinley, and D. Stefanovi´c. Generating object lifetime traces with merlin. ACM Trans. Program. Lang. Syst., 28(3):476–516, May 2006. Google ScholarDigital Library
- W. huang, W. Srisa-an, and J. M. Chang. Dynamic pretenuring schemes for generational garbage collection. In Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS ’04, pages 133–140, Washington, DC, USA, 2004. Google ScholarDigital Library
- S. H. Jensen, M. Sridharan, K. Sen, and S. Chandra. Meminsight: Platform-independent memory debugging for javascript. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, pages 345–356, New York, NY, USA, 2015. Google ScholarDigital Library
- M. Jovic, A. Adamoli, and M. Hauswirth. Catch me if you can: Performance bug detection in the wild. In Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’11, pages 155–170, New York, NY, USA, 2011. Google ScholarDigital Library
- M. Jump, S. M. Blackburn, and K. S. McKinley. Dynamic object sampling for pretenuring. In Proceedings of the 4th International Symposium on Memory Management, ISMM ’04, pages 152–162, New York, NY, USA, 2004. Google ScholarDigital Library
- M. Jump and K. S. McKinley. Detecting memory leaks in managed languages with cork. Softw. Pract. Exper., 40(1):1–22, Jan. 2010. Google ScholarDigital Library
- J. R. Larus. Whole program paths. In Proceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation, PLDI ’99, pages 259–269, New York, NY, USA. Google ScholarDigital Library
- H. Lieberman and C. Hewitt. A real-time garbage collector based on the lifetimes of objects. Commun. ACM, 26(6):419–429, June 1983. Google ScholarDigital Library
- A. Nistor, L. Song, D. Marinov, and S. Lu. Toddler: Detecting performance problems via similar memory-access patterns. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pages 562–571, Piscataway, NJ, USA, 2013. Google ScholarDigital Library
- R. Odaira and T. Nakatani. Continuous object access profiling and optimizations to overcome the memory wall and bloat. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 147–158, New York, NY, USA, 2012. Google ScholarDigital Library
- O. Olivo, I. Dillig, and C. Lin. Static detection of asymptotic performance bugs in collection traversals. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2015, pages 369–378, New York, NY, USA, 2015. Google ScholarDigital Library
- M. Pradel and T. R. Gross. Automatic generation of object usage specifications from large method traces. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering, ASE ’09, pages 371–382, Washington, DC, USA, 2009. Google ScholarDigital Library
- M. Pradel, M. Huggler, and T. R. Gross. Performance regression testing of concurrent classes. In Proceedings of the 2014 International Symposium on Software Testing and Analysis, ISSTA 2014, pages 13–25, New York, NY, USA, 2014. Google ScholarDigital Library
- M. Selakovic and M. Pradel. Automatically fixing real-world javascript performance bugs. In Proceedings of the 37th International Conference on Software Engineering - Volume 2, ICSE ’15, pages 811–812, Piscataway, NJ, USA, 2015. Google ScholarDigital Library
- R. Shaham, E. K. Kolodner, and M. Sagiv. Heap profiling for space-efficient java. In Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation, PLDI ’01, pages 104–113, New York, NY, USA, 2001. Google ScholarDigital Library
- A. Shankar, M. Arnold, and R. Bodik. Jolt: Lightweight dynamic analysis and removal of object churn. In Proceedings of the 23rd ACM SIGPLAN Conference on Object-oriented Programming Systems Languages and Applications, OOPSLA ’08, pages 127–142, New York, NY, USA, 2008. Google ScholarDigital Library
- W. N. Sumner and X. Zhang. Memory indexing: Canonicalizing addresses across executions. In Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE ’10, pages 217–226, New York, NY, USA, 2010. Google ScholarDigital Library
- W. N. Sumner, Y. Zheng, D. Weeratunge, and X. Zhang. Precise calling context encoding. In Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE ’10, pages 525–534, New York, NY, USA, 2010. Google ScholarDigital Library
- S. Tallam, X. Zhang, and R. Gupta. Extending path profiling across loop backedges and procedure boundaries. In Code Generation and Optimization, 2004. CGO 2004. International Symposium on, pages 251–262, March 2004. Google ScholarDigital Library
- R. Vallée-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sundaresan. Soot - a java bytecode optimization framework. In Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative Research, CASCON ’99, pages 13–, 1999. Google ScholarDigital Library
- M. Weiser. Program slicing. In Proceedings of the 5th International Conference on Software Engineering, ICSE ’81, pages 439–449, Piscataway, NJ, USA, 1981. Google ScholarDigital Library
- R. Wu, X. Xiao, S.-C. Cheung, H. Zhang, and C. Zhang. Casper: An efficient approach to call trace collection. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, pages 678–690, New York, NY, USA, 2016. Google ScholarDigital Library
- B. Xin and X. Zhang. Efficient online detection of dynamic control dependence. In Proceedings of the 2007 International Symposium on Software Testing and Analysis, ISSTA ’07, pages 185–195, New York, NY, USA, 2007. Google ScholarDigital Library
- G. Xu. Resurrector: A tunable object lifetime profiling technique for optimizing real-world programs. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA ’13, pages 111–130, New York, NY, USA, 2013. Google ScholarDigital Library
- G. Xu, M. Arnold, N. Mitchell, A. Rountev, and G. Sevitsky. Go with the flow: Profiling copies to find runtime bloat. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’09, pages 419–430, New York, NY, USA, 2009. Google ScholarDigital Library
- G. Xu, M. D. Bond, F. Qin, and A. Rountev. Leakchaser: Helping programmers narrow down causes of memory leaks. In Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’11, pages 270–282, New York, NY, USA, 2011. Google ScholarDigital Library
- D. Yan, G. Xu, and A. Rountev. Uncovering performance problems in java applications with reference propagation profiling. In Proceedings of the 34th International Conference on Software Engineering, ICSE ’12, pages 134–144, Piscataway, NJ, USA, 2012. Google ScholarDigital Library
- X. Zhang and R. Gupta. Whole execution traces. In Microarchitecture, 2004. MICRO-37 2004. 37th International Symposium on, pages 105–116, 2004. Google ScholarDigital Library
- X. Zhang, A. Navabi, and S. Jagannathan. Alchemist: A transparent dependence distance profiling infrastructure. In Proceedings of the 7th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO ’09, pages 47–58, Washington, DC, USA, 2009. Google ScholarDigital Library
Index Terms
- Efficient flow profiling for detecting performance bugs
Recommendations
Low overhead program monitoring and profiling
Program instrumentation, inserted either before or during execution, is rapidly becoming a necessary component of many systems. Instrumentation is commonly used to collect information for many diverse analysis applications, such as detecting program ...
Incremental call-path profiling: Research Articles
European–American Working Group on Automatic Performance Analysis (APART)Profiling is a key technique for achieving high performance. Call-path profiling is a refinement of this technique that classifies a function's behavior based on the path taken to reach the function. This information is particularly useful when ...
Continuous object access profiling and optimizations to overcome the memory wall and bloat
ASPLOS '12Future microprocessors will have more serious memory wall problems since they will include more cores and threads in each chip. Similarly, future applications will have more serious memory bloat problems since they are more often written using object-...
Comments