ABSTRACT
All high-performance production JVMs employ an adaptive strategy for program execution. Methods are first executed unoptimized and then an online profiling mechanism is used to find a subset of methods that should be optimized during the same execution. This paper empirically evaluates the design space of several profilers for initiating dynamic compilation and shows that existing online profiling schemes suffer from several limitations. They provide an insufficient number of samples, are untimely, and have limited accuracy at determining the frequently executed methods. We describe and comprehensively evaluate HPM-sampling, a simple but effective profiling scheme for finding optimization candidates using hardware performance monitors (HPMs) that addresses the aforementioned limitations. We show that HPM-sampling is more accurate; has low overhead; and improves performance by 5.7% on average and up to 18.3% when compared to the default system in Jikes RVM, without changing the compiler.
- 1. A.-R. Adl-Tabatabai, R. L. Hudson, M. J. Serrano, and S. Subramoney. Prefetch injection based on hardware monitoring and object metadata. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 267--276, June 2004. Google ScholarDigital Library
- B. Alpern, C. R. Attanasio, J. J. Barton, M. G. Burke, P. Cheng, J.-D. Choi, A. Cocchi, S. J. Fink, D. Grove, M. Hind, S. F. Hummel, D. Lieber, V. Litvinov, M. F. Mergen, T. Ngo, J. R. Russell, V. Sarkar, M. J. Serrano, J. C. Shepherd, S. E. Smith, V. C. Sreedhar, HSrinivasan, and J. Whaley. The Jalapeño virtual machine. IBM Systems Journal, 39(1):211--238, Feb. 2000. Google ScholarDigital Library
- B. Alpern, S. Augart, S. Blackburn, M. Butrico, A. Cocchi, P. Cheng, J. Dolby, S. Fink, D. Grove, M. Hind, K. McKinley, M. Mergen, J. Moss, T. Ngo, V. Sarkar, and M. Trapp. The Jikes Research Virtual Machine project: Building and open-source research community. IBM Systems Journal, 44(2):399--417, 2005. Google ScholarDigital Library
- G. Ammons, T. Ball, and J. R. Larus. Exploiting hardware performance counters with flow and context sensitive profiling. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 85--96, May 1997. Google ScholarDigital Library
- J. M. Anderson, L. M. Berc, J. Dean, S. Ghemawat, M. R. Henzinger, Stak A. Leung, R. L. Sites, M. T. Vandevoorde, C. A. Waldspurger, and W. E. Weihl. Continuous profiling: Where have all the cycles gone? ACM Transactions on Computer Systems, 15(4):357--390, Nov. 1997. Google ScholarDigital Library
- M. Arnold, S. Fink, D. Grove, M. Hind, and P. F. Sweeney. Adaptive optimization in the Jalapeño JVM. In Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA), 47--65, Oct. 2000. Google ScholarDigital Library
- M. Arnold, S. Fink, D. Grove, M. Hind, and P. F. Sweeney. Architecture and policy for adaptive optimization in virtual machines. Technical Report 23429, IBM Research, Nov. 2004.Google Scholar
- BEA. BEA JRockit: Java for the enterprise -Technical white paper. http://www.bea.com, Jan. 2006.Google Scholar
- S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. Eliot, B. Moss, A. Phansalkar, DStefanović, T. VanDrunen, Dvon Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA), 169--190, Oct. 2006. Google ScholarDigital Library
- M. Cierniak, M. Eng, N. Glew, B. Lewis, and J. Stichnoth. The open runtime platform: A flexible high-performance managed runtime environment. Intel Technology Journal, 7(1):5--18, 2003.Google Scholar
- T. M. Conte, K. N. Menezes, and M. A. Hirsch. Accurate and practical profile-driven compilation using the profile buffer. In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture (MICRO), 36--45, Dec. 1996. Google ScholarDigital Library
- E. Duesterwald and V. Bala. Software profiling for hot path prediction: Less is more. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 202--211, Nov. 2000. Google ScholarDigital Library
- G. Eastman, S. Aundhe, R. Knight, and R. Kasten. Dynamic profile-guided optimization in the. BEA JRockit JVM, In 3rd Workshop on Managed Runtime Environments (MRE) held in conjunction with the IEEE/ACM International Symposium on Code Generation and Optimization (CGO), Mar. 2005.Google Scholar
- S. Friberg. Dynamic profile guided optimization in a VEE on IA-64. Master's thesis, KTH - Royal Institute of Technology, 2004. IMIT/LECS-2004-69.Google Scholar
- A. Georges, D. Buytaert, and L. Eeckhout. Statistically rigorous Java performance evaluation. In Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), Oct. 2007. Google ScholarDigital Library
- U. Hölzle and D. Ungar. Reconciling responsiveness with performance in pure Object-Oriented languages. ACM Transactions on Programming Languages and Systems (TOPLAS), 18(4):355--400, July 1996. Google ScholarDigital Library
- J. Lu, H. Chen, P.-C. Yew, and W.-C. Hsu. Design and implementation of a lightweighted dynamic optimization system. Journal of Instruction--Level Parallelism, 6, 2004.Google Scholar
- D. Maier, P. Ramarao, M. Stoodley, and V. Sundaresan. Experiences with multithreading and dynamic class loading in a Java just-in-time compiler. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), 87--97, Mar. 2006. Google ScholarDigital Library
- M. C. Merten, A. R. Trick, R. D. Barnes, E. M. Nystrom, C. N. George, J. C. Gyllenhaal, and W. mei W. Hwu. An architectural framework for runtime optimization. IEEE Transactions on Computers, 50(6):567--589, 2001. Google ScholarDigital Library
- J. Neter, M. H. Kutner, W. Wasserman, and C. J. Nachtsheim. Applied Linear Statistical Models. WCB/McGraw-Hill, 1996.Google Scholar
- M. Paleczny, C. Vick, and C. Click. The Java Hotspot server compiler. In Proceedings of the Java Virtual Machine Research and Technology Symposium (JVM), pages 1--12, Apr. 2001. Google ScholarDigital Library
- perfctr. perfctr version 2.6.19. http://user.it.uu.se/~mikpe/linux/perfctr.Google Scholar
- F. T. Schneider, M. Payer, and T. R. Gross. Online optimizations driven by hardware performance monitoring. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 373--382, June 2007. Google ScholarDigital Library
- K. Shiv, R. Iyer, C. Newburn, J. Dahlstedt, M. Lagergren, and O. Lindholm. Impact of JIT JVM optimizations on Java application performance. In Proceedings of the 7th Annual Workshop on Interaction between Compilers and Computer Architecture (INTERACT) held in conjunction with the International Symposium on High-Performance Computer Architecture (HPCA), 5--13, Mar. 2003. Google ScholarDigital Library
- Standard Performance Evaluation Corporation. SPECjbb2000 Java Business Benchmark. http://www.spec.org/jbb2000.Google Scholar
- Standard Performance Evaluation Corporation. SPECjvm98 Benchmarks. http://www.spec.org/jvm98.Google Scholar
- L. Su and M. H. Lipasti. Speculative optimization using hardware-monitored guarded regions for Java virtual machines. In Proceedings of the 3rd International Conference on Virtual Execution Environments (VEE), 22--32, June 2007. Google ScholarDigital Library
- T. Suganuma, T. Yasue, M. Kawahito, H. Komatsu, and T. Nakatani. Design and evaluation of dynamic optimizations for a Java just-in-time compiler. ACM Transactions on Programming Languages and Systems (TOPLAS), 27(4):732--785, July 2005. Google ScholarDigital Library
- D. Tam and J. Wu. Using hardware counters to improve dynamic compilation. Technical Report ECE1724, Electrical and Computer Engineering Department University of Toronto, Dec. 2003.Google Scholar
- J. Whaley. A portable sampling-based profiler for Java virtual machines. In Proceedings of the ACM 2000 Conference on Java Grande, 78--87, June 2000. Google ScholarDigital Library
- X. Zhang, Z. Wang, N. Gloy, J. B. Chen, and M. D. Smith. System support for automatic profiling and optimization. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles (SOSP), 15--26, Oct. 1997. Google ScholarDigital Library
Index Terms
- Using hpm-sampling to drive dynamic compilation
Recommendations
Using hpm-sampling to drive dynamic compilation
Proceedings of the 2007 OOPSLA conferenceAll high-performance production JVMs employ an adaptive strategy for program execution. Methods are first executed unoptimized and then an online profiling mechanism is used to find a subset of methods that should be optimized during the same execution. ...
Trace-based compilation for the Java HotSpot virtual machine
PPPJ '11: Proceedings of the 9th International Conference on Principles and Practice of Programming in JavaTraditional method-based just-in-time (JIT) compilation translates whole methods to optimized machine code. Trace-based compilation only generates machine code for frequently executed paths, so-called traces, that may span multiple methods.
In this ...
Generalized just-in-time trace compilation using a parallel task farm in a dynamic binary translator
PLDI '11Dynamic Binary Translation (DBT) is the key technology behind cross-platform virtualization and allows software compiled for one Instruction Set Architecture (ISA) to be executed on a processor supporting a different ISA. Under the hood, DBT is ...
Comments