ABSTRACT
Generating an accurate estimate of the performance of a program on a given system is important to a large number of people. Computer architects, compiler writers, and developers all need insight into a machine's performance. There are a number of performance estimation techniques in use, from profile-based approaches to full machine simulation. This paper discusses a profile-based performance estimation technique that uses a lightweight instrumentation phase that runs in order number of dynamic instructions, followed by an analysis phase that runs in roughly order number of static instructions. This technique accurately predicts the performance of the core pipeline of a detailed out-of-order issue processor model while scheduling far fewer instructions than does full simulation. The difference between the predicted execution time and the time obtained from full simulation is only a few percent.
- 1.R. Bedichek. Talisman: Fast and Accurate Multicomputer Simulation. Performance Evaluation Review, vol. 23, no. 1, pp. 14-24, May 1995.]] Google ScholarDigital Library
- 2.D. Burger and T.M. Austin. The SimpleScalar Tool Set, Version 2.0. University of Wisconsin-Madison Computer Sciences Technical Report #1342, June 1997.]]Google Scholar
- 3.T. Bail and J.R. Larus. Optimally profiling and tracing programs. ACM Transactions on Programming Languages and Systems, vol. 16, no. 3, pp. 1319-1360, July 1994.]] Google ScholarDigital Library
- 4.T. Ball and J.R. Larus. Efficient path profiling. In Proceedings of MICRO 96, pp. 46-57, December 1996.]] Google ScholarDigital Library
- 5.A. Chamas, A. Dalai, P. deDood, et el. A 64b microprocessor with multimedia support. In Proceedings of the 1EEE International SolM-State Cirruits Conference, pp. 178-179, February 1995.]]Google ScholarCross Ref
- 6.R. Cmelik and D. Keppel. Shade: A fast instruction set simulator for execution profiling. Performance Evaluation Review, vol. 22, no. 1, pp. 128-137, May 1994]] Google ScholarDigital Library
- 7.R.P. Colwelt and R.L. Steck. A 0.6mu m BiCMOS processor with dynamic execution. In Proceedings of the 1995 IEEE International Solid-State Circuits Conference, pp. 176-177, February 1995.]]Google Scholar
- 8.R.B. Garner, A. Agrawal, F. Briggs, et al. The sealable processor architecture (SPARC). In Proceedings of the 33rd IEEE Computer Society International Conference, pp. 278- 283, March 1988.]]Google Scholar
- 9.B.A. Gieseke, R.L. Allmon, D.W. Bailey, et el. A 600 Mhz supersealar RISC microprocessor with out-of-order execution. In Proceedings of the 1997 1EEE international Solid- State Circuits Conference, pp. 176-177, February 1997.]]Google Scholar
- 10.A.J. Goldberg and J.L. Hermessy. MTool: An integrated system for performance debugging shared memory multiprsot applications. IEEE Transactions on Parallel'and Distributed Systems, vol. 4, no. 1, pp. 28-40, January 1993.]] Google ScholarDigital Library
- 11.D.E. Knuth. An empirical study of FORTRAN programs. Software Practice and Experience, vol. 1, pp. 105-133, 1971.]]Google ScholarCross Ref
- 12.D.E. Knuth. The Art of Computer Programming Vol L Fundamental Algorithms. 2nd ed. Addison Wesley, Reading Mass, t973.]] Google ScholarDigital Library
- 13.G. Kane and J. Heinrich. MIPS RISC Architecture, Prentice- Hall, Englewood Cliffs, N J, 1992.]] Google ScholarDigital Library
- 14.D.E. Knuth and F.R. Stevenson. Optimal measurement points for program frequency counts. BIT, voI. 13, pp. 313-322, 1973.]]Google Scholar
- 15.J. Larus and E. Schnarr. EEL: machine-independent executable editing. In SIGPLAN Notices, vol. 30, no. 6, pp. 291-300, June 1995.]] Google ScholarDigital Library
- 16.J.R. Larus. Efficient Program Tracing. IEEE Computer, vol. 26, no. 5, pp. 52-61, May 1993.]] Google ScholarDigital Library
- 17.P.S. Magnusson, E Dahlgren, H. G-rahn, et al. SimlCS/sun4m: A Virtual Workstation, In Proceedings of the Usenix Annual Technical Conference, June 1998.]] Google ScholarDigital Library
- 18.MIPS Computer Systems. UMIPS- V Reference Manual (pixie and pixstats). MIPS Computer Systems, Sunnyvale, CA. 1990.]]Google Scholar
- 19.D. Ofelt. Efficient Performance Prediction for Modern Microprocessors. Stanford University Thesis, August 1999.]] Google ScholarDigital Library
- 20.V.S. Pai, P. Ranganathan, and S.V. Adve. RSIM: An Execution-Driven Simulator for ILP-Based Shared-Memory Multiprocessors and Uniprocessors. In the Proceedings of the 3rd Wor~hop on Computer Architecture Education, February 1997.]] Google ScholarDigital Library
- 21.M. Rosenblurn, S.A. Herrod, E. Witehei, and A. Gupta. Complete Computer Simulation: The SimOS Approach. IEEE Parallel and Distributed Technology, vol. 3, no. 4, pp. 34-43, Winter 1995.]] Google ScholarDigital Library
- 22.V. Sarkar. Determining average program execution times and their variance. SIGPLAN Notices (ACM), vol. 24, no. 7, pp. 298-312, June 1989.]] Google ScholarDigital Library
- 23.A. Srivastava and A. Eustace. ATOM: A system for building customized program analysis tools. SIGPLAN Notices, vol. 29, no. 6, pp. 196-205, June 1994.]] Google ScholarDigital Library
- 24.E. Schnarr and J.R. Larus. Fast Out-Of-Order Processor Simulation Using Memoization. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 283-294, October 1998.]] Google ScholarDigital Library
- 25.Standard Performance Evaluation Corporation. The SPEC benchmark suite, http://www.~bench.org.]]Google Scholar
- 26.E. Witchel and M. Rosenblum. Embra: fast and flexible machine simulation. Performance Evaluation Review, vol. 24, no. 1, pp. 68-79, May 1996.]] Google ScholarDigital Library
- 27.K. Yeager. The MIPS R10000 Superscalar Microprocessor. 1EEE Micro, vol. 16, no. 2, pp. 28-40, April 1996.]] Google ScholarDigital Library
Index Terms
- Efficient performance prediction for modern microprocessors
Recommendations
Efficient performance prediction for modern microprocessors
Special issue on proceedings of ACM SIGMETRICS 2000Generating an accurate estimate of the performance of a program on a given system is important to a large number of people. Computer architects, compiler writers, and developers all need insight into a machine's performance. There are a number of ...
Comments