skip to main content
10.1145/511334.511349acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

Full-system timing-first simulation

Published:01 June 2002Publication History

ABSTRACT

Computer system designers often evaluate future design alternatives with detailed simulators that strive for functional fidelity (to execute relevant workloads) and performance fidelity (to rank design alternatives). Trends toward multi-threaded architectures, more complex micro-architectures, and richer workloads, make authoring detailed simulators increasingly difficult. To manage simulator complexity, this paper advocates decoupled simulator organizations that separate functional and performance concerns. Furthermore, we define an approach, called timing-first simulation, that uses an augmented timing simulator to execute instructions important to performance in conjunction with a functional simulator to insure correctness. This design simplifies software development, leverages existing simulators, and can model micro-architecture timing in detail.We describe the timing-first organization and our experiences implementing TFsim, a full-system multiprocessor performance simulator. TFsim models a pipelined, out-of-order micro-architecture in detail, was developed in less than one person-year, and performs competitively with previously-published simulators. TFsim's timing simulator implements dynamically common instructions (99.99% of them), while avoiding the vast and exacting implementation efforts necessary to run unmodified commercial operating systems and workloads. Virtutech Simics, a full-system functional simulator, checks and corrects the timing simulator's execution, contributing 18-36% to the overall run-time. TFsim's mostly correct functional implementation introduces a worst-case performance error of 4.8% for our commercial workloads. Some additional simulator performance is gained by verifying functional correctness less often, at the cost of some additional performance error.

References

  1. A. R. Alameldeen, C. J. Mauer, M. Xu, P. J. Harper, M. M. Martin, D. J. Sorin, M. D. Hill, and D. A. Wood. Evaluating Non-deterministic Multi-threaded Commercial Workloads. In Proceedings of the Fifth Workshop on Computer Architecture Evaluation Using Commercial Workloads, pages 30-38, Feb. 2002.Google ScholarGoogle Scholar
  2. T. Austin, E. Larson, and D. Ernst. SimpleScalar: An Infrastructure for Computer System Modeling. IEEE Computer, 35(2):59-67, Feb. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. A. Barroso, K. Gharachorloo, A. Nowatzyk, and B. Verghese. Impact of Chip-Level Integration on Performance of OLTP Workloads. In Proceedings of the Sixth IEEE Symposium on High-Performance Computer Architecture, Jan. 2000.Google ScholarGoogle Scholar
  4. R. C. Bedichek. Some Efficient Architecture Simulation Techniques. Winter 1990 USENIX Conference, pages 53-63, Jan. 1990.Google ScholarGoogle Scholar
  5. R. C. Bedichek. Talisman: Fast and accurate multicomputer simulation. In Proceedings of the 1995 ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems, pages 14-24, May 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. E. Breach. Design and Evaluation of a Multiscalar Processor. PhD thesis, Computer Sciences Department, University of Wisconsin-Madison, Feb. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. H. W. Cain, K. M. Lepak, B. A. Schwartz, and M. H. Lipasti. Precise and Accurate Processor Simulation. In Proceedings of the Fifth Workshop on Computer Architecture Evaluation Using Commercial Workloads, pages 13-22, Feb. 2002.Google ScholarGoogle Scholar
  8. R. F. Cmelik and D. Keppel. Shade: A Fast Instruction-Set Simulator for Execution Profiling. In Proceedings of the 1994 ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems, May 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. E. Culler and J. Singh. Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann Publishers, Inc., 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Desikan, D. Burger, and S. W. Keckler. Measuring Experimental Error in Microprocessor Simulation. In Proceedings of the 28th Annual International Symposium on Computer Architecture, pages 266-277, July 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Driesen and U. Holzle. Accurate Indirect Branch Prediction. In Proceedings of the 25th Annual International Symposium on Computer Architecture, pages 167-178, June 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Durbhakula, V. S. Pai, and S. V. Adve. Improving the Accuracy vs. Speed Tradeoff for Simulating Shared-Memory Multiprocessors with ILP Processors. Technical Report TR9802, Rice University, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. N. Eden and T. Mudge. The YAGS branch prediction scheme. In Proceedings of the 25th Annual International Symposium on Computer Architecture, pages 69-77, June 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Emer, P. Ahuja, E. Borch, A. Klauser, C.-K. Luk, S. Manne, S. S. Mukherjee, H. Patil, S. Wallace, N. Binkert, R. Espasa, and T. Juan. Asim: A Performance Model Framework. IEEE Computer, 35(2):68-76, Feb. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. J. Hughes, V. S. P. Pai, P. Ranganathan, and S. V. Adve. Rsim: Simulating Shared-Memory Multiprocessors with ILP Processors. IEEE Computer, 35(2):40-49, Feb. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Jourdan, T.-H. Hsing, J. Stark, and Y. N. Patt. The Effects of Mispredicted-Path Execution on Branch Prediction Structures. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, pages 58-67, Oct. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Lamport. How to Make a Multiprocessor Computer that Correctly Executes Multiprocess Programs. IEEE Transactions on Computers, C-28(9):690-691, Sept. 1979.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. E. Larson, S. Chatterjee, and T. Austin. MASE: A Novel Infrastructure for Detailed Microarchitectural Modeling. International Symposium on Performance Analysis of Systems and Software, Nov. 2001.Google ScholarGoogle Scholar
  19. P. S. Magnusson. A Design For Efficient Simulation of a Multiprocessor. First International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pages 69-78, Jan. 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A Full System Simulation Platform. IEEE Computer, 35(2):50-58, Feb. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. M. K. Martin, D. J. Sorin, M. D. Hill, and D. A. Wood. Bandwidth Adaptive Snooping. In Proceedings of the Eighth IEEE Symposium on High-Performance Computer Architecture, Jan. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Rajwar. Personal Communication, Oct. 2001.Google ScholarGoogle Scholar
  23. M. Rosenblum, S. A. Herrod, E. Witchel, and A. Gupta. Complete Computer System Simulation: The SimOS Approach. IEEE Parallel and Distributed Technology: Systems and Applications, 3(4):34-43, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. E. Schnarr and J. R. Larus. Fast Out-Of-Order Processor Simulation Using Memoization. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 283-294, Oct. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sun Microsystems. UltraSPARC User's Manual. Sun Microsystems, Inc., July 1997.Google ScholarGoogle Scholar
  26. Systems Performance Evaluation Cooperative. SPEC Benchmarks. http://www.spec.org.Google ScholarGoogle Scholar
  27. Transaction Processing Performance Council. TPC Benchmark C, Draft Specification, Revision 4.0.q, Aug. 1999.Google ScholarGoogle Scholar
  28. R. A. Uhlig and T. N. Mudge. Trace-Driven Memory Simulation: A Survey. ACM Computing Surveys, 29(2):128-170, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. L. Weaver and T. Germond, editors. SPARC Architecture Manual (Version 9). PTR Prentice Hall, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pages 24-37, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. K. C. Yeager. The MIPS R10000 Superscalar Microprocessor. IEEE Micro, 16(2):28-40, Apr. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. C. B. Zilles, J. S. Emer, and G. S. Sohi. The Use of Multithreading for Exception Handling. In Proceedings of the 32nd Annual IEEE/ACM International Symposium on Microarchitecture, pages 219-229, Nov. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Full-system timing-first simulation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMETRICS '02: Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
        June 2002
        299 pages
        ISBN:1581135319
        DOI:10.1145/511334
        • cover image ACM SIGMETRICS Performance Evaluation Review
          ACM SIGMETRICS Performance Evaluation Review  Volume 30, Issue 1
          Measurement and modeling of computer systems
          June 2002
          286 pages
          ISSN:0163-5999
          DOI:10.1145/511399
          Issue’s Table of Contents

        Copyright © 2002 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 June 2002

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        SIGMETRICS '02 Paper Acceptance Rate23of170submissions,14%Overall Acceptance Rate459of2,691submissions,17%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader