skip to main content
research-article
Free Access

On the simulation of large-scale architectures using multiple application abstraction levels

Authors Info & Claims
Published:26 January 2012Publication History
Skip Abstract Section

Abstract

Simulation is a key tool for computer architecture research. In particular, cycle-accurate simulators are extremely important for microarchitecture exploration and detailed design decisions, but they are slow and, so, not suitable for simulating large-scale architectures, nor are they meant for this. Moreover, microarchitecture design decisions are irrelevant, or even misleading, for early processor design stages and high-level explorations. This allows one to raise the abstraction level of the simulated architecture, and also the application abstraction level, as it does not necessarily have to be represented as an instruction stream.

In this paper we introduce a definition of different application abstraction levels, and how these are employed in TaskSim, a multi-core architecture simulator, to provide several architecture modeling abstractions, and simulate large-scale architectures with hundreds of cores. We compare the simulation speed of these abstraction levels to the ones in existing simulation tools, and also evaluate their utility and accuracy. Our simulations show that a very high-level abstraction, which may be even faster than native execution, is useful for scalability studies on parallel applications; and that just simulating explicit memory transfers, we achieve accurate simulations for architectures using non-coherent scratchpad memories, with just a 25x slowdown compared to native execution. Furthermore, we revisit trace memory simulation techniques, that are more abstract than instruction-by-instruction simulations and provide an 18x simulation speedup.

References

  1. 2011. Mercurium Project website. https://pm.bsc.es/projects/mcxx.Google ScholarGoogle Scholar
  2. 2011. NANOS++ Project website. https://pm.bsc.es/projects/nanox.Google ScholarGoogle Scholar
  3. Austin, T., Larson, E., and Ernst, D. 2002. SimpleScalar: An infrastructure for computer system modeling. Computer 35, 2, 59--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Badia, R. M., Labarta, J., Gimenez, J., and Escalé., F. 2003. DIMEMAS: Predicting MPI applications behavior in Grid environments. In Proceedings of the Workshop on Grid Applications and Programming Tools.Google ScholarGoogle Scholar
  5. Barker, K. J., Davis, K., Hoisie, A., Kerbyson, D. J., Lang, M., Pakin, S., and Sancho, J. C. 2008. Entering the petaflop era: The architecture and performance of Roadrunner. In Proceedings of SC '08. 1:1--1:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bellens, P., Perez, J. M., Badia, R. M., and Labarta, J. 2006. CellSs: A Programming model for the Cell BE architecture. In Proceedings of SC '06. 86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Binkert, N. L., Dreslinski, R. G., Hsu, L. R., Lim, K. T., Saidi, A. G., and Reinhardt, S. K. 2006. The M5 simulator: Modeling networked systems. IEEE Micro 26, 4, 52--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Black, B., Huang, A. S., Lipasti, M. H., and Shen, J. P. 1996. Can trace-driven simulators accurately predict superscalar performance?In Proceedings of ICCD '96. 478--485. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Blumofe, R. D., Joerg, C. F., Kuszmaul, B. C., Leiserson, C. E., Randall, K. H., and Zhou, Y. 1995. Cilk: An efficient multithreaded runtime system. SIGPLAN Not. 30, 8, 207--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bose, P. 2011. Integrated modeling challenges in extreme-scale computing. Proceedings of ISPASS'11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., and Sarkar, V. 2005. X10: An object-oriented approach to non-uniform cluster computing. In Proceedings of OOPSLA '05. 519--538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chen, J., Annavaram, M., and Dubois, M. 2009. SlackSim: A platform for parallel simulations of CMPs on CMPs. SIGARCH Comput. Archit. News 37, 20--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Duran, A., Ayguadé, E., Badia, R. M., Labarta, J., Martinell, L., Martorell, X., and Planas, J. 2011. Ompss: A Proposal for Programming Heterogeneous Multi-Core Architectures. Parall. Proc. Lett. 21, 2, 173--193.Google ScholarGoogle ScholarCross RefCross Ref
  14. Genbrugge, D., Eyerman, S., and Eeckhout, L. 2010. Interval simulation: Raising the level of abstraction in architectural simulation. In Proceedings of HPCA '10. 1--12.Google ScholarGoogle Scholar
  15. Gonzalez, J., Gimenez, J., Casas, M., Moreto, M., Ramirez, A., Labarta, J., and Valero, M. 2011. Simulating whole supercomputer applications. IEEE Micro 31, 3, 32--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jefferson, D. R. and Sowrizal, H. A. 1982. Fast concurrent simulation using the Time Warp mechanism, part I: Local control. Rand Note N-1906AF, the Rand Corp.Google ScholarGoogle Scholar
  17. Kahle, J. A., Day, M. N., Hofstee, H. P., Johns, C. R., Maeurer, T. R., and Shippy, D. 2005. Introduction to the Cell multiprocessor. IBM J. Res. Dev. 49, 4/5, 589--604. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lee, H., Jin, L., Lee, K., Demetriades, S., Moeng, M., and Cho, S. 2010. Two-phase trace-driven simulation (TPTS): A fast multicore processor architecture simulation approach. Softw. Pract. Exper. 40, 239--258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lee, K., Evans, S., and Cho, S. 2009. Accurately approximating superscalar processor performance from traces. In Proceedings of ISPASS'09. 238--248.Google ScholarGoogle Scholar
  20. Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Janapa, V., and Hazelwood, R. K. 2005. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of PLDI '05. 190--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Magnusson, P. S., Christensson, M., Eskilson, J., Forsgren, D., Hållberg, G., Högberg, J., Larsson, F., Moestedt, A., and Werner, B. 2002. Simics: A full system simulation platform. IEEE Computer 35, 2, 50--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Martin, M. M. K., Sorin, D. J., Beckmann, B. M., Marty, M. R., Xu, M., Alameldeen, A. R., Moore, K. E., Hill, M. D., and Wood, D. A. 2005. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset. SIGARCH Comput. Archit. News 33, 4, 92--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Miller, J. E., Kasture, H., Kurian, G., Beckmann, N., III, C. G., Celio, C., Eastep, J., and Agarwal, A. 2009. Graphite: A distributed parallel simulator for multicores. Tech. rep. MIT-CSAIL-TR-2009-056, Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  24. Moudgill, M., Bose, P., and Moreno, J. 1999. Validation of Turandot, a fast processor model for microarchitecture exploration. In Proceedings of IPCCC'99. 451--457.Google ScholarGoogle Scholar
  25. Mukherjee, S. S., Reinhardt, S. K., Falsafi, B., Litzkow, M., Hill, M. D., Wood, D. A., Huss-Lederman, S., and Larus, J. R. 2000. Wisconsin wind tunnel II: A fast, portable parallel architecture simulator. IEEE Concurrency 8, 12--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Perelman, E., Hamerly, G., Van Biesbrouck, M., Sherwood, T., and Calder, B. 2003. Using SimPoint for accurate and efficient simulation. In Proceedings of SIGMETRICS '03. 318--319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Puzak, T. R. 1985. Analysis of cache replacement-algorithms. Ph.D. thesis. AAI8509594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ramirez, A., Cabarcas, F., Juurlink, B., Mesa, A., Sanchez, F., Azevedo, A., Meenderinck, C., Ciobanu, C., Isaza, S., and Gaydadjiev, G. 2010. The SARC architecture. IEEE Micro 30, 5, 16--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Reinders, J. 2007. Intel Threading Building Blocks. O'Reilly. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Rico, A., Duran, A., Cabarcas, F., Etsion, Y., Ramirez, A., and Valero, M. 2011. Trace-driven simulation of multithreaded applications. In Proceedings of ISPASS'11. 87--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Rico, A., Ramirez, A., and Valero, M. 2009. Available task-level parallelism on the Cell BE. Scientific Program. 17, 1-2, 59--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tikir, M. M., Laurenzano, M. A., Carrington, L., and Snavely, A. 2009. PSINS: An open source event tracer and execution simulator for MPI applications. In Proceedings of Euro-Par '09. 135--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Uhlig, R. A. and Mudge, T. N. 1997. Trace-driven memory simulation: A survey. ACM Comput. Surv. 29, 128--170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Vega, A., Rico, A., Cabarcas, F., Ramírez, A., and Valero, M. 2010. Comparing last-level cache designs for CMP architectures. In Proceedings of IFMT '10. 2:1--2:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Wang, W.-H. and Baer, J.-L. 1990. Efficient trace-driven simulation method for cache performance analysis. In Proceedings of SIGMETRICS'90. 27--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Wenisch, T. F., Wunderlich, R. E., Falsafi, B., and Hoe, J. C. 2005. TurboSMARTS: accurate microarchitecture simulation sampling in minutes. In Proceedings of SIGMETRICS '05. 408--409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Wunderlich, R. E., Wenisch, T. F., Falsafi, B., and Hoe, J. C. 2003. SMARTS: Accelerating microarchitecture simulation via rigorous statistical sampling. In Proceedings of ISCA '03. 84--97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Yi, J. J., Eeckhout, L., Lilja, D. J., Calder, B., John, L. K., and Smith, J. E. 2006. The future of simulation: A field of dreams. Computer 39, 22--29. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. On the simulation of large-scale architectures using multiple application abstraction levels

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Architecture and Code Optimization
          ACM Transactions on Architecture and Code Optimization  Volume 8, Issue 4
          Special Issue on High-Performance Embedded Architectures and Compilers
          January 2012
          765 pages
          ISSN:1544-3566
          EISSN:1544-3973
          DOI:10.1145/2086696
          Issue’s Table of Contents

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 January 2012
          • Accepted: 1 November 2011
          • Revised: 1 October 2011
          • Received: 1 July 2011
          Published in taco Volume 8, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader