skip to main content
10.1145/2458523.2458534acmotherconferencesArticle/Chapter ViewAbstractPublication PagesgpgpuConference Proceedingsconference-collections
research-article

Accelerating simulation of agent-based models on heterogeneous architectures

Published:16 March 2013Publication History

ABSTRACT

The wide usage of GPGPU programming models and compiler techniques enables the optimization of data-parallel programs on commodity GPUs. However, mapping GPGPU applications running on discrete parts to emerging integrated heterogeneous architectures such as the AMD Fusion APU and Intel Sandy/Ivy bridge with the CPU and the GPU on the same die has not been well studied.

Classic time-step simulation applications represented by agent-based models have the intrinsic parallel structure that is a good fit for GPGPU architectures. However, when mapping these applications directly to the integrated GPUs, the performance may degrade due to less computation units and lower clock speed.

This paper proposes an optimization to the GPGPU implementation of the agent-based model and illustrates it in the traffic simulation example. The optimization adapts the algorithm by moving part of the workload to the CPU to leverage the integrated architecture and the on-chip memory bus which is faster than the PCIe bus that connects the discrete GPU and the host. The experiments on discrete AMD Radeon GPU and AMD Fusion APU demonstrate that the optimization can achieve 1.08--2.71x performance speedup on the integrated architecture over the discrete platform.

References

  1. B. Aaby, K. Perumalla, and S. Seal. Efficient simulation of agent-based models on multi-gpu and multi-core clusters. In Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques, page 29. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. AMD. Amd fusion family of apus: Enabling a superior, immersive pc experience. March 2010.Google ScholarGoogle Scholar
  3. AMD. Amd accelerated parallel processing opencl programming guide. December 2012.Google ScholarGoogle Scholar
  4. AnandTech. Amd's graphics core next preview: Architected for compute. 2011.Google ScholarGoogle Scholar
  5. M. Billeter, O. Olsson, and U. Assarsson. Efficient stream compaction on wide simd many-core architectures. In Proceedings of the Conference on High Performance Graphics 2009, HPG '09, pages 159--166. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Daga, A. Aji, and W. Feng. On the efficacy of a fused cpu+ gpu processor (or apu) for parallel computing. In Symposium on Application Accelerators in High-Performance Computing (SAAHPC), pages 141--149. IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Doerksen, S. Solomon, and P. Thulasiraman. Designing apu oriented scientific computing applications in opencl. In International Conference on High Performance Computing and Communications (HPCC), pages 587--592. IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. U. Erra, B. Frola, V. Scarano, and I. Couzin. An efficient gpu implementation for large scale individual-based simulation of collective behavior. In International Workshop on High Performance Computational Systems Biology, 2009. HIBI'09, pages 51--58. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Ferrando, M. Gosalvez, J. Cerda, R. Gadea, and K. Sato. Octree-based, gpu implementation of a continuous cellular automaton for the simulation of complex, evolving surfaces. Computer Physics Communications, 182(3):628--640, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  10. M. Garland, M. Kudlur, and Y. Zheng. Designing a unified programming model for heterogeneous machines. In Supercomputing, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Intel. 2nd generation intel core i5 processor. 2011.Google ScholarGoogle Scholar
  12. J. Katajainen, T. Pasanen, and J. Teuhola. Practical in-place mergesort. Nordic Journal of Computing, 3(1):27--40, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Keckler, W. Dally, B. Khailany, M. Garland, and D. Glasco. Gpus and the future of parallel computing. Micro, IEEE, 31(5):7--17, September-October 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Khronos Group. The OpenCL Specification, version 1.2.19, November 2010.Google ScholarGoogle Scholar
  15. S. Lakshmivarahan, S. Dhall, and L. Miller. Parallel sorting algorithms. Advances in Computers, 23:295--354, 1984.Google ScholarGoogle ScholarCross RefCross Ref
  16. M. Lysenko and R. D'Souza. A framework for megascale agent based model simulations on graphics processing units. Journal of Artificial Societies and Social Simulation, 11(4):10, 2008.Google ScholarGoogle Scholar
  17. M. Niazi and A. Hussain. Agent-based computing from multi-agent systems to agent-based models: a visual survey. Scientometrics, 89(2):479--499, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. NVIDIA. Nvidias next generation cuda compute architecture: Fermi. 2009.Google ScholarGoogle Scholar
  19. H. Peters, O. Schulz-Hildebrandt, and N. Luttenberger. Fast in-place sorting with cuda based on bitonic sort. Parallel Processing and Applied Mathematics, pages 403--410, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Pharr and R. Fernando. Gpu gems 2: Programming techniques for high-performance graphics and general-purpose computation. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Richmond and D. Romano. Agent based gpu, a real-time 3d simulation and interactive visualisation framework for massive agent based modelling on the gpu. In Proceedings of International Workshop on Super Visualisation (IWSV08), 2008.Google ScholarGoogle Scholar
  22. P. Richmond, D. Walker, S. Coakley, and D. Romano. High performance cellular level agent-based simulation with flame for the gpu. Briefings in bioinformatics, 11(3):334--347, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  23. M. Treiber and A. Kesting. An open-source microscopic traffic simulator. Intelligent Transportation Systems Magazine, 2(3):6--13, Fall 2010.Google ScholarGoogle ScholarCross RefCross Ref
  24. L. G. Valiant. A bridging model for parallel computation. Commun. ACM, 33(8):103--111, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H. Wu, G. Diamos, S. Cadambi, and S. Yalamanchili. Kernel weaver: Automatically fusing database primitives for efficient gpu computation. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-45 '12, pages 107--118, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Accelerating simulation of agent-based models on heterogeneous architectures

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      GPGPU-6: Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
      March 2013
      156 pages
      ISBN:9781450320177
      DOI:10.1145/2458523

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 March 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      GPGPU-6 Paper Acceptance Rate15of37submissions,41%Overall Acceptance Rate57of129submissions,44%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader