ABSTRACT
As General-Purpose Graphics Processing Units (GPGPUs)become more powerful, they are being used increasingly often in high-performance computing applications. State space exploration, as employed in model-checking and other verification techniques, is a large, complex problem that has successfully been ported to a variety of parallel architectures. Use of the GPU for this purpose, however, has only recently begun to be studied. We show how the 2012 multicore CPU-parallel state-space exploration algorithm of the SPIN model checker can be re-engineered to take advantage of the unique parallel-processing capabilities of the GPGPU architecture, and demonstrate how to overcome the non-trivial design obstacles presented by this task. Our preliminary results demonstrate significant performance improvements over the traditional sequential model checker for state spaces of appreciable size (>10 million unique states).
- D. A. Alcantara, A. Sharf, F. Abbasinejad, S. Sengupta, M. Mitzenmacher, J. D. Owens, and N. Amenta. Real-time parallel hashing on the GPU. In ACM SIGGRAPH Asia 2009 Papers, SIGGRAPH Asia ’09, pages 154:1–154:9, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- D. A. F. Alcantara. Efficient hash tables on the GPU, 2011. Copyright - Copyright ProQuest, UMI Dissertations Publishing 2011; Last updated - 2014-01-23; First page - n/a; M3: Ph.D. Google ScholarDigital Library
- J. Barnat, P. Bauch, L. Brim, and M. ˘ Ce˘ ska. Designing fast LTL model checking algorithms for many-core GPUs. Journal of Parallel and Distributed Computing, 72(9):1083–1097, 2012. Accelerators for High-Performance Computing. Google ScholarDigital Library
- J. Barnat, L. Brim, and P. Ro˘ ckai. Scalable multi-core LTL model-checking. In D. Bo˘ sna˘ cki and S. Edelkamp, editors, Proc. of SPIN 2007: the 14th international SPIN conference on Model checking software, volume 4595 of Lecture Notes in Computer Science, pages 187–203. Springer Berlin Heidelberg, 2007. Google ScholarDigital Library
- J. Barnat, L. Brim, and J. St˘ r´ıbrná. Distributed LTL model-checking in SPIN. In M. Dwyer, editor, Proc. of SPIN 2001: the 8th International SPIN Workshop on Model Checking of Software, volume 2057 of Lecture Notes in Computer Science, pages 200–216. Springer Berlin Heidelberg, 2001. Google ScholarDigital Library
- E. Bartocci, E. M. Cherry, J. Glimm, R. Grosu, S. A. Smolka, and F. H. Fenton. Toward real-time simulation of cardiac dynamics. In Proceedings of the 9th International Conference on Computational Methods in Systems Biology, CMSB ’11, pages 103–112, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- E. Bartocci, R. Singh, F. B. von Stein, A. Amedome, A. J. Caceres, J. Castillo, E. Closser, G. Deards, A. Goltsev, R. S. Ines, C. Isbilir, J. K. Marc, D. Moore, D. Pardi, S. Sadhu, S. Sanchez, P. Sharma, A. Singh, J. Rogers, A. Wolinetz, T. Grosso-Applewhite, K. Zhao, A. B. Filipski, R. F. Gilmour, R. Grosu, J. Glimm, S. A. Smolka, E. M. Cherry, E. M. Clarke, N. Griffeth, and F. H. Fenton. Teaching cardiac electrophysiology modeling to undergraduate students: laboratory exercises and GPU programming for the study of arrhythmias and spiral wave dynamics. Adv Physiol Educ, 35(4):427–437, 2011.Google ScholarCross Ref
- Y. Deng, B. D. Wang, and S. Mu. Taming irregular EDA applications on GPUs. In Proceedings of the ICCAD ’09: the International Conference on Computer-Aided Design, ICCAD ’09, pages 539–546, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- S. Edelkamp and D. Sulewski. Efficient explicit-state model checking on general purpose graphics processors. In J. Pol and M. Weber, editors, Proc. of SPIN’10: the 17th International SPIN Conference on Model Checking Software, volume 6349 of Lecture Notes in Computer Science, pages 106–123. Springer Berlin Heidelberg, 2010. Google ScholarDigital Library
- S. Evangelista, A. Laarman, L. Petrucci, and J. van de Pol. Improved multi-core nested depth-first search. In S. Chakraborty and M. Mukund, editors, Automated Technology for Verification and Analysis, Lecture Notes in Computer Science, pages 269–283. Springer Berlin Heidelberg, 2012. Google ScholarDigital Library
- P. Harish and P. Narayanan. Accelerating large graph algorithms on the GPU using CUDA. In S. Aluru, M. Parashar, R. Badrinath, and V. Prasanna, editors, Proc. of HiPC’07: the 14th international conference on High performance computing, volume 4873 of Lecture Notes in Computer Science, pages 197–208. Springer Berlin Heidelberg, 2007. Google ScholarDigital Library
- G. Holzmann. Parallelizing the SPIN model checker. In A. Donaldson and D. Parker, editors, Proc. of SPIN 2012: the 19th International Workshop on SPIN Model Checking Software, volume 7385 of Lecture Notes in Computer Science, pages 155–171. Springer Berlin Heidelberg, 2012. Google ScholarDigital Library
- G. Holzmann and D. Bo˘ sna˘ cki. The design of a multicore extension of the SPIN model checker. IEEE Transactions on Software Engineering, 33(10):659–674, Oct 2007. Google ScholarDigital Library
- S. Hong, S. K. Kim, T. Oguntebi, and K. Olukotun. Accelerating CUDA graph algorithms at maximum warp. In Proc. of PPoPP ’11: the 16th ACM Symposium on Principles and Practice of Parallel Programming, PPoPP ’11, pages 267–276, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- A. Laarman, J. van de Pol, and M. Weber. Multi-core LTSmin: Marrying modularity and scalability. In M. Bobaru, K. Havelund, G. Holzmann, and R. Joshi, editors, NASA Formal Methods, volume 6617 of Lecture Notes in Computer Science, pages 506–511. Springer Berlin Heidelberg, 2011. Google ScholarDigital Library
- L. Luo, M. Wong, and W. Hwu. An effective GPU implementation of breadth-first search. In Proc. of DAC ’10: the 47th Design Automation Conference, DAC ’10, pages 52–55, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- M. M. Michael. High performance dynamic lock-free hash tables and list-based sets. In Proc. of SPAA ’02: the Fourteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’02, pages 73–82, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
- A. Murthy, E. Bartocci, F. Fenton, J. Glimm, R. Gray, E. Cherry, S. Smolka, and R. Grosu. Curvature analysis of cardiac excitation wavefronts. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 10(2):323–336, March 2013. Google ScholarDigital Library
- O. Shalev and N. Shavit. Split-ordered lists: Lock-free extensible hash tables. Journal of the ACM, 53(3):379–405, may 2006. Google ScholarDigital Library
- C.-H. Shann, T.-L. Huang, and C. Chen. A practical nonblocking queue algorithm using compare-and-swap. In Proc. of ICPADS ’00: the 7th International Conference on Parallel and Distributed Systems, pages 470–475, 2000. Google ScholarDigital Library
- K. Verstoep, H. Bal, J. Barnat, and L. Brim. Efficient large-scale model checking. In Parallel Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pages 1–12, May 2009. Google ScholarDigital Library
- A. Wijs and D. Bo˘ sna˘ cki. GPUexplore: Many-core on-the-fly state space exploration using GPUs. In E. Ábráham and K. Havelund, editors, Proc. of TACAS 2014: the 20th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, volume 8413 of Lecture Notes in Computer Science, pages 233–247. Springer Berlin Heidelberg, 2014.Google Scholar
- S. Xiao and W. chun Feng. Inter-block GPU communication via fast barrier synchronization. In Proc. of IPDPS 2010: the IEEE International Symposium on Parallel Distributed Processing, pages 1–12, April 2010.Google Scholar
Index Terms
- Towards a GPGPU-parallel SPIN model checker
Recommendations
A performance study of general-purpose applications on graphics processors using CUDA
Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly programmable, offering the potential for dramatic speedups for a variety of ...
A unified optimizing compiler framework for different GPGPU architectures
This article presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performance GPGPU programs: effective utilization of GPU memory hierarchy and ...
Performance analysis of accelerated image registration using GPGPU
GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing UnitsThis paper presents a performance analysis of an accelerated 2-D rigid image registration implementation that employs the Compute Unified Device Architecture (CUDA) programming environment to take advantage of the parallel processing capabilities of ...
Comments