skip to main content
research-article

GKLEE: concolic verification and test generation for GPUs

Published:25 February 2012Publication History
Skip Abstract Section

Abstract

Programs written for GPUs often contain correctness errors such as races, deadlocks, or may compute the wrong result. Existing debugging tools often miss these errors because of their limited input-space and execution-space exploration. Existing tools based on conservative static analysis or conservative modeling of SIMD concurrency generate false alarms resulting in wasted bug-hunting. They also often do not target performance bugs (non-coalesced memory accesses, memory bank conflicts, and divergent warps). We provide a new framework called GKLEE that can analyze C++ GPU programs, locating the aforesaid correctness and performance bugs. For these programs, GKLEE can also automatically generate tests that provide high coverage. These tests serve as concrete witnesses for every reported bug. They can also be used for downstream debugging, for example to test the kernel on the actual hardware. We describe the architecture of GKLEE, its symbolic virtual machine model, and describe previously unknown bugs and performance issues that it detected on commercial SDK kernels. We describe GKLEE's test-case reduction heuristics, and the resulting scalability improvement for a given coverage target.

References

  1. G. Li and G. Gopalakrishnan, "Scalable SMT-based verification of GPU kernel functions," in SIGSOFT FSE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Zheng, V. T. Ravi, F. Qin, and G. Agrawal, "GRace: A low-overhead mechanism for detecting data races in GPU programs," in PPoPP, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Boyer, K. Skadron, and W. Weimer, "Automated dynamic analysis of CUDA programs," in Third Workshop on Software Tools for MultiCore Systems,2008.Google ScholarGoogle Scholar
  4. C. Cadar, D. Dunbar, and D. R. Engler, "KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs," in OSDI, 8th USENIX Symposium, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. "SMT-COMP. http://www.smtcomp.org/2011."Google ScholarGoogle Scholar
  6. P. Godefroid, N. Klarlund, and K. Sen, "DART: Directed automated random testing," in PLDI, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. "KLEE open projects," http://klee.llvm.org/OpenProjects.html.Google ScholarGoogle Scholar
  8. K. Sen, D. Marinov, and G. Agha, "CUTE: a concolic unit testing engine for C," in 10th ESEC/FSE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. "CUDA zone. www.nvidia.com/object/cuda_home.html."Google ScholarGoogle Scholar
  10. OpenCL. http://www.khronos.org/opencl.Google ScholarGoogle Scholar
  11. A. Kamil and K. A. Yelick, "Concurrency Analysis for Parallel Programs with Textually Aligned Barriers," in LCPC, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. "The LLVM compiler infrastructure. http://www.llvm.org/."Google ScholarGoogle Scholar
  13. "GKLEE Technical Report. http://www.cs.utah.edu/fv/GKLEE."Google ScholarGoogle Scholar
  14. "Cuda programming guide version 4.0. http://developer.download.nvidia.com/compute/cuda/4_0/toolkit/docs/CUDA_C_Programming_Guide.pdf."Google ScholarGoogle Scholar
  15. J. Sevcik, "Safe Optimisations for Shared-Memory Concurrent Programs," in PLDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. V. Adve, M. D. Hill, B. P. Miller, and R. H. Netzer, "Detecting data races on weak memory systems," in ISCA, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Shasa and M. Snir, "Efficient and correct execution of parallel programs that share memory," ACM TOPLAS, vol. 10, no. 2, pp. 282--312, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Aiken and D. Gay, "Barrier inference," in POPL, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. NVIDIA, "CUDA-GDB," Jan. 2009, an extension to the GDB debugger for debugging CUDA kernels in the hardware.Google ScholarGoogle Scholar
  20. Nvidia, "Parallel Nsight," Jul. 2010.Google ScholarGoogle Scholar
  21. Rogue Wave, "Totalview for CUDA," Jan. 2010.Google ScholarGoogle Scholar
  22. J. M. Cobleigh, L. A. Clarke, and L. J. Osterweil, "Flavers: A finite state verification technique for software systems," IBM Systems Journal,vol. 41, no. 1, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. K. Lahiri, S. Qadeer, and Z. Rakamaric, "Static and precise detection of concurrency errors in systems code using SMT solvers," in 21st Computer Aided Verification (CAV), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Coutinho, D. Sampaio, F. M. Quintao Pereira, and W. Meira Jr., "Divergence analysis and optimizations," in PACT, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Lv, G. Li, A. Humphrey, and G. Gopalakrishnan, "Performance degradation analysis of GPU kernels," in EC2 Workshop, 2011.Google ScholarGoogle Scholar
  26. P. Collingbourne, C. Cadar, and P. H. J. Kelly, "Symbolic crosschecking of floating-point and SIMD code," in EuroSys, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Collingbourne, C. Cadar, and P. Kelly, "Symbolic testing of OpenCL code," in Haifa Verification Conference (HVC), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. F. Diamos, A. R. Kerr, S. Yalamanchili, and N. Clark, "Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems," in PACT, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. GKLEE: concolic verification and test generation for GPUs

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 47, Issue 8
          PPOPP '12
          August 2012
          334 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2370036
          Issue’s Table of Contents
          • cover image ACM Conferences
            PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
            February 2012
            352 pages
            ISBN:9781450311601
            DOI:10.1145/2145816

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 February 2012

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader