skip to main content
research-article
Free Access

Scalable Parallel Programming with CUDA: Is CUDA the parallel programming model that application developers have been waiting for?

Published:01 March 2008Publication History
Skip Abstract Section

Abstract

The advent of multicore CPUs and manycore GPUs means that mainstream processor chips are now parallel systems. Furthermore, their parallelism continues to scale with Moore’s law. The challenge is to develop mainstream application software that transparently scales its parallelism to leverage the increasing number of processor cores, much as 3D graphics applications transparently scale their parallelism to manycore GPUs with widely varying numbers of cores.

References

  1. NVIDIA. 2007. CUDA Technology; http://www.nvidia.com/CUDA.Google ScholarGoogle Scholar
  2. NVIDIA. 2007. CUDA Programming Guide 1.1; http://developer.download.nvidia.com/compute/cuda/1_1/NVIDIA_CUDA_Programming_Guide_1.1.pdf.Google ScholarGoogle Scholar
  3. Stratton, J.A., Stone, S. S., Hwu, W. W. 2008. M-CUDA: An efficient implementation of CUDA kernels on multicores. IMPACT Technical Report 08-01, University of Illinois at Urbana-Champaign, (February).Google ScholarGoogle Scholar
  4. See reference 3.Google ScholarGoogle Scholar
  5. Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P. Brook for GPUs: Stream computing on graphics hardware. 2004. Proceedings of SIGGRAPH (August): 777-786; http://doi.acm.org/10.1145/1186562.1015800. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Stone, S.S., Yi, H., Hwu, W.W., Haldar, J.P., Sutton, B.P., Liang, Z.-P. 2007. How GPUs can improve the quality of magnetic resonance imaging. The First Workshop on General-Purpose Processing on Graphics Processing Units (October).Google ScholarGoogle Scholar
  7. Stone, J.E., Phillips, J.C., Freddolino, P.L., Hardy, D.J., Trabuco, L.G., Schulten, K. 2007. Accelerating molecular modeling applications with graphics processors. Journal of Computational Chemistry 28(16): 2618--2640; http://dx.doi.org/10.1002/jcc.20829.Google ScholarGoogle ScholarCross RefCross Ref
  8. Nyland, L., Harris, M., Prins, J. 2007. Fast n-body simulation with CUDA. In GPU Gems 3. H. Nguyen, ed. Addison-Wesley.Google ScholarGoogle Scholar
  9. Golub, G.H., and Van Loan, C.F. 1996. Matrix Computations, 3rd edition. Johns Hopkins University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Buatois, L., Caumon, G., Lévy, B. 2007. Concurrent number cruncher: An efficient sparse linear solver on the GPU. Proceedings of the High-Performance Computation Conference (HPCC), Springer LNCS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sengupta, S., Harris, M., Zhang, Y., Owens, J.D. 2007. Scan primitives for GPU computing. In Proceedings of Graphics Hardware (August): 97--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. See Reference 3.Google ScholarGoogle Scholar

Index Terms

  1. Scalable Parallel Programming with CUDA: Is CUDA the parallel programming model that application developers have been waiting for?

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image Queue
              Queue  Volume 6, Issue 2
              GPU Computing
              March/April 2008
              63 pages
              ISSN:1542-7730
              EISSN:1542-7749
              DOI:10.1145/1365490
              Issue’s Table of Contents

              Copyright © 2008 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 March 2008

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Popular
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format