Abstract
The advent of multicore CPUs and manycore GPUs means that mainstream processor chips are now parallel systems. Furthermore, their parallelism continues to scale with Moore’s law. The challenge is to develop mainstream application software that transparently scales its parallelism to leverage the increasing number of processor cores, much as 3D graphics applications transparently scale their parallelism to manycore GPUs with widely varying numbers of cores.
- NVIDIA. 2007. CUDA Technology; http://www.nvidia.com/CUDA.Google Scholar
- NVIDIA. 2007. CUDA Programming Guide 1.1; http://developer.download.nvidia.com/compute/cuda/1_1/NVIDIA_CUDA_Programming_Guide_1.1.pdf.Google Scholar
- Stratton, J.A., Stone, S. S., Hwu, W. W. 2008. M-CUDA: An efficient implementation of CUDA kernels on multicores. IMPACT Technical Report 08-01, University of Illinois at Urbana-Champaign, (February).Google Scholar
- See reference 3.Google Scholar
- Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P. Brook for GPUs: Stream computing on graphics hardware. 2004. Proceedings of SIGGRAPH (August): 777-786; http://doi.acm.org/10.1145/1186562.1015800. Google ScholarDigital Library
- Stone, S.S., Yi, H., Hwu, W.W., Haldar, J.P., Sutton, B.P., Liang, Z.-P. 2007. How GPUs can improve the quality of magnetic resonance imaging. The First Workshop on General-Purpose Processing on Graphics Processing Units (October).Google Scholar
- Stone, J.E., Phillips, J.C., Freddolino, P.L., Hardy, D.J., Trabuco, L.G., Schulten, K. 2007. Accelerating molecular modeling applications with graphics processors. Journal of Computational Chemistry 28(16): 2618--2640; http://dx.doi.org/10.1002/jcc.20829.Google ScholarCross Ref
- Nyland, L., Harris, M., Prins, J. 2007. Fast n-body simulation with CUDA. In GPU Gems 3. H. Nguyen, ed. Addison-Wesley.Google Scholar
- Golub, G.H., and Van Loan, C.F. 1996. Matrix Computations, 3rd edition. Johns Hopkins University Press. Google ScholarDigital Library
- Buatois, L., Caumon, G., Lévy, B. 2007. Concurrent number cruncher: An efficient sparse linear solver on the GPU. Proceedings of the High-Performance Computation Conference (HPCC), Springer LNCS. Google ScholarDigital Library
- Sengupta, S., Harris, M., Zhang, Y., Owens, J.D. 2007. Scan primitives for GPU computing. In Proceedings of Graphics Hardware (August): 97--106. Google ScholarDigital Library
- See Reference 3.Google Scholar
Index Terms
- Scalable Parallel Programming with CUDA: Is CUDA the parallel programming model that application developers have been waiting for?
Recommendations
NVIDIA cuda software and gpu parallel computing architecture
ISMM '07: Proceedings of the 6th international symposium on Memory managementIn the past, graphics processors were special purpose hardwired application accelerators, suitable only for conventional rasterization-style graphics applications. Modern GPUs are now fully programmable, massively parallel floating point processors. ...
Comments