research-article

Taming irregular EDA applications on GPUs

Authors:
Yangdong (Steve) Deng

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Bo David Wang

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Shuai Mu

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

ICCAD '09: Proceedings of the 2009 International Conference on Computer-Aided DesignNovember 2009Pages 539–546https://doi.org/10.1145/1687399.1687501

Published:02 November 2009Publication History

ICCAD '09: Proceedings of the 2009 International Conference on Computer-Aided Design

Pages 539–546

ABSTRACT

Recently general purpose computing on graphic processing units (GPUs) is rising as an exciting new trend in high-performance computing. Thus it is appealing to study the potential of GPU for Electronic Design Automation (EDA) applications. However, EDA generally involves irregular data structures such as sparse matrix and graph operations, which pose significant challenges for efficient GPU implementations. In this paper, we propose highperformance GPU implementations for two important irregular EDA computing patterns, Sparse-Matrix Vector Product (SMVP) and graph traversal. On a wide range of EDA problem instances, our SMVP implementations outperform all published work and achieve a speedup of one order of magnitude over the CPU baseline. Upon such a basis, both timing analysis and linear system solution can be considerably accelerated. We also introduce a SMVP based formulation for Breadth-First Search and observe considerable speedup on GPU implementations. Our results suggest that the power of GPU computing can be successfully unleashed through designing GPU-friendly algorithms and/or re-organizing computing structures of current algorithms.

References

Blythe, D. 2008. Rise of the graphics processor. Proceeding of IEEE, Vol. 96, No. 5, 761--778, May, 2008.Google ScholarCross Ref
NVidia. 2008. CUDA programming guide.Google Scholar
Feng, Z. and Li, P. Multigrid on GPU: tackling power grid analysis on parallel SIMT platforms. In Proc. of Int'l Conf' on Computer Aided Design. Google ScholarDigital Library
Gulati, K. and Khatri, S. P. 2008. Towards acceleration of fault simulation using graphics processing units. In Proc. of. ACM IEEE Design Automation Conf. Google ScholarDigital Library
Gulati, K., Croix, J. F., Khatri, S. P., and Shastry, R. 2009. Fast circuit simulation on graphics processing units. in Proc. of Conf' on Asia and South Pacific Design Automation. Google ScholarDigital Library
Kleinhans, G. Sigl, F. Johannes, and Antreich, K. 1991. Gordian: VLSI placement by quadratic programming and slicing optimization. IEEE Trans. CAD, vol. 10, no. 3, March 1991.Google ScholarDigital Library
Eisenmann, H. and Johannes, F. M. 1998. Generic global placement and floorplanning. In Proc of Design automation Conf., 269--274. Google ScholarDigital Library
Nam, G.-J., Alpert, J. C., and Villarrubia, P. G. 2007. ISPD 2005/2006 placement benchmarks. Modern Circuit Placement. Ch. 1. Springer US.Google Scholar
Catanzaro, B., Keutzer, K., and Su, B.-Y. 2008. Parallelizing CAD: a timely research agenda for EDA. In Proc. Design Automation Conf., 12--17. Google ScholarDigital Library
Blelloch, G. E. 1990. Vector models for data-parallel computing. MIT Press. Google ScholarDigital Library
Bell, N. and Garland, M. 2008. Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report. NVR-2008-004.Google Scholar
Harish, P. and Narayanan, P. J. 2007. Accelerating large graph algorithms on the GPU using CUDA. In Proc. Of High Performance Computing -- HiPC. 197--208. Google ScholarDigital Library
Saad, Y. 2000. Iterative methods for sparse linear systems. SIAM. Google ScholarDigital Library
Deng, Y. and Mu, S. 2008. The potential of GPUs for VLSI physical design automation. In Proc. of International Conference on Solid-State and Integrated-Circuit Technology.Google Scholar
Garland, M. 2008. Sparse matrix computations on manycore GPU's. In Proc. of Design Automation Conference, Jun. 2008, pp. 2--6. Google ScholarDigital Library
NVidia, 2009. CUDA Programming Guide, CUDA Driver, Toolkit, and SDK code samples. http://www.nvidia.com/object/cuda_get.html.Google Scholar
MathWorks. 2007. Matlab V7.4. http://www.mathworks.com/.Google Scholar
Bell, N. 2008. Sparse Matrix-Vector Multiplication on CUDA. http://forums.nvidia.com/index.php?showtopic=83825&st=0Google Scholar
Davis, T. University of Florida Sparse Matrix Collection. Available online: http://www.cise.ufl.edu/research/sparse/matrices/. Google ScholarDigital Library
Xue, J., et. al. 2009. Layout Dependent STI Stress Analysis and Stress-Aware RF/Analog Circuit Design Optomization. In Proc. of ACM/IEEE Int'l Conf. on Computer-Aided Design. Google ScholarDigital Library
Ramalingam, A., Nam, G.-J., Singh, A. K., Orshansky, M., Nassif, S. R., and Pan, D. Z. 2006. An accurate sparse matrix based framework for statistical static timing analysis. In Proc. of the 2006 Int'l Conf. on Computer-Aided Design, {doi>10.1145/1233501.1233547} Google ScholarDigital Library
ITC'99 Benchmarks (2nd release), Available online: http://www.cad.polito.it/tools/itc99.html.Google Scholar
SMIC, 0.13um Low Leakage Cadence PDK. http://www.smics.com/website/cnVersion/DS/SMIC-PDK.htm.Google Scholar
Barret, R. et al, 1994. Templates for the Solution of Linear Systems, 2nd Edition, SIAM.Google Scholar
Viswanathan, N. and Chu, C. C.-N. 2005. FastPlace: Efficient analytical placement using cell shifting, iterative local refinement and a hybrid net model. IEEE Trans. Computer-Aided Design, Vol. 24, No. 5, 722--733, 2005. Google ScholarDigital Library
Sapatnekar, S. 2004. Timing. Springer; 1 edition. Ch. 5.Google Scholar
Cormen, T. H., Leiserson, C. E., Rivest, R. L. and Stein, C. 2001. Introduction to algorithms. 2nd Edition. MIT Press. Google ScholarDigital Library
David, T. 2006. Direct methods for sparse linear systems. SIAM. Google ScholarDigital Library

Index Terms

Taming irregular EDA applications on GPUs

Recommendations

Towards accelerating irregular EDA applications with GPUs

Recently graphic processing units (GPUs) are rising as a new vehicle for high-performance, general purpose computing. It is attractive to unleash the power of GPU for Electronic Design Automation (EDA) computations to cut the design turn-around time of ...
Read More
Analysis of Sparse Matrix-Vector Multiplication Using Iterative Method in CUDA
NAS '13: Proceedings of the 2013 IEEE Eighth International Conference on Networking, Architecture and Storage

Scaling up the sparse matrix-vector multiplication has been at the heart of numerous studies in both academia and industry. The massive parallelism of graphics processing units offers tremendous performance in many high-performance computing ...
Read More
Brook for GPUs: stream computing on graphics hardware
SIGGRAPH '04: ACM SIGGRAPH 2004 Papers

In this paper, we present Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming co-processor. We present a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICCAD '09: Proceedings of the 2009 International Conference on Computer-Aided Design
November 2009
803 pages
ISBN:9781605588001
DOI:10.1145/1687399
General Chair:
Jaijeet Roychowdhury
Univ. of California, Berkeley, Berkeley, California
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 November 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CUDA
EDA
GPU
breadth-first search
conjugate gradient
data parallel computing
graph algorithms
placement
sparse matrix
sparse-matrix vector product
static timing analysis
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate457of1,762submissions,26%
Upcoming Conference
ICCAD '24

Sponsor:

sigda

IEEE/ACM International Conference on Computer-Aided Design

October 27 - 31, 2024

New York , NY , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 67
  Total Citations
  View Citations
- 724
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Taming irregular EDA applications on GPUs

ICCAD '09: Proceedings of the 2009 International Conference on Computer-Aided Design

ABSTRACT

References

Cited By

Index Terms

Recommendations

Towards accelerating irregular EDA applications with GPUs

Analysis of Sparse Matrix-Vector Multiplication Using Iterative Method in CUDA

Brook for GPUs: stream computing on graphics hardware