research-article

Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing

Authors:
John Robert Wernsing

University of Florida, Gainesville, FL, USA

University of Florida, Gainesville, FL, USA
View Profile

,
Greg Stitt

University of Florida, Gainesville, FL, USA

University of Florida, Gainesville, FL, USA
View Profile

Authors Info & Claims

ACM SIGPLAN Notices Volume 45 Issue 4April 2010pp 115–124https://doi.org/10.1145/1755951.1755906

Published:13 April 2010Publication History

ACM SIGPLAN Notices

Abstract

Over the past decade, system architectures have started on a clear trend towards increased parallelism and heterogeneity, often resulting in speedups of 10x to 100x. Despite numerous compiler and high-level synthesis studies, usage of such systems has largely been limited to device experts, due to significantly increased application design complexity. To reduce application design complexity, we introduce elastic computing - a framework that separates functionality from implementation details by enabling designers to use specialized functions, called elastic functions, which enable an optimization framework to explore thousands of possible implementations, even ones using different algorithms. Elastic functions allow designers to execute the same application code efficiently on potentially any architecture and for different runtime parameters such as input size, battery life, etc. In this paper, we present an initial elastic computing framework that transparently optimizes application code onto diverse systems, achieving significant speedups ranging from 1.3x to 46x on a hyper-threaded Xeon system with an FPGA accelerator, a 16-CPU Opteron system, and a quad-core Xeon system.

References

J. Ansel, C. Chan, Y.L. Wong, M. Olszewskim, Q. Zhao, A. Edelman, and S. Amarasinghe. PetaBricks: A Language and Compiler for Algorithmic Choice. Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2009, pp. 38--49. Google ScholarDigital Library
B. Chamberlain, D. Callahan, and H. Zima. Parallel Programmability and the Chapel Language. International Journal of High Performance Computing Applications, Vol. 21, Issue 3, August 2007, pg. 291--312. Google ScholarDigital Library
W. Chen, D. Bonachea, J. Duell, P. Husbands, C. Iancu, and K. Yelick. A Performance Analysis of the Berkeley UPC Compiler. Proceedings of the International Conference on Supercomputing (ICS), 2003, pg. 63--73. Google ScholarDigital Library
Cray, Inc. Cray XT5 System. 2008. http://www.cray.com/Products/XT/Product/Technology.aspx.Google Scholar
A. DeHon. The Density Advantage of Configurable Computing. Computer, Vol. 33, Issue 4, April 2000, pp 41--49. Google ScholarDigital Library
ElementCXI, Inc. ECA-64. http://www.elementcxi.com/productbrief.html.Google Scholar
A. Fin, F. Fummi, and M. Signoretto. SystemC: A Homogenous Environment to Test Embedded Systems. Proceedings of the International Workshop on Hardware/Software Codesign (CODES), 2001, pp 17--22. Google ScholarDigital Library
M. Frigo and S. Johnson. FFTW: an Adaptive Software Architecture for the FFT. Acoustics, Speech and Signal Processing. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 1998, pp. 1381--1384.Google Scholar
M. Girkar and C. Polychronopoulos. Extracting Task-Level Parallelism. ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 17, Issue 4, July 1995, pp. 600--634. Google ScholarDigital Library
B. Grattan, G. Stitt and F. Vahid. Codesign-Extended Applications. IEEE/ACM International Symposium on Hardware/Software Codesign (CODES), 2002, pp. 1--6. Google ScholarDigital Library
Z. Guo, W. Najjar, F. Vahid, and K. Vissers. A Quantitative Analysis of the Speedup Factors of FPGAs over Processors. Proceedings of the International Symposium on Field Programmable Gate Arrays (FPGA), pp. 162--170, 2004. Google ScholarDigital Library
S. Gupta, N. Dutt, R. Gupta, and A. Nicolau. SPARK: A High-Level Synthesis Framework for Applying Parallelizing Compiler Transformations. Proceedings of International Conference on VLSI Design (VLSI), 2003. Google ScholarDigital Library
H. Peter Hofstee. Power Efficient Processor Architecture and the Cell Processor. Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), 2005, pg. 258--262. Google ScholarDigital Library
B. Holland, K. Nagarajan, C. Conger, A. Jacobs, and A. George. RAT: a Methodology for Predicting Performance in Application Design Migration to FPGAs. Proceedings of the Workshop on High-Performance Reconfigurable Computing Technology and Applications (HPRCTA), pp 1--10, 2007. Google ScholarDigital Library
Intel Quad-Core Xeon. 2008. http://www.intel.com.Google Scholar
L. Lewins and K. Prager. Experience and Results Porting HPEC Benchmarks to MONARCH. Proceedings of Workshop on High Performance Embedded Computing (HPEC), 2008.Google Scholar
C. Luk, S. Hong, and H. Kim. Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping. Proceedings of the IEEE/ACM International Symposium on Microarchitecture (MICRO), 2009, pg. 45--55. Google ScholarDigital Library
M. Macedonia. The GPU Enters Computing's Mainstream. IEEE Computer, Vol. 36, No. 10, October 2003, pp. 106--108. Google ScholarDigital Library
I. McCallum. Intel QuickAssist Technology Accelerator Abstraction Layer (AAL) 317481-001US. 2007. http://download.intel.com/technology/platforms/quickassist/quickassist_aal_whitepaper.pdf.Google Scholar
M. D. McCool. Data-parallel programming on Cell BE and the GPU using the Rapidmind development platform. In GSPx Multicore Applications Conference, 2006.Google Scholar
S. Merchant, B. Holland, C. Reardon, et al. Strategic Challenges for Application Development Productivity in Reconfigurable Computing. Proceedings of the IEEE National Areospace and Electronics Conference (NAECON), 2008.Google ScholarCross Ref
K. Morris. FPGAs in Space: Programmable Logic in Orbit. FPGA and Structured ASIC Journal, August, 2004.Google Scholar
D. Musser. Introspective Sorting and Selection Algorithms. Software: Practice and Experience, Vol. 27, Issue 8, 1999, pp. 983--993. Google ScholarDigital Library
Nallatech Inc. Nallatech PCIXM FPGA accelerator card, 2008. http://www.nallatech.com/?node_id=1.2.2&id=41.Google Scholar
G. R. Nudd, D. J. Kerbyson, E. Papaefstathiou, S. C. Perry, J. S. Harper, and D. V. Wilcox. Pace - A Toolset for the Performance Prediction of Parallel and Distributed Systems. International Journal of High Performance Computing Applications, Vol. 14, No. 3, 2000, pp. 228--251. Google ScholarDigital Library
L. Semeria, K. Sato, and G. De Micheli. Synthesis of Hardware Models in C with Pointers and Complex Data Structures. IEEE Transactions of Very Large Scale Integration Systems (TVLSI), Vol. 9, Issue 6, December 2001, pp. 743--756. Google ScholarDigital Library
G. Stitt, F. Vahid, and W. Najjar. A Code Refinement Methodology for Performance-Improved Synthesis from C. Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2006, pp. 716--723 Google ScholarDigital Library
Tilera Tile64 Processor Family. 2008. http://www.tilera.com/products/processors.php.Google Scholar
R. Vuduc, J. Demmel, and K. Yelick. OSKI: A Library of Automatically Tuned Sparse Matrix Kernels. Journal of Physics, June 2005.Google Scholar
R. Whaley and J. Dongarra. Automatically Tuned Linear Algebra Software. Proceedings of ACM/IEEE Conference on Supercomputing (SC), 1998, pp. 1--27. Google ScholarDigital Library
J. Williams, A. George, J. Richardson, K. Gosrani, and S. Suresh. Fixed and Reconfigurable Multi-Core Device Characterization for HPEC. Proceedings of Workshop on High-Performance Embedded Computing (HPEC), 2008.Google Scholar
Xilinx Inc. Virtex IV FX devices, 2008. http://www.xilinx.com/products/silicon_solutions/fpgas/virtex/virtex4/index.htm.Google Scholar

Index Terms

Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing
1. Applied computing
  1. Arts and humanities
    1. Architecture (buildings)
      1. Computer-aided design
  2. Physical sciences and engineering
    1. Engineering
      1. Computer-aided design

Recommendations

Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing
LCTES '10: Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems

Over the past decade, system architectures have started on a clear trend towards increased parallelism and heterogeneity, often resulting in speedups of 10x to 100x. Despite numerous compiler and high-level synthesis studies, usage of such systems has ...
Read More
Elastic computing: A portable optimization framework for hybrid computers

Due to power limitations and escalating cooling costs, high-performance computing systems can no longer rely solely on faster clock frequencies and numerous microprocessor nodes to meet increasing performance demands. As an alternative approach, high-...
Read More
RACECAR: a heuristic for automatic function specialization on multi-core heterogeneous systems
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming

High-performance computing systems increasingly combine multi-core processors and heterogeneous resources such as graphics-processing units and field-programmable gate arrays. However, significant application design complexity for such systems has often ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGPLAN Notices Volume 45, Issue 4
LCTES '10
April 2010
170 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1755951
Issue’s Table of Contents
LCTES '10: Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
April 2010
184 pages
ISBN:9781605589534
DOI:10.1145/1755888
General Chair:
Jaejin Lee
Seoul National University, Korea
,
Program Chair:
Bruce B. Childers
University of Pittsburgh, USA
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 April 2010
Check for updates
Author Tags
elastic computing
fpga
heterogeneous architectures
multi-core
speedup
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 85
  Total Citations
  View Citations
- 598
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing

ACM SIGPLAN Notices

Abstract

References

Cited By

Index Terms

Recommendations

Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing

Elastic computing: A portable optimization framework for hybrid computers

RACECAR: a heuristic for automatic function specialization on multi-core heterogeneous systems