skip to main content
research-article

Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing

Published:13 April 2010Publication History
Skip Abstract Section

Abstract

Over the past decade, system architectures have started on a clear trend towards increased parallelism and heterogeneity, often resulting in speedups of 10x to 100x. Despite numerous compiler and high-level synthesis studies, usage of such systems has largely been limited to device experts, due to significantly increased application design complexity. To reduce application design complexity, we introduce elastic computing - a framework that separates functionality from implementation details by enabling designers to use specialized functions, called elastic functions, which enable an optimization framework to explore thousands of possible implementations, even ones using different algorithms. Elastic functions allow designers to execute the same application code efficiently on potentially any architecture and for different runtime parameters such as input size, battery life, etc. In this paper, we present an initial elastic computing framework that transparently optimizes application code onto diverse systems, achieving significant speedups ranging from 1.3x to 46x on a hyper-threaded Xeon system with an FPGA accelerator, a 16-CPU Opteron system, and a quad-core Xeon system.

References

  1. J. Ansel, C. Chan, Y.L. Wong, M. Olszewskim, Q. Zhao, A. Edelman, and S. Amarasinghe. PetaBricks: A Language and Compiler for Algorithmic Choice. Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2009, pp. 38--49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Chamberlain, D. Callahan, and H. Zima. Parallel Programmability and the Chapel Language. International Journal of High Performance Computing Applications, Vol. 21, Issue 3, August 2007, pg. 291--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. W. Chen, D. Bonachea, J. Duell, P. Husbands, C. Iancu, and K. Yelick. A Performance Analysis of the Berkeley UPC Compiler. Proceedings of the International Conference on Supercomputing (ICS), 2003, pg. 63--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cray, Inc. Cray XT5 System. 2008. http://www.cray.com/Products/XT/Product/Technology.aspx.Google ScholarGoogle Scholar
  5. A. DeHon. The Density Advantage of Configurable Computing. Computer, Vol. 33, Issue 4, April 2000, pp 41--49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. ElementCXI, Inc. ECA-64. http://www.elementcxi.com/productbrief.html.Google ScholarGoogle Scholar
  7. A. Fin, F. Fummi, and M. Signoretto. SystemC: A Homogenous Environment to Test Embedded Systems. Proceedings of the International Workshop on Hardware/Software Codesign (CODES), 2001, pp 17--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Frigo and S. Johnson. FFTW: an Adaptive Software Architecture for the FFT. Acoustics, Speech and Signal Processing. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 1998, pp. 1381--1384.Google ScholarGoogle Scholar
  9. M. Girkar and C. Polychronopoulos. Extracting Task-Level Parallelism. ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 17, Issue 4, July 1995, pp. 600--634. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Grattan, G. Stitt and F. Vahid. Codesign-Extended Applications. IEEE/ACM International Symposium on Hardware/Software Codesign (CODES), 2002, pp. 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Z. Guo, W. Najjar, F. Vahid, and K. Vissers. A Quantitative Analysis of the Speedup Factors of FPGAs over Processors. Proceedings of the International Symposium on Field Programmable Gate Arrays (FPGA), pp. 162--170, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Gupta, N. Dutt, R. Gupta, and A. Nicolau. SPARK: A High-Level Synthesis Framework for Applying Parallelizing Compiler Transformations. Proceedings of International Conference on VLSI Design (VLSI), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Peter Hofstee. Power Efficient Processor Architecture and the Cell Processor. Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), 2005, pg. 258--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Holland, K. Nagarajan, C. Conger, A. Jacobs, and A. George. RAT: a Methodology for Predicting Performance in Application Design Migration to FPGAs. Proceedings of the Workshop on High-Performance Reconfigurable Computing Technology and Applications (HPRCTA), pp 1--10, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Intel Quad-Core Xeon. 2008. http://www.intel.com.Google ScholarGoogle Scholar
  16. L. Lewins and K. Prager. Experience and Results Porting HPEC Benchmarks to MONARCH. Proceedings of Workshop on High Performance Embedded Computing (HPEC), 2008.Google ScholarGoogle Scholar
  17. C. Luk, S. Hong, and H. Kim. Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping. Proceedings of the IEEE/ACM International Symposium on Microarchitecture (MICRO), 2009, pg. 45--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Macedonia. The GPU Enters Computing's Mainstream. IEEE Computer, Vol. 36, No. 10, October 2003, pp. 106--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. I. McCallum. Intel QuickAssist Technology Accelerator Abstraction Layer (AAL) 317481-001US. 2007. http://download.intel.com/technology/platforms/quickassist/quickassist_aal_whitepaper.pdf.Google ScholarGoogle Scholar
  20. M. D. McCool. Data-parallel programming on Cell BE and the GPU using the Rapidmind development platform. In GSPx Multicore Applications Conference, 2006.Google ScholarGoogle Scholar
  21. S. Merchant, B. Holland, C. Reardon, et al. Strategic Challenges for Application Development Productivity in Reconfigurable Computing. Proceedings of the IEEE National Areospace and Electronics Conference (NAECON), 2008.Google ScholarGoogle ScholarCross RefCross Ref
  22. K. Morris. FPGAs in Space: Programmable Logic in Orbit. FPGA and Structured ASIC Journal, August, 2004.Google ScholarGoogle Scholar
  23. D. Musser. Introspective Sorting and Selection Algorithms. Software: Practice and Experience, Vol. 27, Issue 8, 1999, pp. 983--993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Nallatech Inc. Nallatech PCIXM FPGA accelerator card, 2008. http://www.nallatech.com/?node_id=1.2.2&id=41.Google ScholarGoogle Scholar
  25. G. R. Nudd, D. J. Kerbyson, E. Papaefstathiou, S. C. Perry, J. S. Harper, and D. V. Wilcox. Pace - A Toolset for the Performance Prediction of Parallel and Distributed Systems. International Journal of High Performance Computing Applications, Vol. 14, No. 3, 2000, pp. 228--251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Semeria, K. Sato, and G. De Micheli. Synthesis of Hardware Models in C with Pointers and Complex Data Structures. IEEE Transactions of Very Large Scale Integration Systems (TVLSI), Vol. 9, Issue 6, December 2001, pp. 743--756. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. G. Stitt, F. Vahid, and W. Najjar. A Code Refinement Methodology for Performance-Improved Synthesis from C. Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2006, pp. 716--723 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Tilera Tile64 Processor Family. 2008. http://www.tilera.com/products/processors.php.Google ScholarGoogle Scholar
  29. R. Vuduc, J. Demmel, and K. Yelick. OSKI: A Library of Automatically Tuned Sparse Matrix Kernels. Journal of Physics, June 2005.Google ScholarGoogle Scholar
  30. R. Whaley and J. Dongarra. Automatically Tuned Linear Algebra Software. Proceedings of ACM/IEEE Conference on Supercomputing (SC), 1998, pp. 1--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Williams, A. George, J. Richardson, K. Gosrani, and S. Suresh. Fixed and Reconfigurable Multi-Core Device Characterization for HPEC. Proceedings of Workshop on High-Performance Embedded Computing (HPEC), 2008.Google ScholarGoogle Scholar
  32. Xilinx Inc. Virtex IV FX devices, 2008. http://www.xilinx.com/products/silicon_solutions/fpgas/virtex/virtex4/index.htm.Google ScholarGoogle Scholar

Index Terms

  1. Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 45, Issue 4
        LCTES '10
        April 2010
        170 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/1755951
        Issue’s Table of Contents
        • cover image ACM Conferences
          LCTES '10: Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
          April 2010
          184 pages
          ISBN:9781605589534
          DOI:10.1145/1755888

        Copyright © 2010 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 April 2010

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader