research-article

Shadowfax: scaling in heterogeneous cluster systems via GPGPU assemblies

Authors:
Alexander M. Merritt

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Vishakha Gupta

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Abhishek Verma

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Ada Gavrilovska

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Karsten Schwan

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

VTDC '11: Proceedings of the 5th international workshop on Virtualization technologies in distributed computingJune 2011Pages 3–10https://doi.org/10.1145/1996121.1996124

Published:08 June 2011Publication History

VTDC '11: Proceedings of the 5th international workshop on Virtualization technologies in distributed computing

Pages 3–10

ABSTRACT

Systems with specialized processors such as those used for accel- erating computations (like NVIDIA's graphics processors or IBM's Cell) have proven their utility in terms of higher performance and lower power consumption. They have also been shown to outperform general purpose processors in case of graphics intensive or high performance applications and for enterprise applications like modern financial codes or web hosts that require scalable image processing. These facts are causing tremendous growth in accelerator-based platforms in the high performance domain with systems like Keeneland, supercomputers like Tianhe-1, RoadRunner and even in data center systems like Amazon's EC2.

The physical hardware in these systems, once purchased and assembled, is not reconfigurable and is expensive to modify or upgrade. This can eventually limit applications' performance and scalability unless they are rewritten to match specific versions of hardware and compositions of components, both for single nodes and for clusters of machines. To address this problem and to support increased flexibility in usage models for CUDA-based GPGPU applications, our research proposes GPGPU assemblies, where each assembly combines a desired number of CPUs and CUDA-supported GPGPUs to form a 'virtual execution platform' for an application. System-level software, then, creates and manages assemblies, including mapping them seamlessly to the actual cluster- and node- level hardware resources present in the system. Experimental evaluations of the initial implementation of GPGPU assemblies demonstrates their feasibility and advantages derived from their use.

References

Amazon Inc. High performance computing using amazon ec2. http://aws.amazon.com/ec2/hpc-applications/.Google Scholar
P. Barham, B. Dragovic, K. Fraser, et al. Xen and the art of virtualization. In SOSP, Bolton Landing, USA, 2003. Google ScholarDigital Library
J. S. Chase, D. E. Irwin, L. E. Grit, et al. Dynamic virtual clusters in a grid site manager. In HPDC, Washington, DC, USA, 2003. Google ScholarDigital Library
Citrix Corp. Xenserver multi-gpu passthrough for hdx 3d pro graphics. http://community.citrix.com/display/ocb/2010/06/28/XenServerGoogle Scholar
J. Duato, A. J. Peña, F. Silla, et al. rCUDA: Reducing the number of gpu-based accelerators in high performance clusters. In HPCS, Caen, France, 2010.Google ScholarCross Ref
N. Farooqi, A. Kerr, G. Diamos, et al. A framework for dynamically instrumenting gpu compute applications within gpu ocelot. In GPGPU-4, Newport Beach, CA, USA, 2011. Google ScholarDigital Library
V. Gupta, A. Gavrilovska, et al. GViM: Gpu-accelerated virtual machines. In HPCVirt, Nuremberg, Germany, 2009. Google ScholarDigital Library
V. Gupta, K. Schwan, N. Tolia, et al. Pegasus: Coordinated scheduling for virtualized accelerator-based systems. In USENIX ATC, Portland, USA, 2011. Google ScholarDigital Library
J. Lange, K. Pedretti, P. Dinda, et al. Minimal overhead virtualization of a large scale supercomputer. In VEE, Newport Beach, USA, March 2011. Google ScholarDigital Library
Microsoft Corp. RemoteFX: Rich end user experience for virtual and session-based desktops. http://www.microsoft.com/windowsserver2008/en/us/rds-remotefx.aspx.Google Scholar
NVIDIA. Nvidia cuda compute unified device architecture - programming guide. http://developer.download.nvidia.com/compute/cuda/1_0/NVIDIA_CUDA_Programming_Guide_1.0.pdf, June 2007.Google Scholar
NVIDIA Corp. NVIDIA SLI Multi-OS. http://www.nvidia.com/object/sli_multi_os.html.Google Scholar
K. Pedretti and P. Bridges. Opportunities for leveraging os virtualization in high-end supercomputing. In MASVDC, Atlanta, USA, December 2010.Google Scholar
S. Ryoo, C. I. Rodrigues, et al. Optimization principles and application performance evaluation of a multithreaded gpu using cuda. In PPoPP, Salt Lake City, USA, 2008. Google ScholarDigital Library
L. Shi, H. Chen, and J. Sun. vCUDA: Gpu accelerated high performance computing in virtual machines. In IPDPS, Rome, Italy, 2009. Google ScholarDigital Library
A. I. Sundararaj and P. A. Dinda. Towards virtual networks for virtual machine grid computing. In Proceedings of the 3rd conference on Virtual Machine Research And Technology Symposium - Volume 3, San Jose, USA, 2004. Google ScholarDigital Library
J. Vetter, K. Schwan, et al. Keeneland: National institute for experimental computing. http://keeneland.gatech.edu/?q=about, 2010.Google Scholar

Index Terms

Shadowfax: scaling in heterogeneous cluster systems via GPGPU assemblies
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Heterogeneous (hybrid) systems

Recommendations

C-DAC's efforts: application kernels on HPC cluster with GPU accelerators
ATIP '12: Proceedings of the ATIP/A*CRC Workshop on Accelerator Technologies for High-Performance Computing: Does Asia Lead the Way?

We describe the problem of parallelization of finite difference method (FDM) and finite element method (FEM) computations for certain class of partial differential equations (PDEs) on High Performance Computing (HPC) GPU cluster. For FDM, the structured ...
Read More
Vectorizing Unstructured Mesh Computations for Many-core Architectures
PMAM'14: Proceedings of Programming Models and Applications on Multicores and Manycores

Achieving optimal performance on the latest multi-core and many-core architectures depends more and more on making efficient use of the hardware's vector processing capabilities. While auto-vectorizing compilers do not require the use of vector ...
Read More
Vectorizing Unstructured Mesh Computations for Many-core Architectures
PMAM'14: Proceedings of Programming Models and Applications on Multicores and Manycores

Achieving optimal performance on the latest multi-core and many-core architectures depends more and more on making efficient use of the hardware's vector processing capabilities. While auto-vectorizing compilers do not require the use of vector ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
VTDC '11: Proceedings of the 5th international workshop on Virtualization technologies in distributed computing
June 2011
44 pages
ISBN:9781450307017
DOI:10.1145/1996121
General Chair:
Adrien Lèbre
Mines de Nantes, France
,
Program Chair:
Kartik Gopalan
Binghamton University, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 June 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CUDA
GPGPU virtualization
application scalability
heterogeneous clusters
remote procedure call
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate5of10submissions,50%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 29
  Total Citations
  View Citations
- 299
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Shadowfax: scaling in heterogeneous cluster systems via GPGPU assemblies

VTDC '11: Proceedings of the 5th international workshop on Virtualization technologies in distributed computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

C-DAC's efforts: application kernels on HPC cluster with GPU accelerators

Vectorizing Unstructured Mesh Computations for Many-core Architectures

Vectorizing Unstructured Mesh Computations for Many-core Architectures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Shadowfax: scaling in heterogeneous cluster systems via GPGPU assemblies

VTDC '11: Proceedings of the 5th international workshop on Virtualization technologies in distributed computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

C-DAC's efforts: application kernels on HPC cluster with GPU accelerators

Vectorizing Unstructured Mesh Computations for Many-core Architectures

Vectorizing Unstructured Mesh Computations for Many-core Architectures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media