skip to main content
10.1145/2731186.2731192acmconferencesArticle/Chapter ViewAbstractPublication PagesveeConference Proceedingsconference-collections
research-article

GPUswap: Enabling Oversubscription of GPU Memory through Transparent Swapping

Authors Info & Claims
Published:14 March 2015Publication History

ABSTRACT

Over the last few years, GPUs have been finding their way into cloud computing platforms, allowing users to benefit from the performance of GPUs at low cost. However, a large portion of the cloud's cost advantage traditionally stems from oversubscription: Cloud providers rent out more resources to their customers than are actually available, expecting that the customers will not actually use all of the promised resources. For GPU memory, this oversubscription is difficult due to the lack of support for demand paging in current GPUs. Therefore, recent approaches to enabling oversubscription of GPU memory resort to software scheduling of GPU kernels -- which has been shown to induce significant runtime overhead in applications even if sufficient GPU memory is available -- to ensure that data is present on the GPU when referenced.

In this paper, we present GPUswap, a novel approach to enabling oversubscription of GPU memory that does not rely on software scheduling of GPU kernels. GPUswap uses the GPU's ability to access system RAM directly to extend the GPU's own memory. To that end, GPUswap transparently relocates data from the GPU to system RAM in response to memory pressure. GPUswap ensures that all data is permanently accessible to the GPU and thus allows applications to submit commands to the GPU directly at any time, without the need for software scheduling. Experiments with our prototype implementation show that GPU applications can still execute even with only 20 MB of GPU memory available. In addition, while software scheduling suffers from permanent overhead even with sufficient GPU memory available, our approach executes GPU applications with native performance.

References

  1. S. Che, M. Boyer, J. Meng, D. Tarjan, J. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A benchmark suite for heterogeneous computing. In Proceedings of the 5th International Symposium on Workload Characterization, IISWC '09, pages 44--54, Austin, TX, Oct. 2009. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. Live migration of virtual machines. In Proceedings of the 2nd Symposium on Networked Systems Design & Implementation, NSDI '05, pages 273--286, Boston, MA, USA, May 2005. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Duato, A. Pena, F. Silla, J. Fernandez, R. Mayo, and E. Quintana-Orti. Enabling CUDA acceleration within virtual machines using rCUDA. In Proceedings of the 18th International Conference on High Performance Computing, HiPC '11, pages 1--10, Bengaluru, India, Dec. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. Giunta, R. Montella, G. Agrillo, and G. Coviello. A GPGPU transparent virtualization component for highperformance computing clouds. In Proceedings of the 16th International Euro-Par Conference on Parallel processing, Euro-Par '10, pages 379--391, Naples, Italy, Sept. 2010. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Gottschlag, M. Hillenbrand, J. Kehne, J. Stoess, and F. Bellosa. LoGV: Low-overhead GPGPU virtualization. In Proceedings of the 4th International Workshop on Frontiers of Heterogeneous Computing, FHC '13, pages 1721--1726, Zhangjiajie, China, Nov. 2013. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  6. V. Gupta, A. Gavrilovska, K. Schwan, H. Kharche, N. Tolia, V. Talwar, and P. Ranganathan. GViM: GPU-accelerated virtual machines. In Proceedings of the 3rd ACMWorkshop on System-level Virtualization for High Performance Computing, HPCVirt '09, pages 17--24, Nuremberg, Germany, Apr. 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. Ji, H. Lin, and X. Ma. RSVM: A region-based software virtual memory for GPU. In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, PACT '13, pages 269--278, Edinburgh, Scotland, Sept. 2013. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Kato. Rodinia for gdev. https://github.com/shinpei0208/gdev-bench, Nov. 2014.Google ScholarGoogle Scholar
  9. S. Kato. Gdev. https://github.com/shinpei0208/gdev, Nov. 2014.Google ScholarGoogle Scholar
  10. S. Kato, K. Lakshmanan, R. R. Rajkumar, and Y. Ishikawa. TimeGraph: GPU scheduling for real-time multi-tasking environments. In Proceedings of the 2011 USENIX Annual Technical Conference, USENIX ATC '11, pages 17--30, Portland, OR, USA, June 2011. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Kato, M. McThrow, C. Maltzahn, and S. Brandt. Gdev: First-class GPU resource management in the operating system. In Proceedings of the 2012 USENIX Annual Technical Conference, USENIX ATC '12, pages 401--412, Boston, MA, USA, June 2012. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Nvidia Corporation. CUDA toolkit. https://developer. nvidia.com/cuda-toolkit, 2014.Google ScholarGoogle Scholar
  13. PathScale. Pscnv. https://github.com/pathscale/pscnv, Nov. 2014.Google ScholarGoogle Scholar
  14. C. J. Rossbach, J. Currey, M. Silberstein, B. Ray, and E. Witchel. PTask: Operating system abstractions to manage GPUs as compute devices. In Proceedings of the 23th Symposium on Operating System Principles, SOSP '11, pages 233--248, Cascais, Portugal, Sept. 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. Shi, H. Chen, J. Sun, and K. Li. vCUDA: GPU-accelerated high-performance computing in virtual machines. IEEE Transactions on Computers, 61(6):804--816, June 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Suzuki, S. Kato, H. Yamada, and K. Kono. GPUvm: Why not virtualizing GPUs at the hypervisor? In Proceedings of the 2014 USENIX Annual Technical Conference, USENIX ATC '14, pages 109--120, Philadelphia, PA, June 2014. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. Tian, Y. Dong, and D. Cowperthwaite. A full GPU virtualization solution with mediated pass-through. In Proceedings of the 2014 USENIX Annual Technical Conference, USENIX ATC '14, pages 121--132, Philadelphia, PA, June 2014. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. K. Wang, X. Ding, R. Lee, S. Kato, and X. Zhang. GDM: Device memory management for GPGPU computing. In Proceedings of the 2014 ACM International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '14, pages 533--545, Austin, TX, USA, June 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. GPUswap: Enabling Oversubscription of GPU Memory through Transparent Swapping

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    VEE '15: Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments
    March 2015
    238 pages
    ISBN:9781450334501
    DOI:10.1145/2731186
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 50, Issue 7
      VEE '15
      July 2015
      221 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2817817
      • Editor:
      • Andy Gill
      Issue’s Table of Contents

    Copyright © 2015 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 14 March 2015

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    VEE '15 Paper Acceptance Rate16of50submissions,32%Overall Acceptance Rate80of235submissions,34%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader