skip to main content
10.1145/2258996.2259002acmconferencesArticle/Chapter ViewAbstractPublication PagesismmConference Proceedingsconference-collections
research-article

GPUs as an opportunity for offloading garbage collection

Authors Info & Claims
Published:15 June 2012Publication History

ABSTRACT

GPUs have become part of most commodity systems. Nonetheless, they are often underutilized when not executing graphics-intensive or special-purpose numerical computations, which are rare in consumer workloads. Emerging architectures, such as integrated CPU/GPU combinations, may create an opportunity to utilize these otherwise unused cycles for offloading traditional systems tasks. Garbage collection appears to be a particularly promising candidate for offloading, due to the popularity of managed languages on consumer devices.

We investigate the challenges for offloading garbage collection to a GPU, by examining the performance trade-offs for the mark phase of a mark & sweep garbage collector. We present a theoretical analysis and an algorithm that demonstrates the feasibility of this approach. We also discuss a number of algorithmic design trade-offs required to leverage the strengths and capabilities of the GPU hardware. Our algorithm has been integrated into the Jikes RVM and we present promising performance results.

References

  1. B. Alpern, S. Augart, S. M. Blackburn, M. Butrico, A. Cocchi, P. Cheng, J. Dolby, S. Fink, D. Grove, M. Hind, K. S. McKinley, M. Mergen, J. E. B. Moss, T. Ngo, V. Sarkar, and M. Trapp. The Jikes Research Virtual Machine project: Building an open-source research community. IBM Systems Journal, 44(2):399--417, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. AMD. AMD Embedded G-Series Platform: The world's firs combination of low-power CPU and advanced GPU integrated into a single embedded device. http://www.amd.com/us/Documents/49282_ G-Series_platform_brief.pdf.Google ScholarGoogle Scholar
  3. AMD. AMD Accelerated Parallel Processing (APP) SDK OpenCL Programming Guide. http://developer.amd.com/sdks/AMDAPPSDK/assets/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.pdf.Google ScholarGoogle Scholar
  4. A. W. Appel and A. Bendiksen. Vectorized garbage collection. The Journal of Supercomputing, 3:151--160, 1989.Google ScholarGoogle ScholarCross RefCross Ref
  5. K. Barabash and E. Petrank. Tracing garbage collection on highly parallel platforms. SIGPLAN Not., 45:1--10, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovi´c, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. SIGPLAN Not., 41:169--190, October 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Elteir, H. Lin, and W.-C. Feng. Performance Characterizatio and Optimization of Atomic Operations on AMD GPUs. In 2011 IEEE International Conference on Cluster Computing (CLUSTER), pages 234 --243, Sept 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. M. Gagnon and L. J. Hendren. SableVM: A Research Framework for the Efficient Execution of Java Bytecode. In In Proceedings of the Java Virtual Machine Research and Technology Symposium, pages 27--40, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. J. Garner, S. M. Blackburn, and D. Frampton. A comprehensive evaluation of object scanning techniques. In Proceedings of the International Symposium on Memory Management, ISMM '11, pages 33--42, New York, NY, USA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Harish and P. J. Narayanan. Accelerating large grap algorithms on the GPU using CUDA. Technology, 4873:197--208, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Harris. Parallel Prefix Sum (Scan) with CUDA. GPU Gems, 3 (April):851--876, 2007.Google ScholarGoogle Scholar
  12. S. Hong, S. K. Kim, T. Oguntebi, and K. Olukotun. Accelerating CUDA graph algorithms at maximum warp. In Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming, PPoPP '11, pages 267--276, New York, NY, USA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. S. Jiva and G. R. Frost. GPU Assisted Garbage Collection, 04 2010. URL http://www.patentlens.net/patentlens/patent/US_2010_0082930_A1/en/.Google ScholarGoogle Scholar
  14. R. Jones and R. D. Lins. Garbage Collection: Algorithms fo Automatic Dynamic Memory Management. Wiley, Sept. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Khronos Group. OpenCL 1.2 Specification. http://www.khronos. org/registry/cl/specs/opencl-1.2.pdf.Google ScholarGoogle Scholar
  16. L. Luo, M.Wong, andW.-m. Hwu. An effective GPU implementation of breadth-first search. In Proceedings of the 47th Design Automation Conference, DAC '10, pages 52--55, New York, NY, USA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Marlow, T. Harris, R. P. James, and S. Peyton Jones. Parallel generational-copying garbage collection with a block-structured heap. In Proceedings of the 7th International Symposium on Memory Management, ISMM '08, pages 11--20, New York, NY, USA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Naghmouchi, D. P. Scarpazza, and M. Berekovic. Small-ruleset regular expression matching on GPGPUs: quantitative performance analysis and optimization. In Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10, pages 337--348, New York, NY, USA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Smith, N. Goyal, J. Ormont, K. Sankaralingam, and C. Estan. Evaluating GPUs for network packet signature matching. In International Symposium on Performance Analysis of Systems and Software, 2009. ISPASS 2009, pages 175 --184, April 2009.Google ScholarGoogle ScholarCross RefCross Ref
  20. W. Sun and R. Ricci. Augmenting Operating Systems With the GPU. Technical report, University of Utah, 2010.Google ScholarGoogle Scholar
  21. R. Veldema and M. Philippsen. Iterative data-parallel mark & sweep on a GPU. In Proceedings of the International Symposium on Memory Management, ISMM '11, pages 1--10, New York, NY, USA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. yong Cher and M. Gschwind. Cell GC: using the Cel synergistic processor as a garbage collection coprocessor. In VEE '08: Proceedings of the 4th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, pages 141--150. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. GPUs as an opportunity for offloading garbage collection

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ISMM '12: Proceedings of the 2012 international symposium on Memory Management
        June 2012
        152 pages
        ISBN:9781450313506
        DOI:10.1145/2258996
        • cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 47, Issue 11
          ISMM '12
          November 2012
          136 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2426642
          Issue’s Table of Contents

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 June 2012

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate72of156submissions,46%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader