skip to main content
research-article

NumaGiC: a Garbage Collector for Big Data on Big NUMA Machines

Published:14 March 2015Publication History
Skip Abstract Section

Abstract

On contemporary cache-coherent Non-Uniform Memory Access (ccNUMA) architectures, applications with a large memory footprint suffer from the cost of the garbage collector (GC), because, as the GC scans the reference graph, it makes many remote memory accesses, saturating the interconnect between memory nodes. We address this problem with NumaGiC, a GC with a mostly-distributed design. In order to maximise memory access locality during collection, a GC thread avoids accessing a different memory node, instead notifying a remote GC thread with a message; nonetheless, NumaGiC avoids the drawbacks of a pure distributed design, which tends to decrease parallelism. We compare NumaGiC with Parallel Scavenge and NAPS on two different ccNUMA architectures running on the Hotspot Java Virtual Machine of OpenJDK 7. On Spark and Neo4j, two industry-strength analytics applications, with heap sizes ranging from 160GB to 350GB, and on SPECjbb2013 and SPECjbb2005, ourgc improves overall performance by up to 45% over NAPS (up to 94% over Parallel Scavenge), and increases the performance of the collector itself by up to 3.6x over NAPS (up to 5.4x over Parallel Scavenge).

References

  1. T. A. Anderson. Optimizations in a private nursery-based garbage collector. In ISMM '10, pages 21--30. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. W. Appel. Simple generational garbage collection and fast allocation. SP&E, 19(2):171--183, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schupbach, and A. Singhania. The multikernel: a new OS architecture for scalable multicore systems. In SOSP '09, pages 29--44. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovic, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA '06, pages 169--190. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. M. Chandy and L. Lamport. Distributed snapshots: Determining global states of distributed systems. TOCS, 3(1):63--75, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Dashti, A. Fedorova, J. Funston, F. Gaud, R. Lachaize, B. Lepers, V. Quema, and M. Roth. Traffic management: A holistic approach to memory placement on NUMA systems. In ASPLOS '13, pages 381--394. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Doligez and X. Leroy. A concurrent, generational garbage collector for a multithreaded implementation of ml. In POPL '93, pages 113--123. ACM, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Friendster. SNAP: network datasets: Friendster social network. http://snap.stanford.edu/data/com-Friendster.html, 2014.Google ScholarGoogle Scholar
  9. L. Gidra, G. Thomas, J. Sopena, and M. Shapiro. Assessing the scalability of garbage collectors on many cores. In SOSP Workshop on Programming Languages and Operating Systems, PLOS '11, pages 1--5. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. L. Gidra, G. Thomas, J. Sopena, and M. Shapiro. A study of the scalability of stop-the-world garbage collectors on multicores. In ASPLOS '13, pages 229--240. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H2. H2 database engine. http://www.h2database.com/, 2014.Google ScholarGoogle Scholar
  12. M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Jones, A. Hosking, and E. Moss. The garbage collection handbook: the art of automatic memory management. Chapman & Hall/CRC, 1st edition, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H. Lieberman and C. Hewitt. A real-time garbage collector based on the lifetimes of objects. CACM, 26(6):419--429, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. LinuxMemPolicy. What is Linux memory policy? http://www.kernel.org/doc/Documentation/vm/numa_memory_policy.txt, 2014.Google ScholarGoogle Scholar
  16. S. Marlow and S. Peyton Jones. Multicore garbage collection with local heaps. In ISMM '11, pages 21--32. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Neo4j. Neo4j -- the world's leading graph database. http://www.neo4j.org, 2014.Google ScholarGoogle Scholar
  18. E. B. Nightingale, O. Hodson, R. McIlroy, C. Hawblitzel, and G. Hunt. Helios: heterogeneous multiprocessing with satellite kernels. In SOSP '09, pages 221--234. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. Ogasawara. NUMA-aware memory manager with dominant-thread-based copying GC. In OOPSLA '09, pages 377--390. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. Sivaramakrishnan, L. Ziarek, and S. Jagannathan. Eliminating read barriers through procrastination and cleanliness. In ISMM '12, pages 49--60. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Sobalvarro. A lifetime-based garbage collector for LISP systems on general-purpose computers. Technical report, Cam- bridge, MA, USA, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. X. Song, H. Chen, R. Chen, Y. Wang, and B. Zang. A case for scaling applications to many-core with OS clustering. In EuroSys '11, pages 61--76. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Spark. Apache Spark-- lightning-fast cluster computing. http://spark.apache.org, 2014.Google ScholarGoogle Scholar
  24. SPECjbb2005. SPECjbb2005 home page. http://www.spec.org/jbb2005/, 2014.Google ScholarGoogle Scholar
  25. SPECjbb2013. SPECjbb2013 home page. http://www.spec.org/jbb2013/, 2014.Google ScholarGoogle Scholar
  26. I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD '12, pages 1222--1230. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. B. Steensgaard. Thread-specific heaps for multi-threaded programs. In ISMM '00, pages 18--24. ACM, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. M. Tikir and J. K. Hollingsworth. NUMA-aware Java heaps for server applications. In IPDPS '05, pages 108--117. IEEE Computer Society, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Ungar. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. In SDE '84, pages 157--167. ACM, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. R. Wilson and T. G. Moher. A "card-marking" scheme for controlling intergenerational references in generation-based garbage collection on stock hardware. SIGPLAN Notice, 24 (5):87--92, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Zhou and B. Demsky. Memory management for many-core processors with software configurable locality policies. In ISMM '12, pages 3--14. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. NumaGiC: a Garbage Collector for Big Data on Big NUMA Machines

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 43, Issue 1
    ASPLOS'15
    March 2015
    676 pages
    ISSN:0163-5964
    DOI:10.1145/2786763
    Issue’s Table of Contents
    • cover image ACM Conferences
      ASPLOS '15: Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems
      March 2015
      720 pages
      ISBN:9781450328357
      DOI:10.1145/2694344

    Copyright © 2015 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 14 March 2015

    Check for updates

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader