ABSTRACT
Reference counting (RC) is one of the two fundamental approaches to garbage collection. It has the desirable characteristics of low memory overhead and short pause times, which are key in today's interactive mobile platforms. However, RC has a higher execution time overhead than its counterpart, tracing garbage collection. The reason is that RC implementations maintain per-object counters, which must be continually updated. In particular, the execution time overhead is high in environments where low memory overhead is critical and, therefore, non-deferred RC is used. This is because the counter updates need to be performed atomically.
To address this problem, this paper proposes a novel algorithm called Biased Reference Counting (BRC), which significantly improves the performance of non-deferred RC. BRC is based on the observation that most objects are only accessed by a single thread, which allows most RC operations to be performed non-atomically. BRC leverages this by biasing each object towards a specific thread, and keeping two counters for each object --- one updated by the owner thread and another updated by the other threads. This allows the owner thread to perform RC operations non-atomically, while the other threads update the second counter atomically.
We implement BRC in the Swift programming language runtime, and evaluate it with client and server programs. We find that BRC makes each RC operation more than twice faster in the common case. As a result, BRC reduces the average execution time of client programs by 22.5%, and boosts the average throughput of server programs by 7.3%.
- An Artificial Garbage Collection Benchmark. http://www.hboehm.info/gc/gc_bench.htmlGoogle Scholar
- CryptoSwift. https://github.com/krzyzanowskim/CryptoSwiftGoogle Scholar
- Kitura: A Swift Web framework and HTTP Server. http://www.kitura.io/Google Scholar
- Perfect: Server-side Swift. http://perfect.org/Google Scholar
- Server-side swift benchmarks. https://github.com/rymcol/Server-Side-Swift-Benchmarks-Summer-2017Google Scholar
- Swift Benchmark Suite. https://github.com/apple/swift/tree/master/benchmarkGoogle Scholar
- Swift Has Reached 1.0. https://developer.apple.com/swift/blog/?id=14Google Scholar
- Swift Programming Language. https://swift.org/.Google Scholar
- Swift Version of Ray Tracing. https://github.com/rnapier/raytraceGoogle Scholar
- SwiftyJSON. https://github.com/SwiftyJSON/SwiftyJSONGoogle Scholar
- T. H. Axford. 1990. Reference Counting of Cyclic Graphs for Functional Programs. Comput. J. 33, 5 (1990), 466--472. Google ScholarDigital Library
- David F. Bacon, Clement R. Attanasio, Han B. Lee, V. T. Rajan, and Stephen Smith. 2001. Java Without the Coffee Breaks: A Nonintrusive Multiprocessor Garbage Collector. In Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation (PLDI '01). 92--103. Google ScholarDigital Library
- David F. Bacon and V. T. Rajan. 2001. Concurrent Cycle Collection in Reference Counted Systems. In Proceedings of the 15th European Conference on Object-Oriented Programming (ECOOP '01). 207--235. Google ScholarDigital Library
- Jeffrey M. Barth. 1977. Shifting Garbage Collection Overhead to Compile Time. Commun. ACM 20, 7 (1977), 513--518. Google ScholarDigital Library
- Stephen M. Blackburn and Kathryn S. McKinley. 2003. Ulterior Reference Counting: Fast Garbage Collection Without a Long Wait. In Proceedings of the 18th Annual ACM SIGPLAN Conference on Object-oriented Programing, Systems, Languages, and Applications (OOPSLA '03). 344--358. Google ScholarDigital Library
- David R. Brownbridge. 1985. Cyclic reference counting for combinator machines. In Conference on Functional Programming and Computer Architecture. 273--288. Google ScholarDigital Library
- George E. Collins. 1960. A Method for Overlapping and Erasure of Lists. Commun. ACM 3, 12 (Dec. 1960), 655--657. Google ScholarDigital Library
- L. Peter Deutsch and Daniel G. Bobrow. 1976. An Efficient, Incremental, Automatic Garbage Collector. Commun. ACM 19, 9 (1976), 522--526. Google ScholarDigital Library
- David Dice, Mark Moir, and William Scherer III. 2003. Quickly Reacquirable Locks. Technical Report. Sun Microsystem Laboratories.Google Scholar
- José A. Joao, Onur Mutlu, and Yale N. Patt. 2009. Flexible Reference-counting-based Hardware Acceleration for Garbage Collection. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA '09). 418--428. Google ScholarDigital Library
- Pramod G. Joisha. 2006. Compiler Optimizations for Nondeferred Reference-Counting Garbage Collection. In Proceedings of the 5th International Symposium on Memory Management (ISMM '06). Google ScholarDigital Library
- Pramod G. Joisha. 2007. Overlooking Roots: A Framework for Making Non-deferred Reference-counting Garbage Collection Fast. In Proceedings of the 6th International Symposium on Memory Management (ISMM '07). 141--158. Google ScholarDigital Library
- Pramod G. Joisha. 2008. A Principled Approach to Nondeferred Reference-counting Garbage Collection. In Proceedings of the 4th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE '08). 131--140. Google ScholarDigital Library
- Richard E. Jones and Rafael D. Lins. 1993. Cyclic Weighted Reference Counting Without Delay. In Proceedings of the 5th International PARLE Conference on Parallel Architectures and Languages Europe (PARLE '93). 712--715. Google ScholarDigital Library
- Kiyokuni Kawachiya, Akira Koseki, and Tamiya Onodera. 2002. Lock Reservation: Java Locks Can Mostly Do Without Atomic Operations. In Proc. of the 17th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA '02). Google ScholarDigital Library
- Yossi Levanoni and Erez Petrank. 2001. An On-the-fly Reference Counting Garbage Collector for Java. In Proceedings of the 16th ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA'01). 367--380. Google ScholarDigital Library
- A. D. Martínez, R. Wachenchauzer, and R. D. Lins. 1990. Cyclic Reference Counting with Local Mark-scan. Inf. Process. Lett. 34, 1 (1990), 31--35. Google ScholarDigital Library
- John McCarthy. 1960. Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I. Commun. ACM 3, 4 (1960), 184--195. Google ScholarDigital Library
- Tamiya Onodera, Kikyokuni Kawachiya, and Akira Koseki. 2004. Lock Reservation for Java Reconsidered. In Proceedings of the 18th European Conference on Object-Oriented Programming (ECOOP '04).Google ScholarCross Ref
- Young Gil Park and Benjamin Goldberg. 1991. Reference Escape Analysis: Optimizing Reference Counting Based on the Lifetime of References. In Proceedings of the 1991 ACM SIGPLAN Symposium on Partial Evaluation and Semantics-based Program Manipulation (PEPM '91). 178--189. Google ScholarDigital Library
- Harel Paz, David F. Bacon, Elliot K. Kolodner, Erez Petrank, and V. T. Rajan. 2007. An Efficient On-the-fly Cycle Collection. ACM Trans. Program. Lang. Syst. 29, 4 (2007). Google ScholarDigital Library
- Harel Paz and Erez Petrank. 2007. Using Prefetching to Improve Reference-counting Garbage Collectors. In Proceedings of the 16th International Conference on Compiler Construction (CC'07). 48--63. Google ScholarDigital Library
- Harel Paz, Erez Petrank, David F. Bacon, Elliot K. Kolodner, and V. T. Rajan. 2005. An Efficient On-the-fly Cycle Collection. In Proceedings of the 14th International Conference on Compiler Construction (CC'05). 156--171. Google ScholarDigital Library
- Filip Pizlo, Daniel Frampton, and Antony L. Hosking. 2011. Fine-grained Adaptive Biased Locking. In Proceedings of the 9th International Conference on Principles and Practice of Programming in Java (PPPJ '11). 171--181. Google ScholarDigital Library
- Ian Rogers and Balaji Iyengar. 2011. Reducing Biased Lock Revocation by Learning. In Proceedings of the 6th Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems. 65--73. Google ScholarDigital Library
- Kenneth Russell and David Detlefs. 2006. Eliminating Synchronization-related Atomic Operations with Biased Locking and Bulk Rebiasing. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications (OOPSLA '06). 263--272. Google ScholarDigital Library
- Rifat Shahriyar, Stephen M. Blackburn, and Daniel Frampton. 2012. Down for the Count? Getting Reference Counting Back in the Ring. In Proceedings of the 2012 International Symposium on Memory Management (ISMM '12). 73--84. Google ScholarDigital Library
- Rifat Shahriyar, Stephen M. Blackburn, and Kathryn S. McKinley. 2014. Fast Conservative Garbage Collection. In Proceedings of the 2014 ACM International Conference on Object-oriented Programming Systems Languages, and Applications (OOPSLA '14). 121--139. Google ScholarDigital Library
- Rifat Shahriyar, Stephen Michael Blackburn, Xi Yang, and Kathryn S. McKinley. 2013. Taking off the Gloves with Reference Counting Immix. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object-oriented Programming Systems Languages, and Applications (OOPSLA '13). 93--110. Google ScholarDigital Library
- David Ungar. 1984. Generation Scavenging: A Non-disruptive High Performance Storage Reclamation Algorithm. In Proceedings of the First ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments (SDE 1). 157--167. Google ScholarDigital Library
- David Ungar, David Grove, and Hubertus Franke. 2017. Dynamic Atomicity: Optimizing Swift Memory Management. In Proceedings of the 13th ACM SIGPLAN International Symposium on Dynamic Languages (DLS '17). 15--26. Google ScholarDigital Library
- Nalini Vasudevan, Kedar S. Namjoshi, and Stephen A. Edwards. 2010. Simple and Fast Biased Locks. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT '10). 65--74. Google ScholarDigital Library
Index Terms
- Biased reference counting: minimizing atomic operations in garbage collection
Recommendations
Low-latency, high-throughput garbage collection
PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and ImplementationTo achieve short pauses, state-of-the-art concurrent copying collectors such as C4, Shenandoah, and ZGC use substantially more CPU cycles and memory than simpler collectors. They suffer from design limitations: i) concurrent copying with inherently ...
Ulterior reference counting: fast garbage collection without a long wait
Special Issue: Proceedings of the OOPSLA '03 conferenceGeneral purpose garbage collectors have yet to combine short pause times with high throughput. For example, generational collectors can achieve high throughput. They have modest average pause times, but occasionally collect the whole heap and ...
Ulterior reference counting: fast garbage collection without a long wait
OOPSLA '03: Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applicationsGeneral purpose garbage collectors have yet to combine short pause times with high throughput. For example, generational collectors can achieve high throughput. They have modest average pause times, but occasionally collect the whole heap and ...
Comments