research-article

Efficient dynamic heap allocation of scratch-pad memory

Authors:
Ross McIlroy

University of Glasgow, Glasgow, United Kingdom

University of Glasgow, Glasgow, United Kingdom
View Profile

,
Peter Dickman

Google Inc., Zurich, Switzerland

Google Inc., Zurich, Switzerland
View Profile

,
Joe Sventek

University of Glasgow, Glasgow, United Kingdom

University of Glasgow, Glasgow, United Kingdom
View Profile

ISMM '08: Proceedings of the 7th international symposium on Memory managementJune 2008Pages 31–40https://doi.org/10.1145/1375634.1375640

Published:07 June 2008Publication History

ISMM '08: Proceedings of the 7th international symposium on Memory management

Pages 31–40

ABSTRACT

An increasing number of processor architectures support scratch-pad memory - software managed on-chip memory. Scratch-pad memory provides low latency data storage, like on-chip caches, but under explicit software control. The simple design and predictable nature of scratchpad memories has seen them incorporated into a number of embedded and real-time system processors. They are also employed by multi-core architectures to isolate processor core local data and act as low latency inter-core shared memory.

Managing scratch-pad memory by hand is time consuming, error prone and potentially wasteful; tools that automatically manage this memory are essential for its use by general purpose software. While there has been promising work in compile time allocation of scratch-pad memory, there will always be applications which require run-time allocation. Modern dynamic memory management techniques are too heavy-weight for scratch-pad management.

This paper presents the Scratch-Pad Memory Allocator, a light-weight memory management algorithm, specifically designed to manage small on-chip memories. This algorithm uses a variety of techniques to reduce its memory footprint while still remaining effective, including: representing memory both as fixed-sized blocks and variable-sized regions within these blocks; coding of memory state in bitmap structures; and exploiting the layout of adjacent regions to dispense with boundary tags for split and coalesce operations. We compare the performance of this allocator against Doug Lea's malloc implementation for the management of core-local and inter-core shared scratchpad memories under real world memory traces. This algorithm manages small memories efficiently and scales well under load when multiple competing cores access shared memory.

References

M. Adiletta, M. Rosenbluth, D. Bernstein, G. Wolrich, and H. Wilkinson. The Next Generation of Intel IXP Network Processors. Intel Tech. Journal, 6(3), 2002.Google Scholar
O. Avissar, R. Barua, and D. Stewart. An optimal memory allocation scheme for scratch-pad-based embedded systems. ACM Trans. on Embedded Comp. Sys., 1(1), 2002. Google ScholarDigital Library
R. Banakar, S. Steinke, B.S. Lee, M. Balakrishnan, and P. Marwedel. Scratchpad memory: design alternative cache on-chip memory in embedded systems. 10th Int?l Symp. on Hardware/Software codesign, 2002. Google ScholarDigital Library
E.D. Berger, K.S. McKinley, R.D. Blumofe, and P.R. Wilson. Hoard: a scalable memory allocator for multithreaded applications. ACM SIGPLAN Notices, 35(11):117--128, 2000. Google ScholarDigital Library
D Brash. Arm-v6 architecture. White Paper, 2002.Google Scholar
JMChang and EF Gehringer. A high performance memory allocator for object-oriented systems. Computers, IEEE Trans-actions on, 45(3):357--366, 1996. Google ScholarDigital Library
K. D. Cooper and T.J. Harvey. Compiler-controlled memory. In 8th Int?l conf. on Architectural support for programming languages and operating systems, 1998. Google ScholarDigital Library
A. Dominguez, S. Udayakumaran, and R. Barua. Heap data allocation to scratch-pad memory in embedded systems. Journal of Embedded Computing, 1(4), 2005. Google ScholarDigital Library
E.G. Hallnor and S.K. Reinhardt. A fully associative software-managed cache design. SIGARCH Comput. Archit. News, 28 (2), 2000. Google ScholarDigital Library
J.D. Hiser and J.W.Davidson. EMBARC: an efficientmemory bank assignment algorithm for retargetable compilers. ACM SIGPLAN Notices, 39(7), 2004. Google ScholarDigital Library
Poul-Henning Kamp. Malloc(3) revisited. In Proc. of the Annual Technical Conf. on USENIX, 1998. Google ScholarDigital Library
M. Kandemir, J. Ramanujam, J. Irwin, N. Vijaykrishnan, I. Kadayif, and A. Parikh. Dynamic management of scratchpad memory space. In 38th conf. on Design automation, 2001. Google ScholarDigital Library
K.C. Knowlton. A fast storage allocator. Communications of the ACM, 8(10):623--624, 1965. Google ScholarDigital Library
D.E. Knuth. The Art of Computer Programming. Vol. 1: Fundamental Algorithms. Addison-Wesley, 1973.Google ScholarDigital Library
D.E. Knuth, J.H. Morris Jr, and V.R. Pratt. Fast Pattern Matching in Strings. SIAM Journal on Computing, 1977.Google ScholarCross Ref
D Krishnaswamy, R Stevens, R Hasbun, J Revilla, and C Hagan. The Intel PXA800F wireless Internet-on-a-chip architecture and design. In IEEE Custom Integrated Circuits, 2003.Google ScholarCross Ref
D. Lea. A memory allocator, 1996. http://gee.cs.oswego.edu/dl/html/malloc.html.Google Scholar
P.R. Panda, N.D. Dutt, and A. Nicolau. On-chip vs. offchip memory: the data partitioning problem in embedded processor-based systems. ACM Trans. on Design Automation of Electronic Systems, 5(3), 2000. Google ScholarDigital Library
D. Pham, S. Asano, M. Bolliger, M.N Day, H.P Hofstee, C. Johns, J. Kahle, A. Kameyama, J. Keaty, Y. Masubuchi, et al. The design and implementation of a first-generation CELL processor. IEEE Solid-State Circuits Conference, 2005.Google ScholarCross Ref
J. Scott, L.H. Lee, J. Arends, and B. Moyer. Designing the Low-Power MCORE Architecture. Power Driven Microarchitecture Workshop, 1998.Google Scholar
Tammo Spalink, Scott Karlin, Larry Peterson, and Yitzchak Gottlieb. Building a robust software-based router using network processors. SIGOPS Oper. Syst. Rev., 35(5), 2001. Google ScholarDigital Library
P.R. Wilson, M.S. Johnstone, M. Neely, and D. Boles. Dynamic Storage Allocation: A Survey and Critical Review. Int?l Workshop on Memory Mgmt., 1995. Google ScholarDigital Library
W.A. Wulf and S.A. McKee. Hitting the memory wall: implications of the obvious. SIGARCH Comp. Arch. News, 1995. Google ScholarDigital Library

Index Terms

Efficient dynamic heap allocation of scratch-pad memory
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Heterogeneous (hybrid) systems
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Dynamic compilers
  2. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Allocation / deallocation strategies

Recommendations

Compiler-assisted dynamic scratch-pad memory management with space overlapping for embedded systems

Scratch-pad memory (SPM), a small, fast, software-managed on-chip SRAM (Static Random Access Memory) is widely used in embedded systems. With the ever-widening performance gap between processors and main memory, it is very important to reduce the ...
Read More
Heap data allocation to scratch-pad memory in embedded systems
Cache exploitation in embedded systems

This paper presents the first-ever compile-time method for allocating a portion of the heap data to scratch-pad memory. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by ...
Read More
Enabling Hybrid PCM Memory System with Inherent Memory Management
RACS '16: Proceedings of the International Conference on Research in Adaptive and Convergent Systems

Replacing the traditional volatile main memory, e.g., DRAM, with a non-volatile phase change memory (PCM) has become a possible solution to reduce the energy consumption of computing systems. To further reduce the bit cost of PCM, the development trend ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ISMM '08: Proceedings of the 7th international symposium on Memory management
June 2008
170 pages
ISBN:9781605581347
DOI:10.1145/1375634
General Chair:
Richard Jones
University of Kent
,
Program Chair:
Steve Blackburn
Australian National University
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 June 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
concurrency
memory management
on-core memory
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate72of156submissions,46%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 50
  Total Citations
  View Citations
- 616
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient dynamic heap allocation of scratch-pad memory

ISMM '08: Proceedings of the 7th international symposium on Memory management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Compiler-assisted dynamic scratch-pad memory management with space overlapping for embedded systems

Heap data allocation to scratch-pad memory in embedded systems

Enabling Hybrid PCM Memory System with Inherent Memory Management