ABSTRACT
An increasing number of processor architectures support scratch-pad memory - software managed on-chip memory. Scratch-pad memory provides low latency data storage, like on-chip caches, but under explicit software control. The simple design and predictable nature of scratchpad memories has seen them incorporated into a number of embedded and real-time system processors. They are also employed by multi-core architectures to isolate processor core local data and act as low latency inter-core shared memory.
Managing scratch-pad memory by hand is time consuming, error prone and potentially wasteful; tools that automatically manage this memory are essential for its use by general purpose software. While there has been promising work in compile time allocation of scratch-pad memory, there will always be applications which require run-time allocation. Modern dynamic memory management techniques are too heavy-weight for scratch-pad management.
This paper presents the Scratch-Pad Memory Allocator, a light-weight memory management algorithm, specifically designed to manage small on-chip memories. This algorithm uses a variety of techniques to reduce its memory footprint while still remaining effective, including: representing memory both as fixed-sized blocks and variable-sized regions within these blocks; coding of memory state in bitmap structures; and exploiting the layout of adjacent regions to dispense with boundary tags for split and coalesce operations. We compare the performance of this allocator against Doug Lea's malloc implementation for the management of core-local and inter-core shared scratchpad memories under real world memory traces. This algorithm manages small memories efficiently and scales well under load when multiple competing cores access shared memory.
- M. Adiletta, M. Rosenbluth, D. Bernstein, G. Wolrich, and H. Wilkinson. The Next Generation of Intel IXP Network Processors. Intel Tech. Journal, 6(3), 2002.Google Scholar
- O. Avissar, R. Barua, and D. Stewart. An optimal memory allocation scheme for scratch-pad-based embedded systems. ACM Trans. on Embedded Comp. Sys., 1(1), 2002. Google ScholarDigital Library
- R. Banakar, S. Steinke, B.S. Lee, M. Balakrishnan, and P. Marwedel. Scratchpad memory: design alternative cache on-chip memory in embedded systems. 10th Int?l Symp. on Hardware/Software codesign, 2002. Google ScholarDigital Library
- E.D. Berger, K.S. McKinley, R.D. Blumofe, and P.R. Wilson. Hoard: a scalable memory allocator for multithreaded applications. ACM SIGPLAN Notices, 35(11):117--128, 2000. Google ScholarDigital Library
- D Brash. Arm-v6 architecture. White Paper, 2002.Google Scholar
- JMChang and EF Gehringer. A high performance memory allocator for object-oriented systems. Computers, IEEE Trans-actions on, 45(3):357--366, 1996. Google ScholarDigital Library
- K. D. Cooper and T.J. Harvey. Compiler-controlled memory. In 8th Int?l conf. on Architectural support for programming languages and operating systems, 1998. Google ScholarDigital Library
- A. Dominguez, S. Udayakumaran, and R. Barua. Heap data allocation to scratch-pad memory in embedded systems. Journal of Embedded Computing, 1(4), 2005. Google ScholarDigital Library
- E.G. Hallnor and S.K. Reinhardt. A fully associative software-managed cache design. SIGARCH Comput. Archit. News, 28 (2), 2000. Google ScholarDigital Library
- J.D. Hiser and J.W.Davidson. EMBARC: an efficientmemory bank assignment algorithm for retargetable compilers. ACM SIGPLAN Notices, 39(7), 2004. Google ScholarDigital Library
- Poul-Henning Kamp. Malloc(3) revisited. In Proc. of the Annual Technical Conf. on USENIX, 1998. Google ScholarDigital Library
- M. Kandemir, J. Ramanujam, J. Irwin, N. Vijaykrishnan, I. Kadayif, and A. Parikh. Dynamic management of scratchpad memory space. In 38th conf. on Design automation, 2001. Google ScholarDigital Library
- K.C. Knowlton. A fast storage allocator. Communications of the ACM, 8(10):623--624, 1965. Google ScholarDigital Library
- D.E. Knuth. The Art of Computer Programming. Vol. 1: Fundamental Algorithms. Addison-Wesley, 1973.Google ScholarDigital Library
- D.E. Knuth, J.H. Morris Jr, and V.R. Pratt. Fast Pattern Matching in Strings. SIAM Journal on Computing, 1977.Google ScholarCross Ref
- D Krishnaswamy, R Stevens, R Hasbun, J Revilla, and C Hagan. The Intel PXA800F wireless Internet-on-a-chip architecture and design. In IEEE Custom Integrated Circuits, 2003.Google ScholarCross Ref
- D. Lea. A memory allocator, 1996. http://gee.cs.oswego.edu/dl/html/malloc.html.Google Scholar
- P.R. Panda, N.D. Dutt, and A. Nicolau. On-chip vs. offchip memory: the data partitioning problem in embedded processor-based systems. ACM Trans. on Design Automation of Electronic Systems, 5(3), 2000. Google ScholarDigital Library
- D. Pham, S. Asano, M. Bolliger, M.N Day, H.P Hofstee, C. Johns, J. Kahle, A. Kameyama, J. Keaty, Y. Masubuchi, et al. The design and implementation of a first-generation CELL processor. IEEE Solid-State Circuits Conference, 2005.Google ScholarCross Ref
- J. Scott, L.H. Lee, J. Arends, and B. Moyer. Designing the Low-Power MCORE Architecture. Power Driven Microarchitecture Workshop, 1998.Google Scholar
- Tammo Spalink, Scott Karlin, Larry Peterson, and Yitzchak Gottlieb. Building a robust software-based router using network processors. SIGOPS Oper. Syst. Rev., 35(5), 2001. Google ScholarDigital Library
- P.R. Wilson, M.S. Johnstone, M. Neely, and D. Boles. Dynamic Storage Allocation: A Survey and Critical Review. Int?l Workshop on Memory Mgmt., 1995. Google ScholarDigital Library
- W.A. Wulf and S.A. McKee. Hitting the memory wall: implications of the obvious. SIGARCH Comp. Arch. News, 1995. Google ScholarDigital Library
Index Terms
- Efficient dynamic heap allocation of scratch-pad memory
Recommendations
Compiler-assisted dynamic scratch-pad memory management with space overlapping for embedded systems
Scratch-pad memory (SPM), a small, fast, software-managed on-chip SRAM (Static Random Access Memory) is widely used in embedded systems. With the ever-widening performance gap between processors and main memory, it is very important to reduce the ...
Heap data allocation to scratch-pad memory in embedded systems
Cache exploitation in embedded systemsThis paper presents the first-ever compile-time method for allocating a portion of the heap data to scratch-pad memory. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by ...
Enabling Hybrid PCM Memory System with Inherent Memory Management
RACS '16: Proceedings of the International Conference on Research in Adaptive and Convergent SystemsReplacing the traditional volatile main memory, e.g., DRAM, with a non-volatile phase change memory (PCM) has become a possible solution to reduce the energy consumption of computing systems. To further reduce the bit cost of PCM, the development trend ...
Comments