Abstract
Software-controlled scratchpad memory is increasingly employed in embedded systems as it offers better timing predictability compared to caches. Previous scratchpad allocation algorithms typically consider single-process applications. But embedded applications are mostly multitasking with real-time constraints, where the scratchpad memory space has to be shared among interacting processes that may preempt each other. In this work, we develop a novel dynamic scratchpad allocation technique that takes these process interferences into account to improve the performance and predictability of the memory system. We model the application as a Message Sequence Chart (MSC) to best capture the interprocess interactions. Our goal is to optimize the Worst-Case Response Time (WCRT) of the application through runtime reloading of the scratchpad memory content at appropriate execution points. We propose an iterative allocation algorithm that consists of two critical steps: (1) analyze the MSC along with the existing allocation to determine potential interference patterns, and (2) exploit this interference information to tune the scratchpad reloading points and content so as to best improve the WCRT. We present various alternative scratchpad allocation heuristics and evaluate their effectiveness in reducing the WCRT. The scheme is also extended to work on Message Sequence Graph models. We evaluate our memory allocation scheme on two real-world embedded applications controlling an Unmanned Aerial Vehicle (UAV) and an in-orbit monitoring instrument, respectively.
- Alur, R. and Yannakakis, M. 1999. Model checking message sequence charts. In Proceedings of the 10th International Conference on Concurrency Theory (CONCUR), J. C. Baeten and S. Mauw, Eds. Springer. 114--129. Google ScholarDigital Library
- Angiolini, F., Menichelli, F., Ferrero, A., Benini, L., and Olivieri, M. 2004. A post-compiler approach to scratchpad mapping of code. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES). ACM, New York, 259--267. Google ScholarDigital Library
- Austin, T., Larson, E., and Ernst, D. 2002. SimpleScalar: An infrastructure for computer system modeling. IEEE Comput. 35, 2, 59--67. Google ScholarDigital Library
- Avissar, O., Barua, R., and Stewart, D. 2002. An optimal memory allocation scheme for scratch-pad-based embedded systems. ACM Trans. Embed. Comput. Syst. 1, 1, 6--26. Google ScholarDigital Library
- Banakar, R., Steinke, S., Lee, B.-S., Balakrishnan, M., and Marwedel, P. 2002. Scratchpad memory: Design alternative for cache on-chip memory in embedded systems. In Proceedings of the 10th International Symposium on Hardware/Software Codesign (CODES). ACM, New York, 73--78. Google ScholarDigital Library
- CPLEX. 2002. The ILOG CPLEX Optimizer v7.5. Commercial software, http://www.ilog.com.Google Scholar
- Deverge, J.-F. and Puaut, I. 2007. WCET-Directed dynamic scratchpad memory allocation of data. In Proceedings of the 19th Euromicro Conference on Real-Time Systems (ECRTS). IEEE Computer Society, 179--190. Google ScholarDigital Library
- Dominguez, A., Udayakumaran, S., and Barua, R. 2005. Heap data allocation to scratch-pad memory in embedded systems. J. Embed. Comput. 1, 4, 521--540. Google ScholarDigital Library
- Egger, B., Kim, C., Jang, C., Nam, Y., Lee, J., and Min, S. L. 2006. A dynamic code placement technique for scratchpad memory using postpass optimization. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES). ACM, New York, 223--233. Google ScholarDigital Library
- Egger, B., Lee, J., and Shin, H. 2008. Dynamic scratchpad memory management for code in portable systems with an MMU. ACM Trans. Embed. Comput. Syst. 7, 2, 1--38. Google ScholarDigital Library
- European Space Agency. 2008. DEBIE -- First standard space debris monitoring instrument. http://gate.etamax.de/edid/publicaccess/debie1.php.Google Scholar
- Falk, H. and Verma, M. 2004. Combined data partitioning and loop nest splitting for energy consumption minimization. In Proceedings of the 8th International Workshop on Software and Compilers for Embedded Systems (SCOPES), H. Schepers, Ed. Springer. 137--151.Google Scholar
- Harel, D. and Thiagarajan, P. S. 2003. Message sequence charts. In UML for Real: Design of Embedded Real-time Systems. 77--105. Google ScholarDigital Library
- Issenin, I., Brockmeyer, E., Durinck, B., and Dutt, N. 2006. Multiprocessor system-on-chip data reuse analysis for exploring customized memory hierarchies. In Proceedings of the 43rd Design Automation Conference (DAC). ACM, New York, 49--52. Google ScholarDigital Library
- ITU-T. 1996. Recommendation Z.120: Message Sequence Chart (MSC). ITU-T, Geneva.Google Scholar
- Janapsatya, A., Ignjatovic, A., and Parameswaran, S. 2006. A novel instruction scratchpad memory optimization method based on concomitance metric. In Proceedings of the Conference on Asia South Pacific Design Automation (ASP-DAC). IEEE Press, 612--617. Google ScholarDigital Library
- Kandemir, M. 2007. Data locality enhancement for CMPs. In Proceedings of the International Conference on Computer-Aided Design (ICCAD). IEEE Press, 155--159. Google ScholarDigital Library
- Kandemir, M., Kadayif, I., and Sezer, U. 2001. Exploiting scratch-pad memory using presburger formulas. In Proceedings of the 14th International Symposium on Systems Synthesis (ISSS). ACM, New York, 7--12. Google ScholarDigital Library
- Kandemir, M., Ozturk, O., and Karakoy, M. 2004a. Dynamic on-chip memory management for chip multiprocessors. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES). ACM, New York, 14--23. Google ScholarDigital Library
- Kandemir, M., Ramanujam, J., and Choudhary, A. 2002. Exploiting shared scratch pad memory space in embedded multiprocessor systems. In Proceedings of the 39th Design Automation Conference (DAC). ACM, New York, 219--224. Google ScholarDigital Library
- Kandemir, M., Ramanujam, J., Irwin, M. J., Vijaykrishnan, N., Kadayif, I., and Parikh, A. 2004b. A compiler based approach for dynamically managing scratch-pad memories in embedded systems. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 23, 2, 243--260. Google ScholarDigital Library
- Lee, C.-G., Hahn, J., Seo, Y.-M., Min, S. L., Ha, R., Hong, S., Park, C. Y., Lee, M., and Kim, C. S. 1998. Analysis of cache-related preemption delay in fixed-priority preemptive scheduling. IEEE Trans. Comput. 47, 6, 700--713. Google ScholarDigital Library
- Li, X., Liang, Y., Mitra, T., and Roychoudhury, A. 2007. Chronos: A timing analyzer for embedded software. Sci. Comput. Program. 69, 1-3, 56--67. http://www.comp.nus.edu.sg/~rpembed/chronos/. Google ScholarDigital Library
- Marwedel, P., Wehmeyer, L., Verma, M., Steinke, S., and Helmig, U. 2004. Fast, predictable and low energy memory references through architecture-aware compilation. In Proceedings of the Conference on Asia South Pacific Design Automation (ASP-DAC). IEEE Press, 4--11. Google ScholarDigital Library
- Mitra, T. and Roychoudhury, A. 2007. Worst case execution time and energy analysis. In The Compiler Design Handbook: Optimizations and Machine Code Generation 2nd Ed., Y. Srikant and P. Shankar, Eds. CRC Press, Boca Raton, FL, Chapter 1.Google Scholar
- Negi, H. S., Mitra, T., and Roychoudhury, A. 2003. Accurate estimation of cache-related preemption delay. In Proceedings of the 1st IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). ACM, New York, 201--206. Google ScholarDigital Library
- Nemer, F., Cassé, H., Sainrat, P., Bahsoun, J.-P., and Michiel, M. D. 2006. PapaBench: A free real-time benchmark. In Proceedings of the 6th International Workshop on Worst-Case Execution Time Analysis (WCET), F. Mueller Ed. Internationales Begegnungs- und Forschungszentrum fuer Informatik (IBFI), Schloss Dagstuhl, Germany. http://www.irit.fr/recherches/ARCHI/MARCH/rubrique.php3?id\_rubrique=97.Google Scholar
- Nguyen, N., Dominguez, A., and Barua, R. 2005. Memory allocation for embedded systems with a compile-time-unknown scratch-pad size. In Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES). ACM, New York, 115--125. Google ScholarDigital Library
- Ozturk, O., Kandemir, M., and Kolcu, I. 2006. Shared scratch-pad memory space management. In Proceedings of the 7th International Symposium on Quality Electronic Design (ISQED). IEEE Computer Society, 576--584. Google ScholarDigital Library
- Panda, P. R., Dutt, N. D., and Nicolau, A. 2000. On-Chip vs. off-chip memory: The data partitioning problem in embedded processor-based systems. ACM Trans. Des. Autom. Electron. Syst. 5, 3, 682--704. Google ScholarDigital Library
- Puaut, I. 2006. WCET-Centric software-controlled instruction caches for hard real-time systems. In Proceedings of the 18th Euromicro Conference on Real-Time Systems (ECRTS). IEEE Computer Society, 217--226. Google ScholarDigital Library
- Ravindran, R. A., Nagarkar, P. D., Dasika, G. S., Marsman, E. D., Senger, R. M., Mahlke, S. A., and Brown, R. B. 2005. Compiler managed dynamic instruction placement in a low-power code cache. In Proceedings of the International Symposium on Code Generation and Optimization (CGO). IEEE Computer Society, 179--190. Google ScholarDigital Library
- Staschulat, J. and Ernst, R. 2004. Multiple process execution in cache related preemption delay analysis. In Proceedings of the 4th ACM International Conference on Embedded Software (EMSOFT). ACM, New York, 278--286. Google ScholarDigital Library
- Steinke, S., Grunwald, N., Wehmeyer, L., Banakar, R., Balakrishnan, M., and Marwedel, P. 2002a. Reducing energy consumption by dynamic copying of instructions onto onchip memory. In Proceedings of the 15th International Symposium on System Synthesis (ISSS). ACM, New York, 213--218. Google ScholarDigital Library
- Steinke, S., Wehmeyer, L., Lee, B. S., and Marwedel, P. 2002b. Assigning program and data objects to scratchpad for energy reduction. In Proceedings of the Design, Automation and Test in Europe (DATE) Conference. IEEE Computer Society, 409. Google ScholarDigital Library
- Suhendra, V., Mitra, T., and Roychoudhury, A. 2006a. Efficient detection and exploitation of infeasible paths for software timing analysis. In Proceedings of the 43rd Design Automation Conference (DAC). ACM, New York, 358--363. Google ScholarDigital Library
- Suhendra, V., Mitra, T., Roychoudhury, A., and Chen, T. 2005. WCET centric data allocation to scratchpad memory. In Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS). IEEE Computer Society, 223--232. Google ScholarDigital Library
- Suhendra, V., Raghavan, C., and Mitra, T. 2006b. Integrated scratchpad memory optimization and task scheduling for MPSoC architectures. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES). ACM, New York, 37--42. Google ScholarDigital Library
- Tomiyama, H. and Dutt, N. D. 2000. Program path analysis to bound cache-related preemption delay in preemptive real-time systems. In Proceedings of the 8th International Workshop on Hardware/Software Codesign (CODES). ACM, New York, 67--71. Google ScholarDigital Library
- Udayakumaran, S. and Barua, R. 2003. Compiler-Decided dynamic memory allocation for scratch-pad based embedded systems. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES). ACM, New York, 276--286. Google ScholarDigital Library
- Verma, M., Petzold, K., Wehmeyer, L., Falk, H., and Marwedel, P. 2005. Scratchpad sharing strategies for multiprocess embedded systems: A first approach. In Proceedings of the 3rd Workshop on Embedded Systems for Real-Time Multimedia (ESTIMedia). M. Miranda and S. Ha, Eds. IEEE Computer Society, 115--120.Google Scholar
- Verma, M., Wehmeyer, L., and Marwedel, P. 2004a. Cache-Aware scratchpad allocation algorithm. In Proceedings of the Design, Automation and Test in Europe (DATE) Conference. IEEE Computer Society, 21264. Google ScholarDigital Library
- Verma, M., Wehmeyer, L., and Marwedel, P. 2004b. Dynamic overlay of scratchpad memory for energy minimization. In Proceedings of the IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). ACM, New York, 104--109. Google ScholarDigital Library
- Wehmeyer, L., Helmig, U., and Marwedel, P. 2004. Compiler-Optimized usage of partitioned memories. In Proceedings of the 3rd Workshop on Memory Performance Issues (WMPI). ACM, New York, 114--120. Google ScholarDigital Library
- Welsh, D. J. A. and Powell, M. B. 1967. An upper bound for the chromatic number of a graph and its application to timetabling problems. The Comput. J. 10, 1, 85--87.Google ScholarCross Ref
- Yen, T.-Y. and Wolf, W. 1998. Performance estimation for real-time distributed embedded systems. IEEE Trans. Parallel Distrib. Syst. 9, 11, 1125--1136. Google ScholarDigital Library
Index Terms
- Scratchpad allocation for concurrent embedded software
Recommendations
A hardware/software framework for instruction and data scratchpad memory allocation
Previous researches show that a scratchpad memory device consumed less energy than a cache device with the same capacity. In this article, we locate the scratchpad memory (SPM) in the top level of the memory hierarchy to reduce the energy consumption. ...
SA-SPM: an efficient compiler for security aware scratchpad memory (invited paper)
LCTES 2019: Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded SystemsScratchpad memories (SPM) are often used to boost the performance of application-specific embedded systems. In embedded systems, main memories are vulnerable to external attacks such as bus snooping or memory extraction. Therefore it is desirable to ...
Virtual I/O Based on ScratchPad Memory for Embedded System
CIT '10: Proceedings of the 2010 10th IEEE International Conference on Computer and Information TechnologyScratchpad memory (SPM) is software-controlled on-chip SRAM memory and widely used in embedded processors to meet the strict requirements on performance, energy consumption and real-time response of embedded systems. This paper proposes an SPM based I/O ...
Comments