Abstract
Chip-multiprocessors are an emerging trend for embedded systems. In this article, we introduce a real-time Java multiprocessor called JopCMP. It is a symmetric shared-memory multiprocessor, and consists of up to eight Java Optimized Processor (JOP) cores, an arbitration control device, and a shared memory. All components are interconnected via a system on chip bus. The arbiter synchronizes the access of multiple CPUs to the shared main memory. In this article, three different arbitration policies are presented, evaluated, and compared with respect to their real-time and average-case performance: a fixed priority, a fair-based, and a time-sliced arbiter.
Tasks running on different CPUs of a chip-multiprocessor (CMP) influence each others' execution times when accessing a shared memory. Therefore, the system needs an arbiter that is able to limit the worst-case execution time of a task running on a CPU, even though tasks executing simultaneously on other CPUs access the main memory. Our research shows that timing analysis is in fact possible for homogeneous multiprocessor systems with a shared memory. The timing analysis of tasks, executing on the CMP using time-sliced memory arbitration, leads to viable worst-case execution time bounds.
The time-sliced arbiter divides the memory access time into equal time slots, one time slot for each CPU. This memory arbitration scheme allows for a calculation of upper bounds of Java application worst-case execution times, depending on the number of CPUs, the time slot size, and the memory access time. Examples of worst-case execution time calculation are presented, and the analyzed results of a real-world application task are compared to measured execution time results. Finally, we evaluate the tradeoffs when using a time-predictable solution compared to using average-case optimized chip-multiprocessors, applying three different benchmarks. These experiments are carried out by executing the programs on the CMP prototype.
- Altera. 2007a. Avalon memory-mapped interface specification (v3.3).Google Scholar
- Altera. 2007b. Nios II Processor Reference Handbook (ver. 7.2).Google Scholar
- Altera. 2007c. Quartus II Handbook, vol. 4: SOPC Builder (ver. 7.2).Google Scholar
- Andrei, A., Eles, P., Peng, Z., and Rosen, J. 2008. Predictable implementation of real-time applications on multiprocessor systems-on-chip. In Proceedings of the IEEE VLSI Design Conference. IEEE, Los Alamitos, 103--110. Google ScholarDigital Library
- ARM. 2006. ARM 11, MPcore Processor, Technical Reference Manual. http://www.arm.com.Google Scholar
- ARM. 1999. AMBA specification (rev. 2.0).Google Scholar
- Artieri, A., D'Alto, V., Chesson, R., Hopkins, M., Rossi, M. C., and Peterson, W. D. 2004. Nomadik—Open multimedia platform for next generation mobile devices. Tech. rep. TA305 http://www.st.com.Google Scholar
- Dutta, S., Jensen, R., and Rieckmann, A. 2001. Viper: A multiprocessor SOC for advanced set-top box and digital TV systems. IEEE Des. Test Comput. 18, 5, 21--31. Google ScholarDigital Library
- Ermedahl, A. and Engblom, J. 2007. Execution time analysis for embedded real-time systems. In Handbook of Real-Time Embedded Systems, S.H.S. Insup Lee and J.Y.-T. Leung Eds., Chapman & Hall/CRC, 35.1--35.17.Google Scholar
- Gaisler, J. and Catovic, E. 2006. Multi-core processor based on LEON3-FT IP core (LEON3-FT-MP). Data Syst. Aerospace. 630, ESA Special Publication.Google Scholar
- Hennessy, J. and Patterson, D. 2006. Computer Architecture: A Quantitative Approach 4th Ed., Morgan Kaufmann. Google ScholarDigital Library
- Hofstee, H. P. 2005. Power efficient processor architecture and the cell processor. In Proceedings of the Symposium on High Performance Computer Architecture. 258--262. Google ScholarDigital Library
- IBM. 2007. 32-Bit OPB arbiter core databook, rev. 1.Google Scholar
- IBM. 2001. On-chip peripheral bus architecture specifications, v2.1.Google Scholar
- Joseph, M. and Pandya, P. K. 1986. Finding response times in a real-time system. Comput. J. 29, 5, 390--395.Google ScholarCross Ref
- Jouppi, N. P. 1990. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proceedings of the 17th Annual International Symposium on Computer Architecture. 364--373. Google ScholarDigital Library
- Kahle, J. A., Day, M. N., Hofstee, H. P., Johns, C. R., Maeurer, T. R., and Shippy, D. 2005. Introduction to the cell multiprocessor. J-IBM-JRD 49, 4/5, 589--604. Google ScholarDigital Library
- Keltcher, C. N., McGrath, K. J., Ahmed, A., and Conway, P. 2003. The AMD Opteron processor for multiprocessor servers. IEEE Micro 23, 2, 66--76. Google ScholarDigital Library
- Kistler, M., Perrone, M., and Petrini, F. 2006. Cell multiprocessor communication network: Built for speed. IEEE Micro 26, 10--25. Google ScholarDigital Library
- Kongetira, P., Aingaran, K., and Olukotun, K. 2005. Niagara: A 32-way multithreaded SPARC processor. IEEE Micro 25, 2, 21--29. Google ScholarDigital Library
- Kopetz, H. 1997. Real-Time Systems: Design Principles for Distributed Embedded Applications. Kluwer Academic Press, Amsterdam. Google ScholarDigital Library
- Laudon, J. and Spracklen, L. 2007. The coming wave of multithreaded chip multiprocessors. Int. J. Paral. Program. 35, 3, 299--330. Google ScholarDigital Library
- Li, Y.-T. S. and Malik, S. 1995. Performance analysis of embedded software using implicit path enumeration. In Proceedings of the Workshop on Languages, Compilers, & Tools for Real-Time Systems. 88--98. Google ScholarDigital Library
- Lickly, B., Liu, I., Kim, S., Patel, H. D., Edwards, S. A., and Lee, E. A. 2008. Predictable programming on a precision timed architecture. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis from Embedded Systems. Google ScholarDigital Library
- Lindholm, T. and Yellin, F. 1999. The Java Virtual Machine Specification 2nd Ed., Addison-Wesley, Reading, MA. Google ScholarDigital Library
- Liu, C. L. and Layland, J. W. 1973. Scheduling algorithms for multiprogramming in a hard-real-time environment. J. ACM 20, 1, 46--61. Google ScholarDigital Library
- Martin, G. and Chang, H. 2003. Winning the SOC Revolution. (Kluwer Academic Press, Amsterdam, chapter 5).Google Scholar
- Moore, G. E. 1965. Cramming more components onto integrated circuits. Electronics 38, 8, 114--117.Google Scholar
- Pitter, C. 2009. Time-predictable Java chip-multiprocessor. Ph.D. dissertation, Vienna University of Technology, Austria.Google Scholar
- Pitter, C. 2008. Time-predictable memory arbitration for a Java chip-multiprocessor. In Proceedings of the 6th International Workshop on Java Technologies for Real-Time and Embedded Systems (JTRES'08). ACM, New York. Google ScholarDigital Library
- Pitter, C. and Schoeberl, M. 2008. Performance evaluation of a Java chip-multiprocessor. In Proceedings of the IEEE 3rd Symposium on Industrial Embedded Systems (SIES'08). IEEE, Los Alamitos, CA.Google Scholar
- Pitter, C. and Schoeberl, M. 2007a. Time predictable CPU and DMA shared memory access. In Proceedings of the International Conference on Field-Programmable Logic and its Applications (FPL'07).Google Scholar
- Pitter, C. and Schoeberl, M. 2007b. Towards a Java multiprocessor. In Proceedings of the 5th International Workshop on Java Technologies for Real-Time and Embedded Systems (JTRES'07). ACM, New York. Google ScholarDigital Library
- Poletti, F., Bertozzi, D., Benini, L., and Bogliolo, A. 2003. Performance analysis of arbitration policies for SoC communication architectures. Des. Automation Embed. Syst. 8, 189--210.Google ScholarDigital Library
- Puschner, P. and Burns, A. 2000. A review of worst-case execution-time analysis. J. Real-Time Syst. 18, 2/3, 115--128. Google ScholarDigital Library
- Rosen, J., Andrei, A., Eles, P., and Peng, Z. 2007. Bus access optimization for predictable implementation of real-time applications on multiprocessor systems-on-chip. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS). IEEE, Los Alamitos, 49--60. Google ScholarDigital Library
- Schoeberl, M. 2008. A Java processor architecture for embedded real-time systems. J. Syst. Architecture 54/1--2, 265--286. Google ScholarDigital Library
- Schoeberl, M. 2007. SimpCon—A simple and efficient SoC interconnect. In Proceedings of the 15th Austrian Workshop on Microelectronics (Austrochip'07).Google Scholar
- Schoeberl, M. and Pedersen, R. 2006. WCET analysis for a Java processor. In Proceedings of the 4th International Workshop on Java Technologies for Real-time and Embedded Systems (JTRES'06), ACM, New York, 202--211. Google ScholarDigital Library
- Schoeberl, M. 2005a. Design and implementation of an efficient stack machine. In Proceedings of the 12th IEEE Reconfigurable Architecture Workshop (RAW'05), IEEE, Los Alamitos. Google ScholarDigital Library
- Schoeberl, M. 2005b. Jop: A Java optimized processor for embedded real-time systems. Ph.D. dissertation, Vienna University of Technology, Austria.Google Scholar
- Schoeberl, M. 2004. A time predictable instruction cache for a Java processor. In Proceedings of the Workshop on Java Technologies for Real-Time and Embedded Systems. Lecture Notes in Computer Science, vol. 3292, Springer, Berlin, 371--382.Google Scholar
- Siebert, F. 2008. Jeopard: Java environment for parallel real-time development. In Proceedings of the 6th International Workshop on Java Technologies for Real-Time and Embedded Systems (JTRES '08), ACM, New York, 87--93. Google ScholarDigital Library
- SPARC International Inc. 1992. The SPARC Architecture Manual: Version 8. Prentice Hall, Englewood Cliffs, NJ. Google ScholarDigital Library
- Thiele, L. and Wilhelm, R. 2004. Design for timing predictability. Real-Time Syst, 28, 2--3, 157--177. Google ScholarDigital Library
- Wilhelm, R., Engblom, J., Ermedahl, A., Holsti, N., Thesing, S., Whalley, D. B., Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I., Puschner, P. P., Staschulat, J., and Stenström, P. 2008. The worst-case execution-time problem—Overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst 7, 3, 1--53. Google ScholarDigital Library
- Wolf, W. 2006. High-Performance Embedded Computing: Architectures, Applications, and Methodologies. Morgan Kaufmann, San Francisco, CA. Google ScholarDigital Library
- Xilinx. 2007. MicroBlaze Processor Reference Guide, Embedded Development Kit EDK 9.2i. http://www.xilinx.com.Google Scholar
- Xilinx. 2005. OPB Arbiter product specification (v1.10c).Google Scholar
Index Terms
- A real-time Java chip-multiprocessor
Recommendations
Time-predictable memory arbitration for a Java chip-multiprocessor
JTRES '08: Proceedings of the 6th international workshop on Java technologies for real-time and embedded systemsIn this paper, we propose an approach to calculate worst-case execution times (WCET) of tasks running on a homogeneous Java multiprocessor. These processors access a shared main memory. Hence, the tasks running on different CPUs may influence the ...
Towards a Java multiprocessor
JTRES '07: Proceedings of the 5th international workshop on Java technologies for real-time and embedded systemsThis paper describes the first steps towards a Java multiprocessor system on a single chip for embedded systems. The chip multiprocessing (CMP) system consists of a homogeneous set of processing elements and a shared memory. Each processor core is based ...
A Low-power Low-cost Optical Router for Optical Networks-on-Chip in Multiprocessor Systems-on-Chip
ISVLSI '09: Proceedings of the 2009 IEEE Computer Society Annual Symposium on VLSINetworks-on-chip (NoCs) can improve the communication bandwidth and power efficiency of multiprocessor systems-on-chip (MPSoC). However, traditional metallic interconnects consume significant amount of power to deliver even higher communication ...
Comments