Skip to main content
Log in

Compilers for Low Power with Design Patterns on Embedded Multicore Systems

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Minimization of power dissipation can be considered at algorithmic, compiler, architectural, logic, and circuit level. Recent research trends for multicore programming models have come to the direction that parallel design patterns can be a solution to develop multicore applications. As parallel design patterns are with regularity, we view this as a great opportunity to exploit power optimizations in the software layer. In this paper, we investigate compilers for low power with parallel design patterns on embedded multicore systems. We evaluate four major parallel design patterns, Pipe and Filter, MapReduce with Iterator, Puppeteer, and Bulk Synchronous Parallel (BSP) Model. Our work attempts to devise power optimization schemes in compilers by exploiting the opportunities of the recurring patterns of embedded multicore programs. The proposed optimization schemes are rate-based optimization for Pipe and Filter pattern , early-exit power optimization for MapReduce with Iterator pattern, power aware mapping algorithm for Puppeteer pattern, and multi-phases power gating scheme for BSP pattern. In our experiments, real world multicore applications are evaluated on a multicore power simulator. Significant power reductions are observed from the experimental results. Therefore, we present a direction for power optimizations that one can further identify additional key design patterns for embedded multicore systems to explore power optimization opportunities via compilers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17

Similar content being viewed by others

References

  1. Macii, E., Pedram, M., Somenzi, F. (1998). High-level power modeling, estimation, and optimization. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 17(11), 1061–1079.

    Article  Google Scholar 

  2. You, Y.p., Lee, Chingren, Lee, J.K. (2002). Compiler analysis and supports for leakage power reduction on microprocessors. In Proceedings of the 15th International Workshop on Languages and Compilers for Parallel Computing(LCPC’02).

  3. Chang, J.-M., & Pedram, M. (1995). Register allocation and binding for low power. In DAC ’95: Proceedings of the 32nd annual ACM/IEEE Design Automation Conference, (pp. 29–35).

  4. Lee, C., Lee, J.K., Hwang, T., Tsai, S.-C. (2003). Compiler optimization on vliw instruction scheduling for low power. ACM Transactions on Design Automation of Electronic Systems, 8(2), 252–268.

    Article  Google Scholar 

  5. You, Y.-P., Huang, C.-W., Lee, J.K. (2007). Compilation for compact power-gating controls. ACM Transactions on Design Automation of Electronic Systems (TODAES), 12(4), 51.

    Article  Google Scholar 

  6. Lin, Y.-C., You, Y.-P., Huang, C.-W., Lee, J.-K., Shih, W.-K., Hwang, T.-T. (2004). Power-aware scheduling for parallel security processors with analytical models. In Proceedings of the 17th International Workshop on Languages and Compilers for Parallel Computing(LCPC’04).

  7. Mattson, T.G., Sanders, B.A., Massingill, B.L. (2004). In Patterns for Parallel Programming. Addison-Wesley.

  8. Keutzer, K., & Mattson, Tim. (2009). Our pattern language (pol): A design pattern language for engineering (parallel) software. In ParaPLoP Workshop on Parallel Programming Patterns.

  9. Gamma, E., Helm, R., Johnson, R., Vlissides, J.M. (1994). Design Patterns: Elements of reusable object oriented software. Addison-Wesley.

  10. SID simulator component develop’s guide. Red Hat Inc., http://sources.redhat.com/sid/.

  11. Hoffmann, H., Agarwal, A., Devadas, S. (2009). Partitioning strategies: spatiotemporal patterns of program decomposition. In Proceedings of the 21st IASTED International Conference on Parallel and Distributed Computing and Systems, PDCS 2009.

  12. Keutzer, K., & Mattson, T. (2011). Architecture parallel software: design patterns in practice and teaching. In Presented as the 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011.

  13. Massingill, B.L., Mattson, T.G., Sanders, B.A. (2007). Simd: an additional pattern for plpp (pattern language for parallel programming). In Proceedings of the 14th Conference on Pattern Languages of Programs, PLOP ’07, (pp. 6:1–6:15).

  14. Dean, J., & Ghemawat, S. (2004). Mapreduce: simplified data processing on large clusters. In Proceedings of the 6th Conference on Symposium on Opearting Systems Design and Implementation, OSDI’04.

  15. He, B., Fang, W., Luo, Q., Govindaraju, N.K., Wang, T. (2008). Mars: a mapreduce framework on graphics processors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT ’08.

  16. Valiant, L.G. (1990). A bridging model for parallel computation. Communications of the ACM, 33 (8), 103–111.

    Article  Google Scholar 

  17. Diamos, G.F., Kerr, A.R., Yalamanchili, S., Ocelot, N.C. A dynamic optimization framework for bulk-synchronous applications in heterogeneous systems. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT ’10, (pp. 353–364).

  18. Stone, J.E., Gohara, D., Opencl, G.S. (2010). A parallel programming standard for heterogeneous computing systems. IEEE Design Test, 12(3), 66–73.

    Google Scholar 

  19. Shih, W.-L. Compiler Optimization for Reducing Leakage Power in Multithread BSP Programs. PhD thesis, 2014.

  20. Lin, C.-Y., Chen, P.-Y., Tseng, C.-K., Huang, C.-W., Weng, C.-C., Kuan, C.-B., Lin, S.-H., Huang, S.-Y., Lee, J.-K. (2010). Power aware sid-based simulator for embedded multicore dsp subsystems. In Proceedings of the 8th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES/ISSS ’10.

  21. Andes Technology (2010). Andescore n1213-s product brief, http://www.andestech.com/en/products/.

  22. Hsu, C.-W., Liao, J.-L., Fang, S.-C., Weng, C.-C., Huang, S.-Y., Hsieh, W.-T., Yeh, J.-C. (2011). Powerdepot: integrating ip-based power modeling with esl power analysis for multi-core soc designs. In Proceedings of the 48th Design Automation Conference, DAC’11.

  23. Li, M.-C., Weng, C.-C., Tai, T.-Y., Shi-Hunag (2008). Extrapolation-based power modeling for memory compilers using MUX-oriented linear regression. In VLSI/CAD Symposium.

  24. Janzen, J. (2001). Calculating memory system power for ddr sdram. Designline, 10(2).

  25. Open64. http://www.open64.net/.

  26. Chia-Han, L., Lin, Y.-C., You, Y.-P., Lee, J.-K. (2009). Lc-grfa: global register file assignment with local consciousness for vliw dsp processors with non-uniform register files. Concurrent Computing: Practice Experimenting, 21(1), 101–114.

    Article  Google Scholar 

  27. Lin, Y.-C., You, Y.-P., Lee, J.-K. (2007). Palf: compiler supports for irregular register files in clustered vliw dsp processors: research articles. Concurrent Computing: Practice Experimenting, 19(18), 2391–2406.

    Article  Google Scholar 

  28. Chen, C.-K., Tseng, L.-H., Chen, S.-C., Lin, Y.-J., You, Y.-P., Chia-Han, L., Lee, J.-K. (2007). Enabling compiler flow for embedded vliw dsp processors with distributed register files. SIGPLAN Notices, 42(7), 146–148.

    Article  Google Scholar 

  29. Chen, Y.-C., Te-Feng, S., Lai, S.-H. (2013). Efficient vehicle dtection with adaptive scan based on perspective geometry. In EEE International Conference on Image Processing (ICIP).

  30. Bellas, N., Hajj, I., Polychronopoulos, C.D., Stamoulis, G. (2000). Architectural and compiler techniques for energy reduction in high-performance microprocessors. IEEE Transaction on Very Large Scale Integration (VLSI) Systems, 8 (3), 317–326.

  31. Semeraro, G., Albonesi, D.H., Dropsho, S.G., Magklis, G., Dwarkadas, S., Scott, M.L. (2002). Dynamic frequency and voltage control for a multiple clock domain microarchitecture. In MICRO 35: Proceedings of the 35th annual ACM/IEEE International Symposium on Microarchitecture, (pp. 356–367).

  32. Yang, H., Gao, G.R., Leung, C. (2002). On achieving balanced power consumption in software pipelined loops. In CASES ’02: Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, (pp. 210–217).

  33. Rele, S., Pande, S., Önder, S., Gupta, R. (2002). Optimizing static power dissipation by functional units in superscalar processors. In Proceedings of the 11th International Conference on Compiler Construction, (pp. 261–275).

  34. Kimura, K., Mase, M., Mikami, H., Miyamoto, T., Shirako, J., Kasahara, H. (2010). Oscar api for real-time low-power multicores and its performance on multicores and smp servers. In: Proceedings of the 22nd International conference on Languages and Compilers for Parallel Computing, LCPC’09, (pp. 188–202).

  35. Ozturk, O., Kandemir, M., Chen, G. (2013). Compiler-directed energy reduction using dynamic voltage scaling and voltage islands for embedded systems. IEEE Transactions on Computers, 62 (2), 268–278.

    Article  MathSciNet  Google Scholar 

  36. Agosta, G., Bessi, M., Capra, E., Francalanci, C (2012). Automatic memoization for energy efficiency in financial applications. Sustainable Computing: Informatics and Systems, 2(2), 105–115.

    Google Scholar 

  37. Bartenstein, T.W., & Liu, Y.D. Green streams for data-intensive software. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, (pp. 532–541).

  38. Lin, C.-Y., Kuan, C.-B., Lee, J.K. (2013). Compilers for low power with design patterns on embedded multicore systems. In 2013 42nd International Conference on Parallel Processing (ICPP), (pp. 1052–1060).

Download references

Acknowledgments

This work is supported in part by Ministry of Economic Affairs under grant no. 101-EC-17-A-02-S1-202 and National Science Council under grant no.102-2219-E-007-001 and 102-2220-E-007-001 in Taiwan. The authors also greatly appreciate Prof. Shang-Hong Lai and his student Yu-Chun Chen and Te-Feng Su of National Tsing Hua University, Taiwan for providing the vehicle detection application as a test case in our experiments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jenq Kuen Lee.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, CY., Kuan, CB., Shih, WL. et al. Compilers for Low Power with Design Patterns on Embedded Multicore Systems. J Sign Process Syst 80, 277–293 (2015). https://doi.org/10.1007/s11265-014-0917-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-014-0917-9

Keywords

Navigation