Skip to main content
Top

2017 | OriginalPaper | Chapter

A Cost Model for Heterogeneous Many-Core Processor

Authors : Yanbing Li, Qi Wang, Yingying Li, Lin Han, Yuchen Gao, Qing Mu

Published in: Parallel Architecture, Algorithm and Programming

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Heterogeneous many-core processors become an important trend in high-performance computing area, but their sophisticated architecture greatly complicates the programming and compiling issue. The cost model is an important part of optimizing compilers, which is used to analyze the benefits of various program optimizations. This paper constructs a cost model for SW26010 heterogeneous many-core processor, and proposes a dynamic-static hybrid method to analyze benefit based on this cost model. Then these have been implemented in an automatic parallelizing framework for SW26010. The experimental results show that the cost model and the benefit analysis can filter a large number of non-beneficial parallel loops and the performance of the automatically parallelized programs increases significantly.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Zheng, F., Yong, X.U., Hongliang, L.I., et al.: A homegrown many-core processor architecture for high-performance computing. Sci. Sin. 45(4), 523 (2015) Zheng, F., Yong, X.U., Hongliang, L.I., et al.: A homegrown many-core processor architecture for high-performance computing. Sci. Sin. 45(4), 523 (2015)
2.
go back to reference Fu, H., Liao, J., Yang, J., et al.: The Sunway TaihuLight supercomputer: system and applications. Sci. Chin. Inf. Sci. 59, 1–16 (2016) Fu, H., Liao, J., Yang, J., et al.: The Sunway TaihuLight supercomputer: system and applications. Sci. Chin. Inf. Sci. 59, 1–16 (2016)
3.
go back to reference Sodani, A., Gramunt, R., Corbal, J., et al.: Knights landing: second-generation intel xeon phi product. IEEE Micro 36(2), 34–46 (2016)CrossRef Sodani, A., Gramunt, R., Corbal, J., et al.: Knights landing: second-generation intel xeon phi product. IEEE Micro 36(2), 34–46 (2016)CrossRef
4.
go back to reference Wu, G., Greathouse, J.L., Lyashevsky, A., et al.: GPGPU performance and power estimation using machine learning. In: Proceedings of IEEE International Symposium on High PERFORMANCE Computer Architecture, pp. 564–576. IEEE, NJ (2015) Wu, G., Greathouse, J.L., Lyashevsky, A., et al.: GPGPU performance and power estimation using machine learning. In: Proceedings of IEEE International Symposium on High PERFORMANCE Computer Architecture, pp. 564–576. IEEE, NJ (2015)
5.
go back to reference Li, Y.B., Zhao, R.C., Liu, X.X., Zhao, J.: Cost model for automatic OpenMP parallelization. Ruan Jian Xue Bao/J. Softw. 25(2), 101–110 (2014) Li, Y.B., Zhao, R.C., Liu, X.X., Zhao, J.: Cost model for automatic OpenMP parallelization. Ruan Jian Xue Bao/J. Softw. 25(2), 101–110 (2014)
6.
go back to reference Wang, Z., Tournavitis, G., Franke, B., et al.: Integrating profile-driven parallelism detection and machine-learning-based mapping. ACM Trans. Archit. Code Optim. (TACO) 11(1), 2 (2014) Wang, Z., Tournavitis, G., Franke, B., et al.: Integrating profile-driven parallelism detection and machine-learning-based mapping. ACM Trans. Archit. Code Optim. (TACO) 11(1), 2 (2014)
7.
go back to reference Naishlos, D.: Autovectorization in GCC. In: Proceedings of the 2004 GCC Developers Summit, pp. 105–118 (2004) Naishlos, D.: Autovectorization in GCC. In: Proceedings of the 2004 GCC Developers Summit, pp. 105–118 (2004)
8.
go back to reference Khaldi, D., Chapman, B.: Towards automatic HBM allocation using LLVM: a case study with knights landing. In: Proceedings of the Third Workshop on LLVM Compiler Infrastructure in HPC, pp. 12–20. IEEE Press (2016) Khaldi, D., Chapman, B.: Towards automatic HBM allocation using LLVM: a case study with knights landing. In: Proceedings of the Third Workshop on LLVM Compiler Infrastructure in HPC, pp. 12–20. IEEE Press (2016)
9.
go back to reference Chakrabarti, G., Chow, F., PathScale, L.: Structure layout optimizations in the open64 compiler: design, implementation and measurements. In: Open64 Workshop at the International Symposium on Code Generation and Optimization (2008) Chakrabarti, G., Chow, F., PathScale, L.: Structure layout optimizations in the open64 compiler: design, implementation and measurements. In: Open64 Workshop at the International Symposium on Code Generation and Optimization (2008)
10.
go back to reference Enterprise C. Cray Inc., NVIDIA and the Portland Group.: The OpenACC application programming interface, v2.0. (2013) Enterprise C. Cray Inc., NVIDIA and the Portland Group.: The OpenACC application programming interface, v2.0. (2013)
11.
go back to reference Lee, S., Min, S.J., Eigenmann, R.: Open MP to GPGPU: a compiler framework for automatic translation and optimization. ACM SIGPLAN Not. 44(4), 101–110 (2009)CrossRef Lee, S., Min, S.J., Eigenmann, R.: Open MP to GPGPU: a compiler framework for automatic translation and optimization. ACM SIGPLAN Not. 44(4), 101–110 (2009)CrossRef
12.
go back to reference Lee S, Eigenmann, R.: OpenMPC: extended open MP programming and tuning for GPUs. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–11. IEEE (2010) Lee S, Eigenmann, R.: OpenMPC: extended open MP programming and tuning for GPUs. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–11. IEEE (2010)
13.
go back to reference Baskaran, M.M., Ramanujam, J., Sadayappan, P.: Automatic C-to-CUDA code generation for affine programs. Compiler Constr. 6011, 244–263 (2010) Baskaran, M.M., Ramanujam, J., Sadayappan, P.: Automatic C-to-CUDA code generation for affine programs. Compiler Constr. 6011, 244–263 (2010)
14.
go back to reference Ravi, N., Yang, Y., Bao, T., Chakradhar, S.: Apricot: an optimizing compiler and productivity tool for x86-compatible many-core coprocessors. In: Proceedings of ICS 2012, 25–29 June 2012, San Servolo Island, Venice, Italy (2012) Ravi, N., Yang, Y., Bao, T., Chakradhar, S.: Apricot: an optimizing compiler and productivity tool for x86-compatible many-core coprocessors. In: Proceedings of ICS 2012, 25–29 June 2012, San Servolo Island, Venice, Italy (2012)
15.
go back to reference Grosser, T., Hoefler, T.: Polly-ACC transparent compilation to heterogeneous hardware. In: Proceedings of the 2016 International Conference on Supercomputing. ACM (2016) Grosser, T., Hoefler, T.: Polly-ACC transparent compilation to heterogeneous hardware. In: Proceedings of the 2016 International Conference on Supercomputing. ACM (2016)
16.
go back to reference Liao, C.: A Compile-Time OpenMP Cost Model. University of Houston, Houston (2007) Liao, C.: A Compile-Time OpenMP Cost Model. University of Houston, Houston (2007)
17.
go back to reference Huang, P., Zhao, R., Yao, Y., Zhao, J.: Parallel cost model for heterogeneous multi-core processors. J. Comput. Appl. 33(06), 1544–1547 (2013) Huang, P., Zhao, R., Yao, Y., Zhao, J.: Parallel cost model for heterogeneous multi-core processors. J. Comput. Appl. 33(06), 1544–1547 (2013)
18.
go back to reference Henning, J.L.: SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput. Archit. News 34(4), 1–17 (2006)CrossRef Henning, J.L.: SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput. Archit. News 34(4), 1–17 (2006)CrossRef
19.
go back to reference Jin, H.Q., Frumkin, M., Yan, J.: The OpenMP implementation of NAS parallel benchmarks and its performance (1999) Jin, H.Q., Frumkin, M., Yan, J.: The OpenMP implementation of NAS parallel benchmarks and its performance (1999)
Metadata
Title
A Cost Model for Heterogeneous Many-Core Processor
Authors
Yanbing Li
Qi Wang
Yingying Li
Lin Han
Yuchen Gao
Qing Mu
Copyright Year
2017
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-6442-5_54