Skip to main content
Top

2018 | OriginalPaper | Chapter

Exploration of Supervised Machine Learning Techniques for Runtime Selection of CPU vs. GPU Execution in Java Programs

Authors : Gloria Y. K. Kim, Akihiro Hayashi, Vivek Sarkar

Published in: Accelerator Programming Using Directives

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

While multi-core CPUs and many-core GPUs are both viable platforms for parallel computing, programming models for them can impose large burdens upon programmers due to their complex and low-level APIs. Since managed languages like Java are designed to be run on multiple platforms, parallel language constructs and APIs such as Java 8 Parallel Stream APIs can enable high-level parallel programming with the promise of performance portability for mainstream (“non-ninja”) programmers. To achieve this goal, it is important for the selection of the hardware device to be automated rather than be specified by the programmer, as is done in current programming models. Due to a variety of factors affecting performance, predicting a preferable device for faster performance of individual kernels remains a difficult problem. While a prior approach uses machine learning to address this challenge, there is no comparable study on good supervised machine learning algorithms and good program features to track. In this paper, we explore (1) program features to be extracted by a compiler and (2) various machine learning techniques that improve accuracy in prediction, thereby improving performance. The results show that an appropriate selection of program features and machine learning algorithm can further improve accuracy. In particular, support vector machines (SVMs), logistic regression, and J48 decision tree are found to be reliable techniques for building accurate prediction models from just two, three, or four program features, achieving accuracies of 99.66%, 98.63%, and 98.28% respectively from 5-fold-cross-validation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
2.
go back to reference Dubach, C., Cheng, P., Rabbah, R., Bacon, D.F., Fink, S.J.: Compiling a high-level language for gpus: (via language support for architectures and compilers). In: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2012, pp. 1–12. ACM, New York (2012). http://doi.acm.org/10.1145/2254064.2254066 Dubach, C., Cheng, P., Rabbah, R., Bacon, D.F., Fink, S.J.: Compiling a high-level language for gpus: (via language support for architectures and compilers). In: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2012, pp. 1–12. ACM, New York (2012). http://​doi.​acm.​org/​10.​1145/​2254064.​2254066
3.
go back to reference Fumero, J.J., Remmelg, T., Steuwer, M., Dubach, C.: Runtime code generation and data management for heterogeneous computing in Java. In: Proceedings of the Principles and Practices of Programming on the Java Platform, PPPJ 2015, pp. 16–26. ACM, New York (2015). http://doi.acm.org/10.1145/2807426.2807428 Fumero, J.J., Remmelg, T., Steuwer, M., Dubach, C.: Runtime code generation and data management for heterogeneous computing in Java. In: Proceedings of the Principles and Practices of Programming on the Java Platform, PPPJ 2015, pp. 16–26. ACM, New York (2015). http://​doi.​acm.​org/​10.​1145/​2807426.​2807428
4.
go back to reference Fumero, J.J., Steuwer, M., Dubach, C.: A composable array function interface for heterogeneous computing in Java. In: Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, ARRAY 2014, pp. 44:44–44:49. ACM, New York (2014). http://doi.acm.org/10.1145/2627373.2627381 Fumero, J.J., Steuwer, M., Dubach, C.: A composable array function interface for heterogeneous computing in Java. In: Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, ARRAY 2014, pp. 44:44–44:49. ACM, New York (2014). http://​doi.​acm.​org/​10.​1145/​2627373.​2627381
5.
go back to reference Grcevski, N., Kielstra, A., Stoodley, K., Stoodley, M., Sundaresan, V.: JavaTM just-in-time compiler and virtual machine improvements for server and middleware applications. In: Proceedings of the 3rd Conference on Virtual Machine Research And Technology Symposium, VM 2004, vol. 3. p. 12. USENIX Association, Berkeley (2004). http://dl.acm.org/citation.cfm?id=1267242.1267254 Grcevski, N., Kielstra, A., Stoodley, K., Stoodley, M., Sundaresan, V.: JavaTM just-in-time compiler and virtual machine improvements for server and middleware applications. In: Proceedings of the 3rd Conference on Virtual Machine Research And Technology Symposium, VM 2004, vol. 3. p. 12. USENIX Association, Berkeley (2004). http://​dl.​acm.​org/​citation.​cfm?​id=​1267242.​1267254
6.
go back to reference Grossman, M., Breternitz, M., Sarkar, V.: HadoopCL: MapReduce on Distributed heterogeneous platforms through seamless integration of Hadoop and OpenCL. In: Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, IPDPSW 2013, pp. 1918–1927. IEEE Computer Society, Washington, DC (2013). https://doi.org/10.1109/IPDPSW.2013.246 Grossman, M., Breternitz, M., Sarkar, V.: HadoopCL: MapReduce on Distributed heterogeneous platforms through seamless integration of Hadoop and OpenCL. In: Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, IPDPSW 2013, pp. 1918–1927. IEEE Computer Society, Washington, DC (2013). https://​doi.​org/​10.​1109/​IPDPSW.​2013.​246
7.
go back to reference Grossman, M., Breternitz, M., Sarkar, V.: Hadoopcl2: motivating the design of a distributed, heterogeneous programming system with machine-learning applications. IEEE Trans. Parallel Distrib. Syst. 27(3), 762–775 (2016)CrossRef Grossman, M., Breternitz, M., Sarkar, V.: Hadoopcl2: motivating the design of a distributed, heterogeneous programming system with machine-learning applications. IEEE Trans. Parallel Distrib. Syst. 27(3), 762–775 (2016)CrossRef
8.
go back to reference Hayashi, A., Grossman, M., Zhao, J., Shirako, J., Sarkar, V.: Accelerating Habanero-Java programs with OpenCL generation. In: Proceedings of the 2013 International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and Tools, PPPJ 2013, pp. 124–134 (2013) Hayashi, A., Grossman, M., Zhao, J., Shirako, J., Sarkar, V.: Accelerating Habanero-Java programs with OpenCL generation. In: Proceedings of the 2013 International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and Tools, PPPJ 2013, pp. 124–134 (2013)
9.
10.
go back to reference Hayashi, A., Ishizaki, K., Koblents, G., Sarkar, V.: Machine-learning-based performance heuristics for runtime CPU/GPU selection. In: Proceedings of the Principles and Practices of Programming on the Java Platform, PPPJ 2015, pp. 27–36. ACM, New York (2015). http://doi.acm.org/10.1145/2807426.2807429 Hayashi, A., Ishizaki, K., Koblents, G., Sarkar, V.: Machine-learning-based performance heuristics for runtime CPU/GPU selection. In: Proceedings of the Principles and Practices of Programming on the Java Platform, PPPJ 2015, pp. 27–36. ACM, New York (2015). http://​doi.​acm.​org/​10.​1145/​2807426.​2807429
12.
go back to reference Hong, S., Kim, H.: An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA 2009, pp. 152–163. ACM, New York (2009). http://doi.acm.org/10.1145/1555754.1555775 Hong, S., Kim, H.: An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA 2009, pp. 152–163. ACM, New York (2009). http://​doi.​acm.​org/​10.​1145/​1555754.​1555775
14.
go back to reference Ishizaki, K., Hayashi, A., Koblents, G., Sarkar, V.: Compiling and optimizing java 8 programs for GPU execution. In: 2015 International Conference on Parallel Architecture and Compilation (PACT), pp. 419–431, October 2015 Ishizaki, K., Hayashi, A., Koblents, G., Sarkar, V.: Compiling and optimizing java 8 programs for GPU execution. In: 2015 International Conference on Parallel Architecture and Compilation (PACT), pp. 419–431, October 2015
16.
go back to reference Kaleem, R., Barik, R., Shpeisman, T., Lewis, B.T., Hu, C., Pingali, K.: Adaptive heterogeneous scheduling for integrated GPUs. In: Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, PACT 2014, pp. 151–162. ACM, New York (2014). http://doi.acm.org/10.1145/2628071.2628088 Kaleem, R., Barik, R., Shpeisman, T., Lewis, B.T., Hu, C., Pingali, K.: Adaptive heterogeneous scheduling for integrated GPUs. In: Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, PACT 2014, pp. 151–162. ACM, New York (2014). http://​doi.​acm.​org/​10.​1145/​2628071.​2628088
17.
go back to reference Karami, A., Mirsoleimani, S.A., Khunjush, F.: A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs. In: The 17th CSI International Symposium on Computer Architecture Digital Systems (CADS 2013), pp. 15–22, October 2013 Karami, A., Mirsoleimani, S.A., Khunjush, F.: A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs. In: The 17th CSI International Symposium on Computer Architecture Digital Systems (CADS 2013), pp. 15–22, October 2013
18.
go back to reference Leung, A., Lhoták, O., Lashari, G.: Automatic parallelization for graphics processing units. In: Proceedings of the 7th International Conference on Principles and Practice of Programming in Java, PPPJ 2009, pp. 91–100 (2009) Leung, A., Lhoták, O., Lashari, G.: Automatic parallelization for graphics processing units. In: Proceedings of the 7th International Conference on Principles and Practice of Programming in Java, PPPJ 2009, pp. 91–100 (2009)
19.
20.
go back to reference Luo, C., Suda, R.: A performance and energy consumption analytical model for GPU. In: 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing, pp. 658–665, December 2011 Luo, C., Suda, R.: A performance and energy consumption analytical model for GPU. In: 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing, pp. 658–665, December 2011
27.
go back to reference Pratt-Szeliga, P., Fawcett, J., Welch, R.: Rootbeer: seamlessly using GPUs from Java. In: 14th IEEE International Conference on High Performance Computing and Communication and 9th IEEE International Conference on Embedded Software and Systems, HPCC-ICESS 2012, Liverpool, United Kingdom, June 25–27, 2012, pp. 375–380, June 2012 Pratt-Szeliga, P., Fawcett, J., Welch, R.: Rootbeer: seamlessly using GPUs from Java. In: 14th IEEE International Conference on High Performance Computing and Communication and 9th IEEE International Conference on Embedded Software and Systems, HPCC-ICESS 2012, Liverpool, United Kingdom, June 25–27, 2012, pp. 375–380, June 2012
29.
go back to reference Wu, G., Greathouse, J.L., Lyashevsky, A., Jayasena, N., Chiou, D.: GPGPU performance and power estimation using machine learning. In: 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pp. 564–576, February 2015 Wu, G., Greathouse, J.L., Lyashevsky, A., Jayasena, N., Chiou, D.: GPGPU performance and power estimation using machine learning. In: 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pp. 564–576, February 2015
31.
go back to reference Zaremba, W., Lin, Y., Grover, V.: JaBEE: framework for object-oriented Java bytecode compilation and execution on Graphics Processor Units. In: Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, GPGPU-5, pp. 74–83. ACM, New York (2012). http://doi.acm.org/10.1145/2159430.2159439 Zaremba, W., Lin, Y., Grover, V.: JaBEE: framework for object-oriented Java bytecode compilation and execution on Graphics Processor Units. In: Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, GPGPU-5, pp. 74–83. ACM, New York (2012). http://​doi.​acm.​org/​10.​1145/​2159430.​2159439
Metadata
Title
Exploration of Supervised Machine Learning Techniques for Runtime Selection of CPU vs. GPU Execution in Java Programs
Authors
Gloria Y. K. Kim
Akihiro Hayashi
Vivek Sarkar
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-74896-2_7

Premium Partner