Skip to main content

2019 | OriginalPaper | Buchkapitel

Towards Portable Online Prediction of Network Utilization Using MPI-Level Monitoring

verfasst von : Shu-Mei Tseng, Bogdan Nicolae, George Bosilca, Emmanuel Jeannot, Aparna Chandramowlishwaran, Franck Cappello

Erschienen in: Euro-Par 2019: Parallel Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Stealing network bandwidth helps a variety of HPC runtimes and services to run additional operations in the background without negatively affecting the applications. A key ingredient to make this possible is an accurate prediction of the future network utilization, enabling the runtime to plan the background operations in advance, such as to avoid competing with the application for network bandwidth. In this paper, we propose a portable deep learning predictor that only uses the information available through MPI introspection to construct a recurrent sequence-to-sequence neural network capable of forecasting network utilization. We leverage the fact that most HPC applications exhibit periodic behaviors to enable predictions far into the future (at least the length of a period). Our online approach does not have an initial training phase, it continuously improves itself during application execution without incurring significant computational overhead. Experimental results show better accuracy and lower computational overhead compared with the state-of-the-art on two representative applications.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Adalsteinsson, H., Cranford, S., Evensky, D.A., Kenny, J.P., Mayo, J., Pinar, A., Janssen, C.L.: A simulator for large-scale parallel computer architectures. Int. J. Distrib. Syst. Technol. 1(2), 57–73 (2010)CrossRef Adalsteinsson, H., Cranford, S., Evensky, D.A., Kenny, J.P., Mayo, J., Pinar, A., Janssen, C.L.: A simulator for large-scale parallel computer architectures. Int. J. Distrib. Syst. Technol. 1(2), 57–73 (2010)CrossRef
4.
Zurück zum Zitat Baker, A.H., Falgout, R.D., Kolev, T.V., Yang, U.M.: Multigrid smoothers for ultraparallel computing. SIAM J. Sci. Comput. 33(5), 2864–2887 (2011)MathSciNetCrossRef Baker, A.H., Falgout, R.D., Kolev, T.V., Yang, U.M.: Multigrid smoothers for ultraparallel computing. SIAM J. Sci. Comput. 33(5), 2864–2887 (2011)MathSciNetCrossRef
5.
Zurück zum Zitat Barrett, B., Squyres, J.M., Lumsdaine, A., Graham, R.L., Bosilca, G.: Analysis of the component architecture overhead in Open MPI. In: EuroPVM/MPI 2005: 12th European Parallel Virtual Machine and Message Passing Interface Users’ Group Meeting, Sorrento, Italy, pp. 175–182 (2005) Barrett, B., Squyres, J.M., Lumsdaine, A., Graham, R.L., Bosilca, G.: Analysis of the component architecture overhead in Open MPI. In: EuroPVM/MPI 2005: 12th European Parallel Virtual Machine and Message Passing Interface Users’ Group Meeting, Sorrento, Italy, pp. 175–182 (2005)
6.
Zurück zum Zitat Bengio, Y., Simard, P., Frasconi, P., et al.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Networks 5(2), 157–166 (1994)CrossRef Bengio, Y., Simard, P., Frasconi, P., et al.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Networks 5(2), 157–166 (1994)CrossRef
8.
Zurück zum Zitat Box, G.E., Jenkins, G.M., Reinsel, G.C., Ljung, G.M.: Time Series Analysis: Forecasting and Control. Wiley, Hoboken (2015)MATH Box, G.E., Jenkins, G.M., Reinsel, G.C., Ljung, G.M.: Time Series Analysis: Forecasting and Control. Wiley, Hoboken (2015)MATH
9.
Zurück zum Zitat Brown, K.A., Domke, J., Matsuoka, S.: Tracing data movements within MPI collectives. In: EuroMPI 2014: Proceedings of the 21st European MPI Users’ Group Meeting, Kyoto, Japan, pp. 117:117–117:118 (2014) Brown, K.A., Domke, J., Matsuoka, S.: Tracing data movements within MPI collectives. In: EuroMPI 2014: Proceedings of the 21st European MPI Users’ Group Meeting, Kyoto, Japan, pp. 117:117–117:118 (2014)
10.
Zurück zum Zitat Chiu, C.C., et al.: State-of-the-art speech recognition with sequence-to-sequence models. In: ICASSP 2018: 2018 IEEE International Conference on Acoustics. Speech and Signal Processing, Calgary, AB, Canada, pp. 4774–4778 (2018) Chiu, C.C., et al.: State-of-the-art speech recognition with sequence-to-sequence models. In: ICASSP 2018: 2018 IEEE International Conference on Acoustics. Speech and Signal Processing, Calgary, AB, Canada, pp. 4774–4778 (2018)
11.
Zurück zum Zitat Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP 2014: 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1724–1734 (2014) Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP 2014: 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1724–1734 (2014)
12.
Zurück zum Zitat Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)MATH Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)MATH
13.
Zurück zum Zitat Gerber, R., et al.: Crosscut report: exascale requirements reviews, March 9–10, 2017-tysons corner, virginia. An office of science review sponsored by: advanced scientific computing research, basic energy sciences, biological and environmental research, fusion energy sciences, high energy physics, nuclear physics. Technical report, Oak Ridge National Lab. (ORNL) (2018) Gerber, R., et al.: Crosscut report: exascale requirements reviews, March 9–10, 2017-tysons corner, virginia. An office of science review sponsored by: advanced scientific computing research, basic energy sciences, biological and environmental research, fusion energy sciences, high energy physics, nuclear physics. Technical report, Oak Ridge National Lab. (ORNL) (2018)
14.
Zurück zum Zitat Habib, S., Morozov, V., Frontiere, N., Finkel, H., Pope, A., Heitmann, K.: HACC: extreme scaling and performance across diverse architectures. In: SC 2013: 2013 International Conference on High Performance Computing. Networking, Storage and Analysis, Denver, USA, pp. 1–10 (2013) Habib, S., Morozov, V., Frontiere, N., Finkel, H., Pope, A., Heitmann, K.: HACC: extreme scaling and performance across diverse architectures. In: SC 2013: 2013 International Conference on High Performance Computing. Networking, Storage and Analysis, Denver, USA, pp. 1–10 (2013)
15.
Zurück zum Zitat Harmon, M., Klabjan, D.: Dynamic prediction length for time series with sequence to sequence networks. arXiv preprint arXiv:1807.00425 (2018) Harmon, M., Klabjan, D.: Dynamic prediction length for time series with sequence to sequence networks. arXiv preprint arXiv:​1807.​00425 (2018)
16.
Zurück zum Zitat Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
17.
Zurück zum Zitat Jean, S., Cho, K., Memisevic, R., Bengio, Y.: On using very large target vocabulary for neural machine translation. In: ACL-IJCNLP 2015: 53rd Annual Meeting of the Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 1–10 (2015) Jean, S., Cho, K., Memisevic, R., Bengio, Y.: On using very large target vocabulary for neural machine translation. In: ACL-IJCNLP 2015: 53rd Annual Meeting of the Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 1–10 (2015)
18.
Zurück zum Zitat Keller, R., Bosilca, G., Fagg, G., Resch, M., Dongarra, J.J.: Implementation and usage of the PERUSE-interface in Open MPI. In: EuroPVM/MPI 2006: 13th European Parallel Virtual Machine/Message Passing Interface Users’ Group Meeting, Bonn, Germany, pp. 347–355 (2006) Keller, R., Bosilca, G., Fagg, G., Resch, M., Dongarra, J.J.: Implementation and usage of the PERUSE-interface in Open MPI. In: EuroPVM/MPI 2006: 13th European Parallel Virtual Machine/Message Passing Interface Users’ Group Meeting, Bonn, Germany, pp. 347–355 (2006)
19.
Zurück zum Zitat Knüpfer, A., et al.: Score-P: a joint performance measurement run-time infrastructure for periscope, scalasca, TAU, and vampir. In: 5th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, pp. 9–91 (2012) Knüpfer, A., et al.: Score-P: a joint performance measurement run-time infrastructure for periscope, scalasca, TAU, and vampir. In: 5th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, pp. 9–91 (2012)
20.
Zurück zum Zitat Kumar, A.S., Mazumdar, S.: Forecasting HPC workload using ARMA models and SSA. In: ICIT 2016: 2016 International Conference on Information Technology, Bhubaneswar, India, pp. 294–297 (2016) Kumar, A.S., Mazumdar, S.: Forecasting HPC workload using ARMA models and SSA. In: ICIT 2016: 2016 International Conference on Information Technology, Bhubaneswar, India, pp. 294–297 (2016)
21.
22.
Zurück zum Zitat Salvador, S., Chan, P.: Toward accurate dynamic time warping in linear time and space. Intell. Data Anal. 11(5), 561–580 (2007)CrossRef Salvador, S., Chan, P.: Toward accurate dynamic time warping in linear time and space. Intell. Data Anal. 11(5), 561–580 (2007)CrossRef
23.
Zurück zum Zitat Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS 2014: 27th Annual Conference on Neural Information Processing Systems, Montreal, Quebec, Canada, pp. 3104–3112 (2014) Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS 2014: 27th Annual Conference on Neural Information Processing Systems, Montreal, Quebec, Canada, pp. 3104–3112 (2014)
24.
Zurück zum Zitat Tran, N., Reed, D.A.: ARIMA time series modeling and forecasting for adaptive I/O prefetching. In: ICS 2001: Proceedings of the 15th International Conference on Supercomputing, Sorrento, Italy, pp. 473–485 (2001) Tran, N., Reed, D.A.: ARIMA time series modeling and forecasting for adaptive I/O prefetching. In: ICS 2001: Proceedings of the 15th International Conference on Supercomputing, Sorrento, Italy, pp. 473–485 (2001)
25.
Zurück zum Zitat Tran, N., Reed, D.A.: Automatic ARIMA time series modeling for adaptive I/O prefetching. IEEE Trans. Parallel Distrib. Syst. 15(4), 362–377 (2004)CrossRef Tran, N., Reed, D.A.: Automatic ARIMA time series modeling for adaptive I/O prefetching. IEEE Trans. Parallel Distrib. Syst. 15(4), 362–377 (2004)CrossRef
26.
Zurück zum Zitat Venugopalan, S., Rohrbach, M., Donahue, J., Mooney, R., Darrell, T., Saenko, K.: Sequence to sequence-video to text. In: ICCV 2015: 2015 IEEE International Conference on Computer Vision, Santiago, Chile, pp. 4534–4542 (2015) Venugopalan, S., Rohrbach, M., Donahue, J., Mooney, R., Darrell, T., Saenko, K.: Sequence to sequence-video to text. In: ICCV 2015: 2015 IEEE International Conference on Computer Vision, Santiago, Chile, pp. 4534–4542 (2015)
27.
Zurück zum Zitat Vetter, J.S., McCracken, M.O.: Statistical scalability analysis of communication operations in distributed applications. In: PPoPP 2001: Proceedings of the 8th ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, Snowbird, Utah, USA, pp. 123–132 (2001) Vetter, J.S., McCracken, M.O.: Statistical scalability analysis of communication operations in distributed applications. In: PPoPP 2001: Proceedings of the 8th ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, Snowbird, Utah, USA, pp. 123–132 (2001)
Metadaten
Titel
Towards Portable Online Prediction of Network Utilization Using MPI-Level Monitoring
verfasst von
Shu-Mei Tseng
Bogdan Nicolae
George Bosilca
Emmanuel Jeannot
Aparna Chandramowlishwaran
Franck Cappello
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-29400-7_4