Skip to main content
Top

2015 | OriginalPaper | Chapter

Predicting Performance of Non-contiguous I/O with Machine Learning

Authors : Julian Kunkel, Michaela Zimmer, Eugen Betke

Published in: High Performance Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Data sieving in ROMIO promises to optimize individual non-contiguous I/O. However, making the right choice and parameterizing its buffer size accordingly are non-trivial tasks, since predicting the resulting performance is difficult. Since many performance factors are not taken into account by data sieving, extracting the optimal performance for a given access pattern and system is often not possible. Additionally, in Lustre, settings such as the stripe size and number of servers are tunable, yet again, identifying rules for the data-centre proves challenging indeed.
In this paper, we (1) discuss limitations of data sieving, (2) apply machine learning techniques to build a performance predictor, and (3) learn and extract best practices for the settings from the data. We used decision trees as these models can capture non-linear behavior, are easy to understand and allow for extraction of the rules used. Even though this initial research is based on decision trees, with sparse training data, the algorithm can predict many cases sufficiently. Compared to a standard setting, the decision trees created are able to improve performance significantly and we can derive expert knowledge by extracting rules from the learned tree. Applying the scheme to a set of experimental data improved the average throughput by 25–50 % of the best parametrization’s gain. Additionally, we demonstrate the versatility of this approach by applying it to the porting system of DKRZ’s next generation supercomputer and discuss achievable performance gains.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Experimental data is taken from Schmidtke’s thesis [15].
 
2
Note that for a tree of depth one, 80 choices are made for which no measurement is available; these values are excluded from the calculation of the average performance. For bigger trees, less than a handful of choices are not quantifiable. Therefore, we believe this comparison to be fair.
 
Literature
1.
go back to reference Thakur, R., Gropp, W., Lusk, E.: Data sieving and collective I/O in ROMIO. In: FRONTIERS 1999: Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation, p. 182. IEEE Computer Society, Washington, DC (1999) Thakur, R., Gropp, W., Lusk, E.: Data sieving and collective I/O in ROMIO. In: FRONTIERS 1999: Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation, p. 182. IEEE Computer Society, Washington, DC (1999)
2.
go back to reference Ching, A., Choudhary, A., Coloma, K., Liao, W.K., Ross, R., Gropp, W.: Noncontiguous I/O accesses through MPI-IO. In: Proceedings of the 3rd International Symposium on Cluster Computing and the Grid, CCGRID, p. 104. IEEE Computer Society, Washington, DC (2003) Ching, A., Choudhary, A., Coloma, K., Liao, W.K., Ross, R., Gropp, W.: Noncontiguous I/O accesses through MPI-IO. In: Proceedings of the 3rd International Symposium on Cluster Computing and the Grid, CCGRID, p. 104. IEEE Computer Society, Washington, DC (2003)
3.
go back to reference Singh, D.E., Isaila, F., Calderon, A., Garcia, F., Carretero, J.: Multiple-phase collective I/O technique for improving data access locality. In: Proceedings of the 15th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP, pp. 534–542. IEEE Computer Society, Washington, DC (2007) Singh, D.E., Isaila, F., Calderon, A., Garcia, F., Carretero, J.: Multiple-phase collective I/O technique for improving data access locality. In: Proceedings of the 15th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP, pp. 534–542. IEEE Computer Society, Washington, DC (2007)
4.
go back to reference Singh, D.E., Isaila, F., Pichel, J.C., Carretero, J.: A collective I/O implementation based on inspector-executor paradigm. J. Supercomputing 47(1), 53–75 (2009)CrossRef Singh, D.E., Isaila, F., Pichel, J.C., Carretero, J.: A collective I/O implementation based on inspector-executor paradigm. J. Supercomputing 47(1), 53–75 (2009)CrossRef
5.
go back to reference Zhang, X., Ou, J., Davis, K., Jiang, S.: Orthrus: a framework for implementing efficient collective I/O in multi-core clusters. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 348–364. Springer, Heidelberg (2014) Zhang, X., Ou, J., Davis, K., Jiang, S.: Orthrus: a framework for implementing efficient collective I/O in multi-core clusters. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 348–364. Springer, Heidelberg (2014)
6.
go back to reference Knüpfer, A., Brunst, H., Doleschal, J., Jurenz, M., Lieber, M., Mickler, H., Müller, M.S., Nagel, W.E.: The vampir performance analysis tool-set. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds.) Tools for High Performance Computing, Proceedings of the 2nd International Workshop on Parallel Tools, pp. 139–155. Springer, Heidelberg (2008)CrossRef Knüpfer, A., Brunst, H., Doleschal, J., Jurenz, M., Lieber, M., Mickler, H., Müller, M.S., Nagel, W.E.: The vampir performance analysis tool-set. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds.) Tools for High Performance Computing, Proceedings of the 2nd International Workshop on Parallel Tools, pp. 139–155. Springer, Heidelberg (2008)CrossRef
8.
go back to reference Madhyastha, T., Reed, D.: Learning to classify parallel Input/Output access patterns. IEEE Trans. Parallel Distrib. Syst. 13(8), 802–813 (2002)CrossRef Madhyastha, T., Reed, D.: Learning to classify parallel Input/Output access patterns. IEEE Trans. Parallel Distrib. Syst. 13(8), 802–813 (2002)CrossRef
9.
go back to reference Barham, P., Donnelly, A., Isaacs, R., Mortier, R.: Using magpie for request extraction and workload modelling. In: Proceedings of the 6th Symposium on Opearting Systems Design and Implementation, vol. 6, pp. 259–272 (2004) Barham, P., Donnelly, A., Isaacs, R., Mortier, R.: Using magpie for request extraction and workload modelling. In: Proceedings of the 6th Symposium on Opearting Systems Design and Implementation, vol. 6, pp. 259–272 (2004)
10.
go back to reference Barham, P., Isaacs, R., Mortier, R., Narayanan, D.: Magpie: online modelling and performance-aware systems. In: Proceedings of the 9th Conference on Hot Topics in Operating Systems, vol. 9 (2003) Barham, P., Isaacs, R., Mortier, R., Narayanan, D.: Magpie: online modelling and performance-aware systems. In: Proceedings of the 9th Conference on Hot Topics in Operating Systems, vol. 9 (2003)
11.
go back to reference Isaacs, R., Barham, P., Bulpin, J., Mortier, R., Narayanan, D.: Request extraction in magpie: events, schemas and temporal joins. In: Proceedings of the 11th Workshop on ACM SIGOPS European Workshop, EW11. ACM, New York (2004) Isaacs, R., Barham, P., Bulpin, J., Mortier, R., Narayanan, D.: Request extraction in magpie: events, schemas and temporal joins. In: Proceedings of the 11th Workshop on ACM SIGOPS European Workshop, EW11. ACM, New York (2004)
12.
go back to reference Behzad, B., Huchette, J., Luu, H.V.T., Aydt, R., Byna, S., Yao, Y., Koziol, Q.: Prabhat: a framework for auto-tuning hdf5 applications. In: Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2013, pp. 127–128. ACM, New York (2013) Behzad, B., Huchette, J., Luu, H.V.T., Aydt, R., Byna, S., Yao, Y., Koziol, Q.: Prabhat: a framework for auto-tuning hdf5 applications. In: Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2013, pp. 127–128. ACM, New York (2013)
13.
go back to reference Kunkel, J.M., Zimmer, M., Hübbe, N., Aguilera, A., Mickler, H., Wang, X., Chut, A., Bönisch, T., Lüttgau, J., Michel, R., Weging, J.: The SIOX architecture – coupling automatic monitoring and optimization of parallel I/O. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 245–260. Springer, Heidelberg (2014) Kunkel, J.M., Zimmer, M., Hübbe, N., Aguilera, A., Mickler, H., Wang, X., Chut, A., Bönisch, T., Lüttgau, J., Michel, R., Weging, J.: The SIOX architecture – coupling automatic monitoring and optimization of parallel I/O. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 245–260. Springer, Heidelberg (2014)
14.
go back to reference Zimmer, M., Kunkel, J.M., Ludwig, T.: Towards self-optimization in HPC I/O. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 422–434. Springer, Heidelberg (2013) CrossRef Zimmer, M., Kunkel, J.M., Ludwig, T.: Towards self-optimization in HPC I/O. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 422–434. Springer, Heidelberg (2013) CrossRef
15.
go back to reference Schmidtke, D.: Analyse und Optimierung von nicht-zusammenhängende Ein-/Ausgabe in MPI, April 2014 Schmidtke, D.: Analyse und Optimierung von nicht-zusammenhängende Ein-/Ausgabe in MPI, April 2014
16.
go back to reference Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks, Pacific Grove (1984) MATH Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks, Pacific Grove (1984) MATH
17.
go back to reference Igel, C., Heidrich-Meisner, V., Glasmachers, T.: Shark. J. Mach. Learn. Res. 9, 993–996 (2008)MATH Igel, C., Heidrich-Meisner, V., Glasmachers, T.: Shark. J. Mach. Learn. Res. 9, 993–996 (2008)MATH
Metadata
Title
Predicting Performance of Non-contiguous I/O with Machine Learning
Authors
Julian Kunkel
Michaela Zimmer
Eugen Betke
Copyright Year
2015
DOI
https://doi.org/10.1007/978-3-319-20119-1_19