Skip to main content
Erschienen in: Journal of Scheduling 5/2021

29.05.2021

Scheduling periodic I/O access with bi-colored chains: models and algorithms

verfasst von: Emmanuel Jeannot, Guillaume Pallez, Nicolas Vidal

Erschienen in: Journal of Scheduling | Ausgabe 5/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Observations show that some HPC applications periodically alternate between (i) operations (computations, local data accesses) executed on the compute nodes, and (ii) I/O transfers of data and this behavior can be predicted before-hand. While the compute nodes are allocated separately to each application, the storage is shared, and thus, I/O access can be a bottleneck leading to contention. To tackle this issue, we design new static I/O scheduling algorithms that prescribe when each application can access the storage. To design a static algorithm, we emphasize on the periodic behavior of most applications. Scheduling the I/O volume of the different applications is repeated over time. This is critical since often the number of application runs is very high. In the following article, we develop a formal background for I/O scheduling. First, we define a model, bi-colored chain scheduling, and then, we go through related results existing in the literature and explore the complexity of this problem variants. Finally, to match the HPC context, we perform experiments based on use cases matching highly parallel applications or distributed learning framework

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Here, the makespan of block \(\texttt {B}_{k}^{\mathcal {S}}\) (resp. \(\texttt {B}_{1}^{\mathcal {S}}\)) is naturally defined as the time between the beginning of the execution of \(B_{1,k}\) on platform \(\mathcal {B}\) and the beginning of the execution of \(B_{1,k+1}\) on platform \(\mathcal {B}\).
 
Literatur
Zurück zum Zitat Aupy, G., Beaumont, O., & Eyraud-Dubois, L. (2019). Sizing and partitioning strategies for burst-buffers to reduce io contention. In 2019 IEEE international parallel and distributed processing symposium (IPDPS). IEEE. Aupy, G., Beaumont, O., & Eyraud-Dubois, L. (2019). Sizing and partitioning strategies for burst-buffers to reduce io contention. In 2019 IEEE international parallel and distributed processing symposium (IPDPS). IEEE.
Zurück zum Zitat Aupy, G., Gainaru, A., & Le Fèvre, V. (2017). Periodic i/o scheduling for super-computers. In International workshop on performance modeling, benchmarking and simulation of high performance computer systems (pp. 44–66). Springer. Aupy, G., Gainaru, A., & Le Fèvre, V. (2017). Periodic i/o scheduling for super-computers. In International workshop on performance modeling, benchmarking and simulation of high performance computer systems (pp. 44–66). Springer.
Zurück zum Zitat Behzad, B., Luu, H. V. T., Huchette, J., Byna, S., Aydt, R., Koziol, Q., & Snir, M. (2013). Taming parallel i/o complexity with auto-tuning. In Proceedings of the international conference on high performance computing, networking, storage and analysis (p. 68). ACM. Behzad, B., Luu, H. V. T., Huchette, J., Byna, S., Aydt, R., Koziol, Q., & Snir, M. (2013). Taming parallel i/o complexity with auto-tuning. In Proceedings of the international conference on high performance computing, networking, storage and analysis (p. 68). ACM.
Zurück zum Zitat Brucker, P., & Brucker, P. (2007). Scheduling algorithms (Vol. 3). Berlin: Springer. Brucker, P., & Brucker, P. (2007). Scheduling algorithms (Vol. 3). Berlin: Springer.
Zurück zum Zitat Carns, P., Latham, R., Ross, R., Iskra, K., Lang, S., & Riley, K. (2009). 24/7 characterization of petascale i/o workloads. In IEEE international conference on cluster computing and workshops, 2009. CLUSTER’09 (pp. 1–10). IEEE. Carns, P., Latham, R., Ross, R., Iskra, K., Lang, S., & Riley, K. (2009). 24/7 characterization of petascale i/o workloads. In IEEE international conference on cluster computing and workshops, 2009. CLUSTER’09 (pp. 1–10). IEEE.
Zurück zum Zitat Daly, J. T. (2006). A higher order estimate of the optimum checkpoint interval for restart dumps. FGCS, 22(3), 303–312.CrossRef Daly, J. T. (2006). A higher order estimate of the optimum checkpoint interval for restart dumps. FGCS, 22(3), 303–312.CrossRef
Zurück zum Zitat Di, S., & Cappello, F. (2016). Fast error-bounded lossy hpc data compression with sz. In 2016 IEEE international parallel and distributed processing symposium (IPDPS) (pp. 730–739). IEEE. Di, S., & Cappello, F. (2016). Fast error-bounded lossy hpc data compression with sz. In 2016 IEEE international parallel and distributed processing symposium (IPDPS) (pp. 730–739). IEEE.
Zurück zum Zitat Dorier, M., Antoniu, G., Cappello, F., Snir, M., Orf, L. (2012). Damaris: How to efficiently leverage multicore parallelism to achieve scalable, jitter-free i/o. In 2012 IEEE international conference on cluster computing (CLUSTER) (pp. 155–163). IEEE. Dorier, M., Antoniu, G., Cappello, F., Snir, M., Orf, L. (2012). Damaris: How to efficiently leverage multicore parallelism to achieve scalable, jitter-free i/o. In 2012 IEEE international conference on cluster computing (CLUSTER) (pp. 155–163). IEEE.
Zurück zum Zitat Dorier, M., Antoniu, G., Ross, R., Kimpe, D., & Ibrahim, S. (2014). Calciom: Mitigating i/o interference in hpc systems through cross-application coordination. In 2014 IEEE 28th international parallel and distributed processing symposium (pp. 155–164). IEEE. Dorier, M., Antoniu, G., Ross, R., Kimpe, D., & Ibrahim, S. (2014). Calciom: Mitigating i/o interference in hpc systems through cross-application coordination. In 2014 IEEE 28th international parallel and distributed processing symposium (pp. 155–164). IEEE.
Zurück zum Zitat Dorier, M., Ibrahim, S., Antoniu, G., & Ross, R. (2014). Omnisc’io: A grammar-based approach to spatial and temporal i/o patterns prediction. In SC14: International conference for high performance computing, networking, storage and analysis (pp. 623–634). IEEE. Dorier, M., Ibrahim, S., Antoniu, G., & Ross, R. (2014). Omnisc’io: A grammar-based approach to spatial and temporal i/o patterns prediction. In SC14: International conference for high performance computing, networking, storage and analysis (pp. 623–634). IEEE.
Zurück zum Zitat Dreher, M., & Raffin, B. (2014). A flexible framework for asynchronous in situ and in transit analytics for scientific simulations. In 2014 14th IEEE/ACM international symposium on cluster, cloud and grid computing (CCGrid) (pp. 277–286). IEEE. Dreher, M., & Raffin, B. (2014). A flexible framework for asynchronous in situ and in transit analytics for scientific simulations. In 2014 14th IEEE/ACM international symposium on cluster, cloud and grid computing (CCGrid) (pp. 277–286). IEEE.
Zurück zum Zitat Gainaru, A., Aupy, G., Benoit, A., Cappello, F., Robert, Y., & Snir, M. (2015). Scheduling the i/o of hpc applications under congestion. In 2015 IEEE international parallel and distributed processing symposium (IPDPS) (pp. 1013–1022). IEEE. Gainaru, A., Aupy, G., Benoit, A., Cappello, F., Robert, Y., & Snir, M. (2015). Scheduling the i/o of hpc applications under congestion. In 2015 IEEE international parallel and distributed processing symposium (IPDPS) (pp. 1013–1022). IEEE.
Zurück zum Zitat Hashimoto, Y., & Aida, K. (2012). Evaluation of performance degradation in hpc applications with vm consolidation. In: 2012 third international conference on networking and computing (ICNC) (pp. 273–277). IEEE. Hashimoto, Y., & Aida, K. (2012). Evaluation of performance degradation in hpc applications with vm consolidation. In: 2012 third international conference on networking and computing (ICNC) (pp. 273–277). IEEE.
Zurück zum Zitat Isaila, F., Carretero, J., & Ross, R. (2016). Clarisse: A middleware for data-staging coordination and control on large-scale hpc platforms. In 2016 16th IEEE/ACM international symposium on cluster, cloud and grid computing (CCGrid) (pp. 346–355). IEEE. Isaila, F., Carretero, J., & Ross, R. (2016). Clarisse: A middleware for data-staging coordination and control on large-scale hpc platforms. In 2016 16th IEEE/ACM international symposium on cluster, cloud and grid computing (CCGrid) (pp. 346–355). IEEE.
Zurück zum Zitat Kougkas, A., Dorier, M., Latham, R., Ross, R., & Sun, X. H. (2016). Leveraging burst buffer coordination to prevent i/o interference. In 2016 IEEE 12th international conference on e-Science (e-Science) (pp. 371–380). IEEE. Kougkas, A., Dorier, M., Latham, R., Ross, R., & Sun, X. H. (2016). Leveraging burst buffer coordination to prevent i/o interference. In 2016 IEEE 12th international conference on e-Science (e-Science) (pp. 371–380). IEEE.
Zurück zum Zitat Kumar, S., Saha, A., Vishwanath, V., Carns, P., Schmidt, J. A., Scorzelli, G., Kolla, H., Grout, R., Latham, R., Ross, R., & Papkafa, M. E. (2016). Characterization and modeling of pidx parallel i/o for performance optimization. In Proceedings of the international conference on high performance computing, networking, storage and analysis (p. 67). ACM. Kumar, S., Saha, A., Vishwanath, V., Carns, P., Schmidt, J. A., Scorzelli, G., Kolla, H., Grout, R., Latham, R., Ross, R., & Papkafa, M. E. (2016). Characterization and modeling of pidx parallel i/o for performance optimization. In Proceedings of the international conference on high performance computing, networking, storage and analysis (p. 67). ACM.
Zurück zum Zitat Lenstra, J., Rinnooy Kan, A., & Brucker, P. (1977). Complexity of machine scheduling problems. Ann. Discrete Math., 1, 343–362.CrossRef Lenstra, J., Rinnooy Kan, A., & Brucker, P. (1977). Complexity of machine scheduling problems. Ann. Discrete Math., 1, 343–362.CrossRef
Zurück zum Zitat Leung, J., Kelly, L., & Anderson, J. H. (2004). Handbook of scheduling: Algorithms, models, and performance analysis. CRC Press Inc. Leung, J., Kelly, L., & Anderson, J. H. (2004). Handbook of scheduling: Algorithms, models, and performance analysis. CRC Press Inc.
Zurück zum Zitat Liu, N., Cope, J., Carns, P., Carothers, C., Ross, R., Grider, G., Crume, A., & Maltzahn, C. (2012). On the role of burst buffers in leadership-class storage systems. In 2012 IEEE 28th symposium on mass storage systems and technologies (MSST) (pp. 1–11). IEEE. Liu, N., Cope, J., Carns, P., Carothers, C., Ross, R., Grider, G., Crume, A., & Maltzahn, C. (2012). On the role of burst buffers in leadership-class storage systems. In 2012 IEEE 28th symposium on mass storage systems and technologies (MSST) (pp. 1–11). IEEE.
Zurück zum Zitat Lofstead, J., Zheng, F., Liu, Q., Klasky, S., Oldfield, R., Kordenbrock, T., Schwan, K., & Wolf, M. (2010). Managing variability in the io performance of petascale storage systems. In Proceedings of the 2010 ACM/IEEE international conference for high performance computing, networking, storage and analysis (pp. 1–12). IEEE Computer Society. Lofstead, J., Zheng, F., Liu, Q., Klasky, S., Oldfield, R., Kordenbrock, T., Schwan, K., & Wolf, M. (2010). Managing variability in the io performance of petascale storage systems. In Proceedings of the 2010 ACM/IEEE international conference for high performance computing, networking, storage and analysis (pp. 1–12). IEEE Computer Society.
Zurück zum Zitat Madireddy, S., Balaprakash, P., Carns, P., Latham, R., Ross, R., Snyder, S., & Wild, S. (2018). Modeling i/o performance variability using conditional variational autoencoders. In 2018 IEEE international conference on cluster computing (CLUSTER) (pp. 109–113). IEEE. Madireddy, S., Balaprakash, P., Carns, P., Latham, R., Ross, R., Snyder, S., & Wild, S. (2018). Modeling i/o performance variability using conditional variational autoencoders. In 2018 IEEE international conference on cluster computing (CLUSTER) (pp. 109–113). IEEE.
Zurück zum Zitat Skinner, D., & Kramer, W. (2005). Understanding the causes of performance variability in hpc workloads. In Proceedings of the IEEE international workload characterization symposium, 2005 (pp. 137–149). IEEE. Skinner, D., & Kramer, W. (2005). Understanding the causes of performance variability in hpc workloads. In Proceedings of the IEEE international workload characterization symposium, 2005 (pp. 137–149). IEEE.
Zurück zum Zitat Tanaev, V., Gordon, W., & Shafransky, Y. M. (2012). Scheduling theory. Single-stage systems (Vol. 284). Springer. Tanaev, V., Gordon, W., & Shafransky, Y. M. (2012). Scheduling theory. Single-stage systems (Vol. 284). Springer.
Zurück zum Zitat Tessier, F., Malakar, P., Vishwanath, V., Jeannot, E., & Isaila, F. (2016). Topology-aware data aggregation for intensive i/o on large-scale supercomputers. In Proceedings of the first workshop on optimization of communication in HPC (pp. 73–81). IEEE Press. Tessier, F., Malakar, P., Vishwanath, V., Jeannot, E., & Isaila, F. (2016). Topology-aware data aggregation for intensive i/o on large-scale supercomputers. In Proceedings of the first workshop on optimization of communication in HPC (pp. 73–81). IEEE Press.
Zurück zum Zitat Uselton, A., Howison, M., Wright, N. J., Skinner, D., Keen, N., Shalf, J., Karavanic, K. L., & Oliker, L. (2010). Parallel i/o performance: From events to ensembles. In 2010 IEEE international symposium on parallel & distributed processing (IPDPS) (pp. 1–11). IEEE. Uselton, A., Howison, M., Wright, N. J., Skinner, D., Keen, N., Shalf, J., Karavanic, K. L., & Oliker, L. (2010). Parallel i/o performance: From events to ensembles. In 2010 IEEE international symposium on parallel & distributed processing (IPDPS) (pp. 1–11). IEEE.
Zurück zum Zitat Wikum, E. D., Llewellyn, D. C., & Nemhauser, G. L. (1994). One-machine generalized precedence constrained scheduling problems. Operations Research Letters, 16(2), 87–99.CrossRef Wikum, E. D., Llewellyn, D. C., & Nemhauser, G. L. (1994). One-machine generalized precedence constrained scheduling problems. Operations Research Letters, 16(2), 87–99.CrossRef
Zurück zum Zitat Xie, B., Chase, J., Dillow, D., Drokin, O., Klasky, S., Oral, S., & Podhorszki, N. (2012). Characterizing output bottlenecks in a supercomputer. In Proceedings of the international conference on high performance computing, networking, storage and analysis (p. 8). IEEE Computer Society Press. Xie, B., Chase, J., Dillow, D., Drokin, O., Klasky, S., Oral, S., & Podhorszki, N. (2012). Characterizing output bottlenecks in a supercomputer. In Proceedings of the international conference on high performance computing, networking, storage and analysis (p. 8). IEEE Computer Society Press.
Zurück zum Zitat Xu, L., Wang, Y., Lux, T., Chang, T., Bernard, J., Li, B., Hong, Y., Cameron, K., & Watson, L. (2020). Modeling i/o performance variability in high-performance computing systems using mixture distributions. Journal of Parallel and Distributed Computing, 139, 87–98. Xu, L., Wang, Y., Lux, T., Chang, T., Bernard, J., Li, B., Hong, Y., Cameron, K., & Watson, L. (2020). Modeling i/o performance variability in high-performance computing systems using mixture distributions. Journal of Parallel and Distributed Computing, 139, 87–98.
Zurück zum Zitat Zhang, X., Davis, K., & Jiang, S. (2012). Opportunistic data-driven execution of parallel programs for efficient i/o services. In 2012 IEEE 26th international parallel & distributed processing symposium (IPDPS) (pp. 330–341). IEEE. Zhang, X., Davis, K., & Jiang, S. (2012). Opportunistic data-driven execution of parallel programs for efficient i/o services. In 2012 IEEE 26th international parallel & distributed processing symposium (IPDPS) (pp. 330–341). IEEE.
Zurück zum Zitat Zhou, Z., Yang, X., Zhao, D., Rich, P., Tang, W., Wang, J., & Lan, Z. (2015). I/o-aware batch scheduling for petascale computing systems. In 2015 IEEE international conference on cluster computing (CLUSTER) (pp. 254–263). IEEE. Zhou, Z., Yang, X., Zhao, D., Rich, P., Tang, W., Wang, J., & Lan, Z. (2015). I/o-aware batch scheduling for petascale computing systems. In 2015 IEEE international conference on cluster computing (CLUSTER) (pp. 254–263). IEEE.
Metadaten
Titel
Scheduling periodic I/O access with bi-colored chains: models and algorithms
verfasst von
Emmanuel Jeannot
Guillaume Pallez
Nicolas Vidal
Publikationsdatum
29.05.2021
Verlag
Springer US
Erschienen in
Journal of Scheduling / Ausgabe 5/2021
Print ISSN: 1094-6136
Elektronische ISSN: 1099-1425
DOI
https://doi.org/10.1007/s10951-021-00685-8

Weitere Artikel der Ausgabe 5/2021

Journal of Scheduling 5/2021 Zur Ausgabe