Skip to main content

2020 | OriginalPaper | Buchkapitel

Improving Resource Isolation of Critical Tasks in a Workload

verfasst von : Meghana Thiyyakat, Subramaniam Kalambur, Dinkar Sitaram

Erschienen in: Job Scheduling Strategies for Parallel Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Typical cluster schedulers co-locate critical tasks and background batch tasks to improve the utilization of resources in the cluster. However, this leads to resource contention and interference between the diverse co-located tasks. To ensure guaranteed resource allocation and predictability, critical tasks are executed within containers as they provide resource isolation using container resource allocation mechanisms. Linux-based containers achieve resource allocation and isolation using a kernel feature known as Control Groups (cgroups). Cgroups allow the division of CPU time into shares which can be allocated to different groups of tasks. In our study, we run workloads on servers with different hardware configurations and measure the CPU time per second, or the CPU bandwidth, that the critical tasks in the workloads can consume. Our workloads have been generated using a cluster trace published by Google, and contain a mixture of critical and background tasks. The results of the experiments show that under high CPU load conditions, the CPU bandwidth consumed by the critical tasks is inadequate and unstable because of the poor resource isolation offered by cgroups. However, when these tasks are scheduled with the careful use of SCHED_DEADLINE policy, which is based on the Global Earliest Deadline First and Constant Bandwidth Server algorithms, they steadily consume their required CPU bandwidth irrespective of the load on the CPU. As a result, when critical tasks are scheduled using SCHED_DEADLINE, they experience 3\(\times \)–40\(\times \) smaller delays than under cgroups.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abeni, L., Balsini, A., Cucinotta, T.: Container-based real-time scheduling in the Linux kernel. ACM SIGBED Rev. 16(3), 33–38 (2019)CrossRef Abeni, L., Balsini, A., Cucinotta, T.: Container-based real-time scheduling in the Linux kernel. ACM SIGBED Rev. 16(3), 33–38 (2019)CrossRef
2.
Zurück zum Zitat Abeni, L., Buttazzo, G.: Integrating multimedia applications in hard real-time systems. In: Proceedings 19th IEEE Real-Time Systems Symposium (Cat. No. 98CB36279), pp. 4–13. IEEE (1998) Abeni, L., Buttazzo, G.: Integrating multimedia applications in hard real-time systems. In: Proceedings 19th IEEE Real-Time Systems Symposium (Cat. No. 98CB36279), pp. 4–13. IEEE (1998)
3.
Zurück zum Zitat Ahn, S., La, K., Kim, J.: Improving i/o resource sharing of linux cgroup for NVME SSDS on multi-core systems. In: 8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 2016) (2016) Ahn, S., La, K., Kim, J.: Improving i/o resource sharing of linux cgroup for NVME SSDS on multi-core systems. In: 8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 2016) (2016)
4.
Zurück zum Zitat Barker, S.K., Shenoy, P.: Empirical evaluation of latency-sensitive application performance in the cloud. In: Proceedings of the First Annual ACM SIGMM Conference on Multimedia Systems, pp. 35–46. ACM (2010) Barker, S.K., Shenoy, P.: Empirical evaluation of latency-sensitive application performance in the cloud. In: Proceedings of the First Annual ACM SIGMM Conference on Multimedia Systems, pp. 35–46. ACM (2010)
5.
Zurück zum Zitat Bini, E., et al.: Resource management on multicore systems: the actors approach. IEEE Micro 31(3), 72–81 (2011)CrossRef Bini, E., et al.: Resource management on multicore systems: the actors approach. IEEE Micro 31(3), 72–81 (2011)CrossRef
6.
Zurück zum Zitat Calheiros, R.N., et al.: Workload prediction using ARIMA model and its impact on cloud applications Qos. IEEE Trans. Cloud Comput. 3(4), 449–458 (2014)CrossRef Calheiros, R.N., et al.: Workload prediction using ARIMA model and its impact on cloud applications Qos. IEEE Trans. Cloud Comput. 3(4), 449–458 (2014)CrossRef
7.
Zurück zum Zitat Carbone, P., et al.: Apache flink: stream and batch processing in a single engine. Bull. IEEE Comput. Soc. Tech. Commit. Data Eng. 36(4) (2015) Carbone, P., et al.: Apache flink: stream and batch processing in a single engine. Bull. IEEE Comput. Soc. Tech. Commit. Data Eng. 36(4) (2015)
10.
Zurück zum Zitat Delimitrou, C., Kozyrakis, C.: Paragon: Qos-aware scheduling for heterogeneous datacenters. ACM SIGPLAN Not. 48, 77–88 (2013)CrossRef Delimitrou, C., Kozyrakis, C.: Paragon: Qos-aware scheduling for heterogeneous datacenters. ACM SIGPLAN Not. 48, 77–88 (2013)CrossRef
11.
Zurück zum Zitat Delimitrou, C., Kozyrakis, C.: Quasar: resource-efficient and QoS-aware cluster management. ACM SIGARCH Comput. Architect. News 42, 127–144 (2014)CrossRef Delimitrou, C., Kozyrakis, C.: Quasar: resource-efficient and QoS-aware cluster management. ACM SIGARCH Comput. Architect. News 42, 127–144 (2014)CrossRef
12.
Zurück zum Zitat Faggioli, D., et al.: An EDF scheduling class for the linux kernel. In: Proceedings of the 11th Real-Time Linux Workshop, pp. 1–8. Citeseer (2009) Faggioli, D., et al.: An EDF scheduling class for the linux kernel. In: Proceedings of the 11th Real-Time Linux Workshop, pp. 1–8. Citeseer (2009)
13.
Zurück zum Zitat Gao, X., et al.: Houdini’s escape: Breaking the resource rein of linux control groups (2019) Gao, X., et al.: Houdini’s escape: Breaking the resource rein of linux control groups (2019)
14.
Zurück zum Zitat Iorgulescu, C., et al.: Perflso: performance isolation for commercial latency-sensitive services. In: 2018 USENIX Annual Technical Conference, pp. 519–532 (2018) Iorgulescu, C., et al.: Perflso: performance isolation for commercial latency-sensitive services. In: 2018 USENIX Annual Technical Conference, pp. 519–532 (2018)
16.
Zurück zum Zitat Kato, S., Rajkumar, R., Ishikawa, Y.: AIRS: supporting interactive real-time applications on multicore platforms. In: 2010 22nd Euromicro Conference on Real-Time Systems, pp. 47–56. IEEE (2010) Kato, S., Rajkumar, R., Ishikawa, Y.: AIRS: supporting interactive real-time applications on multicore platforms. In: 2010 22nd Euromicro Conference on Real-Time Systems, pp. 47–56. IEEE (2010)
17.
Zurück zum Zitat Kishore, Y., et al.: Qos aware resource management for apache Cassandra. In: 2016 IEEE 23rd International Conference on High Performance Computing Workshops (HiPCW), pp. 3–10. IEEE (2016) Kishore, Y., et al.: Qos aware resource management for apache Cassandra. In: 2016 IEEE 23rd International Conference on High Performance Computing Workshops (HiPCW), pp. 3–10. IEEE (2016)
18.
Zurück zum Zitat Lelli, J., et al.: Deadline scheduling in the Llinux kernel. Softw. Pract. Exp. 46(6), 821–839 (2016)CrossRef Lelli, J., et al.: Deadline scheduling in the Llinux kernel. Softw. Pract. Exp. 46(6), 821–839 (2016)CrossRef
20.
Zurück zum Zitat Li, Y., et al.: Pine: optimizing performance isolation in container environments. IEEE Access 7, 30410–30422 (2019)CrossRef Li, Y., et al.: Pine: optimizing performance isolation in container environments. IEEE Access 7, 30410–30422 (2019)CrossRef
21.
Zurück zum Zitat Lo, D., et al.: Heracles: improving resource efficiency at scale. ACM SIGARCH Comput. Architect. News 43, 450–462 (2015)CrossRef Lo, D., et al.: Heracles: improving resource efficiency at scale. ACM SIGARCH Comput. Architect. News 43, 450–462 (2015)CrossRef
22.
Zurück zum Zitat Mars, J., et al.: Bubble-up: increasing utilization in modern warehouse scale computers via sensible co-locations. In: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 248–259. ACM (2011) Mars, J., et al.: Bubble-up: increasing utilization in modern warehouse scale computers via sensible co-locations. In: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 248–259. ACM (2011)
23.
Zurück zum Zitat Merkel, D.: Docker: lightweight linux containers for consistent development and deployment. Linux J. 2014(239), 2 (2014) Merkel, D.: Docker: lightweight linux containers for consistent development and deployment. Linux J. 2014(239), 2 (2014)
24.
Zurück zum Zitat Min, J., et al.Cgroup++: enhancing I/O resource management of linux cgroup on Numa systems with NVMe SSDs. In Proceedings of the Posters and Demos Session of the 16th International Middleware Conference, p 7. ACM (2015) Min, J., et al.Cgroup++: enhancing I/O resource management of linux cgroup on Numa systems with NVMe SSDs. In Proceedings of the Posters and Demos Session of the 16th International Middleware Conference, p 7. ACM (2015)
25.
Zurück zum Zitat Pabla, C.S., et al.: Completely fair scheduler. Linux J. 2009(184), 4 (2009) Pabla, C.S., et al.: Completely fair scheduler. Linux J. 2009(184), 4 (2009)
27.
Zurück zum Zitat Reiss, C., Wilkes, J., Hellerstein, J.L.: Google cluster-usage traces: format+ schema, pp. 1–14. Google Inc., White Paper (2011) Reiss, C., Wilkes, J., Hellerstein, J.L.: Google cluster-usage traces: format+ schema, pp. 1–14. Google Inc., White Paper (2011)
28.
Zurück zum Zitat Reiss, C., et al.: Heterogeneity and dynamicity of clouds at scale: google trace analysis. In: Proceedings of the Third ACM Symposium on Cloud Computing, p 7. ACM (2012) Reiss, C., et al.: Heterogeneity and dynamicity of clouds at scale: google trace analysis. In: Proceedings of the Third ACM Symposium on Cloud Computing, p 7. ACM (2012)
29.
Zurück zum Zitat Toshniwal, A., et al.: Storm@ twitter. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 147–156. ACM (2014) Toshniwal, A., et al.: Storm@ twitter. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 147–156. ACM (2014)
30.
Zurück zum Zitat Vavilapalli, V.K., et al.: Apache Hadoop YARN: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing, p. 5. ACM (2013) Vavilapalli, V.K., et al.: Apache Hadoop YARN: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing, p. 5. ACM (2013)
31.
Zurück zum Zitat Vitucci, C., et al.: A Linux-based virtualized solution providing computing quality of service to SDN-NFV telecommunication applications. In: Proceedings of the 16th Real Time Linux Workshop (RTLWS 2014), pp. 12–13 (2014) Vitucci, C., et al.: A Linux-based virtualized solution providing computing quality of service to SDN-NFV telecommunication applications. In: Proceedings of the 16th Real Time Linux Workshop (RTLWS 2014), pp. 12–13 (2014)
32.
Zurück zum Zitat Xu, R., et al.: Pythia: improving datacenter utilization via precise contention prediction for multiple co-located workloads. In: Proceedings of the 19th International Middleware Conference, pp. 146–160. ACM (2018) Xu, R., et al.: Pythia: improving datacenter utilization via precise contention prediction for multiple co-located workloads. In: Proceedings of the 19th International Middleware Conference, pp. 146–160. ACM (2018)
Metadaten
Titel
Improving Resource Isolation of Critical Tasks in a Workload
verfasst von
Meghana Thiyyakat
Subramaniam Kalambur
Dinkar Sitaram
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-63171-0_3