Skip to main content
Top
Published in: International Journal of Parallel Programming 5/2021

04-05-2021

M-DRL: Deep Reinforcement Learning Based Coflow Traffic Scheduler with MLFQ Threshold Adaption

Authors: Tianba Chen, Wei Li, YuKang Sun, Yunchun Li

Published in: International Journal of Parallel Programming | Issue 5/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The coflow scheduling in data-parallel clusters can improve application-level communication performance. The existing coflow scheduling method without prior knowledge usually uses multi-level feedback queue (MLFQ) with fixed threshold parameters, which is insensitive to coflow traffic characteristics. Manual adjustment of the threshold parameters for different application scenarios often has long optimization period and is coarse in optimization granularity. We propose M-DRL, a deep reinforcement learning based coflow traffic scheduler by dynamically setting thresholds of MLFQ to adapt to the coflow traffic characteristics, and reduces the average coflow completion time. Trace-driven simulations on the public dataset show that coflow communication stages using M-DRL complete 2.08x(6.48x) and 1.36x(1.25x) faster on average coflow completion time (95-th percentile) in comparison to per-flow fairness and Aalo, and is comparable to SEBF with prior knowledge.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. arXiv preprint arXiv:1606.01540 (2016) Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)
2.
go back to reference Chen, L., Lingys, J., Chen, K., Liu, F.: Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization. In: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pp. 191–205 (2018) Chen, L., Lingys, J., Chen, K., Liu, F.: Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization. In: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pp. 191–205 (2018)
3.
go back to reference Chowdhury, M., Stoica, I.: Coflow: A networking abstraction for cluster applications. In: Proceedings of the 11th ACM Workshop on Hot Topics in Networks, pp. 31–36 (2012) Chowdhury, M., Stoica, I.: Coflow: A networking abstraction for cluster applications. In: Proceedings of the 11th ACM Workshop on Hot Topics in Networks, pp. 31–36 (2012)
4.
go back to reference Chowdhury, M., Stoica, I.: Efficient coflow scheduling without prior knowledge. ACM SIGCOMM Comput. Commun. Rev. 45(4), 393–406 (2015)CrossRef Chowdhury, M., Stoica, I.: Efficient coflow scheduling without prior knowledge. ACM SIGCOMM Comput. Commun. Rev. 45(4), 393–406 (2015)CrossRef
5.
go back to reference Chowdhury, M., Zaharia, M., Ma, J., Jordan, M.I., Stoica, I.: Managing data transfers in computer clusters with orchestra. ACM SIGCOMM Comput. Commun. Rev. 41(4), 98–109 (2011)CrossRef Chowdhury, M., Zaharia, M., Ma, J., Jordan, M.I., Stoica, I.: Managing data transfers in computer clusters with orchestra. ACM SIGCOMM Comput. Commun. Rev. 41(4), 98–109 (2011)CrossRef
6.
go back to reference Chowdhury, M., Zhong, Y., Stoica, I.: Efficient coflow scheduling with varys. In: Proceedings of the 2014 ACM Conference on SIGCOMM, pp. 443–454 (2014) Chowdhury, M., Zhong, Y., Stoica, I.: Efficient coflow scheduling with varys. In: Proceedings of the 2014 ACM Conference on SIGCOMM, pp. 443–454 (2014)
7.
go back to reference François-Lavet, V., Henderson, P., Islam, R., Bellemare, M.G., Pineau, J.: An introduction to deep reinforcement learning. arXiv preprint arXiv:1811.12560 (2018) François-Lavet, V., Henderson, P., Islam, R., Bellemare, M.G., Pineau, J.: An introduction to deep reinforcement learning. arXiv preprint arXiv:1811.12560 (2018)
8.
go back to reference Li, C., Zhang, H., Zhou, T.: Coflow scheduling algorithm based density peaks clustering. Future Gener. Comput. Sys. 97, 805–813 (2019)CrossRef Li, C., Zhang, H., Zhou, T.: Coflow scheduling algorithm based density peaks clustering. Future Gener. Comput. Sys. 97, 805–813 (2019)CrossRef
9.
go back to reference Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015) Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
10.
go back to reference Mao, H., Schwarzkopf, M., Venkatakrishnan, S.B., Meng, Z., Alizadeh, M.: Learning scheduling algorithms for data processing clusters. In: Proceedings of the ACM Special Interest Group on Data Communication, pp. 270–288 (2019) Mao, H., Schwarzkopf, M., Venkatakrishnan, S.B., Meng, Z., Alizadeh, M.: Learning scheduling algorithms for data processing clusters. In: Proceedings of the ACM Special Interest Group on Data Communication, pp. 270–288 (2019)
11.
go back to reference Penney, D.D., Chen, L.: A survey of machine learning applied to computer architecture design. arXiv preprint arXiv:1909.12373 (2019) Penney, D.D., Chen, L.: A survey of machine learning applied to computer architecture design. arXiv preprint arXiv:1909.12373 (2019)
12.
go back to reference Sivakumar, V., Rocktäschel, T., Miller, A.H., Küttler, H., Nardelli, N., Rabbat, M., Pineau, J., Riedel, S.: Mvfst-rl: An asynchronous rl framework for congestion control with delayed actions. arXiv preprint arXiv:1910.04054 (2019) Sivakumar, V., Rocktäschel, T., Miller, A.H., Küttler, H., Nardelli, N., Rabbat, M., Pineau, J., Riedel, S.: Mvfst-rl: An asynchronous rl framework for congestion control with delayed actions. arXiv preprint arXiv:1910.04054 (2019)
13.
go back to reference Wang, K., Zhou, Q., Guo, S., Luo, J.: Cluster frameworks for efficient scheduling and resource allocation in data center networks: a survey. IEEE Commun. Surv. Tutor. 20(4), 3560–3580 (2018)CrossRef Wang, K., Zhou, Q., Guo, S., Luo, J.: Cluster frameworks for efficient scheduling and resource allocation in data center networks: a survey. IEEE Commun. Surv. Tutor. 20(4), 3560–3580 (2018)CrossRef
14.
go back to reference Wang, S., Zhang, J., Huang, T., Liu, J., Pan, T., Liu, Y.: A survey of coflow scheduling schemes for data center networks. IEEE Commun. Mag. 56(6), 179–185 (2018)CrossRef Wang, S., Zhang, J., Huang, T., Liu, J., Pan, T., Liu, Y.: A survey of coflow scheduling schemes for data center networks. IEEE Commun. Mag. 56(6), 179–185 (2018)CrossRef
15.
go back to reference Zhang, H., Chen, L., Yi, B., Chen, K., Chowdhury, M., Geng, Y.: Coda: Toward automatically identifying and scheduling coflows in the dark. In: Proceedings of the 2016 ACM SIGCOMM Conference, pp. 160–173 (2016) Zhang, H., Chen, L., Yi, B., Chen, K., Chowdhury, M., Geng, Y.: Coda: Toward automatically identifying and scheduling coflows in the dark. In: Proceedings of the 2016 ACM SIGCOMM Conference, pp. 160–173 (2016)
Metadata
Title
M-DRL: Deep Reinforcement Learning Based Coflow Traffic Scheduler with MLFQ Threshold Adaption
Authors
Tianba Chen
Wei Li
YuKang Sun
Yunchun Li
Publication date
04-05-2021
Publisher
Springer US
Published in
International Journal of Parallel Programming / Issue 5/2021
Print ISSN: 0885-7458
Electronic ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-021-00711-4

Other articles of this Issue 5/2021

International Journal of Parallel Programming 5/2021 Go to the issue

Premium Partner