ABSTRACT
Job scheduling is a key building block of a cloud data center. Hand-crafted heuristics cannot automatically adapt to the change of the environment and optimize for specific workloads. We present the DeepJS, a job scheduling algorithm based on deep reinforcement learning under the framework of the bin packing problem. DeepJS can automatically obtain a fitness calculation method which will minimize the makespan (maximize the throughput) of a set of jobs directly from experience. Through a trace-driven simulation, we demonstrate the convergence and generalization of DeepJS and the essence of DeepJS learning. The results prove that DeepJS outperforms the heuristic-based job scheduling algorithms.
- Alibaba. (n.d.). Alibaba/clusterdata. Retrieved from https://github.com/alibaba/clusterdata/tree/master/cluster-trace-v2017Google Scholar
- Hadoop: Fair Scheduler. (n.d.). Retrieved April 12, 2019, from https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.htmlGoogle Scholar
- Ghodsi, A., Zaharia, M., Hindman, B., Konwinski, A., Shenker, S., and Stoica, I. 2011. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types. In Nsdi (Vol. 11, No. 2011, pp. 24--24). Google ScholarDigital Library
- Chen, W., Xu, Y., and Wu, X. 2017. Deep reinforcement learning for multi-resource multi-machine job scheduling. arXiv preprint arXiv:1711.07440.Google Scholar
- Rao, J., Bu, X., Xu, C. Z., Wang, L., and Yin, G. 2009, June. VCONF: a reinforcement learning approach to virtual machines auto-configuration. In Proceedings of the 6th international conference on Autonomic computing (pp. 137--146). ACM. Google ScholarDigital Library
- Yazdanov, L., and Fetzer, C. 2013, June. Vscaler: Autonomic virtual machine scaling. In 2013 IEEE Sixth International Conference on Cloud Computing (pp. 212--219). IEEE. Google ScholarDigital Library
- Basu, D., Wang, X., Hong, Y., Chen, H., and Bressan, S. 2019. Learn-as-you-go with megh: Efficient live migration of virtual machines. IEEE Transactions on Parallel and Distributed Systems.Google ScholarCross Ref
- Duggan, M., Duggan, J., Howley, E., and Barrett, E. 2017. A reinforcement learning approach for the scheduling of live migration from under utilised hosts. Memetic Computing, 9(4), 283--293.Google ScholarCross Ref
- Grandl, R., Ananthanarayanan, G., Kandula, S., Rao, S., and Akella, A. 2015. Multi-resource packing for cluster schedulers. ACM SIGCOMM Computer Communication Review, 44(4), 455--466. Google ScholarDigital Library
- Lu, C., Ye, K., Xu, G., Xu, C. Z., and Bai, T. 2017. Imbalance in the cloud: An analysis on alibaba cluster trace. In 2017 IEEE International Conference on Big Data (Big Data) (pp. 2884--2892). IEEE.Google Scholar
- Mao, H., Alizadeh, M., Menache, I., and Kandula, S. 2016. Resource management with deep reinforcement learning. In Proceedings of the 15th ACM Workshop on Hot Topics in Networks (pp. 50--56). ACM. Google ScholarDigital Library
- Netflix. (n.d.). Netflix/Fenzo. Retrieved from https://github.com/Netflix/Fenzo/wikiGoogle Scholar
- Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... and Dieleman, S. 2016. Mastering the game of Go with deep neural networks and tree search. nature, 529(7587), 484.Google Scholar
- Bin packing problem. (2019, February 21). Retrieved from https://en.wikipedia.org/wiki/Bin_packing_problemGoogle Scholar
Index Terms
- DeepJS: Job Scheduling Based on Deep Reinforcement Learning in Cloud Data Center
Recommendations
Modified Rate-Monotonic Algorithm for Scheduling Periodic Jobs with Deferred Deadlines
The deadline of a request is the time instant at which its execution must complete. The deadline of the request in any period of a job with deferred deadline is some time instant after the end of the period. The authors describe a semi-static priority-...
Deep Reinforcement Learning for Multi-resource Cloud Job Scheduling
Neural Information ProcessingAbstractThe resource scheduling problem in the cloud environment has always been a difficult and hot research field of cloud computing. The difficult problem of online decision-making tasks for resource management in a complex cloud environment can be ...
Job scheduling to minimize the weighted waiting time variance of jobs
This study considers the job scheduling problem of minimizing the weighted waiting time variance (WWTV) of jobs. It is an extension of WTV minimization problems in which we schedule a batch of n jobs, for servicing on a single resource, in such a way ...
Comments