Abstract
We consider policies for CPU load balancing in networks of workstations. We address the question of whether preemptive migration (migrating active processes) is necessary, or whether remote execution (migrating processes only at the time of birth) is sufficient for load balancing. We show that resolving this issue is strongly tied to understanding the process lifetime distribution. Our measurements indicate that the distribution of lifetimes for a UNIX process is Pareto (heavy-tailed), with a consistent functional form over a variety of workloads. We show how to apply this distribution to derive a preemptive migration policy that requires no hand-tuned parameters. We used a trace-driven simulation to show that our preemptive migration strategy is far more effective than remote execution, even when the memory transfer cost is high.
- ABRAHAMS, D. M. AND RIZZARDI, F. 1988. The Berkeley Interactive Statistical System. W. W. Norton.]]Google Scholar
- AGRAWAL, R. AND EZZET, A. 1987. Location independent remote execution in NEST. IEEE Trans. Softw. Eng. 13, 8 (Aug.), 905-912.]] Google Scholar
- AHMAD, I., GHAFOOR, A., AND MEHROTRA, K. 1991. Performance prediction of distributed load balancing on multicomputer systems. In Supercomputing. IEEE, New York, 830-839.]] Google Scholar
- ARTSY, Y. AND FINKEL, R. 1989. Designing a process migration facility: The Charlotte experience. IEEE Comput. 22, 1, 47-56.]] Google Scholar
- BARAK, A., SHAI, G., AND WHEELER, R.G. 1993. The MOSIX Distributed Operating System: Load Balancing for UNIX. Springer-Verlag, Berlin.]] Google Scholar
- BONOMI, F. AND KUMAR, A. 1990. Adaptive optimal load balancing in a nonhomogeneous multiserver system with a central job scheduler. IEEE Trans. Comput. 39, 10 (Oct.), 1232-1250.]] Google Scholar
- BRYANT, R. M. AND FINKEL, R.A. 1981. A stable distributed scheduling algorithm. In the 2nd International Conference on Distributed Computing Systems. 314-323.]]Google Scholar
- CABRERA, F. 1986. The influence of workload on load balancing strategies. In Proceedings of the Usenix Summer Conference (June). USENIX Assoc., Berkeley, Calif., 446-458.]]Google Scholar
- CASAS, J., CLARK, D. L., KONURU, R., OTTO, S. W., PROUTY, R. M., AND WALPOLE, J. 1995. Mpvm: A migration transparent version of pvm. Comput. Syst. 8, 2 (Spring), 171-216.]]Google Scholar
- CASAVANT, T. L. AND KUHL, J.G. 1987. Analysis of three dynamic distributed load-balancing strategies with varying global information requirements. In the 7th International Conference on Distributed Computing Systems (Sept.). IEEE, New York, 185-192.]]Google Scholar
- CHOWDHURY, S. 1990. The greedy load sharing algorithm. J. Parallel Distrib. Comput. 9, 93-99.]] Google Scholar
- DE PAOLI, D. AND GOSCINSKI, A. 1995. The rhodos migration facility. Tech. Rep. TR C95-36, School of Computing and Mathematics, Deakin Univ., Victoria, Australia. Available via http://www.cm.deakin.edu.au/rhodos/rh_tr95.html#C95-36. Submitted to the Journal of Systems and Software.]] Google Scholar
- DOUGLIS, F. AND OUSTERHOUT, g. 1991. Transparent process migration: Design alternatives and the Sprite implementation. Softw. Pract. Exper. 21, 8 (Aug.), 757-785.]] Google Scholar
- DOWNEY, A. B. AND HARCHOL-BALTER, M. 1995. A note on "The limited performance benefits of migrating active processes for load sharing." Tech. Rep. UCB/CSD-95-888, Univ. of California, Berkeley, Calif.]] Google Scholar
- EAGER, D. L., LAZOWSKA, E. D., AND ZHAORJAN, J. 1986. Adaptive load sharing in homogeneous distributed systems. IEEE Trans. Softw. Eng. 12, 5 (May), 662-675.]] Google Scholar
- EAGER, D. L., LAZOWSKA, E. D., AND ZAHORJAN, J. 1988. The limited performance benefits of migrating active processes for load sharing. In the ACM SIGMETRICS Conference on Measuring and Modeling of Computer Systems. ACM, New York, 662-675.]] Google Scholar
- EPEMA, D. 1995. An analysis of decay-usage scheduling in multiprocessors. In the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems. ACM, New York, 74-85.]] Google Scholar
- EVANS, D. J. AND BUTT, W. U. N. 1993. Dynamic load balancing using task-transfer probabilities. Parallel Comput. 19, 897-916.]] Google Scholar
- HAC, A. AND JIN, X. 1990. Dynamic load balancing in a distributed system using a senderinitiated algorithm. J. Syst. Softw. 11, 79-94.]] Google Scholar
- HENNESSY, J. L. AND PATTERSON, D.A. 1990. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Mateo, Calif.]] Google Scholar
- KRUEGER, P. AND LIVNY, M. 1988. A comparison of preemptive and non-preemptive load distributing. In the 8th International Conference on Distributed Computing Systems. IEEE, New York, 123-130.]]Google Scholar
- KUNZ, T. 1991. The influence of different workload descriptions on a heuristic load balancing scheme. IEEE Trans. Softw. Eng. 17, 7 (July), 725-730.]] Google Scholar
- LARSEN, R. g. AND MARX, M. L. 1986. An Introduction to Mathematical Statistics and Its Applications. 2nd ed. Prentice-Hall, Englewood Cliffs, N.J.]]Google Scholar
- LELAND, W. E. AND OTT, T. J. 1986. Load-balancing heuristics and process behavior. In Proceedings of Performance '86 and ACM SIGMETRICS. Vol. 14. ACM, New York, 54-69.]] Google Scholar
- LIN, H.-C. AND RAGHAVENDRA, C. 1993. A state-aggregation method for analyzing dynamic load-balancing policies. In the IEEE 13th International Conference on Distributed Computing Systems. IEEE, New York, 482-489.]]Google Scholar
- LITZKOW, M. AND LIVNY, M. 1990. Experience with the Condor distributed batch system. In the IEEE Workshop on Experimental Distributed Systems. IEEE, New York, 97-101.]]Google Scholar
- LITZKOW, M., LIVNY, M., AND MUTKA, M. 1988. Condor--A hunter of idle workstations. In the 8th International Conference on Distributed Computing Systems. IEEE, New York.]]Google Scholar
- LIVNY, M. AND MELMAN, M. 1982. Load balancing in homogeneous broadcast distributed systems. In the ACM Computer Network Performance Symposium (April). ACM, New York, 47-55.]] Google Scholar
- MILOJICIC, D.S. 1993. Load distribution: Implementation for the Mach microkernel. Ph.D. dissertation, Univ. of Kaiserslautern, Kaiserslautern, Germany.]]Google Scholar
- MIRCHANDANEY, R., TOWSLEY, D., AND STANKOVIC, J. A. 1990. Adaptive load sharing in heterogeneous distributed systems. J. Parallel Distrib. Comput. 9, 331-346.]] Google Scholar
- POWELL, M. AND MILLER, B. 1983. Process migrations in DEMOS/MP. In the 6th ACM Symposium on Operating Systems Principles (Nov.). ACM, New York, 110-119.]] Google Scholar
- PULIDAS, S., TOWSLEY, D., AND STANKOVIC, J.A. 1988. Imbedding gradient estimators in load balancing algorithms. In the 8th International Conference on Distributed Computing Systerns (June). IEEE, New York, 482-490.]]Google Scholar
- ROMMEL, C.G. 1991. The probability of load balancing success in a homogeneous network. IEEE Trans. Softw. Eng. 17, 922-933.]] Google Scholar
- ROSIN, R.F. 1965. Determining a computing center environment. Commun. ACM 8, 7.]] Google Scholar
- SILBERSCHATZ, A., PETERSON, J., AND GALVIN, P. 1994. Operating System Concepts. 4th ed. Addison-Wesley, Reading, Mass.]] Google Scholar
- SVENSSON, A. 1990. History, an intelligent load sharing filter. In the IEEE l Oth International Conference on Distributed Computing Systems. IEEE, New York, 546-553.]]Google Scholar
- TANENBAUM, A., VAN RENESSE, R., VAN STAVEREN, H., AND SHARP, G. 1990. Experiences with the Amoeba distributed operating system. Commun. ACM 33, IEEE, New York, 336-346.]] Google Scholar
- THEIMER, M. M., LANTZ, K. A., AND CHERITON, D.R. 1985. Preemptable remote execution facilities for the V-System. In the lOth ACM Symposium on Operating Systems Principles. ACM, New York, 2-12.]] Google Scholar
- THIEL, a. 1991. Locus operating system, a transparent system. Comput. Commun. 14, 6, 336 -346.]] Google Scholar
- VAHDAT, A. M., GHORMLEY, D. P., AND ANDERSON, T.E. 1994. Efficient, portable, and robust extension of operating system functionality. Tech. Rep. UCB/CSD-94-842, Univ. of California, Berkeley, Calif.]] Google Scholar
- WALDSPURGER, C. A. AND WEIHL, W. E. 1994. Lottery scheduling: Flexible proportionalshare resource management. In Proceedings of the 1st Symposium on Operating System Design and Implementation. 1-11.]] Google Scholar
- WANG, J., ZHOU, S., K. AHMED, AND LONG, W. 1993. LSBATCH: A distributed load sharing batch system. Tech. Rep. CSRI-286, Computer Systems Research Inst., Univ. of Toronto, Toronto, Canada. April.]]Google Scholar
- WANG, Y.-T. AND MORRIS, R.g. 1985. Load sharing in distributed systems. IEEE Trans. Comput. C-94, 3 (Mar.), 204-217.]]Google Scholar
- ZAYAS, E.R. 1987. Attacking the process migration bottleneck. In the 11th ACM Symposium on Operating Systems Principles. ACM, New York, 13-24.]] Google Scholar
- ZHANG, Y., HAKOZAKI, K., KAMEDA, H., AND SHIMIZU, K. 1995. A performance comparison of adaptive and static load balancing in heterogeneous distributed systems. In Proceedings of the 28th Annual Simulation Symposium. 332-340.]] Google Scholar
- ZHOU, S. 1987. Performance studies for dynamic load balancing in distributed systems. Ph.D. dissertation, Univ. of California, Berkeley, Calif.]] Google Scholar
- ZHOU, S. AND FERRARI, D. 1987. A measurement study of load balancing performance. In the IEEE 7th International Conference on Distributed Computing Systems. IEEE, New York, 490-497.]]Google Scholar
- ZHOU, S., WANG, J., ZHENG, X., AND DELISLE, P. 1993. Utopia: A load-sharing facility for large heterogeneous distributed computing systems. Softw. Pract. Exper. 23, 2 (Dec.), 1305-1336.]] Google Scholar
Index Terms
- Exploiting process lifetime distributions for dynamic load balancing
Recommendations
Exploiting process lifetime distributions for dynamic load balancing
We measure the distribution of lifetimes for UNIX processes and propose a functional form that fits this distribution well. We use this functional form to derive a policy for preemptive migration, and then use a trace-driven simulator to compare our ...
Exploiting process lifetime distributions for dynamic load balancing
SIGMETRICS '96: Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systemsWe measure the distribution of lifetimes for UNIX processes and propose a functional form that fits this distribution well. We use this functional form to derive a policy for preemptive migration, and then use a trace-driven simulator to compare our ...
Comments