ABSTRACT
This paper examines the plausibility of using a network of workstations (NOW) for a mixture of parallel and sequential jobs. Through simulations, our study examines issues that arise when combining these two workloads on a single platform. Starting from a dedicated NOW just for parallel programs, we incrementally relax uniprogramming restrictions until we have a multi-programmed, multi-user NOW for both interactive sequential users and parallel programs. We show that a number of issues associated with the distributed NOW environment (e.g., daemon activity, coscheduling skew) can have a small but noticeable effect on parallel program performance. We also find that efficient migration to idle workstations is necessary to maintain acceptable parallel application performance. Furthermore, we present a methodology for deriving an optimal delay time for recruiting idle machines for use by parallel programs; this recruitment threshold was just 3 minutes for the research cluster we measured. Finally, we quantify the effects of the additional parallel load upon interactive users by keeping track of the potential number of user delays in our simulations. When we limit the maximum number of delays per user, we can still maintain acceptable parallel program performance. In summary, we find that for our workloads a 2:1 rule applies: a NOW cluster of approximately 60 machines can sustain a 32-node parallel workload in addition to the sequential load placed upon it by interactive users.
- Anderson et al 1993.Anderson, T., Owicki, S., Saxe, J., and Thacker, C. High Speed Switch Scheduling for Local Area Networks. In A CM Transactions on Computer Systems, pp. 319-352, 1993.]] Google ScholarDigital Library
- Ashok & Zahorjan 1992.Ashok, I. and Zahorjan, J. Scheduling a Mixed Interactive and Batch Workload on a Parallel, Shared-Memory Supercomputer. In Proceedings of Supercompu~ing '92, pp. 616-625, November 1992.]] Google ScholarDigital Library
- Biagioni et al 1993.Biagioni, E., Cooper, E., and Sansom, R. Designing a Practical ATM LAN. IEEE Network, 7(2):32-39, March 1993.]]Google ScholarDigital Library
- Blelloch et al 1991.Blelloch, G., Leiserson, C., and Maggs, B. A Comparison of Sorting Algorithms for the Connection Machine CM-2. In Symposium on Parallel Algorithms and Architectures, July 1991.]] Google ScholarDigital Library
- Blumrich et al 1994.Blumrich, M., Li, K., Alpert, R., Dubnicki, C., Felten, E., and Sandberg, J. Virtual Memory Mapped Network Interface for the SHRIMP Multicomputer. In Proceedings of ~he 21st International Symposium on Computer Architecture, April 1994.]] Google ScholarDigital Library
- Boden et al 1995.Boden, N. J., Cohen, D., Felderman, R. E., Kulawik, A. E., Seitz, C. L., Seizovic, J. N., and Su, W.-K. Myrinet--A Gigabet-per-Second Local-Area Network. IEEE Micro, 15(1):29-36, February 1995.]] Google ScholarDigital Library
- Carriero & Gelernter 1989.Carriero, N. and Gelernter, D. Linda in Context. Communications of the A CM, April 1989.]] Google ScholarDigital Library
- Chandra et al 1994.Chandra, R., Devine, S., Verghese, B., Gupta, A., and Rosenblum, M. Scheduling and Page Migration for Multiprocessor Computer Servers. In Proceedings of the 6th International Conference on Architectural Support }or Programming Languages and Operating Systems, pp. 12-24, October 1994.]] Google ScholarDigital Library
- Chiang et al 1994.Chiang, S.-H., Mansharamani, R. K., and Vernon, M. K. Use of Application Characteristics and Limited Preemption for Run-To-Completion Parallel Processor Scheduling Policies. In Proceedings of the I994 A CM SIGMETRICS Conference, pp. 33- 44, February 1994.]] Google ScholarDigital Library
- Cox et al 1994.Cox, A. L., Dwarkadas, S., Keleher, P., Lu, H., Rajarnony, R., and Zwaenepoel, W. Software Versus Hardware Shared-Memory implementation: A Case Study. In Proceedings o/the ~lst International Symposium on Computer Architecture, pp. 106-117, April 1994.]] Google ScholarDigital Library
- Crovella et al 1991.Crovella, M., Das, P., Dubnicld, C., LeBlanc, T., and Markatos, E. Multiprogrammlng on multiprocessors. Technical Report 385, University of Rochester, Computer Science Department, February 1991. Revised May.]] Google ScholarDigital Library
- Culler et al 1993a.Culler, D., Dusseau, A., Goldsteln, S., Krislmamurthy, A., Lumetta, S., yon Eicken, T., and Yelick, K. Parallel Programming in Split-C. In Proceedings of Supercomputing '93, 1993.]] Google ScholarDigital Library
- Culler et al 1993b.Culler, D. E., Karp, R. M., Patterson, D. A., Sahay, A., Schauser, K. E., Santos, E., Subramonian, R., and yon Eicken, T. LogP: Towards a Realistic Model of Parallel Computation. In Fourth A CM SIGPLAN Symposium on Principles and Practice of Parallel Programming, May 1993.]] Google ScholarDigital Library
- Culler et al 1994.Culler, D., Dusseau, A., Martin, R., and Schauser, K. Portability and Performance .for Parallel Processing, chapter 4: Fast Parallel Sorting under LogP: from Theory to Practice, pp. 71-98. John Wiley & Sons Ltd., 1994.]]Google Scholar
- Dahlin et al 1994.Dahlin, M., Wang, R., Anderson, T., and Patterson, D. Cooperative Caching: Using Remote Client Memory to Improve File System Performance. In Proceedings o} the First Conference on Operating Systems: Design and Implementation, November 1994.]] Google ScholarDigital Library
- Delisle 1994.Delisle, P. Personal Communication, October 1994. $]]Google Scholar
- Douglas et al 1993.Douglas, C. C., Mattson, T. G., and Schultz, M. H. Parallel Programming Systems for Workstation Clusters. Technical Report 975, Yale University, Computer Science Department, August 1993.]]Google Scholar
- Douglis & Ousterhout 1991.Douglis, F. and Ousterhout, J. Transparent Process Migration: Design Alternatives and the Sprite Implementation. Software - Practice and Experience, 21(8):757-85, August 1991.]] Google ScholarDigital Library
- Eager et al 1986.Eager, D. L., Lazowska, E. D., and Zahorjan, J. Adaptive Load Shating in Homogeneous Distributed Systems. IEEE Transactions on Software Engineering, 12(5):662-675, May 1986.]] Google ScholarDigital Library
- Feitelson & Rudolph 1992.Feitelson, D. G. and Rudolph, L. Gang Scheduling Performance Benefits for Fine- Grained Synchronization. Journal of Parallel and Distributed Computing, 16(4):306-18, December 1992.]]Google ScholarCross Ref
- Felten & Zahorjan 1991.Felten, E. W. and Zahorjan, J. issues in the Implementation of a Remote Paging System. Technical Report 91-03-09, University of Washington, Department of Computer Science, March 1991.]]Google Scholar
- Gelernter 1985.Gelernter, D. Parallel Programming in Linda. In Proceeding of the International Conference on Parallel Processing, pp. 255-263, August 1985.]]Google Scholar
- Gupta et al 1991.Gupta, A., Tucker, A., and Urushibara, S. The Impact of Operating System Scheduling Policies and Synchronization Methods on the Performance of Parallel Applications. In Proceedings of the A CM SIGMETRICS Conference, pp. 120-32, May 1991.]] Google ScholarDigital Library
- Krishnamurthy et al 1994.Krishnamurthy, A., Lumetta, S., Culler, D., and Katherine, K. Y. Connected Components on Distributed Memory Machines. In The Third DIMACS International Algorithm Implementation Challenge, 1994.]]Google Scholar
- Kronenberg et al 1986.Kronenberg, N. P., Levy, H. M., and Strecker, W. D. VAXclusters. A Closely-Coupled Distributed System. A CM Transactions on Computer Systems, 4(2), 1986.]] Google ScholarDigital Library
- Lamport 1990.Lamport, L. Concurrent Reading and Writing of Clocks. A CM Transactions on Computer Systems, 8(4):305-310, April 1990.]] Google ScholarDigital Library
- Lazowska 1994.Lazowska, E. Yet Another Workstation Network. Talk presented at the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, October 1994.]]Google Scholar
- Leighton 1985.Leighton, T. Tight Bounds on the Complexity of Parallel Sorting. IEEE Transactions on Computers, April 1985.]] Google ScholarDigital Library
- Leiserson 1992.Leiserson, C. E. et al. The Network Architecture. of the Connection Machine CM-5. T.n Symposium on Parallel Algorithms and Architectures, April 1992.]] Google ScholarDigital Library
- Leutenegger & Sun 1993.Leutenegger, S. T. and Sun, X.- H. Distributed Computing Feasibility in a Non- Dedicated Homogenous Distributed System. In Supercomputing 93, 1993.]] Google ScholarDigital Library
- Leutenegger & Vernon 1990.Leutenegger, S. T. and Vernon, M.K. The performance of multiprogrammed multiprocessor scheduling policies. In Proceedings of the A CM SIGMETRICS Conference, pp. 226-36, May 1990.]] Google ScholarDigital Library
- Litzkow & Solomon 1992.Litzkow, M. and Solomon, M. Supporting Checkpointing and Process Migration Outside the UNIX Kernel. In Winter 1992 USENIX Conference, pp. 283-290, January 1992.]]Google Scholar
- Martin 1994.Martin, R. P. HPAM: An Active Message Layer for a Network of Workstations. In Proceedings of the 2nd Hot interconnects Conference, July 1994.]]Google Scholar
- McCann & Zahorjan 1994.McCann, C. and Zahorjan, J. Processor Allocation Policies for Message-Passing Parallel Computers. In Proceedings of the 1994 A CM SIG- METRICS Conference, pp. 19-32, February 1994.]] Google ScholarDigital Library
- Mutka & Livny 1991.Mutka, M. M. and Livny, M. The Available Capacity of a Privately Owned Workstation Environment. Performance Evaluation, 12(4):269-84, July 1991.]] Google ScholarDigital Library
- Naik et al 1993.Naik, V. K., Setia, S. K., and Squillante, M. S. Performance Analysis of Job Scheduling Policies in Parallel Supercomputing Environments. In Proceedings of Supercomputing '93, pp. 824-833, November 1993.]] Google ScholarDigital Library
- Nichols 1987.Nichols, D. Using idle Workstations in a Shared Computing Environment. In Proceedings of the Eleventh A CM Symposium on Operating Systems Principles, pp. 5-12, November 1987.]] Google ScholarDigital Library
- Ousterhout 1982.Ousterhout, J. K. Scheduling Techniques for Concurrent Systems. In Third International Conference on Distributed Computing Systems, pp. 22-30, May 1982.]]Google Scholar
- Peris et al 1994.Peris, V. G., Squillante, M. S., and Naik, V. K. Analysis of the Impact of Memory in Distributed Parallel Processing Systems. In Proceedings of the 199,~ A CM SIGMETRICS Conference, pp. 5-18, February 1994.]] Google ScholarDigital Library
- Reinhardt et al 1994.Reinhardt, S. K., Larus, J. R., and Wood, D.A. Tempest and Typhoon: User-Level Shared Memory. In Proceedings of the 21st International Symposium on Computer Architecture, pp. 325-336, April 1994.]] Google ScholarDigital Library
- Sevcik 1989.Sevcik, K. C. Characterizations of Parallelism in Applications and their Use in Scheduling. In Proceedings of the I989 A CM SIGMETRICS and PERFOR- MANCE Conference on Measurement and Modeling of Computer Systems, pp. 171-180, May 1989.]] Google ScholarDigital Library
- Sunderam 1990.Sunderam, V. PVM: A Framework for Parallel Distributed Computing. Concurrency: Practice and Experience, 2(4):315-339, December 1990.]] Google ScholarDigital Library
- Theimer & Lantz 1989.Theimer, M. M. and Lantz, K. A. Finding Idle Machines in a Workstation-Based Distributed System. IEEE Transactiona on Software Engineering, 15(11):1444-57, November 1989.]] Google ScholarDigital Library
- Theimer et al 1985.Theimer, M., Landtz, K., and Cheriton, D. Preemptable Remote Execution Facilities for the V System. In Proceedings of the I Oth A CM Symposium on Operating Systems Principles, pp. 2-12, December 1985.]] Google ScholarDigital Library
- Tucker & Gupta 1989.Tucker, A. and Gupta, A. Process control and scheduling issues for multiprogrammed shared-memory multiprocessors. Operating Systems Review, 23(5):159-66, 1989.]] Google ScholarDigital Library
- von Eicken et al 1992.yon Eicken, T., Culler, D. E., Goldstein, S. C., and Schauser, K. E. Active Messages: a Mechanism for Integrated Communication and Computation. In Proceedings of the 19th International Symposium on Computer Architecture, Gold Coast, Australia, May 1992.]] Google ScholarDigital Library
- von Eicken et al 1994.von Eicken, T., Avula, V., Basu, A., and Buch, V. Low-Latency Communication over ATM Netowrks using Active Messages. In Proceedings of Hot Interconnects II, Stanford, CA, August 1994.]]Google Scholar
- Zhou et al 1992.Zhou, S., Wang, J., Zheng, X., and Delisle, P. Utopia: A Load Sharing Facility for Large, Heterogeneous Distributed Computing Systems. Technical Report CSRI-257, University of Toronto, 1992.]]Google Scholar
Index Terms
- The interaction of parallel and sequential workloads on a network of workstations
Recommendations
The interaction of parallel and sequential workloads on a network of workstations
This paper examines the plausibility of using a network of workstations (NOW) for a mixture of parallel and sequential jobs. Through simulations, our study examines issues that arise when combining these two workloads on a single platform. Starting from ...
Comments