article

Implicit coscheduling: coordinated scheduling with implicit information in distributed systems

Author:
Andrea Carol Arpaci-Dusseau

University of Wisconsin--Madison

University of Wisconsin--Madison
View Profile

Authors Info & Claims

ACM Transactions on Computer Systems Volume 19 Issue 3pp 283–331https://doi.org/10.1145/380749.380764

Published:01 August 2001Publication History

ACM Transactions on Computer Systems

Abstract

In modern distributed systems, coordinated time-sharing is required for communicating processes to leverage the performance of switch-based networks and low-overhead protocols. Coordinated time-sharing has traditionally been achieved with gang scheduling or explicit coscheduling, implementations of which often suffer from many deficiencies: multiple points of failure, high context-switch overheads, and poor interaction with client-server, interactive, and I/O -intensive workloads. Implicit coscheduling dynamically coordinates communicating processes across distributed machines without these structural deficiencies. In implicit coscheduling, no communication is required across operating systems schedulers; instead, cooperating processes achieve coordination by reacting to implicit information carried by communication existing within the parallel application. The implementation of this approach is simple and allows participating nodes to act autonomously. We introduce two key mechanisms in implicit coscheduling. The first is conditional two-phase waiting, a generalization of traditional two-phase waiting in which spin-time may be increased depending upon events occuring while the process waits. The second is an extension to stride scheduling that provides preemption and is fair to processes that block. To demonstrate that implicit coscheduling performs well, we show results from an extensive set of simulation and implementation experiments. To exercise the conditional two-phase waiting algorithm, we examine three workloads: bulk-synchronous and continuous-communication synthetic applications and application kernels written in the Split-C language. To exercise the local scheduler, we examine competing jobs with different communication characteristics. We demonstrate that our implementation scales well with the number of jobs and workstations and is robust to process placement. Our experiments show that implicit coscheduling is effective and fair for a wide range of workloads; most perform within 30% of an idealized model of gang scheduling.

References

ALEXANDROV, A., IONESCU, M., SCHAUSER,K.E.,AND SCHEIMAN, C. 1995. LogGP: Incorporating Long Messages into the LogP model-One step closer towards a realistic model for parallel computation. In 7th Annual Symposium on Parallel Algorithms and Architectures (SPAA'95) (July 1995).]] Google ScholarDigital Library
ANDERSON, T. E., CULLER, D. E., PATTERSON,D.A.,AND THE NOW TEAM. 1995a. A Case for NOW (Networks of Workstations). IEEE Micro 15, 1 (February), 54-64.]] Google ScholarDigital Library
ANDERSON, T. E., DAHLIN,M.D.,NEEFE, J. M., PATTERSON, D. A., ROSELLI,D.S.,AND WANG, R. Y. 1995b. Serverless Network File Systems. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (December 1995), pp. 109-126.]] Google ScholarDigital Library
ARPACI, R. H., DUSSEAU,A.C.,VAHDAT, A. M., LIU,L.T.,ANDERSON,T.E.,AND PATTERSON, D. A. 1995. The Interaction of Parallel and Sequential Workloads on a Network of Workstations. In Proceedings of ACM SIGMETRICS'95/PERFORMANCE'95 Joint International Conference on Measurement and Modeling of Computer Systems (May 1995), pp. 267-278.]] Google ScholarDigital Library
ARPACI-DUSSEAU,A.AND CULLER, D. 1997. Extending Proportional-Share Scheduling to a Network of Workstations. In International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA) (Las Vegas, Nevada, June 1997).]]Google Scholar
ARPACI-DUSSEAU, A. C. 1998. Implicit Coscheduling: Coordinated Scheduling with Implicit Information in Distributed Systems. Ph. D. thesis, University of California at Berkeley.]] Google ScholarDigital Library
ARPACI-DUSSEAU,A.C.,ARPACI-DUSSEAU, R. H., CULLER, D. E., HELLERSTEIN,J.M.,AND PATTERSON,D.P. 1997. High-Performance Sorting on Networks of Workstations. In Proceedings of the 1997 ACM SIGMOD Conference (1997), pp. 243-254.]] Google ScholarDigital Library
BIAGIONI, E., COOPER, E., AND SANSOM, R. 1993. Designing a Practical ATMLAN. IEEE Network 7,2 (March), 32-39.]]Google ScholarDigital Library
BODEN,N.J.,COHEN, D., FELDERMAN, R. E., KULAWIK, A. E., SEITZ, C. L., SEIZOVIC,J.N.,AND SU,W.-K. 1995. Myrinet--A Gigabit-per-Second Local-Area Network. IEEE Micro 15, 1 (February), 29- 38.]] Google ScholarDigital Library
BREWER,E.A.AND KUSZMAUL, B. C. 1993. How to Get Good Performance from the CM5 Data Network. In Proceedings of the 1994 International Parallel Processing Symposium (Cancun, Mexico, April 1993), pp. 858-867.]] Google ScholarDigital Library
BUCHANAN,M.AND CHIEN, A. 1997. Coordinated Thread Scheduling for Workstation Clusters Under Windows NT. In Proceedings of USENIX Windows NT Workshop (Aug. 1997).]] Google ScholarDigital Library
BURGER,D.C.,HYDER,R.S.,MILLER,B.P.,AND WOOD, D. A. 1994. Paging Tradeoffs in Distributed- Shared-Memory Multiprocessors. In Supercomputing'94 (Nov. 1994).]]Google Scholar
CROVELLA, M., DAS, P., DUBNICKI, C., LEBLANC,T.,AND MARKATOS, E. 1991. Multiprogramming on Multiprocessors. Technical Report 385 (February), University of Rochester, Computer Science Department.]] Google ScholarDigital Library
CULLER, D., ARPACI-DUSSEAU, A., ARPACI-DUSSEAU, R., CHUN, B., LUMETTA, S., MAINWARING, A., MARTIN, R., YOSHIKAWA,C.,AND WONG, F. 1997. Parallel Computing on the Berkeley NOW. In Ninth Joint Symposium on Parallel Processing (Kobe, Japan, May 1997).]]Google Scholar
CULLER, D. E., DUSSEAU, A., GOLDSTEIN,S.C.,KRISHNAMURTHY, A., LUMETTA,S.,VON EICKEN,T.,AND YELICK, K. 1993a. Parallel Programming in Split-C. In Proceedings of Supercomputing '93 (1993), pp. 262-273.]] Google ScholarDigital Library
CULLER, D. E., KARP, R. M., PATTERSON, D. A., SAHAY, A., SCHAUSER, K. E., SANTOS, E., SUBRAMONIAN, R., AND VON EICKEN, T. 1993b. LogP: Towards a Realistic Model of Parallel Computation. In Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (May 1993), pp. 262-273.]] Google ScholarDigital Library
DENNING,D.E.AND DENNING, P. J. 1979. Data Security. Computing Surveys 11, 2 (Sept.), 227-249.]] Google ScholarDigital Library
DEWITT,D.AND GRAY, J. 1992. Parallel Database Systems: The Future of High-Performance Database Systems. Communications of the ACM 35, 6 (June), 85-98.]] Google ScholarDigital Library
DOWDY, L. 1988. On the Partitioning of Multiprocessor Systems. Technical Report 88-06 (July), Department of Computer Science, Vanderbilt University.]]Google Scholar
DUSSEAU,A.C.,ARPACI,R.H.,AND CULLER, D. E. 1996a. Effective Distributed Scheduling of Parallel Workloads. In Proceedings of 1996 ACM Sigmetrics International Conference on Measurement and Modeling of Computer Systems (1996).]] Google ScholarDigital Library
DUSSEAU,A.C.,CULLER, D. E., SCHAUSER,K.E.,AND MARTIN, R. P. 1996b. Fast Parallel Sorting Under LogP: Experience with the CM-5. IEEE Transactions on Parallel and Distributed Systems 7,8 (August), 791-805.]] Google ScholarDigital Library
EYKHOLT, J. R., KLEIMAN, S. R., BARTON, S., VOLL, J., FAULKNER, R., SHIVALINGIAH, A., SMITH, M., STEIN, D., WEEKS, M., AND WILLIAMS, D. 1992. Beyond Multiprocessing: Multithreading the SunOS Kernel. In Proceedings of the Summer 1992 USENIX Technical Conference and Exhibition (San Antontio, TX, USA, June 1992), pp. 11-18.]]Google Scholar
FEITELSON, D. G. 1995. A Survey of Scheduling in Multiprogrammed Parallel Systems. Research Report RC 19790 (87657) (February), IBM T. J. Watson Research Center, Yorktown Heights, NY. Second Revision, August 1997.]]Google Scholar
FEITELSON,D.G.AND JETTE, M. 1997. Improved Utilization and Responsiveness with Gang Scheduling. In Proceedings of the IPPS '97 Workshop on Job Scheduling Strategies for Parallel Processing (1997).]] Google ScholarDigital Library
FEITELSON,D.G.AND RUDOLPH, L. 1990. Distributed Hierarchical Control for Parallel Processing. IEEE Computer 23, 5 (May), 65-77.]] Google ScholarDigital Library
FEITELSON,D.G.AND RUDOLPH, L. 1992. Gang Scheduling Performance Benefits for Fine- Grained Synchronization. Journal of Parallel and Distributed Computing 16, 4 (December), 306-18.]]Google ScholarCross Ref
FEITELSON,D.G.AND RUDOLPH, L. 1995. Coscheduling Based on Run-Time Identification of Activity Working Sets. International Journal of Parallel Programming 23, 2 (April), 136-160.]] Google ScholarDigital Library
GHORMLEY,D.P.,PETROU, D., RODRIGUES, S. H., VAHDAT,A.M.,AND ANDERSON, T. E. 1989. GLU- nix: A Global Layer Unix for a Network of Workstations. In Software Practice and Experience (1989).]] Google ScholarDigital Library
GOODHEART,B.AND COX, J. 1994. The Magic Garden Explained: The Internals of UNIX System V Release 4. Prentice Hall.]] Google ScholarDigital Library
HELLERSTEIN, J. L. 1993. Achieving Service Rate Objectives with Decay Usage Scheduling. IEEE Transactions on Software Engineering 19, 8 (August), 813-825.]] Google ScholarDigital Library
HILL,M.D.,LARUS, J. R., REINHARDT,S.K.,AND WOOD, D. A. 1993. Cooperative Shared Memory: oftware and Hardware for Scalable Multiprocessors. ACM Transactions on Computer Systems 11, 4 (November), 300-18.]] Google ScholarDigital Library
HORI, A., TEZUKA, H., AND ISHIKAWA, Y. 1997. Global State Detection using Network Preemption. In Proceedings of the IPPS '97 Workshop on Job Scheduling Strategies for Parallel Processing (1997).]] Google ScholarDigital Library
JACOBSON, V. 1988. Congestion avoidance and control. In SIGCOMM '88 Symposium: Communications Architectures and Protocols (Aug. 1988), pp. 314-29.]] Google ScholarDigital Library
KARLIN, A. R., MANASSE, M., MCGEOCH, L., AND OWICKI, S. 1994. Competitive Randomized Algorithms For Nonuniform Problems. Algorithmica 11, 6 (June), 542-71.]]Google ScholarDigital Library
KAY,J.AND LAUDER, P. 1988. A Fair Share Scheduler. Communications of the ACM31, 1 (January), 44-55.]] Google ScholarDigital Library
LEE, W., FRANK, M., LEE, V., MACKENZIE, K., AND RUDOLPH, L. 1997. Implications of I/O for Gang Scheduled Workloads. In Proceedings of the IPPS '97 Workshop on Job Scheduling Strategies for Parallel Processing (1997).]] Google ScholarDigital Library
LEUTENEGGER,S.T.AND VERNON, M. K. 1990. The Performance of Multiprogrammed Multiprocessor Scheduling Policies. In Proceedings of the 1990 ACM SIGMETRICS Conference (May 1990), pp. 226-36.]] Google ScholarDigital Library
MAINWARING, A. M. 1995. Active Message Application Programming Interface and Communication Subsystem Organization. Master's thesis, University of California, Berkeley.]]Google Scholar
MARTIN,R.P.,VAHDAT, A. M., CULLER,D.E.,AND ANDERSON, T. E. 1997. Effects of Communication Latency, Overhead, and Bandwidth in a Cluster Architecture. In Proceedings of the 24th Annual International Symposium on Computer Architecture (Denver, CO, June 1997).]] Google ScholarDigital Library
NELSON,R.AND TOWSLEY, D. 1993. A Performance Evaluation of Several Priority Policies for Parallel Processing Systems. Journal of the ACM 40, 3 (July), 714-740.]] Google ScholarDigital Library
NIEH, J., HANKO,J.G.,NORTHCUTT,J.D.,AND WALL, G. A. 1993. SVR4 Unix Scheduler Unacceptable for Multimedia Applications. In Proceedings of the Workshop on Network and Operating System Support for Digital Audio and Video (Nov. 1993).]] Google ScholarDigital Library
OUSTERHOUT, J. K. 1982. Scheduling Techniques for Concurrent Systems. In Third International Conference on Distributed Computing Systems (May 1982), pp. 22-30.]]Google Scholar
PAKIN, S., LAURIA, M., AND CHIEN, A. 1995. High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet. In Proceedings of the 1995 Supercomputing Conference (San Diego, California, 1995).]] Google ScholarDigital Library
ROSTI, E., SERAZZI, G., SMIRNI, E., AND SQUILLANTE, M. S. 1998. The Impact of I/O on Program Behavior and Parallel Scheduling. In SIGMETRICS '98/PERFORMANCE'98 (June 1998).]] Google ScholarDigital Library
SOBALVARRO,P.G.AND WEIHL, W. E. 1995. Demand-based Coscheduling of Parallel Jobs on Multi-programmed Multiprocessors. In Proceedings of the IPPS '95 Workshop on Job Scheduling Strategies for Parallel Processing (April 1995), pp. 63-75.]] Google ScholarDigital Library
SOBALVARRO,P.G.,PAKIN, S., WEIHL,W.E.,AND CHIEN, A. A. 1998. Dynamic Coscheduling on Workstation Clusters. In Proceedings of the IPPS '98 Workshop on Job Scheduling Strategies for Parallel Processing (1998).]] Google ScholarDigital Library
SUNDERAM, V. 1990. PVM: A Framework for Parallel Distributed Computing. Concurrency: Practice and Experience 2, 4 (December), 315-339.]] Google ScholarDigital Library
THE MPI FORUM. 1993. MPI: A Message Passing Interface. In Proceedings of Supercomputing '93 (Nov. 1993), pp. 878-883.]] Google ScholarDigital Library
VON EICKEN, T., BASU, A., BUCH,V.,AND VOGELS, W. 1995. U-Net: A User-Level Network Interface for Parallel and Distributed Computing. In Proceedings of the Fifteenth SOSP (Copper Mountain, CO, December 1995), pp. 40-53.]] Google ScholarDigital Library
VON EICKEN, T., CULLER, D. E., GOLDSTEIN,S.C.,AND SCHAUSER, K. E. 1992. Active Messages: a Mechanism for Integrated Communication and Computation. In Proceedings of the 19th International Symposium on Computer Architecture (Gold Coast, Australia, May 1992).]] Google ScholarDigital Library
WALDSPURGER, C. A. 1995. Lottery and Stride Scheduling: Flexible Proportional-Share Resource Management. Ph. D. thesis, Massachusetts Institute of Technology. Also appears as Technical Report MIT/LCS/TR-667.]] Google ScholarDigital Library
WALDSPURGER,C.A.AND WEIHL, W. E. 1996. An Object-Oriented Framework for Modular Resource Management. In 5th Workshop on Object-Orientation in Operating Systems (IWOOOS '96) (October 1996).]] Google ScholarDigital Library
WANG, F., PAPAEFTHYMIOU, M., AND SQUILLANTE, M. 1997. Performance Evaluation of Gang Scheduling for Parallel and Distributed Multiprogramming. In Proceedings of the IPPS '97 Workshop on Job Scheduling Strategies for Parallel Processing (1997).]] Google ScholarDigital Library
WARREN,M.S.,BECKER,D.J.,GODA,M.P.,SALMON,J.K.,AND STERLING, T. 1997. Parallel Supercomputing with Commodity Components. In International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA) (Las Vegas, Nevada, June 1997), pp. 1372- 1381.]]Google Scholar
WONG,F.C.,ARPACI-DUSSEAU,A.C.,AND CULLER, D. E. 1999. Building MPI for Multi-Programming Systems using Implicit Information. In Proceedings of The 6th European PVM/MPI User's Group Meeting (Barcelona, Spain, Sept. 1999).]] Google ScholarDigital Library

Index Terms

Implicit coscheduling: coordinated scheduling with implicit information in distributed systems
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management
        Scheduling

Recommendations

A runtime resolution scheme for priority boost conflict in implicit coscheduling

High-performance parallel and scientific applications are composed of multiple processes running on distinct CPUs that communicate frequently. Due to the synchronization needs of such applications, performance is greatly hampered if their processes are ...
Read More
Coscheduling in Clusters: Is It a Viable Alternative?
SC '04: Proceedings of the 2004 ACM/IEEE conference on Supercomputing

In this paper, we conduct an in-depth evaluation of a broad spectrum of scheduling alternatives for clusters. These include the widely used batch scheduling, local scheduling, gang scheduling, all prior communication-driven coscheduling algorithms (...
Read More
Adaptive Parallel Job Scheduling with Flexible Coscheduling

Many scientific and high-performance computing applications consist of multiple processes running on different processors that communicate frequently. Because of their synchronization needs, these applications can suffer severe performance penalties if ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Computer Systems Volume 19, Issue 3
Aug. 2001
130 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/380749
Editor:
Larry L. Peterson
Princeton Univ., Princeton, NJ
Issue’s Table of Contents
Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 August 2001
Published in tocs Volume 19, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clusters
coscheduling
gang scheduling
networks of workstations
proportional-share scheduling
two-phase waiting
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 56
  Total Citations
  View Citations
- 1,396
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Implicit coscheduling: coordinated scheduling with implicit information in distributed systems

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

A runtime resolution scheme for priority boost conflict in implicit coscheduling

Coscheduling in Clusters: Is It a Viable Alternative?

Adaptive Parallel Job Scheduling with Flexible Coscheduling

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Implicit coscheduling: coordinated scheduling with implicit information in distributed systems

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

A runtime resolution scheme for priority boost conflict in implicit coscheduling

Coscheduling in Clusters: Is It a Viable Alternative?

Adaptive Parallel Job Scheduling with Flexible Coscheduling

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media