Skip to main content
Log in

Online Prediction of the Running Time of Tasks

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

We describe and evaluate the Running Time Advisor (RTA), a system that can predict the running time of a compute-bound task on a typical shared, unreserved commodity host. The prediction is computed from linear time series predictions of host load and takes the form of a confidence interval that neatly expresses the error associated with the measurement and prediction processes – error that must be captured to make statistically valid decisions based on the predictions. Adaptive applications make such decisions in pursuit of consistent high performance, choosing, for example, the host where a task is most likely to meet its deadline. We begin by describing the system and summarizing the results of our previously published work on host load prediction. We then describe our algorithm for computing predictions of running time from host load predictions. We next evaluate the system using over 100,000 randomized testcases run on 39 different hosts, finding that is indeed capable of computing correct and useful confidence intervals. Finally, we report on our experience with using the RTA in application-oriented real-time scheduling in distributed systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. M. Aeschlimann, P. Dinda, L. Kallivokas, J. Lopez, B. Lowekamp and D. O'Hallaron, Preliminary report on the design of a framework for distributed visualization, in: Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'99 (Las Vegas, NV, (1999) pp. 1833–1839.

  2. F. Berman and R. Wolski, Scheduling from the perspective of the application, in: Proc. of the 5th IEEE Symposium on High Performance Distributed Computing (HPDC96) (1996) pp. 100–111.

  3. G.E.P. Box, G.M. Jenkins and G. Reinsel, Time Series Analysis: Forecasting and Control, 3rd Ed. (Prentice Hall, 1994).

  4. P. Dinda, B. Lowekamp, L. Kallivokas and D. O'Hallaron, The case for prediction-based best-effort real-time systems, in: Proc. of the 7th International Workshop on Parallel and Distributed Real-Time Systems (WPDRTS 1999), Lecture Notes in Computer Science, Vol. 1586 (Springer, San Juan, PR, 1999) pp. 309–318. Extended version available as CMU Technical Report CMU-CS-TR-98-174.

    Google Scholar 

  5. P.A. Dinda, The statistical properties of host load, Scientific Programming 7(3,4) (1999). A version of this paper is also available as CMU Technical Report CMU-CS-TR-98-175.

  6. P.A. Dinda, Resource signal prediction and its application to realtime scheduling advisors, Ph.D. thesis, School of Computer Science, Carnegie Mellon University (2000); Carnegie Mellon University Computer Science Department Technical Report CMU-CS-00-131.

  7. P.A. Dinda and D.R. O'Hallaron, An extensible toolkit for resource prediction in distributed systems, Technical Report CMU-CS-99-138, School of Computer Science, Carnegie Mellon University (1999).

  8. P.A. Dinda and D.R. O'Hallaron, Host load prediction using linear models, Cluster Computing 3(4) (2000).

  9. P.A. Dinda and D.R. O'Hallaron, Realistic CPU workloads through host load trace playback, in: Proc. of 5th Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers (LCR2000), Lecture Notes in Computer Science, Vol. 1915 (Rochester, New York), 2000) pp. 246–259.

    Google Scholar 

  10. A.C. Dusseau, R.H. Arpaci and D.E. Culler, Effective distributed scheduling of parallel workloads, in: Proc. of the 1996 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (1996) pp. 25-36.

  11. D.L. Eager, E.D. Lazowska and J. Zahorjan, The limited performance benefits of migrating active processes for load sharing, in: SIGMETRICS '88 (1988) pp. 63-72.

  12. I. Foster and C. Kesselman (eds.), The Grid: Blueprint for a New Computing Infrastructure (Morgan Kaufmann, 1999).

  13. M. Harchol-Balter and A.B. Downey, Exploiting process lifetime distributions for dynamic load balancing, in: Proc. of ACM SIGMETRICS '96 (1996) pp. 13-24.

  14. R. Jain, The Art of Computer Systems Performance Analysis (Wiley, 1991).

  15. J.F. Kurose and R. Chipalkatti, Load sharing in soft real-time distributed computer systems, IEEE Transactions on Computers C36(8) (1987) 993–1000.

  16. W.E. Leland and T.J. Ott, Load-balancing heuristics and process behavior, in: Proceedings of ACM SIGMETRICS, Vol. 14 (1986) pp. 54-69.

  17. C.L. Liu and J.W. Layland, Scheduling algorithms for multiprogramming in a hard real-time environment, Journal of the ACM 20(1) (1973) 46–61.

    Google Scholar 

  18. J. Lopez and D. O'Hallaron, Runtime support for adaptive heavyweight services, in: Proc. of 5th Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers (LCR2000), Lecture Notes in Computer Science, Vol. 1915 (Rochester, NY, 2000) pp. 221–234.

    Google Scholar 

  19. B. Lowekamp, N. Miller, D. Sutherland, T. Gross, P. Steenkiste and J. Subhlok, A resource monitoring system for network-aware applications, in: Proc. of the 7th IEEE International Symposium on High Performance Distributed Computing (HPDC) (1998) pp. 189-196.

  20. M.W. Mutka and M. Livny, The available capacity of a privately owned workstation environment, Performance Evaluation 12(4) (1991) 269–284.

    Google Scholar 

  21. Object Management Group, The Common Object Request Broker: Architecture and Specification (version 2.3.1); Technical report, Object Management Group (1999).

  22. K. Ramamrithham, J.A. Stankovic and W. Zhao, Distributed scheduling of tasks with deadlines and resource requirements, IEEE Transactions on Computer Systems 38(8) (1989) 1110–1123.

    Google Scholar 

  23. M. Rinard, D. Scales and M. Lam, Jade: A high-level machineindependent language for parallel programming, IEEE Computer 26(6) (1993) 28–38.

    Google Scholar 

  24. J.M. Schopf and F. Berman, Stochastic scheduling, in: Proc. of Supercomputing'99 (1999). Also available as Northwestern University Computer Science Department Technical Report CS-99-03.

  25. B. Siegell and P. Steenkiste, Automatic generation of parallel programs with dynamic load balancing, in: Proc. of the 3rd International Symposium on High-Performance Distributed Computing (1994) pp. 166-175.

  26. J. Stankovic, K. Ramamritham, D. Niehaus, M. Humphrey and G.Wallace, The spring system: Integrated support for complex real-time systems, Real-Time Systems Journal 16(2/3) (1999).

  27. R. Wolski, Forecasting network performance to support dynamic scheduling using the network weather service, in: Proc. of the 6th High-Performance Distributed Computing Conference (HPDC97) (1997) pp. 316-325. Extended version available as UCSD Technical Report TR-CS96-494.

  28. R. Wolski, N. Spring and J. Hayes, Predicting the CPU availability of time-shared unix systems, in: Proceedings of the 8th IEEE Symposium on High Performance Distributed Computing (HPDC99) (1999) pp. 105-112. UCSD Technical Report Number CS98-602.

  29. J.A. Zinky, D.E. Bakken and R.E. Schantz, Architectural support for quality of service for CORBA objects, Theory and Practice of Object Systems 3(1) (1997) 55–73.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dinda, P.A. Online Prediction of the Running Time of Tasks. Cluster Computing 5, 225–236 (2002). https://doi.org/10.1023/A:1015634802585

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1015634802585

Navigation