Top

The Journal of Supercomputing

Published in:

01-06-2014

Scheduling parallel jobs on multicore clusters using CPU oversubscription

Authors: Gladys Utrera, Julita Corbalan, Jesús Labarta

Published in: The Journal of Supercomputing | Issue 3/2014

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Job scheduling strategies in multiprocessing systems aim to minimize waiting times of jobs while satisfying user requirements in terms of number of execution units. However, the lack of flexibility in the requests leaves the scheduler a reduced margin of action for scheduling decisions. Many of such decisions consist on just moving ahead some specific jobs in the wait queue. In this work, we propose a job scheduling strategy that improves the overall performance and maximizes resource utilization by allowing jobs to adapt to variations in the load through CPU oversubscription and backfilling. The experimental evaluations include both real executions on multicore clusters and simulations of workload traces from real production systems. The results show that our strategy provides significant improvements over previous proposals like Gang Scheduling with Backfilling, especially in medium to high workloads with strong variations.

previous article Adaptive global power optimization for Web servers

next article A task migration mechanism for distributed many-core operating systems

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

We assume that two processes are executed twice slower on a single CPU than on two separated CPUs. Different studies indicate that this time could be by far less than twice [5, 6], since computation and communication can be overlapped.

Top500 supercomputers sites (2013) http://www.top500.org/. Accessed 7 Mar 2014

Feitelson DG, Rudolph L, Schwiegelshohn U, Sevcik KC, Wong P (1997) Theory and practice in parallel job scheduling. Lect Notes Comput Sci 1291:1–34

Feitelson DG, Rudolph L (1996) Toward convergence in job schedulers for parallel supercomputers. In: Job scheduling strategies for parallel processing. Springer-Verlag, New York, pp 1–26

Utrera G, Corbalán J, Labarta J (2004) Implementing malleability on MPI jobs. In: Proceedings of the 13th international conference on parallel architectures and compilation techniques, PACT ’04. IEEE Computer Society, Washington, pp 215–224

Subotic V, Labarta J, Valero M (2010) Simulation environment for studying overlap of communication and computation. Performance analysis of systems and software (ISPASS). In: 2010 IEEE international symposium on White Plains, pp 115–116

Utrera G, Tabik S, Corbalán J, Labarta J (2011) A job scheduling approach to reduce waiting times. In: Technical report, Technical University of Catalonia (UPC-DAC-RR-2012-1, October 2011) http://www.ac.upc.edu/app/research-reports/html/2012/1/tec_repor_092011. Accessed 18 Feb 2012

Feitelson DG, Rudolph L (1992) Gang scheduling performance benefits for fine-grain synchronization. J Parallel Distrib Comput 16(4):306–318CrossRefMATH

Zhang Y, Franke H, Moreira J, Sivasubramaniam A (2001) An integrated approach to parallel scheduling using gang-scheduling, backfilling, and migration. Job scheduling strategies for parallel processing. In: Feitelson D, Rudolph L (eds) Lecture notes in computer science, vol 2221. Springer, Berlin, Heidelberg, pp 133–158

Parallel workload archive (2013) http://www.cs.huji.ac.il/labs/parallel/workload/. Accessed 7 Mar 2014

10.

Buisson J, Sonmez O, Mohamed H, Lammers W, Epema D (2007) Scheduling malleable applications in multicluster systems. In: Proceedings of the IEEE international conference on cluster computing 2007, pp 372–381

11.

Cera MC, Georgiou Y, Richard O, Maillard N, Navaux POA (2010) Supporting malleability in parallel architectures with dynamic cpusets mapping and dynamic MPI. In: Proceedings of the 11th international conference on distributed computing and networking, ICDCN’10. Springer-Verlag, Berlin, Heidelberg, pp 242–257

12.

El Maghraoui K, Desell TJ, Szymanski BK, Varela CA (2007) Dynamic malleability in iterative MPI applications. In: Proceedings of the 7th IEEE international symposium on cluster computing and the grid, CCGRID ’07. IEEE Computer Society, Washington, pp 591–598

13.

Iancu C, Hofmeyr S, Zheng Y, Blagojevic F (2010) Oversubscription on multicore processors. In: 24th international parallel and distributed processing symposium (IPDPS), pp 1–11

14.

Padhye J, Dowdy LW (1996) Dynamic versus adaptive processor allocation policies for message passing parallel computers: an empirical comparison. In: Proceedings of the workshop on job scheduling strategies for parallel processing. Springer-Verlag, London, pp 224–243

15.

Cirne W, Berman F (2002) Using moldability to improve the performance of supercomputer jobs. J Parallel Distrib Comput 62:1571–1601CrossRefMATH

16.

Downey AB (1997) A model for speedup of parallel programs. In: Technical report, University of California, Berkerley

17.

Sodan AC, Jin W (2010) Backfilling with fairness and slack for parallel job scheduling. J Phys Conf Ser 256(1):012–023

18.

Sudarsan R, Ribbens CJ (2009) Scheduling resizable parallel applications. In: International parallel and distributed processing symposium, pp 1–10

19.

McCann C, Zahorjan J (1994) Processor allocation policies for message-passing parallel computers. In: Proceedings of the 1994 ACM SIGMETRICS conference on measurement and modeling of computer systems, SIGMETRICS ’94. ACM, New York, pp 19–32

20.

NAS Parallel Benchmarks (2013) http://www.nas.nasa.gov/Resources/Software/npb.html. Accessed 7 Mar 2014

21.

Wiseman Y, Feitelson DG (2003) Paired gang scheduling. IEEE Trans Parallel Distrib Syst 14(6):581–592CrossRef

22.

Arpaci-Dusseau AC (2001) Implicit coscheduling: coordinated scheduling with implicit information in distributed systems. ACM Trans Comput Syst 19:283–331CrossRef

23.

Zhang Y, Sivasubramaniam A, Moreira J, Franke H (2000) A simulation-based study of scheduling mechanisms for a dynamic cluster environment. In: Proceedings of the 14th international conference on supercomputing, ICS’00. ACM, New York, pp 100–109

24.

Utrera G, Corbalán J, Labarta J (2004) Scheduling of MPI applications: self-co-scheduling. In: Proceedings of the Euro-Par 2004 conference, 31th August–3rd September 2004, Italy. Lecture notes in computer science, vol 3149, pp 238–245. Springer, New York

25.

Utrera G, Tabik S, Corbalán J, Labarta J (2012) A job scheduling approach for multi-core clusters based on virtual malleability. In: Euro-Par, pp 191–203

26.

Lifka DA (1995) The ANL/IBM SP scheduling system. In: Job scheduling strategies for parallel processing. Springer Berlin, Heidelberg, pp 295–303 (1995)

27.

Mu’alem AW, Feitelson DG (2001) Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans Parallel Distrib Syst 12(6):529–543CrossRef

28.

MPI library (2013) http://www.mcs.anl.gov/research/projects/mpi/. Accessed 7 Mar 2014

29.

MacDougall MH (1987) Simulating computer systems: techniques and tools. MIT Press, Cambridge

30.

Subhlok J, Venkataramaiah S, Singh A (2002) Characterizing NAS benchmark performance on shared heterogeneous networks. In: Proceedings of the 16th international parallel and distributed processing symposium, IPDPS ’02. IEEE Computer Society, Washington, pp 91

31.

Intel MPI Benchmarks (2011) http://software.intel.com/en-us/articles/intel-mpi-benchmarks. Accessed 7 Mar 2014

Title: Scheduling parallel jobs on multicore clusters using CPU oversubscription
Authors: Gladys Utrera
Julita Corbalan
Jesús Labarta
Publication date: 01-06-2014
Publisher: Springer US
Published in: The Journal of Supercomputing / Issue 3/2014
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-014-1142-9

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 3/2014

Buffering in proxy mobile IPv6: implementation and analysis

Cache vulnerability mitigation using an adaptive cache coherence protocol

A Matrix–Matrix Multiplication methodology for single/multi-core architectures using SIMD

Adaptive global power optimization for Web servers

Cloud-based architecture for web applications with load forecasting mechanism: a use case on the e-learning services of a distant university

A self-adaptive resources selection model through a small-world based heuristic

Premium Partner