nach oben

Cluster Computing

Erschienen in:

01.12.2016

A memory-driven scheduling scheme and optimization for concurrent execution in GPU

verfasst von: Bao-yu Xu, Wu Zhang, Xian-he Sun, Yang Wang

Erschienen in: Cluster Computing | Ausgabe 4/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Concurrent execution of GPU tasks is available in modern GPU device. However, limited device memory is an obvious bottleneck in executing many GPU tasks. And the task priority and system performance are often ignored. To address these, a real-time GPU scheduling scheme is proposed in this paper. A reservation algorithm based on device memory(RBDM) is adopted to provide more opportunity for the High-priority task in the scheme. high priority first wake (HPFW) and small memory HPFW (SM-HPFW) are employed in the scheduling of waiting tasks to improve the priority response time and system performance. A CPU-based monitor is developed to check the GPU task execution. Experiments show the RBDM can work effectively. Compared with FIFO, HPFW can decrease overall priority response time significantly. Overall task completion time can be reduced by 20 % using the SM-HPFW while the distribution of device memory requirement of GPU tasks is even.

Vorheriger Artikel On the role of application and resource characterizations in heterogeneous distributed computing systems

Nächster Artikel Nature-inspired multimedia service composition in a media cloud-based healthcare environment

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Adhianto, L., Banerjee, S., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., Tallent, N.R.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr. Comput. Pract. Exp. 22, 685–701 (2010). doi:10.1002/cpe

Chong, E.K.P.: Performance for imprecise evaluation computer of scheduling systems algorithms. J. Syst. Softw. 15, 261–277 (1991)CrossRef

Eswaran, A., Rajkumar, R.: Energy-aware memory firewalling for QoS-sensitive application. Proc. Euromicro Conf. Real-Time Syst. 2005, 11–20 (2005). doi:10.1109/ECRTS.2005.14 CrossRef

Fang, W., Lau, K.K., Lu, M., Xiao, X., Lam, C.K., Yang, P.Y., He, B., Luo, Q., Sander, P.V., Yang, K.: Parallel data mining on graphics processors. Ph.D. thesis, Hong Kong University (2008). http://gpuminer.googlecode.com/files/gpuminer.pdf

Hardy, D., Puaut, I.: Predictable code and data paging for real time systems. In: Proceedings—Euromicro Conference on Real-Time Systems, pp. 266–275 (2008). doi:10.1109/ECRTS.2008.16

Hung, C.L., Hua, G.J.: Local alignment tool based on Hadoop framework and GPU architecture. BioMed Res. Int. 2014, 1–7 (2014). doi:10.1155/2014/541490

Jog, A., Bolotin, E., Guz, Z., Parker, M., Keckler, S.W., Kandermir, M.T., Das, C.R.: Application-aware memory system for fair and efficient execution of concurrent GPGPU applications. In: Workshop on General Purpose Processing Using GPUs(GPGPU-7), pp. 1–8 (2014). doi:10.1145/2576779.2576780

Joo, W., Shin, D.: Resource-constrained spatial multi-tasking for embedded GPU. In: 2014 IEEE International Conference on Consumer Electronics (ICCE), pp. 2010–2011 (2014)

Kato, S., Lakshmanan, K., Rajkumar, R.R., Ishikawa, Y.: TimeGraph: GPU scheduling for real-time multi-tasking environments. In: 2011 USENIX Annual Technical Conference (USENIX ATC11), p. 17 (2011)

10.

Kim, H., Rajkumar, R.: Shared-page management for improving the temporal isolation of memory reservations in resource kernels. In: Proceedings—18th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA 2012—2nd Workshop on Cyber-Physical Systems, Networks, and Applications, CPSNA, pp. 310–319 (2012). doi:10.1109/RTCSA.2012.50

11.

Kim, H., Rajkumar, R.: Memory reservation and shared page management for real-time systems. J. Syst. Archit. 60(2), 165–178 (2014). doi:10.1016/j.sysarc.2013.07.002 CrossRef

12.

Lindholm, E.N.: Nvidia tesla:aunified graphics and computing architecture. Micro IEEE 28(0272–1732), 39–55 (2008)CrossRef

13.

Mokhtari, R., Stumm, M.: BigKernel—high performance CPU-GPU communication pipelining for big data-style applications. In: Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp. 819–828 (2014). doi:10.1109/IPDPS.2014.89

14.

Nvidia: NVIDIA’s Next Generation CUDA Compute Architecture:Kepler GK110. http://www.nvidia.com/content/PDF/kepler/NVIDIA-kepler-GK110-Architecture-Whitepaper.pdf

15.

Nvidia: Whitepaper NVIDIAs Next Generation CUDA Compute Architecture:Fermi (2009). doi:10.1016/j.immuni.2005.11.006. http://www.nvidia.com

16.

Nvidia: Cuda c programming guide (2013). http://docs.nvidia.com/cuda/cuda-c-programming-guide

17.

O’Neil, M.a., Burtscher, M.: Floating-point data compression at 75 Gb/s on a GPU. In: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units - GPGPU-4, pp. 1–7 (2011). doi:10.1145/1964179.1964189. http://portal.acm.org/citation.cfm?doid=1964179.1964189

18.

Rixner, S., Dally, W.J., Kapasi, U.J., Mattson, P., Owens, J.D.: Memory access scheduling. In: Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201), vol. 27, pp. 1–11 (2000). :10.1145/342001.339668

19.

Stuart, J.a., Owens, J.D.: Multi-GPU MapReduce on GPU clusters. In: Proceedings—25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011, pp. 1068–1079 (2011). doi:10.1109/IPDPS.2011.102

20.

Sun, X.H., Wang, D.: Concurrent average memory access time. IEEE Comput. 47(5), 74–80 (2014)CrossRef

21.

Volkov, V., Demmel, J., Berkeley, U.C.: Benchmarking g GPUs to Tune Dense Linear Algebra. In: Proceedings of the 2008 ACM/IEEE Conference on Superconducting (SC ’08), pp. 1–11 (2008)

22.

Yazdanpanah, H.: Evaluation performance of task scheduling algorithms in heterogeneous environments. Int. J. Comput. Appl. 138(8), 1–9 (2016)

Titel: A memory-driven scheduling scheme and optimization for concurrent execution in GPU
verfasst von: Bao-yu Xu
Wu Zhang
Xian-he Sun
Yang Wang
Publikationsdatum: 01.12.2016
Verlag: Springer US
Erschienen in: Cluster Computing / Ausgabe 4/2016
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI: https://doi.org/10.1007/s10586-016-0656-8

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2016

High-performance XML modeling of parallel queries based on MapReduce framework

Event energy clustering and evaluation based on shock wave model

Adaptive measurement method for data popularity in distributed systems

The application of terms mining technique to clustering participant’s character patterns in the enterprise management

A method analysis for hail cloudy prediction based on CNN

Energy efficiency evaluation method based on multi-model fusion strategy

Premium Partner