nach oben

Cluster Computing

Erschienen in:

24.03.2017

A BSP model graph processing system on many cores

verfasst von: Siyan Lai, Guangda Lai, Fangzhou Lu, Guojun Shen, Jing Jin, Xiaola Lin

Erschienen in: Cluster Computing | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Large-scale graph processing plays an increasingly important role for many data-related applications. Recently GPU has been adopted to accelerate various graph processing algorithms. However, since the architecture of GPU is very different from traditional computing model, the learning threshold for developing GPU-based applications is high. In this paper, we propose a GPU-based parallel graph processing system named GPregel to tackle this challenge. GPregel is a BSP model in graph processing such as Pregel from Google. It harnesses a lightweight compiler to hide the underlying complexity of the parallel processing details and simplifies programming, so that it greatly reduces the difficulty in utilizing the GPU to solve graph computing problems. Moreover, GPregel develops several optimizations for enhancing the performance, including (1) a special storage model for BSP model running on GPU, which overcomes the execution divergence and irregular memory access by coarse-grained designs; (2) a warp-level optimal strategy Parallelized-Messages-Sending and a thread-level optimal strategy Threads-Merge-Executing to accelerate the computations of high degree vertexes and low degree vertexes respectively; (3) messages copy mechanism optimization that utilizes a shared array and a rolling array to speed up the messages copy. Experiments demonstrate that GPregel can achieve high performance with little work for developers.

Vorheriger Artikel An improved CSMA/CA algorithm based on WSNs of the drug control system

Nächster Artikel An immune-inspired political boycotts action prediction paradigm

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Kapre, N., Mehta, N., Rizzo, D., Eslick, I., Rubin, R., Uribe, T.E., DeHon, A.: GraphStep: A system architecture for sparse-graph algorithms. In: Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’06), pp. 143–151 (2006)

Bader, D.A., Madduri, K.: GTgraph: A Synthetic Graph Generator Suite, Atlanta (2006)

Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146 (2010)

Zhong, J., He, B.: Medusa: simplified graph processing on GPUs. IEEE Trans. Parallel Distrib. Syst. 25(6), 1543–1552 (2014)MathSciNetCrossRef

Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)CrossRef

Harish, P., Narayanan, P.J.: Accelerating Large Graph Algorithms on the GPU Using CUDA. Lecture Notes in Computer Science, pp. 197–208. Springer, Berlin (2007)

He, G., Feng, H., Li, C., Chen, H.: Parallel SimRank computation on large graphs with iterative aggregation. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 543–552, ACM (2010)

Katz, G.J., Kider, Jr, J.T.: All-pairs shortest-paths for large graphs on the GPU. In: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware, pp. 47–55 (2008)

Vineet, V., Narayanan, P.J.: CUDA cuts: fast graph cuts on the GPU. In: Proceedings of the IEEE Computer Society Computer Vision and Pattern Recognition Workshops, pp. 1–8 (2008)

10.

Protocol Buffers: Google’s data interchange format. https://code.google.com/p/GPregel/

11.

Apache thrift: http://thrift.apache.org/

12.

Nvidia.: CUDA C Programming Guide version 8.0. (2016)

13.

Bell, N., Hoberock, J.: Thrust: a productivity-oriented library for CUDA 26. In: Kirk, D., Hwu, W. (eds.) Programming Massively Parallel Processors, 2nd edn, pp. 339–358. Elsevier, Amsterdam (2013)CrossRef

14.

Satish, N., Harris, M., Garland, M.: Designing efficient sorting algorithms for manycore GPUs. In: Proceedings of the IEEE International Symposium on Parallel & Distributed Processing. IPDPS 2009, pp. 1–10 (2009)

15.

Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Stanford InfoLab (1999)

16.

Mtibaa, A., May, M., Diot, C., Ammar, M.: PeopleRank: social opportunistic forwarding. IEEE Int. Conf. Comput. Commun. 54(1), 1–5 (2010)MATH

17.

Jones, S.: Introduction to dynamic parallelism. In: GPU Technology Conference Presentation S, vol. 338, p. 2012 (2012)

18.

Dale, J.: CUDA function overheads. http://visionexperts.blogspot.com/2009/07/cuda-function-overheads.html

19.

Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: a recursive model for graph mining. In: Proceedings of the Fourth SIAM International Conference on Data Mining SDM’ 04 (2004)

20.

Amaral, L.A.N., Scala, A., Barthélémy, M., Stanley, H.E.: Classes of small-world networks. Proc. Natl. Acad. Sci. USA 97(21), 11149–11152 (2000)CrossRef

21.

Erdos, P., Renyi, A.: On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. 5, 17–60 (1960)MathSciNetMATH

22.

Stanford Large Network Dataset Collection: http://snap.stanford.edu/data/index.html

23.

Harris, M., Sengupta, S., Owens, J.D.: Parallel prefix sum (scan) with CUDA. In: GPU Gems 3 (2007)

24.

Gregor, D., Lumsdaine, A.: The parallel BGL: a generic library for distributed graph computations. In: Parallel Object-Oriented Scientific Computing (POOSC) (2005)

25.

Berry, J., Mackey, G.: MultiThreaded graph library (MTGL). https://software.sandia.gov/trac/mtgl

26.

Apache Incubator Giraph: http://incubator.apache.org/giraph/

27.

GoldenOrb: http://www.raveldata.com/goldenorb/

28.

Salihoglu, S., Widom J.: GPS: a graph processing system *. Stanford InfoLab (2013)

29.

Bu, Y., Borkar, V., Jia, J., Carey, M.J., Condie, T.: Pregelix: big (ger) graph analytics on a dataflow engine. Proc. VLDB Endow. 8(2), 161–172 (2014)CrossRef

30.

Phoebus: https://github.com/xslogic/phoebus

31.

Hong, S., Kyun, S., Tayo, K., Olukotun, O.K.: Accelerating CUDA graph algorithms at maximum warp. In: PPoPP, vol. 46, no. 8, pp. 267–276 (2011)

32.

Luo, L., Wong, M., Hwu, W.: An effective GPU implementation of breadth-first search. In: Proceedings of the 47th ACM/IEEE Design Automation Conference (DAC), pp. 52–55 (2010)

33.

Merrill, D., Garland, M., Grimshaw, A.: High-performance and scalable GPU graph traversal. In: Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’12), vol. 47, no. 8, pp. 117–128 (2011)

34.

Liu, H., Huang, H.H.: Enterprise: breadth-first graph traversal on gpus. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, p. 68 (2015)

35.

Liu, H., Huang, H.H., Hu, Y.: iBFS: concurrent breadth-first search on GPUs. In: Proceedings of the 2016 International Conference on Management of Data, ACM, pp. 403–416 (2016)

36.

Wang, J., Rubin, N., Sidelnik, A., Yalamanchili, S.: Laperm: Locality aware scheduler for dynamic parallelism on gpus. In: Proceedings of the 43rd International Symposium on Computer Architecture, pp. 583–595 (2016)

37.

Tang, X., Pattnaik, A., Jiang, H., Kayiran, O., Jog, A., Sreepathi Pai, M.I., Das, C.R.: Controlled Kernel Launch for dynamic parallelism in GPUs. In: Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, ACM (2017)

38.

El Hajj, I., Gómez-Luna, J., Li, C., Chang, L.W., Milojicic, D., Hwu, W.M.: KLAP: kernel launch aggregation and promotion for optimizing dynamic parallelism. In: Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 1–12 (2016)

39.

Wang, Y., Pan, Y., Davidson, A., Wu, Y., Yang, C., Wang, L., Owens, J. D.: Gunrock: GPU graph analytics. arXiv preprint arXiv:1701.01170 (2017)

Titel: A BSP model graph processing system on many cores
verfasst von: Siyan Lai
Guangda Lai
Fangzhou Lu
Guojun Shen
Jing Jin
Xiaola Lin
Publikationsdatum: 24.03.2017
Verlag: Springer US
Erschienen in: Cluster Computing / Ausgabe 2/2017
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI: https://doi.org/10.1007/s10586-017-0829-0

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2017

Ant colony optimization with different crossover schemes for global optimization

Bonded-cluster simulation of rock-cutting using PFC2D

A practical cross-datacenter fault-tolerance algorithm in the cloud storage system

High-performance data mining with intelligent SSD

An improved CSMA/CA algorithm based on WSNs of the drug control system

Trend analysis of variations in carbon stock using stock big data