Skip to main content
Erschienen in: International Journal of Parallel Programming 4/2016

01.08.2016

Achieving Optimal Inter-Node Communication in Graph Partitioning Using Random Selection and Breadth-First Search

verfasst von: Srimanth Gadde, William Acosta, Jordan Ringenberg, Robert Green, Vijay Devabhaktuni

Erschienen in: International Journal of Parallel Programming | Ausgabe 4/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Processing large graph datasets represents an increasingly important area in computing research and applications. The size of many graph datasets has increased well beyond the processing capacity of a single computing node, thereby necessitating distributed approaches. As these datasets are processed over a distributed system of nodes, this leads to an inter-node communication cost problem that negatively affects system performance. Previously proposed algorithms implemented breadth-first search (BFS) for graph searching and focused on the execution, parallel performance and not the communication. In this paper a new methodology is proposed that combines BFS with random selection in order to partition large graph datasets and effectively minimize inter-node communication. The new method is discussed and applied to the single-source shortest path and PageRank algorithms using three graphs that are representative of real-world scenarios. Experimental results show that graph inter-node communication for canonical graphs representative of real-world data is improved up to 42 % in case of Powerlaw graph, up to 27 % in case of Random near K-regular graph (with low degree), and up to 7 % in case of Random near K-regular graph (with high degree).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Kang, U., Tsourakakis, C.E., Faloutsos, C.: PEGASUS: a peta-scale graph mining system implementation and observations. Data mining, 2009. In: Ninth IEEE International Conference on ICDM’09. IEEE (2009) Kang, U., Tsourakakis, C.E., Faloutsos, C.: PEGASUS: a peta-scale graph mining system implementation and observations. Data mining, 2009. In: Ninth IEEE International Conference on ICDM’09. IEEE (2009)
2.
Zurück zum Zitat Kang, U., et al.: HADI: Fast diameter estimation and mining in massive graphs with Hadoop. Carnegie Mellon University, School of Computer Science, Machine Learning Department (2008) Kang, U., et al.: HADI: Fast diameter estimation and mining in massive graphs with Hadoop. Carnegie Mellon University, School of Computer Science, Machine Learning Department (2008)
3.
Zurück zum Zitat Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 International Conference on Management of data. ACM (2010) Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 International Conference on Management of data. ACM (2010)
4.
Zurück zum Zitat Shao, B., Wang, H., Li, Y.: The Trinity graph engine. Technical Report 161291, Microsoft Research (2012) Shao, B., Wang, H., Li, Y.: The Trinity graph engine. Technical Report 161291, Microsoft Research (2012)
5.
Zurück zum Zitat Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: Haloop: efficient iterative data processing on large clusters. PVLDB 3(1), 285–296 (2010) Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: Haloop: efficient iterative data processing on large clusters. PVLDB 3(1), 285–296 (2010)
6.
Zurück zum Zitat Yanfeng, Z., et al.: Priter: a distributed framework for prioritized iterative computations. In: Proceedings of the 2nd ACM Symposium on Cloud Computing. ACM (2011) Yanfeng, Z., et al.: Priter: a distributed framework for prioritized iterative computations. In: Proceedings of the 2nd ACM Symposium on Cloud Computing. ACM (2011)
7.
Zurück zum Zitat Russell, P., Jinyang, L.: Piccolo: building fast, distributed programs with partitioned tables. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, USENIX Association (2010) Russell, P., Jinyang, L.: Piccolo: building fast, distributed programs with partitioned tables. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, USENIX Association (2010)
8.
Zurück zum Zitat Matei, Z., et al.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on hot Topics in Cloud Computing, USENIX Association (2010) Matei, Z., et al.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on hot Topics in Cloud Computing, USENIX Association (2010)
10.
Zurück zum Zitat Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the Web. Technical Report, Stanford InfoLab (1999) Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the Web. Technical Report, Stanford InfoLab (1999)
12.
Zurück zum Zitat Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)CrossRef Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)CrossRef
13.
Zurück zum Zitat Sandeep, K.: A distributed algorithm for k-way graph partitioning. In: Proceedings of 25th Conference on EUROMICRO 2. IEEE (1999) Sandeep, K.: A distributed algorithm for k-way graph partitioning. In: Proceedings of 25th Conference on EUROMICRO 2. IEEE (1999)
15.
Zurück zum Zitat Rishan, C., et al.: Improving large graph processing on partitioned graphs in the cloud. In: Proceedings of the Third ACM Symposium on Cloud Computing. ACM (2012) Rishan, C., et al.: Improving large graph processing on partitioned graphs in the cloud. In: Proceedings of the Third ACM Symposium on Cloud Computing. ACM (2012)
16.
Zurück zum Zitat Isabelle, S., Gabriel, K.: Streaming graph partitioning for large distributed graphs. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2012) Isabelle, S., Gabriel, K.: Streaming graph partitioning for large distributed graphs. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2012)
17.
Zurück zum Zitat Reid, A., Fan, C., Lincoln, L.: Drawing Power Law Graphs. Graph Drawing. Springer, Berlin/Heidelberg (2005)MATH Reid, A., Fan, C., Lincoln, L.: Drawing Power Law Graphs. Graph Drawing. Springer, Berlin/Heidelberg (2005)MATH
18.
Zurück zum Zitat Andries, B.E., Willem, H.H.: Distance-Regular Graphs. In: Brouwer, A.E., Haemers, W.H. (eds.) Spectra of Graphs, pp. 177–185. Springer, New York (2012) Andries, B.E., Willem, H.H.: Distance-Regular Graphs. In: Brouwer, A.E., Haemers, W.H. (eds.) Spectra of Graphs, pp. 177–185. Springer, New York (2012)
19.
Zurück zum Zitat Richard, K.E., Peter, S.: Large-scale parallel breadth-first search. In: Proceedings of the National Conference on Artificial Intelligence. 20.3. AAAI Press, MIT Press: Menlo Park, Cambridge, London (1999) (2005) Richard, K.E., Peter, S.: Large-scale parallel breadth-first search. In: Proceedings of the National Conference on Artificial Intelligence. 20.3. AAAI Press, MIT Press: Menlo Park, Cambridge, London (1999) (2005)
20.
Zurück zum Zitat Brendan, M.D., Nicholas, C.W.: Uniform generation of random regular graphs of moderate degree. J. Algorithm 11(1), 52–67 (1990)MathSciNetCrossRefMATH Brendan, M.D., Nicholas, C.W.: Uniform generation of random regular graphs of moderate degree. J. Algorithm 11(1), 52–67 (1990)MathSciNetCrossRefMATH
23.
Zurück zum Zitat François, P., Jean, R.: Scotch: A Software Package for Static Mapping by Dual Recursive Partitioning of Process and Architecture Graphs. High-Performance Computing and Networking. Springer, Berlin/Heidelberg (1996) François, P., Jean, R.: Scotch: A Software Package for Static Mapping by Dual Recursive Partitioning of Process and Architecture Graphs. High-Performance Computing and Networking. Springer, Berlin/Heidelberg (1996)
24.
Zurück zum Zitat Jeffrey, D., Sanjay, G.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef Jeffrey, D., Sanjay, G.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef
25.
Zurück zum Zitat Hung-chih, Y., et al.: Map-reduce-merge: simplified relational data processing on large clusters. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of data. ACM (2007) Hung-chih, Y., et al.: Map-reduce-merge: simplified relational data processing on large clusters. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of data. ACM (2007)
30.
Zurück zum Zitat Salihoglu, S., Widom, J.: GPS: a graph processing system. In: Szalay, A., Budavari, T., Balazinska, M., Meliou, A., Sacan, A. (eds.) Proceedings of the 25th International Conference on Scientific and Statistical Database Management (SSDBM). ACM, New York, NY (2013). doi:10.1145/2484838.2484843 Salihoglu, S., Widom, J.: GPS: a graph processing system. In: Szalay, A., Budavari, T., Balazinska, M., Meliou, A., Sacan, A. (eds.) Proceedings of the 25th International Conference on Scientific and Statistical Database Management (SSDBM). ACM, New York, NY (2013). doi:10.​1145/​2484838.​2484843
31.
Zurück zum Zitat Zhenyu, G., et al.: g2: A graph processing system for diagnosing distributed sytems. In: Proceedings of the 2011USENIX Annual Technical Conference, USENIXATC (2011) Zhenyu, G., et al.: g2: A graph processing system for diagnosing distributed sytems. In: Proceedings of the 2011USENIX Annual Technical Conference, USENIXATC (2011)
32.
Zurück zum Zitat Kyungho, J., et al.: Large graph processing based on remote memory system. In: 12th IEEE International Conference on High Performance Computing and Communications (HPCC). IEEE (2010) Kyungho, J., et al.: Large graph processing based on remote memory system. In: 12th IEEE International Conference on High Performance Computing and Communications (HPCC). IEEE (2010)
33.
Zurück zum Zitat Jian, L., et al.: Cost-conscious scheduling for large graph processing in the cloud. In: IEEE 13th International Conference on High Performance Computing and Communications (HPCC). IEEE (2011) Jian, L., et al.: Cost-conscious scheduling for large graph processing in the cloud. In: IEEE 13th International Conference on High Performance Computing and Communications (HPCC). IEEE (2011)
34.
Zurück zum Zitat Ulrich, M.: External memory BFS on undirected graphs with bounded degree. In: Proceedings of the 12th Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics (2001) Ulrich, M.: External memory BFS on undirected graphs with bounded degree. In: Proceedings of the 12th Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics (2001)
35.
Zurück zum Zitat Kurt, M., Ulrich, M.: External-Memory Breadth First Search with Sublinear I/O. Algorithms-ESA 2002. Springer, Berlin, Heidelberg (2002)MATH Kurt, M., Ulrich, M.: External-Memory Breadth First Search with Sublinear I/O. Algorithms-ESA 2002. Springer, Berlin, Heidelberg (2002)MATH
36.
Zurück zum Zitat Yinglong, X., Viktor K.P.: Topologically adaptive parallel-first search on multicore processors. In: Proceedings of the 21st IASTED International Conference, vol. 668(77) (2009) Yinglong, X., Viktor K.P.: Topologically adaptive parallel-first search on multicore processors. In: Proceedings of the 21st IASTED International Conference, vol. 668(77) (2009)
37.
Zurück zum Zitat Deepak, A.: Design, implementation and experimental study of external memory BFS algorithms. Master thesis. Max-Planck-Institute für Informatik, Saarbrücken, Germany (2005) Deepak, A.: Design, implementation and experimental study of external memory BFS algorithms. Master thesis. Max-Planck-Institute für Informatik, Saarbrücken, Germany (2005)
38.
Zurück zum Zitat Deepak, A., et al.: Improved external memory BFS implementation. In: Proceedings of the Workshop on Algorithm Engineering and Experiments (2007) Deepak, A., et al.: Improved external memory BFS implementation. In: Proceedings of the Workshop on Algorithm Engineering and Experiments (2007)
39.
Zurück zum Zitat Zhou, R., Hansen, E.A: Breadth-first heuristic search. In Proceedings of the 14th International Conference on Automated Planning and Scheduling ICAPS, pp. 92–100 (2004) Zhou, R., Hansen, E.A: Breadth-first heuristic search. In Proceedings of the 14th International Conference on Automated Planning and Scheduling ICAPS, pp. 92–100 (2004)
40.
Zurück zum Zitat Zhou, R., Hansen, E.A: A breadth-first approach to memory-efficient graph search. In: Proceedings of the National Conference on Artificial Intelligence, vol. 21(2) (2006) Zhou, R., Hansen, E.A: A breadth-first approach to memory-efficient graph search. In: Proceedings of the National Conference on Artificial Intelligence, vol. 21(2) (2006)
41.
Zurück zum Zitat Todd, G., Saad, Y.: Heuristic algorithms for automation graph partitioning. University of Minnesota Supercomputer Institute, Minneapolis, MN 55415, pp. 94–29 (1994) Todd, G., Saad, Y.: Heuristic algorithms for automation graph partitioning. University of Minnesota Supercomputer Institute, Minneapolis, MN 55415, pp. 94–29 (1994)
42.
Zurück zum Zitat Sahar, I., Wael, E.: Computing breadth first search in large graph using hmetis partitioning. Eur. J. Sci. Res. 29(2), 215–221 (2009) Sahar, I., Wael, E.: Computing breadth first search in large graph using hmetis partitioning. Eur. J. Sci. Res. 29(2), 215–221 (2009)
43.
Zurück zum Zitat Shang, H., Kitsuregawa, M.: Efficient breadth-first search on large graphs with skewed degree distributions. In: Proceedings of the 16th International Conference on Extending Database Technology (EDBT ’13), pp. 311–322. ACM, New York, NY (2013). doi:10.1145/2452376.2452413 Shang, H., Kitsuregawa, M.: Efficient breadth-first search on large graphs with skewed degree distributions. In: Proceedings of the 16th International Conference on Extending Database Technology (EDBT ’13), pp. 311–322. ACM, New York, NY (2013). doi:10.​1145/​2452376.​2452413
44.
Zurück zum Zitat Scott, B., et al.: Direction-optimizing breadth first search. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). IEEE (2012) Scott, B., et al.: Direction-optimizing breadth first search. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). IEEE (2012)
45.
Zurück zum Zitat David, A.B., et al.: Designing multithreaded algorithms for breadth-first search and st-connectivity on the Cray MTA-2. Parallel processing, 2006, ICPP 2006. In: International Conference on IEEE (2006) David, A.B., et al.: Designing multithreaded algorithms for breadth-first search and st-connectivity on the Cray MTA-2. Parallel processing, 2006, ICPP 2006. In: International Conference on IEEE (2006)
46.
Zurück zum Zitat Aydin, B., Kamesh, M.: Parallel breadth-first search on distributed memory systems. In: Proceedings of 2011 International Conference for High performance Computing. Networking, Storage and Analysis. ACM (2011) Aydin, B., Kamesh, M.: Parallel breadth-first search on distributed memory systems. In: Proceedings of 2011 International Conference for High performance Computing. Networking, Storage and Analysis. ACM (2011)
47.
Zurück zum Zitat Shengqi, Y., et al.: Towards effective partition management for large graphs. In: Proceedings of the 2012 International Conference on Management of Data. ACM (2012) Shengqi, Y., et al.: Towards effective partition management for large graphs. In: Proceedings of the 2012 International Conference on Management of Data. ACM (2012)
48.
Zurück zum Zitat Josep, M.P., et al.: The little engine (s) that could: scaling online social networks. ACM SIGCOMM Computer Communication Review, vol. 40(4) ACM (2010) Josep, M.P., et al.: The little engine (s) that could: scaling online social networks. ACM SIGCOMM Computer Communication Review, vol. 40(4) ACM (2010)
49.
Zurück zum Zitat Edmond, C., Keith, H., Andy, Y.: Distributed Breadth-First Search with 2-D Partitioning. LLNL, Technical Report. UCRL-CONF-210829 (2005) Edmond, C., Keith, H., Andy, Y.: Distributed Breadth-First Search with 2-D Partitioning. LLNL, Technical Report. UCRL-CONF-210829 (2005)
50.
Zurück zum Zitat Victor, M., et al.: Graph Partitioning for Efficient BFS in Shared-Nothing Parallel Systems. Web-Age Information Management. Springer, Berlin Heidelberg (2010) Victor, M., et al.: Graph Partitioning for Efficient BFS in Shared-Nothing Parallel Systems. Web-Age Information Management. Springer, Berlin Heidelberg (2010)
51.
Zurück zum Zitat Buntinas, D., Mercier, G., Gropp, W.: Design and evaluation of nemesis, a scalable, low-latency, message-passing communication subsystem. In: Proceedings of the International Symposium on Cluster Computing and the Grid (2006) Buntinas, D., Mercier, G., Gropp, W.: Design and evaluation of nemesis, a scalable, low-latency, message-passing communication subsystem. In: Proceedings of the International Symposium on Cluster Computing and the Grid (2006)
52.
Zurück zum Zitat Chai, L., Gao, Q., Panda, D.K.: Understanding the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with Intel Dual-Core System, Cluster Computing and the Grid, 2007. Seventh IEEE International Symposium on CCGRID 2007. IEEE (2007) Chai, L., Gao, Q., Panda, D.K.: Understanding the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with Intel Dual-Core System, Cluster Computing and the Grid, 2007. Seventh IEEE International Symposium on CCGRID 2007. IEEE (2007)
53.
Zurück zum Zitat El-Ghazawi, T., et al.: The promise of high-performance reconfigurable computing. IEEE Comput. 41(2), 69–76 (2008)CrossRef El-Ghazawi, T., et al.: The promise of high-performance reconfigurable computing. IEEE Comput. 41(2), 69–76 (2008)CrossRef
55.
Zurück zum Zitat Rosner, B., Grove, D.: Use of the Mann–Whitney \(U\) test for clustered data. Stat. Med. 18(11), 1387–1400 (1999)CrossRef Rosner, B., Grove, D.: Use of the Mann–Whitney \(U\) test for clustered data. Stat. Med. 18(11), 1387–1400 (1999)CrossRef
56.
Zurück zum Zitat Živorad, M.: Application of Mann–Whitney \(U\) test in research of professional training of primary school teachers. Metodički Obzori 6(11), 73–79 (2011) Živorad, M.: Application of Mann–Whitney \(U\) test in research of professional training of primary school teachers. Metodički Obzori 6(11), 73–79 (2011)
Metadaten
Titel
Achieving Optimal Inter-Node Communication in Graph Partitioning Using Random Selection and Breadth-First Search
verfasst von
Srimanth Gadde
William Acosta
Jordan Ringenberg
Robert Green
Vijay Devabhaktuni
Publikationsdatum
01.08.2016
Verlag
Springer US
Erschienen in
International Journal of Parallel Programming / Ausgabe 4/2016
Print ISSN: 0885-7458
Elektronische ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-015-0374-5

Weitere Artikel der Ausgabe 4/2016

International Journal of Parallel Programming 4/2016 Zur Ausgabe

OriginalPaper

Fast LH

Premium Partner