Skip to main content
Erschienen in: International Journal of Parallel Programming 2/2017

28.03.2016

Towards Systematic Parallelization of Graph Transformations Over Pregel

verfasst von: Le-Duc Tung, Zhenjiang Hu

Erschienen in: International Journal of Parallel Programming | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Graphs can be used to model many kinds of data, from traditional datasets to social networks or semi-structured datasets. To process large graphs, many systems have been proposed. The Pregel programming model is popular, thanks to its scalability. Although Pregel is simple to understand and use, it is of low-level in programming and requires developers to write programs that are hard to maintain and need to be carefully optimized. On the other hand, structural recursion is powerful to systematically construct efficient parallel programs on lists, arrays and trees, but it has not yet been applied to graphs. In this paper, we propose an efficient method for parallel evaluation of structural recursion on graphs, which is suitable for Pregel. We design and implement a high-level parallel programming framework where a domain-specific language (DSL) is provided to ease the programing task. Specifications written in the DSL are automatically compiled into Pregel programs that are scalable for large graphs. Experimental results show that our framework outperforms the original evaluation of structural recursion, and achieves good scalability and speedup for real datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Afrati, F.N., Ullman, J.D.: Transitive closure and recursive datalog implemented on clusters. In: Proceedings of the 15th International Conference on Extending Database Technology, EDBT ’12 (2012) Afrati, F.N., Ullman, J.D.: Transitive closure and recursive datalog implemented on clusters. In: Proceedings of the 15th International Conference on Extending Database Technology, EDBT ’12 (2012)
2.
Zurück zum Zitat Buneman, P.: Semistructured data. In: Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS ’97, pp. 117–121. ACM, New York, NY, USA (1997) Buneman, P.: Semistructured data. In: Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS ’97, pp. 117–121. ACM, New York, NY, USA (1997)
3.
Zurück zum Zitat Buneman, P., Fernandez, M., Suciu, D.: UnQL: A Query Language and Algebra for Semistructured Data Based on Structural Recursion. VLDB J. 9(1), 76–110 (2000) Buneman, P., Fernandez, M., Suciu, D.: UnQL: A Query Language and Algebra for Semistructured Data Based on Structural Recursion. VLDB J. 9(1), 76–110 (2000)
4.
Zurück zum Zitat Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef
5.
Zurück zum Zitat Emoto, K., Fischer, S., Hu, Z.: Generate, test, and aggregate: a calculation-based framework for systematic parallel programming with mapreduce. In: Proceedings of the 21st European Conference on Programming Languages and Systems, ESOP’12, pp. 254–273. Springer, Berlin (2012) Emoto, K., Fischer, S., Hu, Z.: Generate, test, and aggregate: a calculation-based framework for systematic parallel programming with mapreduce. In: Proceedings of the 21st European Conference on Programming Languages and Systems, ESOP’12, pp. 254–273. Springer, Berlin (2012)
6.
Zurück zum Zitat Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: Proceedings of OSDI’12, pp. 17–30 (2012) Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: Proceedings of OSDI’12, pp. 17–30 (2012)
7.
Zurück zum Zitat Hidaka, S., Hu, Z., Kato, H., Nakano, K.: Towards a compositional approach to model transformation for software development. In: Proceedings of the 2009 ACM Symposium on Applied Computing, SAC ’09, pp. 468–475. ACM, New York, NY, USA (2009) Hidaka, S., Hu, Z., Kato, H., Nakano, K.: Towards a compositional approach to model transformation for software development. In: Proceedings of the 2009 ACM Symposium on Applied Computing, SAC ’09, pp. 468–475. ACM, New York, NY, USA (2009)
8.
Zurück zum Zitat Hong, S., Salihoglu, S., Widom, J., Olukotun, K.: Simplifying scalable graph processing with a domain-specific language. In: Proceedings of CGO’14, pp. 208–218 (2014) Hong, S., Salihoglu, S., Widom, J., Olukotun, K.: Simplifying scalable graph processing with a domain-specific language. In: Proceedings of CGO’14, pp. 208–218 (2014)
9.
Zurück zum Zitat Krause, C., Tichy, M., Giese, H.: Implementing graph transformations in the bulk synchronous parallel model. In: Gnesi, S., Rensink, A. (eds.) Fundamental Approaches to SoftwareEngineering, Lecture Notes in Computer Science, vol. 8411, pp. 325–339. Springer, Berlin (2014) Krause, C., Tichy, M., Giese, H.: Implementing graph transformations in the bulk synchronous parallel model. In: Gnesi, S., Rensink, A. (eds.) Fundamental Approaches to SoftwareEngineering, Lecture Notes in Computer Science, vol. 8411, pp. 325–339. Springer, Berlin (2014)
10.
Zurück zum Zitat Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)CrossRef Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)CrossRef
11.
Zurück zum Zitat Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD ’10 (2010) Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD ’10 (2010)
12.
Zurück zum Zitat Matsuzaki, K., Iwasaki, H., Emoto, K., Hu, Z.: A library of constructive skeletons for sequential style of parallel programming. In: Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale ’06. ACM, New York, NY, USA (2006) Matsuzaki, K., Iwasaki, H., Emoto, K., Hu, Z.: A library of constructive skeletons for sequential style of parallel programming. In: Proceedings of the 1st International Conference on Scalable Information Systems, InfoScale ’06. ACM, New York, NY, USA (2006)
13.
Zurück zum Zitat Nolé, M., Sartiani, C.: Processing regular path queries on giraph. In: EDBT/ICDT Workshops (2014) Nolé, M., Sartiani, C.: Processing regular path queries on giraph. In: EDBT/ICDT Workshops (2014)
14.
Zurück zum Zitat Salihoglu, S., Widom, J.: HelP: High-level primitives for large-scale graph processing. In: Proceedings of Workshop on GRAph Data Management Experiences and Systems, GRADES’14, pp. 3:1–3:6 (2014) Salihoglu, S., Widom, J.: HelP: High-level primitives for large-scale graph processing. In: Proceedings of Workshop on GRAph Data Management Experiences and Systems, GRADES’14, pp. 3:1–3:6 (2014)
15.
Zurück zum Zitat Suciu, D.: Distributed query evaluation on semistructured data. ACM Trans. Database Syst. 27(1), 1–62 (2002) Suciu, D.: Distributed query evaluation on semistructured data. ACM Trans. Database Syst. 27(1), 1–62 (2002)
16.
Zurück zum Zitat Tung, L.D., Nguyen-Van, Q., Hu, Z.: Efficient query evaluation on distributed graphs with hadoop environment. In: Proceedings of the 4th Symposium on Information and Communication Technology, SoICT ’13. ACM, New York, NY, USA (2013) Tung, L.D., Nguyen-Van, Q., Hu, Z.: Efficient query evaluation on distributed graphs with hadoop environment. In: Proceedings of the 4th Symposium on Information and Communication Technology, SoICT ’13. ACM, New York, NY, USA (2013)
17.
Zurück zum Zitat Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)CrossRef Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)CrossRef
18.
Zurück zum Zitat Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on spark. In: First International Workshop on Graph Data Management Experiences and Systems, GRADES ’13, pp. 2:1–2:6 (2013) Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on spark. In: First International Workshop on Graph Data Management Experiences and Systems, GRADES ’13, pp. 2:1–2:6 (2013)
Metadaten
Titel
Towards Systematic Parallelization of Graph Transformations Over Pregel
verfasst von
Le-Duc Tung
Zhenjiang Hu
Publikationsdatum
28.03.2016
Verlag
Springer US
Erschienen in
International Journal of Parallel Programming / Ausgabe 2/2017
Print ISSN: 0885-7458
Elektronische ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-016-0418-5

Weitere Artikel der Ausgabe 2/2017

International Journal of Parallel Programming 2/2017 Zur Ausgabe

Premium Partner