skip to main content
10.1145/3035918.3035942acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access
Best Paper

Parallelizing Sequential Graph Computations

Authors Info & Claims
Published:09 May 2017Publication History

ABSTRACT

This paper presents GRAPE, a parallel system for graph computations. GRAPE differs from prior systems in its ability to parallelize existing sequential graph algorithms as a whole. Underlying GRAPE are a simple programming model and a principled approach, based on partial evaluation and incremental computation. We show that sequential graph algorithms can be "plugged into" GRAPE with minor changes, and get parallelized. As long as the sequential algorithms are correct, their GRAPE parallelization guarantees to terminate with correct answers under a monotonic condition. Moreover, we show that algorithms in MapReduce, BSP and PRAM can be optimally simulated on GRAPE. In addition to the ease of programming, we experimentally verify that GRAPE achieves comparable performance to the state-of-the-art graph systems, using real-life and synthetic graphs.

References

  1. Aliyun.sl https://intl.aliyun.com.Google ScholarGoogle Scholar
  2. DBpedia.sl http://wiki.dbpedia.org/Datasets.Google ScholarGoogle Scholar
  3. Giraph.sl http://giraph.apache.org/.Google ScholarGoogle Scholar
  4. GRAPE.sl http://grapedb.io/.Google ScholarGoogle Scholar
  5. Movielens.sl http://grouplens.org/datasets/movielens/.Google ScholarGoogle Scholar
  6. MPICH.sl https://www.mpich.org/.Google ScholarGoogle Scholar
  7. Snap.sl http://snap.stanford.edu/data/index.html.Google ScholarGoogle Scholar
  8. Traffic.sl http://www.dis.uniroma1.it/challenge9/download.shtml.Google ScholarGoogle Scholar
  9. U. A. Acar. Self-Adjusting Computation. PhD thesis, Carnegie Mellon University, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Bang-Jensen and G. Z. Gutin. Digraphs: Theory, Algorithms and Applications. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. A. Bernstein and N. Goodman. Concurrency control in distributed database systems. ACM Comput. Surv., 13(2):185--221. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. G. Boman, K. D. Devine, and S. Rajamanickam. Scalable matrix computations on large scale-free graphs using 2D graph partitioning. In SC, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. F. Bourse, M. Lelarge, and M. Vojnovic. Balanced graph edge partition. In SIGKDD, pages 1456--1465, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Buneman, G. Cong, W. Fan, and A. Kementsietsidis. Using partial evaluation in distributed query evaluation. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SICOMP, 32(5):1338--1355, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub) graph isomorphism algorithm for matching large graphs. TPAMI, 26(10):1367--1372, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. W. Fan, C. Hu, and C. Tian. Incremental graph computations: Doable and undoable. In SIGMOD, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. W. Fan, J. Li, S. Ma, N. Tang, Y. Wu, and Y. Wu. Graph pattern matching: From intractability to polynomial time. In PVLDB, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. W. Fan, J. Li, X. Wang, and Y. Wu. Query preserving graph compression. In SIGMOD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. W. Fan, X. Wang, and Y. Wu. Incremental graph pattern matching. TODS, 38(3), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. W. Fan, X. Wang, and Y. Wu. Distributed graph simulation: Impossibility and possibility. PVLDB, 7(12), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. L. Fredman and R. E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. JACM, 34(3):596--615, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In USENIX, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. GraphX: Graph processing in a distributed dataflow framework. In OSDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. T. J. Harris. A survey of PRAM simulation techniques. ACM Comput. Surv., 26(2):187--206, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. R. Henzinger, T. Henzinger, and P. Kopke. Computing simulations on finite and infinite graphs. In FOCS, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. N. D. Jones. An introduction to partial evaluation. ACM Computing Surveys, 28(3), 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. H. J. Karloff, S. Suri, and S. Vassilvitskii. A model of computation for MapReduce. In SODA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. G. Karypis and V. Kumar. METIS--unstructured graph partitioning and sparse matrix ordering system, version 2.0. Technical report, 1995.Google ScholarGoogle Scholar
  31. A. Khan, Y. Wu, C. C. Aggarwal, and X. Yan. Nema: Fast graph search with label similarity. PVLDB, 6(3), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Kim and K. S. Candan. SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices. Data & Knowledge Engineering, 72:285--303, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Y. Koren, R. Bell, C. Volinsky, et al. Matrix factorization techniques for recommender systems. Computer, 42(8):30--37, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed GraphLab: A framework for machine learning in the cloud. PVLDB, 5(8), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. T. Mytkowicz, M. Musuvathi, and W. Schulte. Data-parallel finite-state machines. In ASPLOS, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. K. Pingali, D. Nguyen, M. Kulkarni, M. Burtscher, M. A. Hassaan, R. Kaleem, T.-H. Lee, A. Lenharth, R. Manevich, M. Méndez-Lojo, et al. The tao of parallelism in algorithms. In ACM Sigplan Notices, volume 46, pages 12--25, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. C. Radoi, S. J. Fink, R. M. Rabbah, and M. Sridharan. Translating imperative code to MapReduce. In OOPSLA, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. G. Ramalingam and T. Reps. An incremental algorithm for a generalization of the shortest-path problem. J. Algorithms, 21(2):267--305, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. G. Ramalingam and T. Reps. On the computational complexity of dynamic graph problems. TCS, 158(1--2), 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. V. Raychev, M. Musuvathi, and T. Mytkowicz. Parallelizing user-defined aggregations using symbolic execution. In SOSP, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. S. Salihoglu and J. Widom. GPS: a graph processing system. In SSDBM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD, pages 1222--1230, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Y. Tian, A. Balmin, S. A. Corsten, and J. M. Shirish Tatikonda. From "think like a vertex" to "think like a graph". PVLDB, 7(7):193--204, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. P. Trinder. A Functional Database. PhD thesis, University of Oxford, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. L. G. Valiant. A bridging model for parallel computation. Commun. ACM, 33(8):103--111, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. L. G. Valiant. General purpose parallel architectures. In Handbook of Theoretical Computer Science, Vol A. 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. J. Vinagre, A. M. Jorge, and J. Gama. Fast incremental matrix factorization for recommendation with positive-only feedback. In International Conference on User Modeling, Adaptation, and Personalization, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  49. G. Wang, W. Xie, A. J. Demers, and J. Gehrke. Asynchronous large-scale graph processing made easy. In CIDR, 2013.Google ScholarGoogle Scholar
  50. D. Yan, J. Cheng, Y. Lu, and W. Ng. Blogel: A block-centric framework for distributed computation on real-world graphs. PVLDB, 7(14):1981--1992, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. D. Yan, J. Cheng, K. Xing, Y. Lu, W. Ng, and Y. Bu. Pregel algorithms for graph connectivity problems with performance guarantees. PVLDB, 7(14):1821--1832, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Y. Zhou, L. Liu, K. Lee, C. Pu, and Q. Zhang. Fast iterative graph computation with resource aware graph parallel abstractions. In HPDC, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Parallelizing Sequential Graph Computations

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data
      May 2017
      1810 pages
      ISBN:9781450341974
      DOI:10.1145/3035918

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 May 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader