skip to main content
10.1145/2688500.2688508acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

SYNC or ASYNC: time to fuse for distributed graph-parallel computation

Published:24 January 2015Publication History

ABSTRACT

Large-scale graph-structured computation usually exhibits iterative and convergence-oriented computing nature, where input data is computed iteratively until a convergence condition is reached. Such features have led to the development of two different computation modes for graph-structured programs, namely synchronous (Sync) and asynchronous (Async) modes. Unfortunately, there is currently no in-depth study on their execution properties and thus programmers have to manually choose a mode, either requiring a deep understanding of underlying graph engines, or suffering from suboptimal performance. This paper makes the first comprehensive characterization on the performance of the two modes on a set of typical graph-parallel applications. Our study shows that the performance of the two modes varies significantly with different graph algorithms, partitioning methods, execution stages, input graphs and cluster scales, and no single mode consistently outperforms the other. To this end, this paper proposes Hsync, a hybrid graph computation mode that adaptively switches a graph-parallel program between the two modes for optimal performance. Hsync constantly collects execution statistics on-the-fly and leverages a set of heuristics to predict future performance and determine when a mode switch could be profitable. We have built online sampling and offline profiling approaches combined with a set of heuristics to accurately predicting future performance in the two modes. A prototype called PowerSwitch has been built based on PowerGraph, a state-of-the-art distributed graph-parallel system, to support adaptive execution of graph algorithms. On a 48-node EC2-like cluster, PowerSwitch consistently outperforms the best of both modes, with a speedup ranging from 9% to 73% due to timely switch between two modes.

References

  1. The laboratory for web algorithmics. http://law.dsi.unimi.it/datasets.php.Google ScholarGoogle Scholar
  2. A PACHE. The Apache Giraph Project. http://giraph.apache.org/.Google ScholarGoogle Scholar
  3. A PACHE. The Apache Hama Project. http://hama.apache.org/.Google ScholarGoogle Scholar
  4. B EAMER, S., A SANOVI ´ C, K., AND P ATTERSON, D. Directionoptimizing breadth-first search. Scientific Programming 21, 3 (2013), 137–148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B ERTSEKAS, D. P., G UERRIERO, F., AND M USMANNO, R. Parallel asynchronous label-correcting methods for shortest paths. Journal of Optimization Theory and Applications 88, 2 (1996), 297–320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B ERTSEKAS, D. P., AND T SITSIKLIS, J. N. Parallel and distributed computation: numerical methods. Prentice-Hall, Inc., 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B RIN, S., AND P AGE, L. The anatomy of a large-scale hypertextual web search engine. In Proc. WWW (1998). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C HANDY, K., AND L AMPORT, L. Distributed snapshots: determining global states of distributed systems. ACM TOCS 3, 1 (1985), 63–75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C HEN, R., D ING, X., W ANG, P., C HEN, H., Z ANG, B., AND G UAN, H. Computation and communication efficient graph processing with distributed immutable view. In Proc. HPDC (2014). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C HEN, R., S HI, J., C HEN, Y., G UAN, H., AND C HEN, H. Powerlyra: Differentiated graph computation and partitioning on skewed graphs. Tech. Rep. 2013-11-001, Shanghai Jiao Tong University, 2013.Google ScholarGoogle Scholar
  11. C HEN, R., S HI, J., Z ANG, B., AND G UAN, H. Bipartite-oriented distributed graph partitioning for big learning. In Proc. APSys (2014). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C HENG, R., H ONG, J., K YROLA, A., M IAO, Y., W ENG, X., W U, M., Y ANG, F., Z HOU, L., Z HAO, F., AND C HEN, E. Kineograph: taking the pulse of a fast-changing and connected world. In Proc. EuroSys (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C HIERICHETTI, F., K UMAR, R., L ATTANZI, S., M ITZENMACHER, M., P ANCONESI, A., AND R AGHAVAN, P. On compressing social networks. In Proc. SIGKDD (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D EAN, J., AND G HEMAWAT, S. Mapreduce: simplified data processing on large clusters. CACM 51, 1 (2008), 107–113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E FRON, B., H ASTIE, T., J OHNSTONE, I., AND T IBSHIRANI, R. Least angle regression. The Annals of statistics 32, 2 (2004), 407–499.Google ScholarGoogle Scholar
  16. E LIDAN, G., M C G RAW, I., AND K OLLER, D. Residual belief propagation: Informed scheduling for asynchronous message passing. In Proc. UAI (2006).Google ScholarGoogle Scholar
  17. G ONZALEZ, J., L OW, Y., G RETTON, A., AND G UESTRIN, C. Parallel gibbs sampling: From colored fields to thin junction trees. In Proc. ICAIS (2011).Google ScholarGoogle Scholar
  18. G ONZALEZ, J., L OW, Y., G U, H., B ICKSON, D., AND G UESTRIN, C. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proc. OSDI (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G ONZALEZ, J. E., L OW, Y., G UESTRIN, C., AND O’H ALLARON, D. Distributed parallel inference on large factor graphs. In Proc. UAI (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G ONZALEZ, J. E., X IN, R. S., D AVE, A., C RANKSHAW, D., F RANKLIN, M. J., AND S TOICA, I. Graphx: Graph processing in a distributed dataflow framework. In Proc. OSDI (2014). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H AN, W., M IAO, Y., L I, K., W U, M., Y ANG, F., Z HOU, L., P RAB - HAKARAN, V., C HEN, W., AND C HEN, E. Chronos: a graph engine for temporal graph analysis. In Proc. EuroSys (2014). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. H ASELGROVE, H. Wikipedia page-to-page link database. http://haselgrove.id.au/wikipedia.htm, 2010.Google ScholarGoogle Scholar
  23. H ERODOTOU, H., AND B ABU, S. Profiling, what-if analysis, and cost-based optimization of mapreduce programs. Proceedings of the VLDB Endowment (2011).Google ScholarGoogle Scholar
  24. H ERODOTOU, H., D ONG, F., AND B ABU, S. No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics. In Proc. SoCC (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J AIN, N., L IAO, G., AND W ILLKE, T. L. Graphbuilder: scalable graph etl framework. In Proc. GDM (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K WAK, H., L EE, C., P ARK, H., AND M OON, S. What is twitter, a social network or a news media? In Proc. WWW (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K YROLA, A., B LELLOCH, G., AND G UESTRIN, C. GraphChi: Largescale graph computation on just a PC. In Proc. OSDI (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L ESKOVEC, J., L ANG, K., D ASGUPTA, A., AND M AHONEY, M. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics 6, 1 (2009), 29–123.Google ScholarGoogle Scholar
  29. L IU, Y., W U, B., W ANG, H., AND M A, P. Bpgm: A big graph mining tool. Tsinghua Science and Technology 19, 1 (2014), 33–38.Google ScholarGoogle Scholar
  30. L OW, Y., B ICKSON, D., G ONZALEZ, J., G UESTRIN, C., K YROLA, A., AND H ELLERSTEIN, J. M. Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M ALEWICZ, G., A USTERN, M. H., B IK, A. J., D EHNERT, J. C., H ORN, I., L EISER, N., AND C ZAJKOWSKI, G. Pregel: a system for large-scale graph processing. In Proc. SIGMOD (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M URRAY, D. G., M C S HERRY, F., I SAACS, R., I SARD, M., B ARHAM, P., AND A BADI, M. Naiad: a timely dataflow system. In Proc. SOSP (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. N GUYEN, D., L ENHARTH, A., AND P INGALI, K. A lightweight infrastructure for graph analytics. In Proc. SOSP (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. P ANDA, B., H ERBACH, J., B ASU, S., AND B AYARDO, R. PLANET: massively parallel learning of tree ensembles with MapReduce. Proc. VLDB (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. P ROJECT, S. N. A. Stanford large network dataset collection. http://snap.stanford.edu/data/.Google ScholarGoogle Scholar
  36. R OY, A., M IHAILOVIC, I., AND Z WAENEPOEL, W. X-stream: Edgecentric graph processing using streaming partitions. In Proc. SOSP (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. S ALIHOGLU, S., AND W IDOM, J. GPS: A Graph Processing System. http://infolab.stanford.edu/gps/, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. S HAO, B., W ANG, H., AND L I, Y. Trinity: A distributed graph engine on a memory cloud. In Proc. SIGMOD (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. S HUN, J., AND B LELLOCH, G. E. Ligra: a lightweight graph processing framework for shared memory. In Proc. PPoPP (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S MOLA, A., AND N ARAYANAMURTHY, S. An architecture for parallel topic models. Proc. VLDB (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. V ALIANT, L. G. A bridging model for parallel computation. CACM 33, 8 (1990), 103–111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. W ANG, G., X IE, W., D EMERS, A. J., AND G EHRKE, J. Asynchronous large-scale graph processing made easy. In Proc. CIDR (2013).Google ScholarGoogle Scholar
  43. W ILSON, C., B OE, B., S ALA, A., P UTTASWAMY, K. P., AND Z HAO, B. Y. User interactions in social networks and their implications. In Proc. EuroSys (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Y E, J., C HOW, J., C HEN, J., AND Z HENG, Z. Stochastic gradient boosted distributed decision trees. In Proc. CIKM (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Y OU, Y., S ONG, S. L., AND K ERBYSON, D. An adaptive crossarchitecture combination method for graph traversal. In Proc. SC (2014). Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Z HANG, K., C HEN, R., AND C HEN, H. Numa-aware graphstructured analytics. In Proc. PPoPP (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Z HAO, X., C HANG, A., S ARMA, A. D., Z HENG, H., AND Z HAO, B. Y. On the embeddability of random walk distances. Proc. VLDB (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Z HONG, J., AND H E, B. Medusa: Simplified graph processing on gpus. IEEE TPDS (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Z HONG, J., AND H E, B. Parallel graph processing on graphics processors made easy. Proc. VLDB (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SYNC or ASYNC: time to fuse for distributed graph-parallel computation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PPoPP 2015: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
        January 2015
        290 pages
        ISBN:9781450332057
        DOI:10.1145/2688500
        • cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 50, Issue 8
          PPoPP '15
          August 2015
          290 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2858788
          • Editor:
          • Andy Gill
          Issue’s Table of Contents

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 January 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate230of1,014submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader