skip to main content
research-article

PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs

Authors Info & Claims
Published:22 January 2019Publication History
Skip Abstract Section

Abstract

Natural graphs with skewed distributions raise unique challenges to distributed graph computation and partitioning. Existing graph-parallel systems usually use a “one-size-fits-all” design that uniformly processes all vertices, which either suffer from notable load imbalance and high contention for high-degree vertices (e.g., Pregel and GraphLab) or incur high communication cost and memory consumption even for low-degree vertices (e.g., PowerGraph and GraphX). In this article, we argue that skewed distributions in natural graphs also necessitate differentiated processing on high-degree and low-degree vertices. We then introduce PowerLyra, a new distributed graph processing system that embraces the best of both worlds of existing graph-parallel systems. Specifically, PowerLyra uses centralized computation for low-degree vertices to avoid frequent communications and distributes the computation for high-degree vertices to balance workloads. PowerLyra further provides an efficient hybrid graph partitioning algorithm (i.e., hybrid-cut) that combines edge-cut (for low-degree vertices) and vertex-cut (for high-degree vertices) with heuristics. To improve cache locality of inter-node graph accesses, PowerLyra further provides a locality-conscious data layout optimization. PowerLyra is implemented based on the latest GraphLab and can seamlessly support various graph algorithms running in both synchronous and asynchronous execution modes. A detailed evaluation on three clusters using various graph-analytics and MLDM (Machine Learning and Data Mining) applications shows that PowerLyra outperforms PowerGraph by up to 5.53X (from 1.24X) and 3.26X (from 1.49X) for real-world and synthetic graphs, respectively, and is much faster than other systems like GraphX and Giraph, yet with much less memory consumption. A porting of hybrid-cut to GraphX further confirms the efficiency and generality of PowerLyra.

References

  1. Amine Abou-Rjeili and George Karypis. 2006. Multilevel algorithms for partitioning power-law graphs. In Proceedings of the 20th International Conference on Parallel and Distributed Processing (IPDPS’06). IEEE Computer Society, Washington, DC, USA, 124--124. http://dl.acm.org/citation.cfm?id=1898953.1899055. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Lada A. Adamic and Bernardo A. Huberman. 2002. Zipf’s law and the internet. Glottometrics 3, 1 (2002), 143--150.Google ScholarGoogle Scholar
  3. Maciej Besta and Torsten Hoefler. 2015. Accelerating irregular computations with hardware transactional memory and active messages. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (HPDC’15). ACM, New York, 161--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Paolo Boldi, Bruno Codenotti, Massimo Santini, and Sebastiano Vigna. 2004. UbiCrawler: A scalable fully distributed web crawler. Software Practice and Experience 34, 8 (July 2004), 711--726. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 1--7 (April 1998), 107--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Nathan Bronson, Zach Amsden, George Cabrera, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, et al. 2013. Tao: Facebook’s distributed data store for the social graph. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’13). 49--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Aydin Buluc and John R Gilbert. 2011. The combinatorial BLAS: Design, implementation, and applications. Int. J. High Performance Computer Applications 25, 4 (Nov. 2011), 496--509. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Umit Catalyurek and Cevdet Aykanat. 1996. Decomposing irregularly sparse matrices for parallel matrix-vector multiplication. In Proceedings of the 3rd International Workshop on Parallel Algorithms for Irregularly Structured Problems (IRREGULAR’96). Springer-Verlag, London, UK, 75--86. http://dl.acm.org/citation.cfm?id=646010.676990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ümit V. Çatalyürek, Cevdet Aykanat, and Bora Uçar. 2010. On two-dimensional sparse matrix partitioning: Models, methods, and a recipe. SIAM Journal of Science and Computing 32, 2 (Feb. 2010), 656--683.Google ScholarGoogle ScholarCross RefCross Ref
  10. Haibo Chen, Heng Zhang, Mingkai Dong, Zhaoguo Wang, Yubin Xia, Haibing Guan, and Binyu Zang. 2017. Efficient and available in-memory KV-store with hybrid erasure coding and replication. ACM Transactions on Storage 13, 3, Article 25 (Sept. 2017), 30 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Rong Chen, Xin Ding, Peng Wang, Haibo Chen, Binyu Zang, and Haibing Guan. 2014. Computation and communication efficient graph processing with distributed immutable view. In Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing (HPDC’14). ACM, New York, 215--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. PowerLyra: Differentiated graph computation and partitioning on skewed graphs. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15). ACM, New York, Article 1, 15 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Rong Chen, Jiaxin Shi, Binyu Zang, and Haibing Guan. 2014. Bipartite-oriented distributed graph partitioning for big learning. In Proceedings of 5th Asia-Pacific Workshop on Systems (APSys’14). ACM, New York, Article 14, 7 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Rong Chen, Jia-Xin Shi, Hai-Bo Chen, and Bin-Yu Zang. 2015. Bipartite-oriented distributed graph partitioning for big learning. Journal of Computer Science and Technology 30, 1 (2015), 20--29.Google ScholarGoogle ScholarCross RefCross Ref
  15. Rishan Chen, Mao Yang, Xuetian Weng, Byron Choi, Bingsheng He, and Xiaoming Li. 2012. Improving large graph processing on partitioned graphs in the cloud. In Proceedings of the 3rd ACM Symposium on Cloud Computing (SoCC’12). ACM, New York, Article 3, 13 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Rong Chen, Youyang Yao, Peng Wang, Kaiyuan Zhang, Zhaoguo Wang, Haibing Guan, Binyu Zang, and Haibo Chen. 2018. Replication-based fault-tolerance for large-scale graph processing. IEEE Transactions on Parallel and Distributed Systems 29, 7 (2018), 1621--1635.Google ScholarGoogle ScholarCross RefCross Ref
  17. Raymond Cheng, Ji Hong, Aapo Kyrola, Youshan Miao, Xuetian Weng, Ming Wu, Fan Yang, Lidong Zhou, Feng Zhao, and Enhong Chen. 2012. Kineograph: Taking the pulse of a fast-changing and connected world. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys’12). ACM, New York, 85--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Michael Mitzenmacher, Alessandro Panconesi, and Prabhakar Raghavan. 2009. On compressing social networks. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09). ACM, New York,219--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi Muthukrishnan. 2015. One trillion edges: Graph processing at facebook-scale. Proceedings of the VLDB Endowment 8, 12 (Aug. 2015), 1804--1815. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. DIMACS. 2006. The 9th DIMACS Implementation Challenge - Shortest Paths. http://www.dis.uniroma1.it/challenge9/.Google ScholarGoogle Scholar
  21. Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. 1999. On power-law relationships of the internet topology. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM’99). ACM, New York, 251--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ionel Gog, Malte Schwarzkopf, Natacha Crooks, Matthew P. Grosvenor, Allen Clement, and Steven Hand. 2015. Musketeer: All for one, one for all in data processing systems. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15). ACM, New York, Article 2, 16 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Joseph Gonzalez, Yucheng Low, Arthur Gretton, and Carlos Guestrin. 2011. Parallel gibbs sampling: From colored fields to thin junction trees. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS’11). PMLR, 15:324--332. http://proceedings.mlr.press/v15/gonzalez11a.html.Google ScholarGoogle Scholar
  24. Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). USENIX Association, Berkeley, CA,17--30. http://dl.acm.org/citation.cfm?id=2387880.2387883. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Joseph E. Gonzalez, Yucheng Low, Carlos Guestrin, and David O’Hallaron. 2009. Distributed parallel inference on large factor graphs. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI’09). AUAI Press, Arlington, VA, 203--212. http://dl.acm.org/citation.cfm?id=1795114.1795139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’ 14). USENIX Association, Berkeley, CA, 599--613. http://dl.acm.org/citation.cfm?id=2685048.2685096. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Wentao Han, Youshan Miao, Kaiwei Li, Ming Wu, Fan Yang, Lidong Zhou, Vijayan Prabhakaran, Wenguang Chen, and Enhong Chen. 2014. Chronos: A graph engine for temporal graph analysis. In Proceedings of the Ninth European Conference on Computer Systems (EuroSys’14). ACM, New York, Article 1, 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Henry Haselgrove. 2010. Wikipedia page-to-page link database. http://haselgrove.id.au/wikipedia.htm.Google ScholarGoogle Scholar
  29. Imranul Hoque and Indranil Gupta. 2013. LFGraph: Simple and fast distributed graph analytics. In Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems (TRIOS’13). ACM, New York, Article 9, 17 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Andy Huang and Wei Wu. 2014. Mining Ecommerce Graph Data with Apache Spark at Alibaba Taobao. Retrieved April 1, 2018, from https://databricks.com/blog/2014/08/14/mining-graph-data-with-spark-at-alibaba-taobao.html.Google ScholarGoogle Scholar
  31. Nilesh Jain, Guangdeng Liao, and Theodore L. Willke. 2013. GraphBuilder: Scalable graph ETL framework. In Proceedings of the 1st International Workshop on Graph Data Management Experiences and Systems (GRADES’13). ACM, New York, Article 4, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Saehan Jo, Jaemin Yoo, and U Kang. 2018. Fast and scalable distributed loopy belief propagation on real-world graphs. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining (WSDM’18). ACM, New York, 297--305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Xiaoen Ju, Hani Jamjoom, and Kang G. Shin. 2017. Hieroglyph: Locally-sufficient graph processing via compute-sync-merge. Proceedings of the ACM Measurement Analysis Computer Systems 1, 1, Article 9 (June 2017), 25 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Tim Kaler, William Hasenplaugh, Tao B. Schardl, and Charles E. Leiserson. 2014. Executing dynamic data-graph computations deterministically using chromatic scheduling. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’14). ACM, New York, 154--165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. U. Kang, Charalampos E. Tsourakakis, Ana Paula Appel, Christos Faloutsos, and Jure Leskovec. 2011. HADI: Mining radii of large graphs. ACM Transactions on Knowledge Discovery Data 5, 2, Article 8 (Feb. 2011), 24 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. George Karypis and Vipin Kumar. 1999. Parallel multilevel series k-way partitioning scheme for irregular graphs. SIAM Review 41, 2 (1999), 278--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Zuhair Khayyat, Karim Awara, Amani Alonazi, Hani Jamjoom, Dan Williams, and Panos Kalnis. 2013. Mizan: A system for dynamic load balancing in large-scale graph processing. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys’13). ACM, New York, 169--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is twitter, a social network or a news media?. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, 591--600. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale graph computation on just a PC. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). USENIX Association, Berkeley, CA, 31--46. http://dl.acm.org/citation.cfm?id=2387880.2387884. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Michael LeBeane, Shuang Song, Reena Panda, Jee Ho Ryoo, and Lizy K. John. 2015. Data partitioning strategies for graph workloads on heterogeneous clusters. In Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis (SC’15). IEEE, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2007. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery Data 1, 1, Article 2 (March 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. 2009. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics 6, 1 (2009), 29--123.Google ScholarGoogle ScholarCross RefCross Ref
  43. Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed graphlab: A framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment 5, 8 (April 2012), 716--727. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Andrew Lumsdaine, Douglas Gregor, Bruce Hendrickson, and Jonathan Berry. 2007. Challenges in parallel graph processing. Parallel Processing Letters 17, 01 (2007), 5--20.Google ScholarGoogle ScholarCross RefCross Ref
  45. Lingxiao Ma, Zhi Yang, Han Chen, Jilong Xue, and Yafei Dai. 2017. Garaph: Efficient GPU-accelerated graph processing on a single machine with balanced replication. In Proceedings of 2017 USENIX Annual Technical Conference (USENIX ATC’17). USENIX Association, 195--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Steffen Maass, Changwoo Min, Sanidhya Kashyap, Woonhak Kang, Mohan Kumar, and Taesoo Kim. 2017. Mosaic: Processing a trillion-edge graph on a single machine. In Proceedings of the 12th European Conference on Computer Systems (EuroSys’17). ACM, New York, 527--543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD’10). ACM, New York, 135--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Jasmina Malicevic, Baptiste Lepers, and Willy Zwaenepoel. 2017. Everything you always wanted to know about multicore graph processing but were afraid to ask. In Proceedings of 2017 USENIX Annual Technical Conference (USENIX ATC’17). USENIX Association, 631--643. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Daniel Margo and Margo Seltzer. 2015. A scalable distributed graph partitioner. Proceedings of the VLDB Endowment 8, 12 (Aug. 2015), 1478--1489. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Kameshwar Munagala and Abhiram Ranade. 1999. I/O-complexity of graph algorithms. In Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’99). Society for Industrial and Applied Mathematics, Philadelphia, PA, 687--694. http://dl.acm.org/citation.cfm?id=314500.314891. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A timely dataflow system. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). ACM, New York, 439--455. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Mark E. J. Newman. 2005. Power laws, pareto distributions and Zipf’s law. Contemporary Physics 46, 5 (2005), 323--351.Google ScholarGoogle ScholarCross RefCross Ref
  53. Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). ACM, New York, 456--471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Biswanath Panda, Joshua S. Herbach, Sugato Basu, and Roberto J. Bayardo. 2009. PLANET: Massively parallel learning of tree ensembles with MapReduce. Proceedins ofthe VLDB Endowment 2, 2 (Aug. 2009), 1426--1437. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Zhengping Qian, Yong He, Chunzhi Su, Zhuojie Wu, Hongyu Zhu, Taizhi Zhang, Lidong Zhou, Yuan Yu, and Zheng Zhang. 2013. TimeStream: Reliable stream computation in the cloud. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys’13). ACM, New York,1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, and Willy Zwaenepoel. 2015. Chaos: Scale-out graph processing from secondary storage. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP’15). ACM, New York, 410--424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-Stream: Edge-centric graph processing using streaming partitions. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). ACM, New York, 472--488. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Alessandra Sala, Lili Cao, Christo Wilson, Robert Zablit, Haitao Zheng, and Ben Y. Zhao. 2010. Measurement-calibrated graph models for social network experiments. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, 861--870. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Alessandra Sala, Xiaohan Zhao, Christo Wilson, Haitao Zheng, and Ben Y. Zhao. 2011. Sharing graphs using differentially private graph models. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference (IMC’11). ACM, New York, 81--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Semih Salihoglu and Jennifer Widom. 2013. GPS: A graph processing system. In Proceedings of the 25th International Conference on Scientific and Statistical Database Management (SSDBM’13). ACM, New York, Article 22, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Kirk Schloegel, George Karypis, and Vipin Kumar. 2000. Parallel multilevel algorithms for multi-constraint graph partitioning (distinguished paper). In Proceedings of the 6th International Euro-Par Conference on Parallel Processing (Euro-Par’00). Springer-Verlag, London, UK, 296--310. http://dl.acm.org/citation.cfm?id=646665.698944. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Jiwon Seo, Jongsoo Park, Jaeho Shin, and Monica S. Lam. 2013. Distributed socialite: A datalog-based language for large-scale graph analysis. Proceedings of the VLDB Endowment 6, 14 (Sept. 2013), 1906--1917. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Bin Shao, Haixun Wang, and Yatao Li. 2013. Trinity: A distributed graph engine on a memory cloud. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD’13). ACM, New York, 505--516. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Jiaxin Shi, Youyang Yao, Rong Chen, Haibo Chen, and Feifei Li. 2016. Fast and concurrent RDF queries with RDMA-based distributed graph exploration. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 317--332. http://dl.acm.org/citation.cfm?id=3026877.3026902. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Julian Shun and Guy E. Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’13). ACM, New York,135--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Julian Shun, Laxman Dhulipala, and Guy E Blelloch. 2015. Smaller and faster: Parallel processing of compressed graphs with Ligra+. In 2015 Data Compression Conference. IEEE, 403--412. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Alexander Smola and Shravan Narayanamurthy. 2010. An architecture for parallel topic models. Proceedings of the VLDB Endowment 3, 1-2 (Sept. 2010), 703--710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Isabelle Stanton and Gabriel Kliot. 2012. Streaming graph partitioning for large distributed graphs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’12). ACM, New York, 1222--1230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Gábor Takács, István Pilászy, Bottyán Németh, and Domonkos Tikk. 2009. Scalable collaborative filtering approaches for large recommender systems. Journal of Machine Learning Research 10 (June 2009), 623--656. http://dl.acm.org/citation.cfm?id=1577069.1577091. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Tencent. 2018. Design and Practice of the Anomaly Detection Framework for Billions of User in WeChat (in Chinese). https://cloud.tencent.com/developer/article/1028442.Google ScholarGoogle Scholar
  71. Yuanyuan Tian, Andrey Balmin, Severin Andreas Corsten, Shirish Tatikonda, and John McPherson. 2013. From “Think Like a Vertex” to “Think Like a Graph.”Proceedings of the VLDB Endowment 7, 3 (Nov. 2013), 193--204. http://dl.acm.org/citation.cfm?id=2732232.2732238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Charalampos Tsourakakis, Christos Gkantsidis, Bozidar Radunovic, and Milan Vojnovic. 2014. FENNEL: Streaming graph partitioning for massive scale graphs. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM’14). ACM, New York, 333--342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Alexander Ulanov, Manish Marwah, Mijung Kim, Roshan Dathathri, Carlos Zubieta, and Jun Li. 2017. Sandpiper: Scaling probabilistic inferencing to large scale graphical models. In Proceedings of 2017 IEEE International Conference on Big Data. IEEE, 383--388.Google ScholarGoogle ScholarCross RefCross Ref
  74. Leslie G. Valiant. 1990. A bridging model for parallel computation. Communications of the ACM 33, 8 (Aug. 1990), 103--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Shiv Verma, Luke M. Leslie, Yosub Shin, and Indranil Gupta. 2017. An experimental comparison of partitioning strategies in distributed graph processing. Proceedings of the VLDB Endowment 10, 5 (Jan. 2017), 493--504. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Huaimin Wang, Peichang Shi, and Yiming Zhang. 2017. Jointcloud: A cross-cloud cooperation architecture for integrated internet service customization. In Proceedings of the 37th International Conference on Distributed Computing Systems (ICDCS’17). IEEE, 1846--1855.Google ScholarGoogle ScholarCross RefCross Ref
  77. Hao Wang, Jing Zhang, Da Zhang, Sarunya Pumma, and Wu-chun Feng. 2017. PaPar: A parallel data partitioning framework for big data applications. In Proceedings of 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS’17). IEEE, 605--614.Google ScholarGoogle ScholarCross RefCross Ref
  78. Lei Wang, Liangji Zhuang, Junhang Chen, Huimin Cui, Fang Lv, Ying Liu, and Xiaobing Feng. 2018. Lazygraph: Lazy data coherency for replicas in distributed graph-parallel computation. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’18). ACM, New York, 276--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Peng Wang, Kaiyuan Zhang, Rong Chen, Haibo Chen, and Haibing Guan. 2014. Replication-based fault-tolerance for large-scale graph processing. In Proceedings of the 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’14). IEEE Computer Society, Washington, DC,562--573. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Siyuan Wang, Chang Lou, Rong Chen, and Haibo Chen. 2018. Fast and concurrent RDF queries using RDMA-assisted GPU graph exploration. In Proceedings of 2018 USENIX Annual Technical Conference (USENIX ATC’18). 651--664. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Yangzihao Wang, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, and John D. Owens. 2016. Gunrock: A high-performance graph processing library on the GPU. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’16). ACM, New York, Article 11, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Y. Zhao. 2009. User interactions in social networks and their implications. In Proceedings of the 4th ACM European Conference on Computer Systems (EuroSys’09). ACM, New York, 205--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Ming Wu, Fan Yang, Jilong Xue, Wencong Xiao, Youshan Miao, Lan Wei, Haoxiang Lin, Yafei Dai, and Lidong Zhou. 2015. GraM: Scaling graph computation to the trillions. In Proceedings of the 6th ACM Symposium on Cloud Computing (SoCC’15). ACM, New York, 408--421. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Wencong Xiao, Jilong Xue, Youshan Miao, Zhen Li, Cheng Chen, Ming Wu, Wei Li, and Lidong Zhou. 2017. Tux2: Distributed graph computation for machine learning. In Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation (NSDI’17). 669--682. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Chenning Xie, Rong Chen, Haibing Guan, Binyu Zang, and Haibo Chen. 2015. SYNC or ASYNC: Time to fuse for distributed graph-parallel computation. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’15). ACM, New York, 194--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. Cong Xie, Ling Yan, Wu-Jun Li, and Zhihua Zhang. 2014. Distributed power-law graph computing: Theoretical and empirical analysis. In Proceedings of the 28th Annual Conference on Neural Information Processing Systems (NIPS’14). 1673--1681. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Da Yan, James Cheng, Yi Lu, and Wilfred Ng. 2014. Blogel: A block-centric framework for distributed computation on real-world graphs. Proceedings of the VLDB Endowment 7, 14 (2014), 1981--1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Jerry Ye, Jyh-Herng Chow, Jiang Chen, and Zhaohui Zheng. 2009. Stochastic gradient boosted distributed decision trees. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM’09). ACM, New York, 2061--2064. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Andy Yoo, Edmond Chow, Keith Henderson, William McLendon, Bruce Hendrickson, and Umit Catalyurek. 2005. A scalable distributed parallel breadth-first search algorithm on BlueGene/L. In Proceedings of the 2005 ACM/IEEE Conference on Supercomputing (SC’05). IEEE Computer Society, Washington, DC, 25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI’12). USENIX Association, Berkeley, CA, 2--2. http://dl.acm.org/citation.cfm?id=2228298.2228301. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Heng Zhang, Mingkai Dong, and Haibo Chen. 2016. Efficient and available in-memory kv-store with hybrid erasure coding and replication. In Proceedings of 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, Santa Clara, CA, 167--180. https://www.usenix.org/conference/fast16/technical-sessions/presentation/zhang-heng. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Kaiyuan Zhang, Rong Chen, and Haibo Chen. 2015. NUMA-aware graph-structured analytics. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’15). ACM, New York, 183--193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Mingxing Zhang, Yongwei Wu, Kang Chen, Teng Ma, and Weimin Zheng. 2016. Measuring and optimizing distributed array programs. Proceedings of the VLDB Endowment 9, 12 (Aug. 2016), 912--923. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Mingxing Zhang, Yongwei Wu, Kang Chen, Xuehai Qian, Xue Li, and Weimin Zheng. 2016. Exploring the hidden dimension in graph processing. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 285--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. Mingxing Zhang, Youwei Zhuo, Chao Wang, Mingyu Gao, Yongwei Wu, Kang Chen, Christos Kozyrakis, and Xuehai Qian. 2018. GraphP: Reducing communication for PIM-based graph processing with efficient data partition. In Proceedings of 2018 IEEE International Symposium onHigh Performance Computer Architecture (HPCA’18). IEEE, 544--557.Google ScholarGoogle ScholarCross RefCross Ref
  96. Yunhao Zhang, Rong Chen, and Haibo Chen. 2017. Sub-millisecond stateful stream querying over fast-evolving linked data. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP’17). ACM, New York, 614--630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. Yu Zhang, Xiaofei Liao, Hai Jin, Lin Gu, Guang Tan, and Bing Bing Zhou. 2017. HotGraph: Efficient asynchronous processing for real-world graphs. IEEE Transactions on Computing 66, 5 (May 2017), 799--809. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. Xiaohan Zhao, Adelbert Chang, Atish Das Sarma, Haitao Zheng, and Ben Y. Zhao. 2013. On the embeddability of random walk distances. Proceedings of the VLDB Endowment 6, 14 (Sept. 2013), 1690--1701. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. Yunhong Zhou, Dennis Wilkinson, Robert Schreiber, and Rong Pan. 2008. Large-scale parallel collaborative filtering for the netflix prize. In Proceedings of the 4th International Conference on Algorithmic Aspects in Information and Management (AAIM’08). Springer-Verlag, Berlin, Heidelberg, 337--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 301--316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large scale graph processing on a single machine using 2-level hierarchical partitioning. In Proceedings of the 2015 USENIX Conference on Annual Technical Conference (USENIX ATC’15). USENIX Association. https://www.usenix.org/conference/atc15/technical-session/presentation/zhu. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Parallel Computing
        ACM Transactions on Parallel Computing  Volume 5, Issue 3
        September 2018
        89 pages
        ISSN:2329-4949
        EISSN:2329-4957
        DOI:10.1145/3305217
        • Editor:
        • David Bader
        Issue’s Table of Contents

        Copyright © 2019 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 January 2019
        • Accepted: 1 March 2018
        • Revised: 1 October 2016
        • Received: 1 August 2015
        Published in topc Volume 5, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format