ABSTRACT
Many practical computing problems concern large graph. Standard problems include web graph analysis and social networks analysis like Facebook, Twitter. The scale of these graph poses challenge to their efficient processing. To efficiently process large-scale graph, we create X-Pregel, a graph processing system based on Google's Computing Pregel model [1], by using the state-of-the-art PGAS programming language X10. We do not purely implement Google Pregel by using X10 language, but we also introduce two new features that do not exists in the original model to optimize the performance: (1) an optimization to reduce the number of messages which is exchanged among workers, (2) a dynamic re-partitioning scheme that effectively reassign vertices to different workers during the computation. Our performance evaluation demonstrates that our optimization method of sending messages achieves up to 200% speed up on Pagerank by reducing the network I/O to 10 times in comparison with the default method of sending messages when processing SCALE20 Kronecker graph [2](vertices = 1,048,576, edges = 33,554,432). It also demonstrates that our system processes large graph faster than prior implementation of Pregel such as GPS [3](stands for graph processing system) and Giraph [4].
- Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 international conference on Management of data, SIGMOD '10, pages 135--146. ACM, 2010. Google ScholarDigital Library
- Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutos, and Zoubin Ghahramani. Kronecker graphs: An approach to modeling networks. pages 985--1042, 2010. Google ScholarDigital Library
- Semih Salihoglu and Jennifer Widom. Gps: A graph processing system. Technical report, Stanford University.Google Scholar
- The Apache Software Foundation. Apache incubator giraph. http://incubator.apache.org/giraph/.Google Scholar
- Sanjay Ghemawat Jeffrey Dean. Mapreduce: simplified data processing on large clusters. In Proceedings of the 6th conference on Symposium on Operation Systems Design & Implementation - Volume 6, OSDI'04, pages 10--10. Google ScholarDigital Library
- Yahoo. Apache hadoop. http://hadoop.apache.org.Google Scholar
- Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual web search engine. In COMPUTER NETWORKS AND ISDN SYSTEMS, pages 107--117. Elsevier Science Publishers B. V., 1998. Google ScholarDigital Library
- C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: an object-oriented approach to non-uniform cluster computing. pages 519--538, 2005.Google Scholar
- V. Saraswat, B. Bloom, I. Peshansky, O. Tardieu, and D. Grove. The x10 reference manual. 2010.Google Scholar
- U. Kang, Charalampos E. Tsourakakis, and Christos Faloutsos. Pegasus: A peta-scale graph mining system - implementation and observations, 2009.Google Scholar
- Leslie G. Valiant. A bridging model for parallel computation, 1990.Google ScholarDigital Library
- The Apache Software Foundation. Apache zookeeper. http://zookeeper.apache.org/.Google Scholar
- The Apache Software Foundation. Apache mina. http://mina.apache.org/.Google Scholar
- Graph 500 Steering Committee. Graph500. http://www.graph500.org/.Google Scholar
Index Terms
- Towards highly scalable pregel-based graph processing platform with x10
Recommendations
Pregel: a system for large-scale graph processing
SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of dataMany practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs - in some cases billions of vertices, trillions of edges - poses challenges to their efficient ...
Highly connected monochromatic subgraphs
We conjecture that for n>4(k-1) every 2-coloring of the edges of the complete graph K"n contains a k-connected monochromatic subgraph with at least n-2(k-1) vertices. This conjecture, if true, is best possible. Here we prove it for k=2, and show how to ...
Towards Systematic Parallelization of Graph Transformations Over Pregel
Graphs can be used to model many kinds of data, from traditional datasets to social networks or semi-structured datasets. To process large graphs, many systems have been proposed. The Pregel programming model is popular, thanks to its scalability. ...
Comments