skip to main content
research-article

The more the merrier: efficient multi-source graph traversal

Published:01 December 2014Publication History
Skip Abstract Section

Abstract

Graph analytics on social networks, Web data, and communication networks has been widely used in a plethora of applications. Many graph analytics algorithms are based on breadth-first search (BFS) graph traversal, which is not only time-consuming for large datasets but also involves much redundant computation when executed multiple times from different start vertices. In this paper, we propose Multi-Source BFS (MS-BFS), an algorithm that is designed to run multiple concurrent BFSs over the same graph on a single CPU core while scaling up as the number of cores increases. MS-BFS leverages the properties of small-world networks, which apply to many real-world graphs, and enables efficient graph traversal that: (i) shares common computation across concurrent BFSs; (ii) greatly reduces the number of random memory accesses; and (iii) does not incur synchronization costs. We demonstrate how a real graph analytics application---all-vertices closeness centrality---can be efficiently solved with MS-BFS. Furthermore, we present an extensive experimental evaluation with both synthetic and real datasets, including Twitter and Wikipedia, showing that MS-BFS provides almost linear scalability with respect to the number of cores and excellent scalability for increasing graph sizes, outperforming state-of-the-art BFS algorithms by more than one order of magnitude when running a large number of BFSs.

References

  1. Graph 500 Benchmark, 2014. http://www.graph500.org/.Google ScholarGoogle Scholar
  2. V. Agarwal, F. Petrini, D. Pasetto, and D. A. Bader. Scalable Graph Exploration on Multicore Processors. In SC '10, pages 1--11, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. A. N. Amaral, A. Scala, M. Barthélémy, and H. E. Stanley. Classes of Small-World Networks. PNAS, 97(21): 11149--11152, 2000.Google ScholarGoogle Scholar
  4. L. Backstrom, P. Boldi, M. Rosa, J. Ugander, and S. Vigna. Four Degrees of Separation. In WebSci '12, pages 33--42, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. A. Bader and K. Madduri. Designing Multithreaded Algorithms for Breadth-First Search and St-connectivity on the Cray MTA-2. In ICPP '06, pages 523--530, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. A. Bader and K. Madduri. SNAP, Small-world Network Analysis and Partitioning: An Open-source Parallel Graph Framework for the Exploration of Large-scale Networks. In IPDPS, pages 1--12, 2008.Google ScholarGoogle Scholar
  7. S. Beamer, K. Asanović, and D. Patterson. Direction-Optimizing Breadth-First Search. In SC '12, pages 12:1--12:10, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Boncz. LDBC: Benchmarks for Graph and RDF Data Management. In IDEAS '13, pages 1--2, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. U. Brandes. A Faster Algorithm for Betweenness Centrality. Journal of Mathematical Sociology, 25: 163--177, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  10. A. Buluç and K. Madduri. Parallel Breadth-First Search on Distributed Memory Systems. In SC '11, pages 65:1--65:12, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Checconi, F. Petrini, J. Willcock, A. Lumsdaine, A. R. Choudhury, and Y. Sabharwal. Breaking the Speed and Scalability Barriers for Graph Exploration on Distributed-Memory Machines. In SC '12, pages 13:1--13:12, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Cheng, Z. Shang, H. Cheng, H. Wang, and J. X. Yu. K-Reach: Who is in Your Small World. PVLDB, 5(11): 1292--1303, July 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. V. Cherkassky, A. V. Goldberg, and T. Radzik. Shortest Paths Algorithms: Theory and Experimental Evaluation. Mathematical Programming, 73(2): 129--174, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Chhugani, N. Satish, C. Kim, J. Sewall, and P. Dubey. Fast and Efficient Graph Traversal Algorithm for CPUs: Maximizing Single-Node Efficiency. In IPDPS '12, pages 378--389, May 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Faunus -- Graph Analytics Engine, 2014. http://thinkaurelius.github.io/faunus/.Google ScholarGoogle Scholar
  16. D. Gregor and A. Lumsdaine. The Parallel BGL: A Generic Library for Distributed Graph Computations. In POOSC, 2005.Google ScholarGoogle Scholar
  17. P. Gupta, A. Goel, J. Lin, A. Sharma, D. Wang, and R. Zadeh. WTF: The Who to Follow Service at Twitter. In WWW '13, pages 505--514, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Hong, T. Oguntebi, and K. Olukotun. Efficient Parallel Graph Exploration on Multi-Core CPU and GPU. In PACT '11, pages 78--88, Oct 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Kemper and T. Neumann. HyPer: A Hybrid OLTP&OLAP Main Memory Database System Based on Virtual Memory Snapshots. In ICDE'11, pages 195--206, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Landherr, B. Friedl, and J. Heidemann. A Critical Review of Centrality Measures in Social Networks. Business & Information Systems Engineering, 2(6): 371--385, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  21. LDBC Social Network Data Generator, 2014. https://github.com/ldbc/ldbc_snb_datagen.Google ScholarGoogle Scholar
  22. J. Lee, C. Jung, D. Lim, and Y. Solihin. Prefetching with Helper Threads for Loosely Coupled Multiprocessor Systems. TPDS, 20(9): 1309--1324, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud. PVLDB, 5(8): 716--727, April 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Lumsdaine, D. Gregor, B. Hendrickson, and J. Berry. Challenges in Parallel Graph Processing. Parallel Processing Letters, 17(1): 5--20, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  25. G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-scale Graph Processing. In SIGMOD '10, pages 135--146, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Neo4j, 2014. http://neo4j.com/.Google ScholarGoogle Scholar
  27. P. Olsen, A. Labouseur, and J.-H. Hwang. Efficient Top-K Closeness Centrality Search. In ICDE '14, pages 196--207, March 2014.Google ScholarGoogle Scholar
  28. L. Qin, J. X. Yu, L. Chang, H. Cheng, C. Zhang, and X. Lin. Scalable Big Graph Processing in MapReduce. In SIGMOD '14, pages 827--838, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Rowstron, D. Narayanan, A. Donnelly, G. O'Shea, and A. Douglas. Nobody Ever Got Fired for Using Hadoop on a Cluster. In HotCDP '12, pages 2:1--2:5, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. N. Satish, C. Kim, J. Chhugani, and P. Dubey. Large-Scale Energy-efficient Graph Traversal: A Path to Efficient Data-Intensive Supercomputing. In SC '12, pages 14:1--14:11, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. B. Shao, H. Wang, and Y. Li. Trinity: A Distributed Graph Engine on a Memory Cloud. In SIGMOD '13, pages 505--516, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. Simmen, K. Schnaitter, J. Davis, Y. He, S. Lohariwala, A. Mysore, V. Shenoi, M. Tan, and X. Yu. Large-Scale Graph Analytics in Aster 6: Bringing Context to Big Data Discovery. PVLDB, 7(13): 1405--1416, August 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Titan -- Distributed Graph Database, 2014. http://thinkaurelius.github.io/titan/.Google ScholarGoogle Scholar
  34. S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications, volume 8. Cambridge University Press, 1994.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. The more the merrier: efficient multi-source graph traversal
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image Proceedings of the VLDB Endowment
            Proceedings of the VLDB Endowment  Volume 8, Issue 4
            December 2014
            132 pages

            Publisher

            VLDB Endowment

            Publication History

            • Published: 1 December 2014
            Published in pvldb Volume 8, Issue 4

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader