skip to main content
10.1145/2568088.2576761acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
short-paper

Benchmarking graph-processing platforms: a vision

Published:22 March 2014Publication History

ABSTRACT

Processing graphs, especially at large scale, is an increasingly useful activity in a variety of business, engineering, and scientific domains. Already, there are tens of graph-processing platforms, such as Hadoop, Giraph, GraphLab, etc., each with a different design and functionality. For graph-processing to continue to evolve, users have to find it easy to select a graph-processing platform, and developers and system integrators have to find it easy to quantify the performance and other non-functional aspects of interest. However, the state of performance analysis of graph-processing platforms is still immature: there are few studies and, for the few that exist, there are few similarities, and relatively little understanding of the impact of dataset and algorithm diversity on performance. Our vision is to develop, with the help of the performance-savvy community, a comprehensive benchmarking suite for graph-processing platforms. In this work, we take a step in this direction, by proposing a set of seven challenges, summarizing our previous work on performance evaluation of distributed graph-processing platforms, and introducing our on-going work within the SPEC Research Group's Cloud Working Group.

References

  1. DAS4. http://www.cs.vu.nl/das4/.Google ScholarGoogle Scholar
  2. Giraph. http://giraph.apache.org/.Google ScholarGoogle Scholar
  3. Neo4j. http://www.neo4j.org/.Google ScholarGoogle Scholar
  4. SNAP. http://snap.stanford.edu/index.html.Google ScholarGoogle Scholar
  5. YARN. http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html.Google ScholarGoogle Scholar
  6. D. D. Abreu, A. Flores, G. Palma, V. Pestana, J. Piñero, J. Queipo, J. Sánchez, and M.-E. Vidal. Choosing between graph databases and rdf engines for consuming and mining linked data. In COLD, 2013.Google ScholarGoogle Scholar
  7. A.-L. Barabási and R. Albert. Emergence of scaling in random networks. 1999.Google ScholarGoogle Scholar
  8. A. Buluç, E. Duriakova, A. Fox, J. R. Gilbert, S. Kamil, A. Lugowski, L. Oliker, and S. Williams. High-Productivity 1 http://research.spec.org/working-groups/rg-cloud-working-group.html2 You are invited to participate, http://goo.gl/TJwkTg. and High-Performance Analysis of Filtered Semantic Graphs. In IPDPS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Cai and C. K. Poon. Path-hop: efficiently indexing large graphs for reachability queries. In CIKM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Cong and K. Makarychev. Optimizing Large-scale Graph Analysis on Multithreaded, Multicore Platforms. In IPDPS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Dayarathna and T. Suzumura. Xgdbench: A benchmarking platform for graph stores in exascale clouds. In CloudCom, pages 363--370, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Elser and A. Montresor. An evaluation study of bigdata frameworks for graph processing. In IEEE BigData, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  13. S. Ewen, K. Tzoumas, M. Kaufmann, and V. Markl. Spinning fast iterative data flows. PVLDB, 5(11):1268--1279, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Ghit, N. Yigitbasi, and D. Epema. Resource Management for Dynamic MapReduce Clusters in Multicluster Systems. In SC|12 MTAGS, 2012. Best paper award. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Graph500. http://www.graph500.org/.Google ScholarGoogle Scholar
  16. D. Gregor and A. Lumsdaine. The Parallel BGL: A Generic Library for Distributed Graph Computations. POOSC, 2005.Google ScholarGoogle Scholar
  17. Y. Guo, M. Biczak, A. L. Varbanescu, A. Iosup, C. Martella, and T. L. Willke. How well do graph-processing platforms perform? an empirical performance evaluation and analysis. In IPDPS, 2013. http://www.pds.ewi.tudelft.nl/~iosup/perf-eval-graph-proc14ipdps.pdf.Google ScholarGoogle Scholar
  18. Y. Guo and A. Iosup. The Game Trace Archive. In NetGames, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. W. Jiang and G. Agrawal. Ex-MATE: Data Intensive Computing with Large Reduction Objects and Its Application to Graph Mining. In CCGRID, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In SIGKDD, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud. In VLDB, pages 716--727, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Lugowski, D. M. Alber, A. Buluç, J. R. Gilbert, S. Reinhardt, Y. Teng, and A. Waranis. A Flexible Open-Source Toolbox for Scalable Complex Graph Analysis. In SDM, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  23. G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-scale Graph Processing. In SIGMOD, pages 135--146, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Merrill, M. Garland, and A. S. Grimshaw. Scalable GPU graph traversal. In PPOPP, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. Shao, H. Wang, and Y. Li. Trinity: A distributed graph engine on a memory cloud. In SIGMOD, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Shun and G. E. Blelloch. Ligra: a lightweight graph processing framework for shared memory. In PPOPP, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. E. Solomonik, A. Buluç, and J. Demmel. Minimizing Communication in All-Pairs Shortest Paths. In IPDPS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. N. Wang, J. Zhang, K.-L. Tan, and A. K. H. Tung. On Triangulation-based Dense Neighborhood Graphs Discovery. VLDB, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. White. Hadoop: The definitive guide. O'Reilly Media, Inc., 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. B. Wu and Y. Du. Cloud-Based Connected Component Algorithm. In ICAICI, pages 122--126, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Benchmarking graph-processing platforms: a vision

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ICPE '14: Proceedings of the 5th ACM/SPEC international conference on Performance engineering
        March 2014
        310 pages
        ISBN:9781450327336
        DOI:10.1145/2568088

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 March 2014

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        ICPE '14 Paper Acceptance Rate21of78submissions,27%Overall Acceptance Rate252of851submissions,30%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader