skip to main content
research-article

LDBC graphalytics: a benchmark for large-scale graph analysis on parallel and distributed platforms

Published:01 September 2016Publication History
Skip Abstract Section

Abstract

In this paper we introduce LDBC Graphalytics, a new industrial-grade benchmark for graph analysis platforms. It consists of six deterministic algorithms, standard datasets, synthetic dataset generators, and reference output, that enable the objective comparison of graph analysis platforms. Its test harness produces deep metrics that quantify multiple kinds of system scalability, such as horizontal/vertical and weak/strong, and of robustness, such as failures and performance variability. The benchmark comes with open-source software for generating data and monitoring performance. We describe and analyze six implementations of the benchmark (three from the community, three from the industry), providing insights into the strengths and weaknesses of the platforms. Key to our contribution, vendors perform the tuning and benchmarking of their platforms.

References

  1. Apache Giraph. http://giraph.apache.org.Google ScholarGoogle Scholar
  2. DAS-5: Distributed ASCI Supercomputer 5. http://cs.vu.nl/das5.Google ScholarGoogle Scholar
  3. Oracle Big Data Spatial and Graph. http://oracle.com/database/big-data-spatial-and-graph.Google ScholarGoogle Scholar
  4. Oracle Labs PGX: Parallel Graph Analytics Overview. http://oracle.com/technetwork/oracle-labs/parallel-graph-analytics.Google ScholarGoogle Scholar
  5. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data.Google ScholarGoogle Scholar
  6. G. Aluç et al. Diversified stress testing of RDF data management systems. In ISWC, pages 197--212, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Ammar and M. T. Özsu. WGB: towards a universal graph benchmark. In WBDB, pages 58--72, 2013.Google ScholarGoogle Scholar
  8. M. Anderson et al. GraphPad: optimized graph primitives for parallel and distr. platforms. In IPDPS, 2016.Google ScholarGoogle Scholar
  9. T. Armstrong et al. LinkBench: a database benchmark based on the Facebook social graph. In SIGMOD, pages 1185--1196, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Bader and K. Madduri. Design and implementation of the HPCS graph analysis benchmark on symmetric multiprocessors. In HiPC, pages 465--476, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Bizer and A. Schultz. The Berlin SPARQL benchmark. Int. J. Semantic Web Inf. Syst., 5(2):1--24, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. Capota et al. Graphalytics: A big data benchmark for graph-processing platforms. In GRADES, pages 7:1--7:6, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Cha et al. Measuring User Influence in Twitter: The Million Follower Fallacy. In ICWSM, page 30, 2010.Google ScholarGoogle Scholar
  14. M. Dayarathna and T. Suzumura. Graph database benchmarking on cloud environments with XGDBench. Autom. Softw. Eng., 21(4):509--533, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Eisenman et al. Parallel graph processing: Prejudice and state of the art. In ICPE, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. B. Elser and A. Montresor. An evaluation study of bigdata frameworks for graph processing. In Big Data, pages 60--67, 2013.Google ScholarGoogle Scholar
  17. O. Erling et al. The LDBC social network benchmark: Interactive workload. In SIGMOD, pages 619--630, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Fan et al. The case against specialized graph analytics engines. In CIDR, 2015.Google ScholarGoogle Scholar
  19. M. Ferdman et al. Clearing the clouds: a study of emerging scaleout workloads on modern hardware. In ASPLOS, pages 37--48, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Ghazal et al. BigBench: towards an industry standard benchmark for big data analytics. In SIGMOD, pages 1197--1208, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. E. Gonzalez et al. PowerGraph: Distributed graph parallel computation on natural graphs. In OSDI, pages 17--30, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Guo and A. Iosup. The game trace archive. In NETGAMES, page 4. IEEE Press, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Guo et al. LUBM: A benchmark for OWL knowledge base systems. J. Web Sem., 3(2-3):158--182, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Guo et al. How well do graph-processing platforms perform? In IPDPS, pages 395--404, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Guo et al. An empirical performance evaluation of gpu-enabled graph-processing systems. In CCGrid, pages 423--432, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Han et al. An experimental comparison of pregel-like graph processing systems. PVLDB, 7(12):1047--1058, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Hong et al. PGX.D: a fast distributed graph processing engine. In SC, pages 58:1--58:12, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Iosup, et al. LDBC Graphalytics: A Benchmark for Large-Scale Graph Analysis on Parallel and Distributed Platforms. Technical Report DS-2016-001, Delft University of Technology, 2016. http://ds.ewi.tudelft.nl/research-publications/technical-reports/2016/.Google ScholarGoogle Scholar
  29. A. Jindal et al. Vertexica: your relational friend for graph analytics! PVLDB, 7(13):1669--1672, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Lu et al. Large-scale distributed graph computing systems: An experimental evaluation. PVLDB, 8(3):281--292, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Z. Ming et al. BDGS: A scalable big data generator suite in big data benchmarking. In WBDB, pages 138--154, 2013.Google ScholarGoogle Scholar
  32. L. Nai et al. GraphBIG: understanding graph computing in the context of industrial solutions. In SC, pages 69:1--69:12, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. W. L. Ngai. Fine-grained Performance Evaluation of Large-scale Graph Processing Systems. Master's thesis, Delft University of Technology, the Netherlands, 2015.Google ScholarGoogle Scholar
  34. L. Page et al. The pagerank citation ranking: bringing order to the web. 1999.Google ScholarGoogle Scholar
  35. T. Rabl et al. The vision of BigBench 2.0. In DanaC, pages 3:1--3:4, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. U. Raghavan et al. Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76(3):036106, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  37. N. Satish et al. Navigating the maze of graph analytics frameworks using massive datasets. In SIGMOD, pages 979--990, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Schmidt et al. Sp^2 bench: a SPARQL performance benchmark. In ICDE, pages 222--233, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. N. Sundaram et al. Graphmat: High performance graph analytics made productive. PVLDB, 8(11):1214--1225, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. L. Wang et al. BigDataBench: a big data benchmark suite from internet services. In HPCA, pages 488--499, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  41. R. Xin et al. GraphX: A resilient distr. graph system on Spark. In GRADES, page 2, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. LDBC graphalytics: a benchmark for large-scale graph analysis on parallel and distributed platforms
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the VLDB Endowment
      Proceedings of the VLDB Endowment  Volume 9, Issue 13
      September 2016
      378 pages
      ISSN:2150-8097
      Issue’s Table of Contents

      Publisher

      VLDB Endowment

      Publication History

      • Published: 1 September 2016
      Published in pvldb Volume 9, Issue 13

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader