research-article

LDBC graphalytics: a benchmark for large-scale graph analysis on parallel and distributed platforms

Authors:
Alexandru Iosup

Delft University of Technology

Delft University of Technology
View Profile

,
Tim Hegeman

Delft University of Technology

Delft University of Technology
View Profile

,
Wing Lung Ngai

Delft University of Technology

Delft University of Technology
View Profile

,
Stijn Heldens

Delft University of Technology

Delft University of Technology
View Profile

,
Arnau Prat-Pérez

UPC Barcelona

UPC Barcelona
View Profile

,
Thomas Manhardto

Oracle Labs

Oracle Labs
View Profile

,
Hassan Chafio

Oracle Labs

Oracle Labs
View Profile

,
Mihai Capotă

Intel Labs

Intel Labs
View Profile

,
Narayanan Sundaram

Intel Labs

Intel Labs
View Profile

,
Michael Anderson

Intel Labs

Intel Labs
View Profile

,
Ilie Gabriel Tănase

IBM Research

IBM Research
View Profile

,
Yinglong Xia

Huawei Research America

Huawei Research America
View Profile

,
Lifeng Nai

Georgia Tech

Georgia Tech
View Profile

,
Peter Boncz

CWI Amsterdam

CWI Amsterdam
View Profile

Proceedings of the VLDB Endowment Volume 9 Issue 13pp 1317–1328https://doi.org/10.14778/3007263.3007270

Published:01 September 2016Publication History

Proceedings of the VLDB Endowment

Abstract

In this paper we introduce LDBC Graphalytics, a new industrial-grade benchmark for graph analysis platforms. It consists of six deterministic algorithms, standard datasets, synthetic dataset generators, and reference output, that enable the objective comparison of graph analysis platforms. Its test harness produces deep metrics that quantify multiple kinds of system scalability, such as horizontal/vertical and weak/strong, and of robustness, such as failures and performance variability. The benchmark comes with open-source software for generating data and monitoring performance. We describe and analyze six implementations of the benchmark (three from the community, three from the industry), providing insights into the strengths and weaknesses of the platforms. Key to our contribution, vendors perform the tuning and benchmarking of their platforms.

References

Apache Giraph. http://giraph.apache.org.Google Scholar
DAS-5: Distributed ASCI Supercomputer 5. http://cs.vu.nl/das5.Google Scholar
Oracle Big Data Spatial and Graph. http://oracle.com/database/big-data-spatial-and-graph.Google Scholar
Oracle Labs PGX: Parallel Graph Analytics Overview. http://oracle.com/technetwork/oracle-labs/parallel-graph-analytics.Google Scholar
SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data.Google Scholar
G. Aluç et al. Diversified stress testing of RDF data management systems. In ISWC, pages 197--212, 2014. Google ScholarDigital Library
K. Ammar and M. T. Özsu. WGB: towards a universal graph benchmark. In WBDB, pages 58--72, 2013.Google Scholar
M. Anderson et al. GraphPad: optimized graph primitives for parallel and distr. platforms. In IPDPS, 2016.Google Scholar
T. Armstrong et al. LinkBench: a database benchmark based on the Facebook social graph. In SIGMOD, pages 1185--1196, 2013. Google ScholarDigital Library
D. Bader and K. Madduri. Design and implementation of the HPCS graph analysis benchmark on symmetric multiprocessors. In HiPC, pages 465--476, 2005. Google ScholarDigital Library
C. Bizer and A. Schultz. The Berlin SPARQL benchmark. Int. J. Semantic Web Inf. Syst., 5(2):1--24, 2009.Google ScholarCross Ref
M. Capota et al. Graphalytics: A big data benchmark for graph-processing platforms. In GRADES, pages 7:1--7:6, 2015. Google ScholarDigital Library
M. Cha et al. Measuring User Influence in Twitter: The Million Follower Fallacy. In ICWSM, page 30, 2010.Google Scholar
M. Dayarathna and T. Suzumura. Graph database benchmarking on cloud environments with XGDBench. Autom. Softw. Eng., 21(4):509--533, 2014. Google ScholarDigital Library
A. Eisenman et al. Parallel graph processing: Prejudice and state of the art. In ICPE, 2016. Google ScholarDigital Library
B. Elser and A. Montresor. An evaluation study of bigdata frameworks for graph processing. In Big Data, pages 60--67, 2013.Google Scholar
O. Erling et al. The LDBC social network benchmark: Interactive workload. In SIGMOD, pages 619--630, 2015. Google ScholarDigital Library
J. Fan et al. The case against specialized graph analytics engines. In CIDR, 2015.Google Scholar
M. Ferdman et al. Clearing the clouds: a study of emerging scaleout workloads on modern hardware. In ASPLOS, pages 37--48, 2012. Google ScholarDigital Library
A. Ghazal et al. BigBench: towards an industry standard benchmark for big data analytics. In SIGMOD, pages 1197--1208, 2013. Google ScholarDigital Library
J. E. Gonzalez et al. PowerGraph: Distributed graph parallel computation on natural graphs. In OSDI, pages 17--30, 2012. Google ScholarDigital Library
Y. Guo and A. Iosup. The game trace archive. In NETGAMES, page 4. IEEE Press, 2012. Google ScholarDigital Library
Y. Guo et al. LUBM: A benchmark for OWL knowledge base systems. J. Web Sem., 3(2-3):158--182, 2005. Google ScholarDigital Library
Y. Guo et al. How well do graph-processing platforms perform? In IPDPS, pages 395--404, 2014. Google ScholarDigital Library
Y. Guo et al. An empirical performance evaluation of gpu-enabled graph-processing systems. In CCGrid, pages 423--432, 2015.Google ScholarDigital Library
M. Han et al. An experimental comparison of pregel-like graph processing systems. PVLDB, 7(12):1047--1058, 2014. Google ScholarDigital Library
S. Hong et al. PGX.D: a fast distributed graph processing engine. In SC, pages 58:1--58:12, 2015. Google ScholarDigital Library
A. Iosup, et al. LDBC Graphalytics: A Benchmark for Large-Scale Graph Analysis on Parallel and Distributed Platforms. Technical Report DS-2016-001, Delft University of Technology, 2016. http://ds.ewi.tudelft.nl/research-publications/technical-reports/2016/.Google Scholar
A. Jindal et al. Vertexica: your relational friend for graph analytics! PVLDB, 7(13):1669--1672, 2014. Google ScholarDigital Library
Y. Lu et al. Large-scale distributed graph computing systems: An experimental evaluation. PVLDB, 8(3):281--292, 2014. Google ScholarDigital Library
Z. Ming et al. BDGS: A scalable big data generator suite in big data benchmarking. In WBDB, pages 138--154, 2013.Google Scholar
L. Nai et al. GraphBIG: understanding graph computing in the context of industrial solutions. In SC, pages 69:1--69:12, 2015. Google ScholarDigital Library
W. L. Ngai. Fine-grained Performance Evaluation of Large-scale Graph Processing Systems. Master's thesis, Delft University of Technology, the Netherlands, 2015.Google Scholar
L. Page et al. The pagerank citation ranking: bringing order to the web. 1999.Google Scholar
T. Rabl et al. The vision of BigBench 2.0. In DanaC, pages 3:1--3:4, 2015. Google ScholarDigital Library
U. Raghavan et al. Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76(3):036106, 2007.Google ScholarCross Ref
N. Satish et al. Navigating the maze of graph analytics frameworks using massive datasets. In SIGMOD, pages 979--990, 2014. Google ScholarDigital Library
M. Schmidt et al. Sp^2 bench: a SPARQL performance benchmark. In ICDE, pages 222--233, 2009. Google ScholarDigital Library
N. Sundaram et al. Graphmat: High performance graph analytics made productive. PVLDB, 8(11):1214--1225, 2015. Google ScholarDigital Library
L. Wang et al. BigDataBench: a big data benchmark suite from internet services. In HPCA, pages 488--499, 2014.Google ScholarCross Ref
R. Xin et al. GraphX: A resilient distr. graph system on Spark. In GRADES, page 2, 2013. Google ScholarDigital Library

Index Terms

LDBC graphalytics: a benchmark for large-scale graph analysis on parallel and distributed platforms
1. General and reference
  1. Cross-computing tools and techniques
    1. Performance

Index terms have been assigned to the content through auto-classification.

Recommendations

The LDBC Social Network Benchmark: Interactive Workload
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

The Linked Data Benchmark Council (LDBC) is now two years underway and has gathered strong industrial participation for its mission to establish benchmarks, and benchmarking practices for evaluating graph data management systems. The LDBC introduced a ...
Read More
Graphalytics: A Big Data Benchmark for Graph-Processing Platforms
GRADES'15: Proceedings of the GRADES'15

Graphs are increasingly used in industry, governance, and science. This has stimulated the appearance of many and diverse graph-processing platforms. Although platform diversity is beneficial, it also makes it very challenging to select the best ...
Read More
LDBC: benchmarks for graph and RDF data management
IDEAS '13: Proceedings of the 17th International Database Engineering & Applications Symposium

The Linked Data Benchmark Council (LDBC) is an EU project that aims to develop industry-strength benchmarks for graph and RDF data management systems. LDBC introduces a so-called "choke-point" based benchmark development, through which experts identify ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 9, Issue 13
September 2016
378 pages
ISSN:2150-8097
Editor:
Surajit Chaudhuri
Microsoft Research
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 September 2016
Published in pvldb Volume 9, Issue 13
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 46
  Total Citations
  View Citations
- 581
  Total Downloads
- Downloads (Last 12 months)81
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

LDBC graphalytics: a benchmark for large-scale graph analysis on parallel and distributed platforms

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

The LDBC Social Network Benchmark: Interactive Workload

Graphalytics: A Big Data Benchmark for Graph-Processing Platforms

LDBC: benchmarks for graph and RDF data management

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

LDBC graphalytics: a benchmark for large-scale graph analysis on parallel and distributed platforms

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

The LDBC Social Network Benchmark: Interactive Workload

Graphalytics: A Big Data Benchmark for Graph-Processing Platforms

LDBC: benchmarks for graph and RDF data management

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media