short-paper

Benchmarking graph-processing platforms: a vision

Authors:
Yong Guo

TU Delft, Delft, Netherlands

TU Delft, Delft, Netherlands
View Profile

,
Ana Lucia Varbanescu

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands
View Profile

,
Alexandru Iosup

TU Delft, Delft, Netherlands

TU Delft, Delft, Netherlands
View Profile

,
Claudio Martella

VU University Amsterdam, Amsterdam, Netherlands

VU University Amsterdam, Amsterdam, Netherlands
View Profile

,
Theodore L. Willke

Systems Architecture Lab, Intel Corporation, Portland, USA

Systems Architecture Lab, Intel Corporation, Portland, USA
View Profile

ICPE '14: Proceedings of the 5th ACM/SPEC international conference on Performance engineeringMarch 2014Pages 289–292https://doi.org/10.1145/2568088.2576761

Published:22 March 2014Publication History

ICPE '14: Proceedings of the 5th ACM/SPEC international conference on Performance engineering

Pages 289–292

ABSTRACT

Processing graphs, especially at large scale, is an increasingly useful activity in a variety of business, engineering, and scientific domains. Already, there are tens of graph-processing platforms, such as Hadoop, Giraph, GraphLab, etc., each with a different design and functionality. For graph-processing to continue to evolve, users have to find it easy to select a graph-processing platform, and developers and system integrators have to find it easy to quantify the performance and other non-functional aspects of interest. However, the state of performance analysis of graph-processing platforms is still immature: there are few studies and, for the few that exist, there are few similarities, and relatively little understanding of the impact of dataset and algorithm diversity on performance. Our vision is to develop, with the help of the performance-savvy community, a comprehensive benchmarking suite for graph-processing platforms. In this work, we take a step in this direction, by proposing a set of seven challenges, summarizing our previous work on performance evaluation of distributed graph-processing platforms, and introducing our on-going work within the SPEC Research Group's Cloud Working Group.

References

DAS4. http://www.cs.vu.nl/das4/.Google Scholar
Giraph. http://giraph.apache.org/.Google Scholar
Neo4j. http://www.neo4j.org/.Google Scholar
SNAP. http://snap.stanford.edu/index.html.Google Scholar
YARN. http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html.Google Scholar
D. D. Abreu, A. Flores, G. Palma, V. Pestana, J. Piñero, J. Queipo, J. Sánchez, and M.-E. Vidal. Choosing between graph databases and rdf engines for consuming and mining linked data. In COLD, 2013.Google Scholar
A.-L. Barabási and R. Albert. Emergence of scaling in random networks. 1999.Google Scholar
A. Buluç, E. Duriakova, A. Fox, J. R. Gilbert, S. Kamil, A. Lugowski, L. Oliker, and S. Williams. High-Productivity 1 http://research.spec.org/working-groups/rg-cloud-working-group.html2 You are invited to participate, http://goo.gl/TJwkTg. and High-Performance Analysis of Filtered Semantic Graphs. In IPDPS, 2013. Google ScholarDigital Library
J. Cai and C. K. Poon. Path-hop: efficiently indexing large graphs for reachability queries. In CIKM, 2010. Google ScholarDigital Library
G. Cong and K. Makarychev. Optimizing Large-scale Graph Analysis on Multithreaded, Multicore Platforms. In IPDPS, 2012. Google ScholarDigital Library
M. Dayarathna and T. Suzumura. Xgdbench: A benchmarking platform for graph stores in exascale clouds. In CloudCom, pages 363--370, 2012. Google ScholarDigital Library
B. Elser and A. Montresor. An evaluation study of bigdata frameworks for graph processing. In IEEE BigData, 2013.Google ScholarCross Ref
S. Ewen, K. Tzoumas, M. Kaufmann, and V. Markl. Spinning fast iterative data flows. PVLDB, 5(11):1268--1279, 2012. Google ScholarDigital Library
B. Ghit, N. Yigitbasi, and D. Epema. Resource Management for Dynamic MapReduce Clusters in Multicluster Systems. In SC|12 MTAGS, 2012. Best paper award. Google ScholarDigital Library
Graph500. http://www.graph500.org/.Google Scholar
D. Gregor and A. Lumsdaine. The Parallel BGL: A Generic Library for Distributed Graph Computations. POOSC, 2005.Google Scholar
Y. Guo, M. Biczak, A. L. Varbanescu, A. Iosup, C. Martella, and T. L. Willke. How well do graph-processing platforms perform? an empirical performance evaluation and analysis. In IPDPS, 2013. http://www.pds.ewi.tudelft.nl/~iosup/perf-eval-graph-proc14ipdps.pdf.Google Scholar
Y. Guo and A. Iosup. The Game Trace Archive. In NetGames, 2012. Google ScholarDigital Library
W. Jiang and G. Agrawal. Ex-MATE: Data Intensive Computing with Large Reduction Objects and Its Application to Graph Mining. In CCGRID, 2011. Google ScholarDigital Library
J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In SIGKDD, 2005. Google ScholarDigital Library
Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud. In VLDB, pages 716--727, 2012. Google ScholarDigital Library
A. Lugowski, D. M. Alber, A. Buluç, J. R. Gilbert, S. Reinhardt, Y. Teng, and A. Waranis. A Flexible Open-Source Toolbox for Scalable Complex Graph Analysis. In SDM, 2012.Google ScholarCross Ref
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-scale Graph Processing. In SIGMOD, pages 135--146, 2010. Google ScholarDigital Library
D. Merrill, M. Garland, and A. S. Grimshaw. Scalable GPU graph traversal. In PPOPP, 2012. Google ScholarDigital Library
B. Shao, H. Wang, and Y. Li. Trinity: A distributed graph engine on a memory cloud. In SIGMOD, 2013. Google ScholarDigital Library
J. Shun and G. E. Blelloch. Ligra: a lightweight graph processing framework for shared memory. In PPOPP, 2013. Google ScholarDigital Library
E. Solomonik, A. Buluç, and J. Demmel. Minimizing Communication in All-Pairs Shortest Paths. In IPDPS, 2013. Google ScholarDigital Library
N. Wang, J. Zhang, K.-L. Tan, and A. K. H. Tung. On Triangulation-based Dense Neighborhood Graphs Discovery. VLDB, 2010. Google ScholarDigital Library
T. White. Hadoop: The definitive guide. O'Reilly Media, Inc., 2012. Google ScholarDigital Library
B. Wu and Y. Du. Cloud-Based Connected Component Algorithm. In ICAICI, pages 122--126, 2010. Google ScholarDigital Library

Index Terms

Benchmarking graph-processing platforms: a vision
1. General and reference
  1. Cross-computing tools and techniques
    1. Measurement
    2. Metrics

Recommendations

From Performance to Dependability Benchmarking: A Mandatory Path
Performance Evaluation and Benchmarking

The work on performance benchmarking has started long ago. Ranging from simple benchmarks that target a very specific system or component to very complex benchmarks for complex infrastructures, performance benchmarks have contributed to improve ...
Read More
Measuring and Benchmarking Power Consumption and Energy Efficiency
ICPE '18: Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

Energy efficiency is an important quality of computing systems. Researchers try to analyze, model, and predict the energy efficiency and power consumption of systems. Such research requires energy efficiency and power measurements, as well as ...
Read More
Benchmarking data warehouses

Database benchmarks can either help users in comparing the performances of different systems, or help engineers in testing the effect of various design choices. In the field of data warehouses, the Transaction Processing Performance Council's standard ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICPE '14: Proceedings of the 5th ACM/SPEC international conference on Performance engineering
March 2014
310 pages
ISBN:9781450327336
DOI:10.1145/2568088
General Chairs:
Klaus-Dieter Lange
Hewlett-Packard Company, USA
,
John Murphy
University College Dublin, Ireland
,
Program Chairs:
Walter Binder
University of Lugano, Switzerland
,
José Merseguer
Universidad de Zaragoza, Spain
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 March 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
benchmarking
experimentation
graph processing
performance
Qualifiers
- short-paper
Conference

Acceptance Rates
ICPE '14 Paper Acceptance Rate21of78submissions,27%Overall Acceptance Rate252of851submissions,30%
More
Upcoming Conference
ICPE '24

Sponsor:

sigsoft online

sigsoft online

15th ACM/SPEC International Conference on Performance Engineering

May 7 - 11, 2024

London , United Kingdom
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 33
  Total Citations
  View Citations
- 394
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Benchmarking graph-processing platforms: a vision

ICPE '14: Proceedings of the 5th ACM/SPEC international conference on Performance engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

From Performance to Dependability Benchmarking: A Mandatory Path

Measuring and Benchmarking Power Consumption and Energy Efficiency

Benchmarking data warehouses