Skip to main content
Top

2018 | Book

Performance Evaluation and Benchmarking for the Analytics Era

9th TPC Technology Conference, TPCTC 2017, Munich, Germany, August 28, 2017, Revised Selected Papers

insite
SEARCH

About this book

This book constitutes the thoroughly refereed post-conference proceedings of the 8th TPC Technology Conference, on Performance Evaluation and Benchmarking, TPCTC 2017, held in conjunction with the43rd International Conference on Very Large Databases (VLDB 2017) in August/September 2017.
The 12 papers presented were carefully reviewed and selected from numeroussubmissions. The TPC remains committed to developing new benchmark standards to keep pace with these rapid changes in technology.

Table of Contents

Frontmatter
Industry Standards for the Analytics Era: TPC Roadmap
Abstract
The Transaction Processing Performance Council (TPC) is a non-profit organization focused on developing data-centric benchmark standards and disseminating objective, verifiable performance data to industry. This paper provides a high-level summary of TPC benchmark standards, technology conference initiative, and new development activities in progress.
Raghunath Nambiar, Meikel Poess
PEEL: A Framework for Benchmarking Distributed Systems and Algorithms
Abstract
During the last decade, a multitude of novel systems for scalable and distributed data processing has been proposed in both academia and industry. While there are published results of experimental evaluations for nearly all systems, it remains a challenge to objectively compare different system’s performance. It is thus imperative to enable and establish benchmarks for these systems. However, even if workloads and data sets or data generators are fixed, orchestrating and executing benchmarks can be a major obstacle. Worse, many systems come with hardware-dependent parameters that have to be tuned and spawn a diverse set of configuration files. This impedes portability and reproducibility of benchmarks. To address these problems and to foster reproducible and portable experiments and benchmarks of distributed data processing systems, we present PEEL, a framework to define, execute, analyze, and share experiments. PEEL enables the transparent specification of benchmarking workloads and system configuration parameters. It orchestrates the systems involved and automatically runs and collects all associated logs of experiments. PEEL currently supports Apache HDFS, Hadoop, Flink, and Spark and can easily be extended to include further systems.
Christoph Boden, Alexander Alexandrov, Andreas Kunft, Tilmann Rabl, Volker Markl
Senska – Towards an Enterprise Streaming Benchmark
Abstract
In the light of growing data volumes and continuing digitization in fields such as Industry 4.0 or Internet of Things, data stream processing have gained popularity and importance. Especially enterprises can benefit from this development by augmenting their vital, core business data with up-to-date streaming information. Enriching this transactional data with detailed information from high-frequency data streams allows answering new analytical questions as well as improving current analyses, e.g., regarding predictive maintenance. Comparing such data stream processing architectures for use in an enterprise context, i.e., when combining streaming and business data, is currently a challenging task as there is no suitable benchmark.
In this paper, we give an overview about performance benchmarks in the area of data stream processing. We highlight shortcomings of existing benchmarks and present the need for a new benchmark with a focus on an enterprise context. Furthermore, the ideas behind Senska, a new enterprise streaming benchmark that shall fill this gap, and its architecture are introduced.
Guenter Hesse, Benjamin Reissaus, Christoph Matthies, Martin Lorenz, Milena Kraus, Matthias Uflacker
Towards a Scalability and Energy Efficiency Benchmark for VNF
Abstract
Network Function Virtualization (NFV) is the transfer of network functions from dedicated devices to high-volume commodity servers. It opens opportunities for flexibility and energy savings. Concrete insights on the flexibility of specific NFV environments require measurement methodologies and benchmarks. However, current benchmarks are not measuring the ability of a virtual network function (VNF) to scale either horizontally or vertically. We therefore envision a new benchmark that measures a VNF’s ability to scale while evaluating its energy efficiency at the same time. Such a benchmark would enable the selection of a suitable VNF for changing demands, deployed at an existing or new resource landscape, while minimizing energy costs.
Norbert Schmitt, Jóakim von Kistowski, Samuel Kounev
Characterizing BigBench Queries, Hive, and Spark in Multi-cloud Environments
Abstract
BigBench is the new standard (TPCx-BB) for benchmarking and testing Big Data systems. The TPCx-BB specification describes several business use cases—queries—which require a broad combination of data extraction techniques including SQL, Map/Reduce (M/R), user code (UDF), and Machine Learning to fulfill them. However, currently, there is no widespread knowledge of the different resource requirements and expected performance of each query, as is the case to more established benchmarks. Moreover, over the last year, the Spark framework and APIs have been evolving very rapidly, with major improvements in performance and the stable release of v2. It is our intent to compare the current state of Spark to Hive’s base implementation which can use the legacy M/R engine and Mahout or the current Tez and MLlib frameworks. At the same time, cloud providers currently offer convenient on-demand managed big data clusters (PaaS) with a pay-as-you-go model. In PaaS, analytical engines such as Hive and Spark come ready to use, with a general-purpose configuration and upgrade management. The study characterizes both the BigBench queries and the out-of-the-box performance of Spark and Hive versions in the cloud. At the same time, comparing popular PaaS offerings in terms of reliability, data scalability (1 GB to 10 TB), versions, and settings from Azure HDinsight, Amazon Web Services EMR, and Google Cloud Dataproc. The query characterization highlights the similarities and differences in Hive an Spark frameworks, and which queries are the most resource consuming according to CPU, memory, and I/O. Scalability results show how there is a need for configuration tuning in most cloud providers as data scale grows, especially with Sparks memory usage. These results can help practitioners to quickly test systems by picking a subset of the queries which stresses each of the categories. At the same time, results show how Hive and Spark compare and what performance can be expected of each in PaaS.
Nicolas Poggi, Alejandro Montero, David Carrera
Performance Characterization of Big Data Systems with TPC Express Benchmark HS
Abstract
TPC Express Benchmark HS (TPCx-HS) is industry’s first standard for benchmarking big data systems. There are many moving parts in a large big data deployment which includes compute, storage, memory and network collectively called the infrastructure, platform and application and in this paper, we characterize in detail how each of these components affect performance.
Manan Trivedi
Experiences and Lessons in Practice Using TPCx-BB Benchmarks
Abstract
The TPCx-BigBench (TPCx-BB) is a TPC Express benchmark, which is designed to measure the performance of big data analytics systems. It contains 30 use cases that simulate big data processing, big data storage, big data analytics, and reporting. We have used this benchmark to evaluate the performance of software and hardware components for big data systems. It has very good coverage on different data types and provides enough scalability to address data size and node scaling problems. We have gained lots of meaningful insights through this benchmark to design analytic systems. In the meantime, we also found we cannot merely rely on TPCx-BB to evaluate and design an end-to-end big data systems. There are some gaps between an analytics system and a real end-to-end system. The whole data flow of a real end-to-end system should include data ingestion, which moves data from where it is originated into a system where it can be stored and analyzed such as Hadoop. Data ingestion may be challenging for businesses at a reasonable speed in order to maintain a competitive advantage. However, TPCx-BB cannot help on performance evaluation of software and hardware for data ingestion. Big data is composed of three dimensions: Volume, Variety, and Velocity. The Velocity refers to the high speed in data processing: real-time or near real-time. With big data technology widely used, real-time and near real-time processing become more popular. There is very strict limitation on bandwidth and latency for real-time processing. TPCx-BB cannot help on performance evaluation of software and hardware for real-time processing. This paper mainly discusses these experiences and lessons in practice using TPCx-BB. Then, we provide some advices to extend TPCx-BB to cover data ingestion and real-time processing. We also share some ideas how to implement TPCx-BB coverage.
Kebing Wang, Bianny Bian, Paul Cao, Mike Riess
JCC-H: Adding Join Crossing Correlations with Skew to TPC-H
Abstract
We introduce JCC-H, a drop-in replacement for the data and query generator of TPC-H, that introduces Join-Crossing-Correlations (JCC) and skew into its dataset and query workload. These correlations are carefully designed such that the filter predicates on table columns in the existing TPC-H queries now suddenly can have effects on the value-, frequency- and join-fan-out-distributions, experienced by operators in the query plan. The query generator of JCC-H is able to generate parameter bindings for the 22 query templates in two different equivalence classes: query templates that receive “normal” parameters do not experience skew and behave very similar to default TPC-H queries. Query templates expanded with the “skewed” parameters, though, experience strong join-crossing-correlations and skew in filter, aggregation and join operations. In this paper we discuss the goals of JCC-H, its detailed design, as well as show initial experiments on both a single-server and MPP database system, that confirm that our design goals were largely met. In all, JCC-H provides a convenient way for any system that is already testing with TPC-H to examine how the system can handle skew and correlations, so we hope the community can use it to make progress on issues like skew mitigation and detection and exploitation of join-crossing-correlations in query optimizers and data storage.
Peter Boncz, Angelos-Christos Anatiotis, Steffen Kläbe
TPCx-HS v2: Transforming with Technology Changes
Abstract
The TPCx-HS Hadoop benchmark has helped drive competition in the Big Data marketplace and has proven to be a successful industry standard benchmark for Hadoop systems. However, the Big Data landscape has rapidly changed since its initial release in 2014. Key technologies have matured, while new ones have risen to prominence in an effort to keep pace with the exponential expansion of datasets. For example, Hadoop has undergone a much-needed upgrade to the way that scheduling, resource management, and execution occur in Hadoop, while Apache Spark has risen to be the de facto standard for in-memory cluster compute for ETL, Machine Learning, and Data Science Workloads. Moreover, enterprises are increasingly considering cloud infrastructure for Big Data processing. What has not changed since TPCx-HS was first released is the need for a straightforward, industry standard way in which these current technologies and architectures can be evaluated. In this paper, we introduce TPCx-HS v2 that is designed to address these changes in the Big Data technology landscape and stress both the hardware and software stacks including the execution engine (MapReduce or Spark) and Hadoop Filesystem API compatible layers for both on-premise and cloud deployments.
Tariq Magdon-Ismail, Chinmayi Narasimhadevara, Dave Jaffe, Raghunath Nambiar
Performance Assurance Model for Applications on SPARK Platform
Abstract
The wide availability of open source big data processing frameworks, such as Spark, has increased migration of existing applications and deployment of new applications to these cost-effective platforms. One of the challenges is assuring performance of an application with increase in data size in production system. We have addressed this problem in our work for Spark platform using a performance prediction model in development environment. We have proposed a grey box approach to estimate an application execution time on Spark cluster for higher data size using measurements on low volume data in a small size cluster. The proposed model may also be used iteratively to estimate the competent cluster size for desired application performance in production environment. We have discussed both machine learning and analytic based techniques to build the model. The model is also flexible to different configurations of Spark cluster. This flexibility enables the use of the prediction model with optimization techniques to get tuned value of Spark parameters for optimal performance of deployed application on Spark cluster. Our key innovations in building Spark performance prediction model are support for different configurations of Spark platform, and simulator to estimate Spark stage execution time which includes task execution variability due to HDFS, data skew and cluster nodes heterogeneity. We have shown that our proposed approaches are able to predict within 20% error bound for Wordcount, Terasort, K-means and few TPC-H SQL workloads.
Rekha Singhal, Praveen Singh
Benchmarking and Performance Analysis for Distributed Cache Systems: A Comparative Case Study
Abstract
Caching critical pieces of information in memory or local hard drive is important for applications’ performance. Critical pieces of information could include, for example, information returned from I/O-intensive queries or computationally-intensive calculations. Apart from such, storing large amounts of data in a single memory is expensive and sometimes infeasible. Distributed cache systems come to offer faster access by exploiting the memory of more than one machine but they appear as one logical large cache. Therefore, analyzing and benchmarking these systems are necessary to study what and how factors, such as number of clients and data sizes, affect the performance. The majority of current benchmarks deal with the number of clients as “multiple-threads but all over one client connection”; this does not reflect the real scenarios where each thread has its own connection. This paper considered several benchmarking mechanisms and selected one for performance analysis. It also studied the performance of two popular open source distributed cache systems (Hazelcast and Infinispan). Using the selected benchmarking mechanism, results show that the performance of distributed cache systems is significantly affected by the number of concurrent clients accessing the distributed cache as well as by the size of the data managed by the cache. Furthermore, the conducted performance analysis shows that Infinispan outperforms Hazelcast in the simple data retrieval scenarios as well as most SQL-like queries scenarios, whereas Hazelcast outperforms Infinispan in SQL-like queries for small data sizes.
Haytham Salhi, Feras Odeh, Rabee Nasser, Adel Taweel
A Comparison of ARM Against x86 for Distributed Machine Learning Workloads
Abstract
The rise of Machine Learning (ML) in the last decade has created an unprecedented surge in demand for new and more powerful hardware. Various hardware approaches exist to take on these large demands motivating the need for hardware performance benchmarks to compare these diverse hardware systems. In this paper, we present a comprehensive analysis and comparison of available benchmark suites in the field of ML and related fields. The analysis of these benchmarks is used to discuss the potential of ARM processors within the context of ML deployments. Our paper concludes with a brief hardware performance comparison of modern, server-grade ARM and x86 processors using a benchmark suite selected from our survey.
Sebastian Kmiec, Jonathon Wong, Hans-Arno Jacobsen, Da Qi Ren
Backmatter
Metadata
Title
Performance Evaluation and Benchmarking for the Analytics Era
Editors
Raghunath Nambiar
Meikel Poess
Copyright Year
2018
Electronic ISBN
978-3-319-72401-0
Print ISBN
978-3-319-72400-3
DOI
https://doi.org/10.1007/978-3-319-72401-0