Skip to main content

Über dieses Buch

This book constitutes the refereed post-conference proceedings of the 11th TPC Technology Conference on Performance Evaluation and Benchmarking, TPCTC 2019, held in conjunction with the 45th International Conference on Very Large Databases (VLDB 2019) in August 2019.
The 11 papers presented were carefully reviewed and focus on topics such as blockchain; big data and analytics; complex event processing; database Optimizations; data Integration; disaster tolerance and recovery; artificial Intelligence; emerging storage technologies (NVMe, 3D XPoint Memory etc.); hybrid workloads; energy and space efficiency; in-memory databases; internet of things; virtualization; enhancements to TPC workloads; lessons learned in practice using TPC workloads; collection and interpretation of performance data in public cloud environments.



Benchmarking Elastic Cloud Big Data Services Under SLA Constraints

We introduce an extension for TPC benchmarks addressing the requirements of big data processing in cloud environments. We characterize it as the Elasticity Test and evaluate under TPCx-BB (BigBench). First, the Elasticity Test incorporates an approach to generate real-world query submissions patterns with distinct data scale factors based on major industrial cluster logs. Second, a new metric is introduced based on Service Level Agreements (SLAs) that takes the quality of service requirements of each query under consideration.
Experiments with Apache Hive and Spark on the cloud platforms of three major vendors validate our approach by comparing to the current TPCx-BB metric. Results show how systems who fail to meet SLAs under concurrency due to queuing or degraded performance negatively affect the new metric. On the other hand, elastic systems meet a higher percentage of SLAs and thus are rewarded in the new metric. Such systems have the ability to scale up and down compute workers according to the demands of a varying workload and can thus save dollar costs.
Nicolas Poggi, Víctor Cuevas-Vicenttín, Josep Lluis Berral, Thomas Fenech, Gonzalo Gómez, Davide Brini, Alejandro Montero, David Carrera, Umar Farooq Minhas, Jose A. Blakeley, Donald Kossmann, Raghu Ramakrishnan, Clemens Szyperski

Efficient Multiway Hash Join on Reconfigurable Hardware

We propose the algorithms for performing multiway joins using a new type of coarse grain reconfigurable hardware accelerator – “Plasticine” – that, compared with other accelerators, emphasizes high compute capability and high on-chip communication bandwidth. Joining three or more relations in a single step, i.e. multiway join, is efficient when the join of any two relations yields too large an intermediate relation. We show at least 130x speedup for a sequence of binary hash joins execution on Plasticine over CPU. We further show that in some realistic cases, a Plasticine-like accelerator can make 3-way joins more efficient than a cascade of binary hash joins on the same hardware, by a factor of up to 45X.
Rekha Singhal, Yaqi Zhang, Jeffrey D. Ullman, Raghu Prabhakar, Kunle Olukotun

Challenges in Distributed MLPerf

MLPerf has emerged as a frontrunner in benchmarking AI performance by having support of main players in the industry. At the same time, official scores uncover challenges for measuring distributed AI performance: a 2/3 throughput loss at a large scale and longer number of epochs needed to reach the required accuracy. Furthermore, no distributed scored have been submitted for Tensorflow, the most popular AI framework. Our work investigates these issues and suggests ways for overcoming challenges facing benchmarking at scale. Focusing on Tensorflow, wee show how efficient distributed scores can be obtained with appropriate software and hardware choices. Results for various Lenovo servers and Nvidia GPUs (V100 and T4) are also presented. Finally, we examine the utility of MLPerf for evaluating scale-up hardware and propose augmenting the main MLPerf score by an additional score that takes into account computational efficiency. Several options for the score are explored and analyzed in detail.
Miro Hodak, Ajay Dholakia

ADABench - Towards an Industry Standard Benchmark for Advanced Analytics

The digital revolution, rapidly decreasing storage cost, and remarkable results achieved by state of the art machine learning (ML) methods are driving widespread adoption of ML approaches. While notable recent efforts to benchmark ML methods for canonical tasks exist, none of them address the challenges arising with the increasing pervasiveness of end-to-end ML deployments. The challenges involved in successfully applying ML methods in diverse enterprise settings extend far beyond efficient model training.
In this paper, we present our work in benchmarking advanced data analytics systems and lay the foundation towards an industry standard machine learning benchmark. Unlike previous approaches, we aim to cover the complete end-to-end ML pipeline for diverse, industry-relevant application domains rather than evaluating only training performance. To this end, we present reference implementations of complete ML pipelines including corresponding metrics and run rules, and evaluate them at different scales in terms of hardware, software, and problem size.
Tilmann Rabl, Christoph Brücke, Philipp Härtling, Stella Stars, Rodrigo Escobar Palacios, Hamesh Patel, Satyam Srivastava, Christoph Boden, Jens Meiners, Sebastian Schelter

TPCx-BB (Big Bench) in a Single-Node Environment

Big data tends to concentrate on the data volume and variety which requires large cluster capabilities to process diverse and heterogeneous data. Currently, NoSQL/Hadoop-based cluster frameworks are known to excel at handling this form of data by scaling across nodes and distributed query processing. But for certain data sizes, relational databases can also support these workloads. In this paper, we support this claim over a popular relational database engine, Microsoft* SQL Server* 2019 (pre-release candidate) using a big data benchmark, BigBench. Our work in this paper is the industry first case study that runs BigBench on a single node environment powered by Intel® XeonTM processor 8164 product family and enterprise-class Intel® SSDs. We make the following two contributions: (1) present response times of all 30 BigBench queries when run sequentially to showcase the advanced analytics and machine learning capabilities integrated within SQL Server 2019, and (2) present results from data scalability experiments over two scale factors (1 TB, 3 TB) to understand the impact of increase in data size on query runtimes. We further characterize a subset of queries to understand their resource consumption requirements (CPU/IO/memory) on a single node system. We will conclude by correlating our initial engineering study to similar research studies on cluster-based configurations providing a further hint to the potential of relational databases to run reasonably scaled big-data workloads.
Dippy Aggarwal, Shreyas Shekhar, Chris Elford, Umachandar Jayachandran, Sadashivan Krishnamurthy, Jamie Reding, Brendan Niebruegge

CBench-Dynamo: A Consistency Benchmark for NoSQL Database Systems

Nowadays software architects face new challenges because Internet has grown to a point where popular websites are accessed by hundreds of millions of people on a daily basis. One powerful machine is no longer economically viable and resilient in order to handle such outstanding traffic and architectures have since been migrated to horizontal scaling. However, traditional databases, usually associated with a relational design, were not ready for horizontal scaling. Therefore, NoSQL databases have proposed to fill the gap left by their predecessors. This new paradigm is proposed to better serve currently massive scaled-up Internet usage when consistency is no longer a top priority and a high available service is preferable. Cassandra is a NoSQL database based on the Amazon Dynamo design. Dynamo-based databases are designed to run in a cluster while offering high availability and eventual consistency to clients when subject to network partition events. Therefore, the main goal of this work is to propose CBench-Dynamo, the first consistency benchmark for NoSQL databases. Our proposed benchmark correlates properties, such as performance, consistency, and availability, in different consistency configurations while subjecting the System Under Test to network partition events.
Miguel Diogo, Bruno Cabral, Jorge Bernardino

Benchmarking Pocket-Scale Databases

Embedded database libraries provide developers with a common and convenient data persistence layer. They are a key component of major mobile operating systems, and are used extensively on interactive devices like smartphones. Database performance affects the response times and resource consumption of millions of smartphone apps and billions of smartphone users. Given their wide use and impact, it is critical that we understand how embedded databases operate in realistic mobile settings, and how they interact with mobile environments. We argue that traditional database benchmarking methods produce misleading results when applied to mobile devices, due to evaluating performance only at saturation. To rectify this, we present PocketData, a new benchmark for mobile device database evaluation that uses typical workloads to produce representative performance results. We explain the performance measurement methodology behind PocketData, and address specific challenges. We analyze the results obtained, and show how different classes of workload interact with database performance. Notably, our study of mobile databases at non-saturated levels uncovers significant latency and energy variation in database workloads resulting from CPU frequency scaling policies called governors—variation that we show is hidden by typical benchmark measurement techniques.
Carl Nuessle, Oliver Kennedy, Lukasz Ziarek

End-to-End Benchmarking of Deep Learning Platforms

With their capability to recognise complex patterns in data, deep learning models are rapidly becoming the most prominent set of tools for a broad range of data science tasks from image classification to natural language processing. This trend is supplemented by the availability of deep learning software platforms and modern hardware environments. We propose a declarative benchmarking framework to evaluate the performance of different software and hardware systems. We further use our framework to analyse the performance of three different software frameworks on different hardware setups for a representative set of deep learning workloads and corresponding neural network architectures (Our framework is publicly available at https://​github.​com/​vdeuschle/​rysia.).
Vincent Deuschle, Alexander Alexandrov, Tim Januschowski, Volker Markl

Role of the TPC in the Cloud Age

In recent year the TPC Technology Conference on Performance Evaluation and Benchmarking (TPCTC) series have had significant influence in defining industry standards. The 11th TPC Technology Conference on Performance Evaluation and Benchmarking (TPCTC 2019) organized an industry panel on the “Role of the TPC in the Cloud Age”. This paper summaries the panel discussions.
Alain Crolotte, Feifei Li, Meikel Poess, Peter Boncz, Raghunath Nambiar

Benchmarking Database Cloud Services

Meanwhile, many database cloud services are available. The well-known providers are AWS (Amazon Web Service), Google Cloud, and Azure (Microsoft). Oracle and IBM offer cloud services for their in-house database products. In the past, the TPC organization has focused on performance measurement of database systems. Often, however, a database system is predefined, and the question arises as to the most efficient infrastructure and the best price/performance ratio - whether on-premise or as a cloud service. On the Internet, you can hardly find comparable and traceable information about the performance of database cloud services. Therefore, it is challenging to make corresponding price/performance comparisons [1]. The company Peakmarks was founded in 2011 to provide a robust and comprehensive benchmarking framework to identify representative performance indicators of database services. Peakmarks does not sell any hardware but runs benchmarks on behalf of users and manufacturers and thus guarantees absolute independence. Users can license Peakmarks benchmark software to perform their own performance tests. This presentation gives a rough overview of the Peakmarks benchmark software, its architecture, and workloads. Examples are used to show how understandable key performance metrics for database cloud services can be determined quickly and practically.
Manfred Drozd

Use Machine Learning Methods to Predict Lifetimes of Storage Devices

Erase count is a key performance indicator of hard drives, and it shows the lifetime of a device. Analysis of erase counts helps us understand the performance of a device and prevent the failure of it. In this paper, a machine learning based framework is proposed to predict the curves of erase counts. Specifically, probabilities and erase-count curves of different hard drives are first calculated from training data. The probabilities are for deciding disk type in testing data. The erase-count curves from training data serve as references to testing data. Long short-term memory is utilized to model the erase-count difference between a reference device and a testing device, and to predict the lifetime of the testing device. Preliminary results of synthetic data show that our method can follow references and precisely predict erase counts.
Yingxuan Zhu, Bowen Jiang, Yong Wang, Tim Tingqiu Yuan, Jian Li


Weitere Informationen