research-article

Predictable performance for unpredictable workloads

Authors:
P. Unterbrunner

ETH Zurich, Switzerland

ETH Zurich, Switzerland
View Profile

,
G. Giannikis

ETH Zurich, Switzerland

ETH Zurich, Switzerland
View Profile

,
G. Alonso

ETH Zurich, Switzerland

ETH Zurich, Switzerland
View Profile

,
D. Fauser

Amadeus IT Group, SA, France

Amadeus IT Group, SA, France
View Profile

,
D. Kossmann

ETH Zurich, Switzerland

ETH Zurich, Switzerland
View Profile

Proceedings of the VLDB Endowment Volume 2 Issue 1pp 706–717https://doi.org/10.14778/1687627.1687707

Published:01 August 2009Publication History

Proceedings of the VLDB Endowment

Abstract

This paper introduces Crescando: a scalable, distributed relational table implementation designed to perform large numbers of queries and updates with guaranteed access latency and data freshness. To this end, Crescando leverages a number of modern query processing techniques and hardware trends. Specifically, Crescando is based on parallel, collaborative scans in main memory and so-called "query-data" joins known from data-stream processing. While the proposed approach is not always optimal for a given workload, it provides latency and freshness guarantees for all workloads. Thus, Crescando is particularly attractive if the workload is unknown, changing, or involves many different queries. This paper describes the design, algorithms, and implementation of a Crescando storage node, and assesses its performance on modern multi-core hardware.

References

A. Ailamaki et. al. Weaving relations for cache performance. In Proc. VLDB '01, 2001. Google ScholarDigital Library
P. M. G. Apers et. al. Prisma/db: A parallel, main memory relational dbms. IEEE TKDE, 4(6), 1992. Google ScholarDigital Library
H. Berenson et. al. A critique of ansi sql isolation levels. In Proc. SIGMOD '95, 1995. Google ScholarDigital Library
P. A. Boncz et. al. Database architecture optimized for the new bottleneck: Memory access. In Proc. VLDB '99, 1999. Google ScholarDigital Library
P. A. Boncz et. al. Monetdb/x100: Hyper-pipelining query execution. In Proc. CIDR '05, 2005.Google Scholar
S. Chandrasekaran and M. J. Franklin. Streaming queries over streaming data. In Proc. VLDB '02, 2002. Google ScholarDigital Library
D. J. Dewitt et. al. The gamma database machine project. IEEE TKDE, 2(1), 1990. Google ScholarDigital Library
D. J. DeWitt et. al. An evaluation of non-equijoin algorithms. In Proc. VLDB '91, 1991. Google ScholarDigital Library
F. Fabret et. al. Filtering algorithms and implementation for very fast publish/subscribe systems. In Proc. SIGMOD '01, 2001. Google ScholarDigital Library
P. Flajolet and G. N. Martin. Probabilistic counting algorithms for data base applications. J. Comput. Syst. Sci., 31(2):182--209, 1985. Google ScholarDigital Library
A. Guttman. R-trees: a dynamic index structure for spatial searching. In Proc. SIGMOD '84, 1984. Google ScholarDigital Library
S. Harizopoulos et. al. Qpipe: A simultaneously pipelined relational query engine. In Proc. SIGMOD '05, 2005. Google ScholarDigital Library
A. Kleen. A numa api for linux. Novell Technical Whitepaper, 2005. http://www.novell.com/-resourcecenter/ext_item.jsp?itemId=14444.Google Scholar
C. Lang et. al. Increasing buffer-locality for multiple relational table scans through grouping and throttling. Proc. ICDE '07, 2007.Google Scholar
D. B. Lomet. Key range locking strategies for improved concurrency. In Proc. VLDB '93, 1993. Google ScholarDigital Library
C. Mohan et. al. Aries: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM TODS, 17:94--162, 1992. Google ScholarDigital Library
L. Qiao et. al. Main-memory scan sharing for multi-core cpus. Proc. VLDB '08, 1(1), 2008. Google ScholarDigital Library
V. Raman et. al. Constant-time query processing. In Proc. ICDE '08, 2008. Google ScholarDigital Library
M. Ronström and L. Thalmann. Mysql cluster architecture overview: High availability features of mysql cluster. MySQL Technical Whitepaper, 2004. http://www.techworld.com/whitepapers/index.cfm?-whitepaperid=5663.Google Scholar
K. A. Ross. Conjunctive selection conditions in main memory. In Proc. PODS '02, 2002. Google ScholarDigital Library
K. A. Ross. Selection conditions in main memory. ACM TODS, 29(1), 2004. Google ScholarDigital Library
T. K. Sellis. Multiple-query optimization. ACM TODS, 13(1):23--52, 1988. Google ScholarDigital Library
K.-Y. Whang et. al. A linear-time probabilistic counting algorithm for database applications. ACM TODS, 15(2):208--229, 1990. Google ScholarDigital Library
W. A. Wulf and S. A. McKee. Hitting the memory wall: implications of the obvious. ACM SIGARCH Comput. Archit. News, 23(1):20--24, 1995. Google ScholarDigital Library
M. Zukowski et. al. Cooperative scans: Dynamic bandwidth sharing in a dbms. In Proc. VLDB '07, 2007. Google ScholarDigital Library

Index Terms

Predictable performance for unpredictable workloads
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
2. Information systems
  1. Data management systems
    1. Database management system engines
      1. Parallel and distributed DBMSs
  2. Information storage systems
    1. Record storage systems

Recommendations

Predictable performance and high query concurrency for data analytics

Conventional data warehouses employ the query-at-a-time model, which maps each query to a distinct physical plan. When several queries execute concurrently, this model introduces contention and thrashing, because the physical plans--unaware of each ...
Read More
Materialized view selection for XQuery workloads
SIGMOD '12: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data

The efficient processing of XQuery still poses significant challenges. A particularly effective technique to improve XQuery processing performance consists of using materialized views to answer queries. In this work, we consider the problem of choosing ...
Read More
Performance prediction for concurrent database workloads
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

Current trends in data management systems, such as cloud and multi-tenant databases, are leading to data processing environments that concurrently execute heterogeneous query workloads. At the same time, these systems need to satisfy diverse performance ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 2, Issue 1
August 2009
1293 pages
ISSN:2150-8097
Editors:
Serge Abiteboul,
Tova Milo,
Jignesh Patel,
Philippe Rigaux
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 August 2009
Published in pvldb Volume 2, Issue 1
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 54
  Total Citations
  View Citations
- 525
  Total Downloads
- Downloads (Last 12 months)24
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Predictable performance for unpredictable workloads

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Predictable performance and high query concurrency for data analytics

Materialized view selection for XQuery workloads

Performance prediction for concurrent database workloads

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Predictable performance for unpredictable workloads

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Predictable performance and high query concurrency for data analytics

Materialized view selection for XQuery workloads

Performance prediction for concurrent database workloads

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media