skip to main content
research-article

Predictable performance for unpredictable workloads

Published:01 August 2009Publication History
Skip Abstract Section

Abstract

This paper introduces Crescando: a scalable, distributed relational table implementation designed to perform large numbers of queries and updates with guaranteed access latency and data freshness. To this end, Crescando leverages a number of modern query processing techniques and hardware trends. Specifically, Crescando is based on parallel, collaborative scans in main memory and so-called "query-data" joins known from data-stream processing. While the proposed approach is not always optimal for a given workload, it provides latency and freshness guarantees for all workloads. Thus, Crescando is particularly attractive if the workload is unknown, changing, or involves many different queries. This paper describes the design, algorithms, and implementation of a Crescando storage node, and assesses its performance on modern multi-core hardware.

References

  1. A. Ailamaki et. al. Weaving relations for cache performance. In Proc. VLDB '01, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. M. G. Apers et. al. Prisma/db: A parallel, main memory relational dbms. IEEE TKDE, 4(6), 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. H. Berenson et. al. A critique of ansi sql isolation levels. In Proc. SIGMOD '95, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. A. Boncz et. al. Database architecture optimized for the new bottleneck: Memory access. In Proc. VLDB '99, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. A. Boncz et. al. Monetdb/x100: Hyper-pipelining query execution. In Proc. CIDR '05, 2005.Google ScholarGoogle Scholar
  6. S. Chandrasekaran and M. J. Franklin. Streaming queries over streaming data. In Proc. VLDB '02, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. J. Dewitt et. al. The gamma database machine project. IEEE TKDE, 2(1), 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. J. DeWitt et. al. An evaluation of non-equijoin algorithms. In Proc. VLDB '91, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Fabret et. al. Filtering algorithms and implementation for very fast publish/subscribe systems. In Proc. SIGMOD '01, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Flajolet and G. N. Martin. Probabilistic counting algorithms for data base applications. J. Comput. Syst. Sci., 31(2):182--209, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Guttman. R-trees: a dynamic index structure for spatial searching. In Proc. SIGMOD '84, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Harizopoulos et. al. Qpipe: A simultaneously pipelined relational query engine. In Proc. SIGMOD '05, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Kleen. A numa api for linux. Novell Technical Whitepaper, 2005. http://www.novell.com/-resourcecenter/ext_item.jsp?itemId=14444.Google ScholarGoogle Scholar
  14. C. Lang et. al. Increasing buffer-locality for multiple relational table scans through grouping and throttling. Proc. ICDE '07, 2007.Google ScholarGoogle Scholar
  15. D. B. Lomet. Key range locking strategies for improved concurrency. In Proc. VLDB '93, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Mohan et. al. Aries: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM TODS, 17:94--162, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Qiao et. al. Main-memory scan sharing for multi-core cpus. Proc. VLDB '08, 1(1), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. V. Raman et. al. Constant-time query processing. In Proc. ICDE '08, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Ronström and L. Thalmann. Mysql cluster architecture overview: High availability features of mysql cluster. MySQL Technical Whitepaper, 2004. http://www.techworld.com/whitepapers/index.cfm?-whitepaperid=5663.Google ScholarGoogle Scholar
  20. K. A. Ross. Conjunctive selection conditions in main memory. In Proc. PODS '02, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. A. Ross. Selection conditions in main memory. ACM TODS, 29(1), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. K. Sellis. Multiple-query optimization. ACM TODS, 13(1):23--52, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. K.-Y. Whang et. al. A linear-time probabilistic counting algorithm for database applications. ACM TODS, 15(2):208--229, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. A. Wulf and S. A. McKee. Hitting the memory wall: implications of the obvious. ACM SIGARCH Comput. Archit. News, 23(1):20--24, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Zukowski et. al. Cooperative scans: Dynamic bandwidth sharing in a dbms. In Proc. VLDB '07, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Predictable performance for unpredictable workloads

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader