ABSTRACT
This paper describes query processing in the DBO database system. Like other database systems designed for ad-hoc, analytic processing, DBO is able to compute the exact answer to queries over a large relational database in a scalable fashion. Unlike any other system designed for analytic processing, DBO can constantly maintain a guess as to the final answer to an aggregate query throughout execution, along with statistically meaningful bounds for the guess's accuracy. As DBO gathers more and more information, the guess gets more and more accurate, until it is 100% accurate as the query is completed. This allows users to stop the execution at any time that they are happy with the query accuracy, and encourages exploratory data analysis.
- S. Acharya, P. Gibons, V. Poosala, S. Ramaswamy: Join Synopses for Approximate Query Processing. SIGMOD 1999:275--286. Google ScholarDigital Library
- S. Chaudhuri, R. Motwani, V. R. Narasayya: On Random Sampling over Joins. SIGMOD 1999: 263--274. Google ScholarDigital Library
- W. Cochran: Sampling Techniques. Wiley and Sons, 1977.Google Scholar
- J. P. Dittrich, B. Seeger, D. S. Taylor, Peter Widmayer: On producing join results early. PODS 2003: 134--142. Google ScholarDigital Library
- J. P. Dittrich, B. Seeger, D. S. Taylor, P. Widmayer: Progressive Merge Join: A Generic and Non-blocking Sort-based Join Algorithm. VLDB 2002: 299--310. Google ScholarDigital Library
- P. J. Haas, J. M. Hellerstein: Ripple Joins for Online Aggregation. SIGMOD 1999: 287--298. Google ScholarDigital Library
- P.J. Haas: Large-Sample and Deterministic Confidence Intervals for Online Aggregation. SSDBM 1997: 51--63. Google ScholarDigital Library
- P. J. Haas, J. F. Naughton, S. Seshadri, A. N. Swami: Selectivity and Cost Estimation for Joins Based on Random Sampling. J. Com. Syst. Sci. 52(3): 550--569 (1996). Google ScholarDigital Library
- G. H. Hardy, J. E. Littlewood, and G. Polya. Inequalities. Cambridge University Press, 1988.Google Scholar
- J. M. Hellerstein, R. Avnur, A. Chou, C. Hidber, C. Olston, V. Raman, T. Roth, P. J. Haas: Interactive Data Analysis: The Control Project. IEEE Computer 32(8): 51--59 (1999). Google ScholarDigital Library
- J. M. Hellerstein, P. J. Haas, H. J. Wang: Online Aggregation. SIGMOD 1997: 171--182. Google ScholarDigital Library
- G. Özsoyoglu, K. Du, S. G. Swamy, W. C. Hou: Processing Real-Time, Non-Aggregate Queries with Time-Constraints in CASE-DB. ICDE 1992: 410--417. Google ScholarDigital Library
- C. Jermaine, A. Dobra, S. Arumugam, S. Joshi, A. Pol: A Disk-Based Join with Probabilistic Guarantees. SIGMOD 2005: 456--467. Google ScholarDigital Library
- C. Jermaine, A. Dobra, A. Pol, S. Joshi: Online Estimation for Subset-Based SQL Queries. VLDB 2005: 745--756. Google ScholarDigital Library
- G. Luo, C. Ellmann, P. J. Haas, J. F. Naughton: A scalable hash ripple join algorithm. SIGMOD 2002: 252--262. Google ScholarDigital Library
- F. Olken: Random Sampling from Databases. PhD Thesis, U. of California, Berkeley, 1993Google Scholar
- F. Olken, D. Rotem, P. Xu: Random Sampling from Hash Files. SIGMOD 1990: 375--386.Google Scholar
- F. Olken, D. Rotem: Random Sampling from B+-Trees. VLDB 1989: 269--277. Google ScholarDigital Library
- L.D. Shapiro: Join Processing in Database Systems with Large Main Memories. ACM TODS 11(3): 239--264 (1986). Google ScholarDigital Library
- J. Shao: Mathematical Statistics. Springer-Verlag, 1999.Google Scholar
Index Terms
- Scalable approximate query processing with the DBO engine
Recommendations
Approximate Query Processing: No Silver Bullet
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of DataIn this paper, we reflect on the state of the art of Approximate Query Processing. Although much technical progress has been made in this area of research, we are yet to see its impact on products and services. We discuss two promising avenues to pursue ...
Scalable approximate query processing with the DBO engine
This article describes query processing in the DBO database system. Like other database systems designed for ad hoc analytic processing, DBO is able to compute the exact answers to queries over a large relational database in a scalable fashion. Unlike ...
The DBO database system
SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of dataWe demonstrate our prototype of the DBO database system. DBO is designed to facilitate scalable analytic processing over large data archives. DBO's analytic processing performance is competitive with other database systems; however, unlike any other ...
Comments