Abstract
Bloom filters are a great technique to test whether a key is not in a set of keys. This paper presents a novel data structure called ARF. In a nutshell, ARFs are for range queries what Bloom filters are for point queries. That is, an ARF can determine whether a set of keys does not contain any keys that are part of a specific range. This paper describes the principles and methods for efficient implementation of ARFs and presents the results of comprehensive experiments that assess the precision, space, and latency of ARFs. Furthermore, this paper shows how ARFs can be applied to a commercial database system that partitions data into hot and cold regions to optimize queries that involve only hot data.
- A. V. Aho, P. J. Denning, and J. D. Ullman. Principles of optimal page replacement. J. ACM, 18(1):80-93, 1971. Google Scholar
- A.-K. Alexiou. Adaptive range filters for query optimization. Master Thesis 84, Systems Group, ETH Zurich, 2013.Google Scholar
- B. Baig, H. Chan, S. McCauley, and A. Wong. Range queries using Bloom filters. http://www.scribd.com/doc/92463819/Ranged-Queries-Using-Bloom-Filters-Final.Google Scholar
- D. Barbará, W. DuMouchel, C. Faloutsos, P. J. Haas, J. M. Hellerstein, Y. E. Ioannidis, H. V. Jagadish, T. Johnson, R. T. Ng, V. Poosala, K. A. Ross, and K. C. Sevcik. The new jersey data reduction report. IEEE Data Eng. Bull., 20(4):3-45, 1997.Google Scholar
- B. H. Bloom. Space/time trade-offs in hash coding with allowable errors. Commun. ACM, 13(7):422-426, 1970. Google Scholar
- F. Bonomi, M. Mitzenmacher, R. Panigrahy, S. Singh, and G. Varghese. An improved construction for counting Bloom filters. In Y. Azar and T. Erlebach, editors, ESA, volume 4168 of Lecture Notes in Computer Science, pages 684-695. Springer, 2006. Google Scholar
- J. Bruck, J. Gao, and A. Jiang. Weighted Bloom filters. 2006.Google Scholar
- C.-M. Chen and N. Roussopoulos. Adaptive selectivity estimation using query feedback. In R. T. Snodgrass and M. Winslett, editors, SIGMOD Conference, pages 161-172. ACM Press, 1994. Google Scholar
- S. Dar, M. J. Franklin, B. T. Jónsson, D. Srivastava, and M. Tan. Semantic data caching and replacement. In T. M. Vijayaraman, A. P. Buchmann, C. Mohan, and N. L. Sarda, editors, VLDB, pages 330-341. Morgan Kaufmann, 1996. Google Scholar
- C. Diaconu, C. Freedman, E. Ismert, P.-A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling. Hekaton: SQL Server's memory-optimized OLTP engine. In SIGMOD Conference, 2013. Google Scholar
- J.-P. Dittrich, P. M. Fischer, and D. Kossmann. Agile: Adaptive indexing for context-aware information filters. In F. Özcan, editor, SIGMOD Conference, pages 215-226. ACM, 2005. Google Scholar
- F. Fabret, H.-A. Jacobsen, F. Llirbat, J. Pereira, K. A. Ross, and D. Shasha. Filtering algorithms and implementation for very fast publish/subscribe. In S. Mehrotra and T. K. Sellis, editors, SIGMOD Conference, pages 115-126. ACM, 2001. Google Scholar
- L. Fan, P. Cao, J. M. Almeida, and A. Z. Broder. Summary cache: A scalable wide-area web cache sharing protocol. In SIGCOMM, pages 254-265, 1998. Google Scholar
- Y. E. Ioannidis. The history of histograms (abridged). In VLDB, pages 19-30, 2003. Google Scholar
- A. M. Keller and J. Basu. A predicate-based caching scheme for client-server database architectures. In PDIS, pages 229-238. IEEE Computer Society, 1994. Google Scholar
- D. E. Knuth. The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley, 1973.Google Scholar
- J. Levandoski, P.-A. Larson, and R. Stoica. Identifying hot and cold data in main-memory databases. In IEEE Conf. on Data Engineering, 2013. Google Scholar
- M. Zhong, P. Lu, K. Shen, and J. I. Seiferas. Optimizing data popularity conscious Bloom filters. In R. A. Bazzi and B. Patt-Shamir, editors, PODC, pages 355-364. ACM, 2008. Google Scholar
Index Terms
- Adaptive range filters for cold data: avoiding trips to Siberia
Recommendations
Grafite: Taming Adversarial Queries with Optimal Range Filters
PACMMODRange filters allow checking whether a query range intersects a given set of keys with a chance of returning a false positive answer, thus generalising the functionality of Bloom filters from point to range queries. Existing practical range filters have ...
Range queries on uncertain data
Given a set P of n uncertain points on the real line, each represented by its one-dimensional probability density function, we consider the problem of building data structures on P to answer range queries of the following three types for any query ...
Adaptive median filters: new algorithms and results
Based on two types of image models corrupted by impulse noise, we propose two new algorithms for adaptive median filters. They have variable window size for removal of impulses while preserving sharpness. The first one, called the ranked-order based ...
Comments