skip to main content
article

Adaptive range filters for cold data: avoiding trips to Siberia

Published:01 September 2013Publication History
Skip Abstract Section

Abstract

Bloom filters are a great technique to test whether a key is not in a set of keys. This paper presents a novel data structure called ARF. In a nutshell, ARFs are for range queries what Bloom filters are for point queries. That is, an ARF can determine whether a set of keys does not contain any keys that are part of a specific range. This paper describes the principles and methods for efficient implementation of ARFs and presents the results of comprehensive experiments that assess the precision, space, and latency of ARFs. Furthermore, this paper shows how ARFs can be applied to a commercial database system that partitions data into hot and cold regions to optimize queries that involve only hot data.

References

  1. A. V. Aho, P. J. Denning, and J. D. Ullman. Principles of optimal page replacement. J. ACM, 18(1):80-93, 1971. Google ScholarGoogle Scholar
  2. A.-K. Alexiou. Adaptive range filters for query optimization. Master Thesis 84, Systems Group, ETH Zurich, 2013.Google ScholarGoogle Scholar
  3. B. Baig, H. Chan, S. McCauley, and A. Wong. Range queries using Bloom filters. http://www.scribd.com/doc/92463819/Ranged-Queries-Using-Bloom-Filters-Final.Google ScholarGoogle Scholar
  4. D. Barbará, W. DuMouchel, C. Faloutsos, P. J. Haas, J. M. Hellerstein, Y. E. Ioannidis, H. V. Jagadish, T. Johnson, R. T. Ng, V. Poosala, K. A. Ross, and K. C. Sevcik. The new jersey data reduction report. IEEE Data Eng. Bull., 20(4):3-45, 1997.Google ScholarGoogle Scholar
  5. B. H. Bloom. Space/time trade-offs in hash coding with allowable errors. Commun. ACM, 13(7):422-426, 1970. Google ScholarGoogle Scholar
  6. F. Bonomi, M. Mitzenmacher, R. Panigrahy, S. Singh, and G. Varghese. An improved construction for counting Bloom filters. In Y. Azar and T. Erlebach, editors, ESA, volume 4168 of Lecture Notes in Computer Science, pages 684-695. Springer, 2006. Google ScholarGoogle Scholar
  7. J. Bruck, J. Gao, and A. Jiang. Weighted Bloom filters. 2006.Google ScholarGoogle Scholar
  8. C.-M. Chen and N. Roussopoulos. Adaptive selectivity estimation using query feedback. In R. T. Snodgrass and M. Winslett, editors, SIGMOD Conference, pages 161-172. ACM Press, 1994. Google ScholarGoogle Scholar
  9. S. Dar, M. J. Franklin, B. T. Jónsson, D. Srivastava, and M. Tan. Semantic data caching and replacement. In T. M. Vijayaraman, A. P. Buchmann, C. Mohan, and N. L. Sarda, editors, VLDB, pages 330-341. Morgan Kaufmann, 1996. Google ScholarGoogle Scholar
  10. C. Diaconu, C. Freedman, E. Ismert, P.-A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling. Hekaton: SQL Server's memory-optimized OLTP engine. In SIGMOD Conference, 2013. Google ScholarGoogle Scholar
  11. J.-P. Dittrich, P. M. Fischer, and D. Kossmann. Agile: Adaptive indexing for context-aware information filters. In F. Özcan, editor, SIGMOD Conference, pages 215-226. ACM, 2005. Google ScholarGoogle Scholar
  12. F. Fabret, H.-A. Jacobsen, F. Llirbat, J. Pereira, K. A. Ross, and D. Shasha. Filtering algorithms and implementation for very fast publish/subscribe. In S. Mehrotra and T. K. Sellis, editors, SIGMOD Conference, pages 115-126. ACM, 2001. Google ScholarGoogle Scholar
  13. L. Fan, P. Cao, J. M. Almeida, and A. Z. Broder. Summary cache: A scalable wide-area web cache sharing protocol. In SIGCOMM, pages 254-265, 1998. Google ScholarGoogle Scholar
  14. Y. E. Ioannidis. The history of histograms (abridged). In VLDB, pages 19-30, 2003. Google ScholarGoogle Scholar
  15. A. M. Keller and J. Basu. A predicate-based caching scheme for client-server database architectures. In PDIS, pages 229-238. IEEE Computer Society, 1994. Google ScholarGoogle Scholar
  16. D. E. Knuth. The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley, 1973.Google ScholarGoogle Scholar
  17. J. Levandoski, P.-A. Larson, and R. Stoica. Identifying hot and cold data in main-memory databases. In IEEE Conf. on Data Engineering, 2013. Google ScholarGoogle Scholar
  18. M. Zhong, P. Lu, K. Shen, and J. I. Seiferas. Optimizing data popularity conscious Bloom filters. In R. A. Bazzi and B. Patt-Shamir, editors, PODC, pages 355-364. ACM, 2008. Google ScholarGoogle Scholar

Index Terms

  1. Adaptive range filters for cold data: avoiding trips to Siberia
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the VLDB Endowment
        Proceedings of the VLDB Endowment  Volume 6, Issue 14
        September 2013
        384 pages

        Publisher

        VLDB Endowment

        Publication History

        • Published: 1 September 2013
        Published in pvldb Volume 6, Issue 14

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader