skip to main content
10.1145/3318464.3389757acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Lethe: A Tunable Delete-Aware LSM Engine

Published:31 May 2020Publication History

ABSTRACT

Data-intensive applications fueled the evolution of log structured merge (LSM) based key-value engines that employ the out-of-place paradigm to support high ingestion rates with low read/write interference. These benefits, however, come at the cost of treating deletes as a second-class citizen. A delete inserts a tombstone that invalidates older instances of the deleted key. State-of-the-art LSM engines do not provide guarantees as to how fast a tombstone will propagate to persist the deletion. Further, LSM engines only support deletion on the sort key. To delete on another attribute (e.g., timestamp), the entire tree is read and re-written. We highlight that fast persistent deletion without affecting read performance is key to support: (i) streaming systems operating on a window of data, (ii) privacy with latency guarantees on the right-to-be-forgotten, and (iii) en masse cloud deployment of data systems that makes storage a precious resource. To address these challenges, in this paper, we build a new key-value storage engine, Lethe, that uses a very small amount of additional metadata, a set of new delete-aware compaction policies, and a new physical data layout that weaves the sort and the delete key order. We show that Lethe supports any user-defined threshold for the delete persistence latency offering higher read throughput (1.17-1.4x) and lower space amplification (2.1-9.8x), with a modest increase in write amplification (between 4% and 25%). In addition, Lethe supports efficient range deletes on a secondary delete key by dropping entire data pages without sacrificing read performance nor employing a costly full tree merge.

Skip Supplemental Material Section

Supplemental Material

3318464.3389757.mp4

mp4

140.7 MB

References

  1. California Consumer Privacy Act of 2018. Assembly Bill No. 375, Chapter 55, 2018.Google ScholarGoogle Scholar
  2. T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. Ferná ndez-Moctezuma, R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, and S. Whittle. The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing. Proceedings of the VLDB Endowment, 8(12):1792--1803, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. I. Alagiannis, S. Idreos, and A. Ailamaki. H2O: A Hands-free Adaptive Store. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 1103--1114, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Apache. Accumulo. https://accumulo.apache.org/.Google ScholarGoogle Scholar
  5. Apache. Cassandra. http://cassandra.apache.org.Google ScholarGoogle Scholar
  6. Apache. HBase. http://hbase.apache.org/.Google ScholarGoogle Scholar
  7. R. Appuswamy, M. Karpathiotakis, D. Porobic, and A. Ailamaki. The Case For Heterogeneous HTAP. In Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR), 2017.Google ScholarGoogle Scholar
  8. C. Aravindan and P. Baumgartner. A Rational and Efficient Algorithm for View Deletion in Databases. In Proceedings of the International Symposium on Logic Programming (ILPS), pages 165--179, 1997.Google ScholarGoogle Scholar
  9. J. Arulraj, A. Pavlo, and P. Menon. Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 583--598, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Athanassoulis and A. Ailamaki. BF-Tree: Approximate Tree Indexing. Proceedings of the VLDB Endowment, 7(14):1881--1892, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Athanassoulis, S. Chen, A. Ailamaki, P. B. Gibbons, and R. Stoica. MaSM: Efficient Online Updates in Data Warehouses. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 865--876, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Athanassoulis, S. Chen, A. Ailamaki, P. B. Gibbons, and R. Stoica. Online Updates on Data Warehouses via Judicious Use of Solid-State Storage. ACM Transactions on Database Systems (TODS), 40(1), 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Athanassoulis, M. S. Kester, L. M. Maas, R. Stoica, S. Idreos, A. Ailamaki, and M. Callaghan. Designing Access Methods: The RUM Conjecture. In Proceedings of the International Conference on Extending Database Technology (EDBT), pages 461--466, 2016.Google ScholarGoogle Scholar
  14. B. Bhattacharjee, T. Malkemus, S. Lau, S. Mckeough, J.-A. Kirton, R. V. Boeschoten, and J. Kennedy. Efficient Bulk Deletes for Multi Dimensionally Clustered Tables in DB2. In Proceedings of the International Conference on Very Large Data Bases (VLDB), pages 1197--1206, 2007.Google ScholarGoogle Scholar
  15. M. Callaghan. Deletes are fast and slow in an LSM. http://smalldatum.blogspot.com/2020/01/deletes-are-fast-and-slow-in-lsm.html, 2020.Google ScholarGoogle Scholar
  16. Z. Cao, S. Dong, S. Vemuri, and D. H. C. Du. Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook. In 18th USENIX Conference on File and Storage Technologies (FAST 20), pages 209--223, 2020.Google ScholarGoogle Scholar
  17. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A Distributed Storage System for Structured Data. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 205--218, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Cisco. Cisco Global Cloud Index: Forecast and Methodology, 2016--2021. White Paper, 2018.Google ScholarGoogle Scholar
  19. J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. C. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google's Globally-Distributed Database. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 251--264, 2012.Google ScholarGoogle Scholar
  20. B. Dageville, T. Cruanes, M. Zukowski, V. Antonov, A. Avanes, J. Bock, J. Claybaugh, D. Engovatov, M. Hentschel, J. Huang, A. W. Lee, A. Motivala, A. Q. Munir, S. Pelley, P. Povinec, G. Rahn, S. Triantafyllis, and P. Unterbrunner. The Snowflake Elastic Data Warehouse. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 215--226, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. N. Dayan, M. Athanassoulis, and S. Idreos. Monkey: Optimal Navigable Key-Value Store. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 79--94, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. N. Dayan, M. Athanassoulis, and S. Idreos. Optimal Bloom Filters and Adaptive Merging for LSM-Trees. ACM Transactions on Database Systems (TODS), 43(4):16:1--16:48, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. N. Dayan and S. Idreos. Dostoevsky: Better Space-Time Trade-Offs for LSM-Tree Based Key-Value Stores via Adaptive Removal of Superfluous Merging. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 505--520, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. N. Dayan and S. Idreos. The Log-Structured Merge-Bush & the Wacky Continuum. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 449--466, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's Highly Available Key-value Store. ACM SIGOPS Operating Systems Review, 41(6):205--220, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Deshpande and A. Machanavajjhala. ACM SIGMOD Blog: Privacy Challenges in the Post-GDPR World: A Data Management Perspective. http://wp.sigmod.org/?p=2554, 2018.Google ScholarGoogle Scholar
  27. S. Dong, M. Callaghan, L. Galanis, D. Borthakur, T. Savor, and M. Strum. Optimizing Space Amplification in RocksDB. In Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR), 2017.Google ScholarGoogle Scholar
  28. Facebook. MyRocks. http://myrocks.io/.Google ScholarGoogle Scholar
  29. Facebook. RocksDB. https://github.com/facebook/rocksdb.Google ScholarGoogle Scholar
  30. Gartner. Gartner Says 8.4 Billion Connected "Things" Will Be in Use in 2017, Up 31 Percent From 2016. https://tinyurl.com/Gartner2020, 2017.Google ScholarGoogle Scholar
  31. A. G"a rtner, A. Kemper, D. Kossmann, and B. Zeller. Efficient Bulk Deletes in Relational Databases. In Proceedings of the IEEE International Conference on Data Engineering (ICDE), pages 183--192, 2001.Google ScholarGoogle Scholar
  32. R. Geambasu, T. Kohno, A. A. Levy, and H. M. Levy. Vanish: Increasing Data Privacy with Self-Destructing Data. In Proceedings of the USENIX Security Symposium, pages 299--316, 2009.Google ScholarGoogle Scholar
  33. R. Geambasu, A. A. Levy, T. Kohno, A. Krishnamurthy, and H. M. Levy. Comet: An active distributed key-value store. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 323--336, 2010.Google ScholarGoogle Scholar
  34. M. Goddard. The EU General Data Protection Regulation (GDPR): European Regulation that has a Global Impact. International Journal of Market Research, 59(6):703--705, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  35. G. Golan-Gueta, E. Bortnikov, E. Hillel, and I. Keidar. Scaling Concurrent Log-Structured Data Stores. In Proceedings of the ACM European Conference on Computer Systems (EuroSys), pages 32:1--32:14, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Google. LevelDB. https://github.com/google/leveldb/.Google ScholarGoogle Scholar
  37. T. Heinis and A. Ailamaki. Reconsolidating Data Structures. In Proceedings of the International Conference on Extending Database Technology (EDBT), pages 665--670, 2015.Google ScholarGoogle Scholar
  38. S. Hé man, M. Zukowski, and N. J. Nes. Positional Update Handling in Column Stores. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 543--554, 2010.Google ScholarGoogle Scholar
  39. G. Huang, X. Cheng, J. Wang, Y. Wang, D. He, T. Zhang, F. Li, S. Wang, W. Cao, and Q. Li. X-Engine: An Optimized Storage Engine for Large-scale E-commerce Transaction Processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 651--665, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Y. Huang, T. Z. J. Fu, D.-M. Chiu, J. C. S. Lui, and C. Huang. Challenges, design and analysis of a large-scale p2p-vod system. In Proceedings of the ACM SIGCOMM 2008 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, Seattle, WA, USA, August 17--22, 2008, pages 375--388, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. F. Hueske. State TTL for Apache Flink: How to Limit the Lifetime of State. Ververica, 2018.Google ScholarGoogle Scholar
  42. M. L. Kersten. Big Data Space Fungus. In Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR), Gong show talk, 2015.Google ScholarGoogle Scholar
  43. S. Kulkarni, N. Bhagat, M. Fu, V. Kedigehalli, C. Kellogg, S. Mittal, J. M. Patel, K. Ramasamy, and S. Taneja. Twitter Heron: Stream Processing at Scale. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 239--250, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. A. Lamb, M. Fuller, and R. Varadarajan. The Vertica Analytic Database: C-Store 7 Years Later. Proceedings of the VLDB Endowment, 5(12):1790--1801, 2012.Google ScholarGoogle Scholar
  45. J.-T. Lee and G. G. Belford. An Efficient Object-based Algorithm for Spatial Searching, Insertion and Deletion. In Proceedings of the IEEE International Conference on Data Engineering (ICDE), pages 40--47, 1992.Google ScholarGoogle Scholar
  46. Y. Li, B. He, J. Yang, Q. Luo, K. Yi, and R. J. Yang. Tree Indexing on Solid State Drives. Proceedings of the VLDB Endowment, 3(1--2):1195--1206, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. T. Lilja, R. Saikkonen, S. Sippu, and E. Soisalon-Soininen. Online Bulk Deletion. In Proceedings of the IEEE International Conference on Data Engineering (ICDE), pages 956--965, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  48. LinkedIn. Voldemort. http://www.project-voldemort.com.Google ScholarGoogle Scholar
  49. H. Lu. Peer-to-Peer Support for Massively Multiplayer Games. In Proceedings IEEE INFOCOM 2004, The 23rd Annual Joint Conference of the IEEE Computer and Communications Societies, Hong Kong, China, March 7--11, 2004, 2004.Google ScholarGoogle Scholar
  50. C. Luo and M. J. Carey. LSM-based Storage Techniques: A Survey. The VLDB Journal, 2019.Google ScholarGoogle Scholar
  51. C. Mohan. Hybrid Transaction and Analytics Processing (HTAP): State of the Art. In Proceedings of the International Workshop on Business Intelligence for the Real-Time Enterprise (BIRTE), 2016.Google ScholarGoogle Scholar
  52. MongoDB. Online reference. http://www.mongodb.com/.Google ScholarGoogle Scholar
  53. P. E. O'Neil, E. Cheng, D. Gawlick, and E. J. O'Neil. The log-structured merge-tree (LSM-tree). Acta Informatica, 33(4):351--385, 1996.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. F. Ö zcan, Y. Tian, and P. Tö zü n. Hybrid Transactional/Analytical Processing: A Survey. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 1771--1775, 2017.Google ScholarGoogle Scholar
  55. A. Pantelopoulos and N. G. Bourbakis. Prognosis: a wearable health-monitoring system for people at risk: methodology and modeling. IEEE Trans. Information Technology in Biomedicine, 14(3):613--621, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. S. Papadopoulos, K. Datta, S. Madden, and T. Mattson. The TileDB Array Data Storage Manager. Proceedings of the VLDB Endowment, 10(4):349--360, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. P. Raju, R. Kadekodi, V. Chidambaram, and I. Abraham. PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP), pages 497--514, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. RocksDB. DeleteRange: A New Native RocksDB Operation. https://rocksdb.org/blog/2018/11/21/delete-range.html, 2018.Google ScholarGoogle Scholar
  59. T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes Twitter users: real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26--30, 2010, pages 851--860, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. S. Sarkar, J.-P. Banâ tre, L. Rilling, and C. Morin. Towards Enforcement of the EU GDPR: Enabling Data Erasure. In Proceedings of the IEEE International Conference of Internet of Things (iThings), pages 1--8, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  61. R. Sears and R. Ramakrishnan. bLSM: A General Purpose Log Structured Merge Tree. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 217--228, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. SQLite4. Online reference. https://sqlite.org/src4/.Google ScholarGoogle Scholar
  63. M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. R. Madden, E. J. O'Neil, P. E. O'Neil, A. Rasin, N. Tran, and S. Zdonik. C-Store: A Column-oriented DBMS. In Proceedings of the International Conference on Very Large Data Bases (VLDB), pages 553--564, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. R. Thonangi and J. Yang. On Log-Structured Merge for Solid-State Drives. In Proceedings of the IEEE International Conference on Data Engineering (ICDE), pages 683--694, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  65. Q.-C. To, J. Soto, and V. Markl. A Survey of State Management in Big Data Processing Systems. CoRR, abs/1702.0, 2017.Google ScholarGoogle Scholar
  66. P. Van Sandt, Y. Chronis, and J. M. Patel. Efficiently Searching In-Memory Sorted Arrays: Revenge of the Interpolation Search? In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 36--53, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Z. Whittaker and N. Lomas. Even years later, Twitter doesn't delete your direct messages. https://techcrunch.com/2019/02/15/twitter-direct-messages/, 2019.Google ScholarGoogle Scholar
  68. WiredTiger. Source Code. https://github.com/wiredtiger/wiredtiger.Google ScholarGoogle Scholar
  69. H. Zhang, H. Lim, V. Leis, D. G. Andersen, M. Kaminsky, K. Keeton, and A. Pavlo. SuRF: Practical Range Query Filtering with Fast Succinct Tries. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 323--336, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. M. Zukowski and P. A. Boncz. Vectorwise: Beyond Column Stores. IEEE Data Engineering Bulletin, 35(1):21--27, 2012.Google ScholarGoogle Scholar

Index Terms

  1. Lethe: A Tunable Delete-Aware LSM Engine

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader