skip to main content
research-article

Optimizing databases by learning hidden parameters of solid state drives

Published:09 December 2019Publication History
Skip Abstract Section

Abstract

Solid State Drives (SSDs) are complex devices with varying internal implementations, resulting in subtle differences in behavior between devices. In this paper, we demonstrate how a database engine can be optimized for a particular device by learning its hidden parameters. This can not only improve an application's performance, but also potentially increase the lifetime of the SSD. Our approach for optimizing a database for a given SSD consists of three steps: learning the hidden parameters of the device, proposing rules to analyze the I/O behavior of the database, and optimizing the database by eliminating violations of these rules.

We obtain two different characteristics of an SSD, namely the request size profile and the location profile, from which we learn multiple internal parameters. Based on these parameters, we propose rules to analyze the I/O behavior of a database engine. Using these rules, we uncover sub-optimal I/O patterns in SQLite3 and MariaDB when running on our experimental SSDs. Finally, we present three techniques to optimize these database engines: (1) use-hot-locations on SSD-S, which improves the SELECT operation throughput of SQLite3 and MariaDB by 29% and 27% respectively; it also improves the performance of YCSB on MariaDB by 1%-22% depending on the workload mix, (2) write-aligned-stripes on SSD-T, reduces the wear-out caused by SQLite3 write-ahead log (WAL) file by 3.1%, and (3) contain-write-in-flash-page on SSD-T, which reduces the wear-out caused by the MariaDB binary log file by 6.7%.

References

  1. HDD vs SSD: What does the future for storage hold? https://www.backblaze.com/blog/ssd-vs-hdd-future-of-storage/.Google ScholarGoogle Scholar
  2. Intel Optane SSD 900P Series. https://www.intel.com/content/www/us/en/products/memory-storage/solid-state-drives/consumer-ssds/optane-ssd-9-series/optane-ssd-900p-series.html.Google ScholarGoogle Scholar
  3. MariaDB Success Stories. https://mariadb.com/kb/en/library/mariadb-success-stories.Google ScholarGoogle Scholar
  4. pandas: Python Data Analysis Library. https://pandas.pydata.org/.Google ScholarGoogle Scholar
  5. PRAGMA Statements supported by SQLite. https://www.sqlite.org/pragma.html.Google ScholarGoogle Scholar
  6. Samsung K9XXG08UXA Flash Datasheet. http://www.samsung.com/semiconductor.Google ScholarGoogle Scholar
  7. Solid-state Drive. https://en.wikipedia.org/wiki/Solid-state_drive.Google ScholarGoogle Scholar
  8. SQLite Testing Interface Operation Codes. https://www.sqlite.org/c3ref/c_testctrl_always.html.Google ScholarGoogle Scholar
  9. SQLite3 Database File Format. https://www.sqlite.org/fileformat.html.Google ScholarGoogle Scholar
  10. SQLite4 LSM Benchmark. https://sqlite.org/src4/doc/trunk/www/lsmperf.wiki.Google ScholarGoogle Scholar
  11. Storage Engines: MariaDB Knowledge Base. https://mariadb.com/kb/en/library/storage-engines.Google ScholarGoogle Scholar
  12. strace. https://strace.io/.Google ScholarGoogle Scholar
  13. The InnoDB Storage Engine. https://dev.mysql.com/doc/refman/8.0/en/innodb-storage-engine.html.Google ScholarGoogle Scholar
  14. The Physical Structure of an InnoDB Index. https://dev.mysql.com/doc/refman/8.0/en/innodb-physical-structure.html.Google ScholarGoogle Scholar
  15. The SQLite3 Database Engine. https://sqlite.org/index.html.Google ScholarGoogle Scholar
  16. Working With SQLite Databases using Python and Pandas. https://www.dataquest.io/blog/python-pandas-databases/.Google ScholarGoogle Scholar
  17. Write-Ahead Logging in SQLite3. https://www.sqlite.org/wal.html.Google ScholarGoogle Scholar
  18. Yahoo! Cloud Serving Benchmark. https://github.com/brianfrankcooper/YCSB.Google ScholarGoogle Scholar
  19. D. Agrawal, D. Ganesan, R. Sitaraman, Y. Diao, and S. Singh. Lazy-Adaptive Tree: An Optimized Index Structure for Flash Devices. PVLDB, 2(1):361--372, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. N. Agrawal, V. Prabhakaran, T. Wobber, J. D. Davis, M. Manasse, and R. Panigrahy. Design Tradeoffs for SSD Performance. In USENIX 2008 Annual Technical Conference, ATC'08, pages 57--70, Berkeley, CA, USA.Google ScholarGoogle Scholar
  21. A. Baxter. SSD vs HDD. https://www.storagereview.com/ssd_vs_hdd.Google ScholarGoogle Scholar
  22. M. Bjørling, J. Gonzalez, and P. Bonnet. LightNVM: The Linux Open-Channel SSD Subsystem. In 15th USENIX Conference on File and Storage Technologies (FAST 17), pages 359--374, Santa Clara, CA, 2017.Google ScholarGoogle Scholar
  23. A. M. Caulfield, L. M. Grupp, and S. Swanson. Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XIV, pages 217--228, New York, NY, USA, 2009. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. F. Chen, D. A. Koufaty, and X. Zhang. Understanding Intrinsic Characteristics and System Implications of Flash Memory Based Solid State Drives. In Proceedings of the Eleventh International Joint Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '09, pages 181--192, New York, NY, USA, 2009. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. Chen, R. Lee, and X. Zhang. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In 2011 IEEE 17th International Symposium on High Performance Computer Architecture, pages 266--277, Feb 2011.Google ScholarGoogle ScholarCross RefCross Ref
  26. T.-S. Chung, D.-J. Park, S. Park, D.-H. Lee, S.-W. Lee, and H.-J. Song. A Survey of Flash Translation Layer. J. Syst. Archit., 55(5--6):332--343, May 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Dirik and B. Jacob. The Performance of PC Solid-state Disks (SSDs) As a Function of Bandwidth, Concurrency, Device Architecture, and System Organization. In Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA '09, pages 279--289, New York, NY, USA, 2009. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Duplyakin, R. Ricci, A. Maricq, G. Wong, J. Duerig, E. Eide, L. Stoller, M. Hibler, D. Johnson, K. Webb, A. Akella, K. Wang, G. Ricart, L. Landweber, C. Elliott, M. Zink, E. Cecchet, S. Kar, and P. Mishra. The Design and Operation of CloudLab. In Proceedings of the USENIX Annual Technical Conference (ATC), pages 1--14, July 2019.Google ScholarGoogle Scholar
  29. A. Gupta, Y. Kim, and B. Urgaonkar. DFTL: A Flash Translation Layer Employing Demand-based Selective Caching of Page-level Address Mappings. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XIV, pages 229--240, New York, NY, USA, 2009. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. F. T. Hady, A. Foong, B. Veal, and D. Williams. Platform Storage Performance With 3D XPoint Technology. Proceedings of the IEEE, 105(9):1822--1833, Sep. 2017.Google ScholarGoogle ScholarCross RefCross Ref
  31. J. He, S. Kannan, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. The Unwritten Contract of Solid State Drives. In Proceedings of the Twelfth European Conference on Computer Systems, EuroSys '17, pages 127--144, New York, NY, USA, 2017. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. P. Hernandez. SSD vs HDD: Price Comparison. http://www.enterprisestorageforum.com/storage-hardware/ssd-vs-hdd-price-comparison.html.Google ScholarGoogle Scholar
  33. X.-Y. Hu, E. Eleftheriou, R. Haas, I. Iliadis, and R. Pletka. Write Amplification Analysis in Flash-based Solid State Drives. In Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference, SYSTOR '09, pages 10:1--10:9, New York, NY, USA, 2009. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Y. Hu, H. Jiang, D. Feng, L. Tian, H. Luo, and C. Ren. Exploring and Exploiting the Multilevel Parallelism Inside SSDs for Improved Performance and Endurance. IEEE Transactions on Computers, 62(6):1141--1155, June 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Z. Jiang, Y. Wu, Y. Zhang, C. Li, and C. Xing. AB-Tree: A Write-Optimized Adaptive Index Structure on Solid State Disk. In 2014 11th Web Information System and Application Conference, pages 188--193, Sept 2014.Google ScholarGoogle Scholar
  36. J.-U. Kang, H. Jo, J.-S. Kim, and J. Lee. A Superblock-based Flash Translation Layer for NAND Flash Memory. In Proceedings of the 6th ACM &Amp; IEEE International Conference on Embedded Software, EMSOFT '06, pages 161--170, New York, NY, USA, 2006. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. Kim, S. Seo, D. Jung, J.-S. Kim, and J. Huh. Parameter-Aware I/O Management for Solid State Disks (SSDs). IEEE Trans. Comput., 61(5):636--649, May 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Y. Li, B. He, R. J. Yang, Q. Luo, and K. Yi. Tree Indexing on Solid State Drives. PVLDB, 3(1--2):1195--1206, 2010.Google ScholarGoogle Scholar
  39. J. Meza, Q. Wu, S. Kumar, and O. Mutlu. A Large-Scale Study of Flash Memory Failures in the Field. In Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '15, pages 177--190, New York, NY, USA, 2015. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. Nath and A. Kansal. FlashDB: Dynamic Self-tuning Database for NAND Flash. In Proceedings of the 6th International Conference on Information Processing in Sensor Networks, IPSN '07, pages 410--419, New York, NY, USA, 2007. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. C. Park, W. Cheon, J. Kang, K. Roh, W. Cho, and J.-S. Kim. A Reconfigurable FTL (Flash Translation Layer) Architecture for NAND Flash-based Applications. ACM Trans. Embed. Comput. Syst., 7(4):38:1--38:23, Aug. 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. I. L. Picoli, C. V. Pasco, B. T. Jónsson, L. Bouganim, and P. Bonnet. uFLIP-OC: Understanding Flash I/O Patterns on Open-Channel Solid-State Drives. In Proceedings of the 8th Asia-Pacific Workshop on Systems, APSys '17, pages 20:1--20:7, New York, NY, USA, 2017. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. D. Purohith, J. Mohan, and V. Chidambaram. The Dangers and Complexities of SQLite Benchmarking. In Proceedings of the 8th Asia-Pacific Workshop on Systems, APSys '17, pages 3:1--3:6, New York, NY, USA, 2017. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. H. Roh, S. Park, S. Kim, M. Shin, and S.-W. Lee. B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives. PVLDB, 5(4):286--297, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. B. Tallis. Micron 3D NAND Status Update. https://www.anandtech.com/show/10028/micron-3d-nand-status-update.Google ScholarGoogle Scholar
  46. A. J. Uppal, R. C. Chiang, and H. H. Huang. Flashy prefetching for high-performance flash drives. In 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pages 1--12, April 2012.Google ScholarGoogle Scholar
  47. Wikipedia. Write Amplification. https://en.wikipedia.org/wiki/Write_amplification.Google ScholarGoogle Scholar
  48. C.-H. Wu, T.-W. Kuo, and L. P. Chang. An Efficient B-tree Layer Implementation for Flash-memory Storage Systems. ACM Trans. Embed. Comput. Syst., 6(3), July 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Y. Xu, K. Sui, R. Yao, H. Zhang, Q. Lin, Y. Dang, P. Li, K. Jiang, W. Zhang, J.-G. Lou, M. Chintalapati, and D. Zhang. Improving service availability of cloud systems by predicting disk error. In Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Conference, USENIX ATC '18, pages 481--493.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 13, Issue 4
    December 2019
    167 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    • Published: 9 December 2019
    Published in pvldb Volume 13, Issue 4

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader