skip to main content
10.1145/3465998.3466003acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access

A Parametric I/O Model for Modern Storage Devices

Published:20 June 2021Publication History

ABSTRACT

Storage devices have evolved to offer increasingly faster read/write access, through flash-based and other solid-state storage technologies. When compared to classical rotating hard disk drives (HDDs), modern solid-state drives (SSDs) have two key differences: (i) the absence of mechanical parts, and (ii) an inherent difference between the process of reading and writing. The former removes a key performance bottleneck, enabling internal device parallelism, whereas the latter manifests as a read/write performance asymmetry. In other words, SSDs can serve multiple concurrent I/Os, and their writes are generally slower than reads; none of which is true for HDDs. Yet, the performance of storage-resident applications is typically modeled by the number of disk accesses performed, inherently assuming symmetric read and write performance and the ability to perform only one I/O at a time, failing to accurately capture the performance of modern storage devices.

To address this mismatch, we propose a simple yet expressive storage model, termed Parametric I/O Model (PIO) that captures contemporary devices by parameterizing read/write asymmetry (α) and access concurrency (k). PIO enables device-specific decisions at algorithm design time, rather than as an optimization during deployment and testing, thus ensuring optimal algorithm design by taking into account the properties of each device. We present a benchmarking of several storage devices that shows that α and k vary significantly across devices. Further, we show that using carefully quantified values of α and k for each storage device, we can fully exploit the performance it offers, and we lay the groundwork for asymmetry/concurrency-aware storage-intensive algorithms. We also highlight that the degree of the performance benefit due to concurrent reads or writes depends on the asymmetry of the underlying device. Finally, we summarize our findings as a set of guidelines for designing storage-intensive algorithms and discuss specific examples for better algorithm and system designs as well as runtime tuning.

References

  1. Alok Aggarwal and Jeffrey Scott Vitter. 1988. The Input/Output Complexity of Sorting and Related Problems. Commun. ACM 31, 9 (1988), 1116--1127.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis, Mark Manasse, and Rina Panigrahy. 2008. Design Tradeoffs for SSD Performance. In Proceedings of the USENIX Annual Technical Conference (ATC). 57--70.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Manos Athanassoulis and Anastasia Ailamaki. 2014. BF-Tree: Approximate Tree Indexing. Proceedings of the VLDB Endowment 7, 14 (2014), 1881--1892.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Manos Athanassoulis, Bishwaranjan Bhattacharjee, Mustafa Canim, and Kenneth A. Ross. 2012. Path Processing using Solid State Storage. In Proceedings of the International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures (ADMS). 23--32.Google ScholarGoogle Scholar
  5. R Bishnoi, M Ebrahimi, F Oboril, and M B Tahoori. 2016. Improving Write Performance for STT-MRAM. IEEE Transactions on Magnetics 52, 8 (2016), 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  6. Guy E Blelloch, Jeremy T Fineman, Phillip B Gibbons, Yan Gu, and Julian Shun. 2015. Sorting with Asymmetric Read and Write Costs. In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Guy E Blelloch, Jeremy T Fineman, Phillip B Gibbons, Yan Gu, and Julian Shun. 2016. Efficient Algorithms with Asymmetric Read and Write Costs. In Proceedings of the Annual European Symposium on Algorithms (ESA). 14:1--14:18.Google ScholarGoogle Scholar
  8. E Chen, D Apalkov, Z Diao, A Driskill-Smith, D Druist, D Lottis, V Nikitin, X Tang, S Watts, S Wang, S A Wolf, A W Ghosh, J W Lu, S J Poon, M Stan, W H Butler, S Gupta, C K A Mewes, T Mewes, and P B Visscher. 2010. Advances and Future Prospects of Spin-Transfer Torque Random Access Memory. IEEE Transactions on Magnetics 46, 6 (2010), 1873--1878.Google ScholarGoogle ScholarCross RefCross Ref
  9. Feng Chen, Binbing Hou, and Rubao Lee. 2016. Internal Parallelism of Flash Memory-Based Solid-State Drives. ACM Transactions on Storage (TOS) 12, 3 (2016), 13:1--13:39.Google ScholarGoogle Scholar
  10. Feng Chen, Rubao Lee, and Xiaodong Zhang. 2011. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA). 266--277.Google ScholarGoogle ScholarCross RefCross Ref
  11. Shimin Chen, Phillip B. Gibbons, and Suman Nath. 2011. Rethinking Database Algorithms for Phase Change Memory. In Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR).Google ScholarGoogle Scholar
  12. Shimin Chen and Qin Jin. 2015. Persistent B+-Trees in Non-Volatile Main Memory. Proceedings of the VLDB Endowment 8, 7 (2015), 786--797.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Youngdon Choi, Ickhyun Song, Mu Hui Park, Hoeju Chung, Sanghoan Chang, Beakhyoung Cho, Jinyoung Kim, Younghoon Oh, Duckmin Kwon, Jung Sunwoo, Junho Shin, Yoohwan Rho, Changsoo Lee, Min Gu Kang, Jaeyun Lee, Yongjin Kwon, Soehee Kim, Jaehwan Kim, Yong Jun Lee, Qi Wang, Sooho Cha, Sujin Ahn, Hideki Horii, Jaewook Lee, Kisung Kim, Hansung Joo, Kwangjin Lee, Yeong Taek Lee, Jeihwan Yoo, and Gitae Jeong. 2012. A 20nm 1.8V 8Gb PRAM with 40MB/s program bandwidth. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC). 46--48.Google ScholarGoogle ScholarCross RefCross Ref
  14. Michael Cornwell. 2012. Anatomy of a Solid-State Drive. Communications of the ACM(CACM) 55, 12 (2012), 59--63.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal Navigable Key-Value Store. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 79--94.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. I B M DB2. 2021. Configuring storage for performance. https://www.ibm.com/docs/en/db2-for-zos/12?topic=performance-configuring-storage (2021).Google ScholarGoogle Scholar
  17. Diego Didona, Nikolas Ioannou, Radu Stoica, and Kornilios Kourtis. 2020. Toward a Better Understanding and Evaluation of Tree Structures on Flash SSDs. Proceedings of the VLDB Endowment 14, 3 (2020), 364--377.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Fio. 2021. Flexible I/O tester. https://fio.readthedocs.io/en/latest/ (2021).Google ScholarGoogle Scholar
  19. Eran Gal and Sivan Toledo. 2005. Algorithms and data structures for flash memories. Comput. Surveys 37, 2 (2005), 138--163.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yan Gu, Yihan Sun, and Guy E Blelloch. 2018. Algorithmic building blocks for asymmetric memories. In Leibniz International Proceedings in Informatics, LIPIcs, Vol. 112. 44:1--44:15.Google ScholarGoogle Scholar
  21. Laura M Haas, Michael J Carey, Miron Livny, and Amit Shukla. 1997. Seeking the Truth About ad hoc Join Costs. The VLDB Journal 6, 3 (1997), 241--256.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Frank T Hady, Annie P Foong, Bryan Veal, and Dan Williams. 2017. Platform Storage Performance With 3D XPoint Technology. Proc. IEEE 105, 9 (2017), 1822--1833.Google ScholarGoogle ScholarCross RefCross Ref
  23. MySQL InnoDB. 2021. Configuring Buffer Pool Flushing. https://dev.mysql.com/doc/refman/8.0/en/innodb-buffer-pool-flushing.html (2021).Google ScholarGoogle Scholar
  24. Intel. 2016. Intel® Optane™ Technology. https://www.intel.com/content/www/us/en/architecture-and-technology/intel-optane-technology.html (2016).Google ScholarGoogle Scholar
  25. Jon L. Jacobi. 2019. NVMe SSDs: Everything you need to know about this insanely fast storage. https://www.pcworld.com/article/2899351/everything-you-need-to-know-about-nvme.html (2019).Google ScholarGoogle Scholar
  26. Aarati Kakaraparthy, Jignesh M Patel, Kwanghyun Park, and Brian Kroth. 2019. Optimizing Databases by Learning Hidden Parameters of Solid State Drives. Proceedings of the VLDB Endowment 13, 4 (2019), 519--532.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jeong-Uk Kang, Heeseung Jo, Jinsoo Kim, and Joonwon Lee. 2006. A superblock-based flash translation layer for NAND flash memory. In Proceedings of the 6th ACM & IEEE International conference on Embedded software, EMSOFT 2006, October 22-25, 2006, Seoul, Korea. 161--170.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sudarsun Kannan, Andrea C Arpaci-Dusseau, Remzi H Arpaci-Dusseau, Yuangang Wang, Jun Xu, and Gopinath Palani. 2018. Designing a True Direct-Access File System with DevFS. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). 241--256.Google ScholarGoogle Scholar
  29. Sang-Won Lee, Dong-Joo Park, Tae-Sun Chung, Dong-Ho Lee, Sangwon Park, and Ha-Joo Song. 2007. A log buffer-based flash translation layer using fully-associative sector translation. ACM Transactions on Embedded Computing Systems (TECS) 6, 3 (2007).Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Yinan Li, Bingsheng He, Jun Yang, Qiong Luo, Ke Yi, and Robin Jun Yang. 2010. Tree Indexing on Solid State Drives. Proceedings of the VLDB Endowment 3, 1-2 (2010), 1195--1206.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jihang Liu, Shimin Chen, and Lujun Wang. 2020. LB+-Trees: Optimizing Persistent Index Performance on 3DXPoint Memory. Proceedings of the VLDB Endowment 13, 7 (2020), 1078--1090.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Chen Luo and Michael J. Carey. 2020. LSM-based Storage Techniques: A Survey. The VLDB Journal 29, 1 (2020), 393--418.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Bo Mao, Suzhen Wu, and Lide Duan. 2018. Improving the SSD Performance by Exploiting Request Characteristics and Internal Parallelism. IEEE Trans. on CAD of Integrated Circuits and Systems 37, 2 (2018), 472--484.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Jagan Singh Meena, Simon Min Sze, Umesh Chand, and Tseung-Yuen Tseng. 2014. Overview of emerging nonvolatile memory technologies. Nanoscale Research Letters 9, 1 (2014), 526.Google ScholarGoogle ScholarCross RefCross Ref
  35. Suman Nath and Phillip B. Gibbons. 2008. Online Maintenance of Very Large Random Samples on Flash Storage. Proceedings of the VLDB Endowment 1, 1 (2008), 970--983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Suman Nath and Aman Kansal. 2007. FlashDB: dynamic self-tuning database for NAND flash. Proceedings of the International Symposium on Information Processing in Sensor Networks (IPSN) (2007).Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Patrick E. O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth J. O'Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica 33, 4 (1996), 351--385.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Tarikul Islam Papon and Manos Athanassoulis. 2021. The Need for a New I/O Model. In Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR).Google ScholarGoogle Scholar
  39. Hyoungmin Park and Kyuseok Shim. 2009. FAST:Flash-aware external sorting for mobile database systems. Journal of Systems and Software 82, 8 (2009), 1298--1312.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Stan Park and Kai Shen. 2009. A performance evaluation of scientific I/O workloads on Flash-based SSDs. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER). 1--5.Google ScholarGoogle ScholarCross RefCross Ref
  41. Stan Park and Kai Shen. 2012. FIOS: a fair, efficient flash I/O scheduler. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). 13.Google ScholarGoogle Scholar
  42. Georgios Psaropoulos, Ismail Oukid, Thomas Legler, Norman May, and Anastasia Ailamaki. 2019. Bridging the Latency Gap between NVM and DRAM for Latency-bound Operations. In Proceedings of the International Workshop on Data Management on New Hardware (DAMON). 13:1--13:8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Simone Raoux, Geoffrey W Burr, Matthew J Breitwisch, Charles T Rettner, Yi-Chou Chen, Robert M Shelby, Martin Salinga, Daniel Krebs, Shih-Hung Chen, Hsiang-Lan Lung, and Chung Hon Lam. 2008. Phase-change random access memory: A scalable technology. IBM Journal of Research and Development 52, 4-5 (2008), 465--480.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Hongchan Roh, Sanghyun Park, Sungho Kim, Mincheol Shin, and Sang-Won Lee. 2011. B+-Tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives. Proceedings of the VLDB Endowment 5, 4 (2011), 286--297.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Guido De Sandre, Luca Bettini, Alessandro Pirola, Lionel Marmonier, Marco Pasotti, Massimo Borghi, Paolo Mattavelli, Paola Zuliani, Luca Scotti, Gianfranco Mastracchio, Ferdinando Bedeschi, Roberto Gastaldi, and Roberto Bez. 2011. A 4 Mb LV MOS-Selected Embedded Phase Change Memory in 90 nm Standard CMOS Technology. J. Solid-State Circuits 46, 1 (2011), 52--63.Google ScholarGoogle ScholarCross RefCross Ref
  46. Subhadeep Sarkar, Tarikul Islam Papon, Dimitris Staratzis, and Manos Athanassoulis. 2020. Lethe: A Tunable Delete-Aware LSM Engine. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 893--908.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Jinho Seol, Hyotaek Shim, Jaegeuk Kim, and Seungryoul Maeng. 2009. A buffer replacement algorithm exploiting multi-chip parallelism in solid state disks. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES). 137--146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Arman Shehabi, Sarah Smith, Dale Sartor, Richard Brown, Magnus Herrlin, Jonathan Koomey, Eric Masanet, Nathaniel Horner, Inês Azevedo, and William Lintner. 2016. United States Data Center Energy Usage Report. Ernest Orlando Lawrence Berkeley National Laboratory LBNL-10057 (2016).Google ScholarGoogle Scholar
  49. Kai Shen and Stan Park. 2013. FlashFQ: A Fair Queueing I/O Scheduler for Flash-Based SSDs. In Proceedings of the USENIX Annual Technical Conference (ATC). 67--78.Google ScholarGoogle Scholar
  50. SPDK. 2016. Storage Performance Development Kit (SPDK). https://spdk.io (2016).Google ScholarGoogle Scholar
  51. Dmitri B. Strukov, Gregory S. Snider, Duncan R. Stewart, and R. Stanley Williams. 2008. The missing memristor found. Nature 453, 7191 (2008), 80--83.Google ScholarGoogle Scholar
  52. Stratis D. Viglas. 2012. Adapting the B +-tree for asymmetric I/O. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 7503 LNCS. 399--412.Google ScholarGoogle Scholar
  53. Kan Wu, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2019. Towards an Unwritten Contract of Intel Optane SSD. In Proceedings of the USENIX Conference on Hot Topics in Storage and File Systems (HotStorage).Google ScholarGoogle Scholar
  54. Jian Xu and Steven Swanson. 2016. NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). 323--338.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Qiumin Xu, Huzefa Siyamwala, Mrinmoy Ghosh, Tameesh Suri, Manu Awasthi, Zvika Guz, Anahita Shayesteh, and Vijay Balakrishnan. 2015. Performance analysis of NVMe SSDs and their implication on real world databases. In Proceedings of the ACM International Systems and Storage Conference (SYSTOR). 6:1--6:11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Jianhui Yue and Yifeng Zhu. 2013. Accelerating write by exploiting PCM asymmetries. In 19th IEEE International Symposium on High Performance Computer Architecture, HPCA 2013, Shenzhen, China, February 23-27, 2013. 282--293.Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    DAMON '21: Proceedings of the 17th International Workshop on Data Management on New Hardware
    June 2021
    104 pages
    ISBN:9781450385565
    DOI:10.1145/3465998

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 20 June 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    DAMON '21 Paper Acceptance Rate15of17submissions,88%Overall Acceptance Rate80of102submissions,78%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader