ABSTRACT
Storage devices have evolved to offer increasingly faster read/write access, through flash-based and other solid-state storage technologies. When compared to classical rotating hard disk drives (HDDs), modern solid-state drives (SSDs) have two key differences: (i) the absence of mechanical parts, and (ii) an inherent difference between the process of reading and writing. The former removes a key performance bottleneck, enabling internal device parallelism, whereas the latter manifests as a read/write performance asymmetry. In other words, SSDs can serve multiple concurrent I/Os, and their writes are generally slower than reads; none of which is true for HDDs. Yet, the performance of storage-resident applications is typically modeled by the number of disk accesses performed, inherently assuming symmetric read and write performance and the ability to perform only one I/O at a time, failing to accurately capture the performance of modern storage devices.
To address this mismatch, we propose a simple yet expressive storage model, termed Parametric I/O Model (PIO) that captures contemporary devices by parameterizing read/write asymmetry (α) and access concurrency (k). PIO enables device-specific decisions at algorithm design time, rather than as an optimization during deployment and testing, thus ensuring optimal algorithm design by taking into account the properties of each device. We present a benchmarking of several storage devices that shows that α and k vary significantly across devices. Further, we show that using carefully quantified values of α and k for each storage device, we can fully exploit the performance it offers, and we lay the groundwork for asymmetry/concurrency-aware storage-intensive algorithms. We also highlight that the degree of the performance benefit due to concurrent reads or writes depends on the asymmetry of the underlying device. Finally, we summarize our findings as a set of guidelines for designing storage-intensive algorithms and discuss specific examples for better algorithm and system designs as well as runtime tuning.
- Alok Aggarwal and Jeffrey Scott Vitter. 1988. The Input/Output Complexity of Sorting and Related Problems. Commun. ACM 31, 9 (1988), 1116--1127.Google ScholarDigital Library
- Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis, Mark Manasse, and Rina Panigrahy. 2008. Design Tradeoffs for SSD Performance. In Proceedings of the USENIX Annual Technical Conference (ATC). 57--70.Google ScholarDigital Library
- Manos Athanassoulis and Anastasia Ailamaki. 2014. BF-Tree: Approximate Tree Indexing. Proceedings of the VLDB Endowment 7, 14 (2014), 1881--1892.Google ScholarDigital Library
- Manos Athanassoulis, Bishwaranjan Bhattacharjee, Mustafa Canim, and Kenneth A. Ross. 2012. Path Processing using Solid State Storage. In Proceedings of the International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures (ADMS). 23--32.Google Scholar
- R Bishnoi, M Ebrahimi, F Oboril, and M B Tahoori. 2016. Improving Write Performance for STT-MRAM. IEEE Transactions on Magnetics 52, 8 (2016), 1--11.Google ScholarCross Ref
- Guy E Blelloch, Jeremy T Fineman, Phillip B Gibbons, Yan Gu, and Julian Shun. 2015. Sorting with Asymmetric Read and Write Costs. In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 1--12.Google ScholarDigital Library
- Guy E Blelloch, Jeremy T Fineman, Phillip B Gibbons, Yan Gu, and Julian Shun. 2016. Efficient Algorithms with Asymmetric Read and Write Costs. In Proceedings of the Annual European Symposium on Algorithms (ESA). 14:1--14:18.Google Scholar
- E Chen, D Apalkov, Z Diao, A Driskill-Smith, D Druist, D Lottis, V Nikitin, X Tang, S Watts, S Wang, S A Wolf, A W Ghosh, J W Lu, S J Poon, M Stan, W H Butler, S Gupta, C K A Mewes, T Mewes, and P B Visscher. 2010. Advances and Future Prospects of Spin-Transfer Torque Random Access Memory. IEEE Transactions on Magnetics 46, 6 (2010), 1873--1878.Google ScholarCross Ref
- Feng Chen, Binbing Hou, and Rubao Lee. 2016. Internal Parallelism of Flash Memory-Based Solid-State Drives. ACM Transactions on Storage (TOS) 12, 3 (2016), 13:1--13:39.Google Scholar
- Feng Chen, Rubao Lee, and Xiaodong Zhang. 2011. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA). 266--277.Google ScholarCross Ref
- Shimin Chen, Phillip B. Gibbons, and Suman Nath. 2011. Rethinking Database Algorithms for Phase Change Memory. In Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR).Google Scholar
- Shimin Chen and Qin Jin. 2015. Persistent B+-Trees in Non-Volatile Main Memory. Proceedings of the VLDB Endowment 8, 7 (2015), 786--797.Google ScholarDigital Library
- Youngdon Choi, Ickhyun Song, Mu Hui Park, Hoeju Chung, Sanghoan Chang, Beakhyoung Cho, Jinyoung Kim, Younghoon Oh, Duckmin Kwon, Jung Sunwoo, Junho Shin, Yoohwan Rho, Changsoo Lee, Min Gu Kang, Jaeyun Lee, Yongjin Kwon, Soehee Kim, Jaehwan Kim, Yong Jun Lee, Qi Wang, Sooho Cha, Sujin Ahn, Hideki Horii, Jaewook Lee, Kisung Kim, Hansung Joo, Kwangjin Lee, Yeong Taek Lee, Jeihwan Yoo, and Gitae Jeong. 2012. A 20nm 1.8V 8Gb PRAM with 40MB/s program bandwidth. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC). 46--48.Google ScholarCross Ref
- Michael Cornwell. 2012. Anatomy of a Solid-State Drive. Communications of the ACM(CACM) 55, 12 (2012), 59--63.Google ScholarDigital Library
- Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal Navigable Key-Value Store. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 79--94.Google ScholarDigital Library
- I B M DB2. 2021. Configuring storage for performance. https://www.ibm.com/docs/en/db2-for-zos/12?topic=performance-configuring-storage (2021).Google Scholar
- Diego Didona, Nikolas Ioannou, Radu Stoica, and Kornilios Kourtis. 2020. Toward a Better Understanding and Evaluation of Tree Structures on Flash SSDs. Proceedings of the VLDB Endowment 14, 3 (2020), 364--377.Google ScholarDigital Library
- Fio. 2021. Flexible I/O tester. https://fio.readthedocs.io/en/latest/ (2021).Google Scholar
- Eran Gal and Sivan Toledo. 2005. Algorithms and data structures for flash memories. Comput. Surveys 37, 2 (2005), 138--163.Google ScholarDigital Library
- Yan Gu, Yihan Sun, and Guy E Blelloch. 2018. Algorithmic building blocks for asymmetric memories. In Leibniz International Proceedings in Informatics, LIPIcs, Vol. 112. 44:1--44:15.Google Scholar
- Laura M Haas, Michael J Carey, Miron Livny, and Amit Shukla. 1997. Seeking the Truth About ad hoc Join Costs. The VLDB Journal 6, 3 (1997), 241--256.Google ScholarDigital Library
- Frank T Hady, Annie P Foong, Bryan Veal, and Dan Williams. 2017. Platform Storage Performance With 3D XPoint Technology. Proc. IEEE 105, 9 (2017), 1822--1833.Google ScholarCross Ref
- MySQL InnoDB. 2021. Configuring Buffer Pool Flushing. https://dev.mysql.com/doc/refman/8.0/en/innodb-buffer-pool-flushing.html (2021).Google Scholar
- Intel. 2016. Intel® Optane™ Technology. https://www.intel.com/content/www/us/en/architecture-and-technology/intel-optane-technology.html (2016).Google Scholar
- Jon L. Jacobi. 2019. NVMe SSDs: Everything you need to know about this insanely fast storage. https://www.pcworld.com/article/2899351/everything-you-need-to-know-about-nvme.html (2019).Google Scholar
- Aarati Kakaraparthy, Jignesh M Patel, Kwanghyun Park, and Brian Kroth. 2019. Optimizing Databases by Learning Hidden Parameters of Solid State Drives. Proceedings of the VLDB Endowment 13, 4 (2019), 519--532.Google ScholarDigital Library
- Jeong-Uk Kang, Heeseung Jo, Jinsoo Kim, and Joonwon Lee. 2006. A superblock-based flash translation layer for NAND flash memory. In Proceedings of the 6th ACM & IEEE International conference on Embedded software, EMSOFT 2006, October 22-25, 2006, Seoul, Korea. 161--170.Google ScholarDigital Library
- Sudarsun Kannan, Andrea C Arpaci-Dusseau, Remzi H Arpaci-Dusseau, Yuangang Wang, Jun Xu, and Gopinath Palani. 2018. Designing a True Direct-Access File System with DevFS. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). 241--256.Google Scholar
- Sang-Won Lee, Dong-Joo Park, Tae-Sun Chung, Dong-Ho Lee, Sangwon Park, and Ha-Joo Song. 2007. A log buffer-based flash translation layer using fully-associative sector translation. ACM Transactions on Embedded Computing Systems (TECS) 6, 3 (2007).Google ScholarDigital Library
- Yinan Li, Bingsheng He, Jun Yang, Qiong Luo, Ke Yi, and Robin Jun Yang. 2010. Tree Indexing on Solid State Drives. Proceedings of the VLDB Endowment 3, 1-2 (2010), 1195--1206.Google ScholarDigital Library
- Jihang Liu, Shimin Chen, and Lujun Wang. 2020. LB+-Trees: Optimizing Persistent Index Performance on 3DXPoint Memory. Proceedings of the VLDB Endowment 13, 7 (2020), 1078--1090.Google ScholarDigital Library
- Chen Luo and Michael J. Carey. 2020. LSM-based Storage Techniques: A Survey. The VLDB Journal 29, 1 (2020), 393--418.Google ScholarDigital Library
- Bo Mao, Suzhen Wu, and Lide Duan. 2018. Improving the SSD Performance by Exploiting Request Characteristics and Internal Parallelism. IEEE Trans. on CAD of Integrated Circuits and Systems 37, 2 (2018), 472--484.Google ScholarDigital Library
- Jagan Singh Meena, Simon Min Sze, Umesh Chand, and Tseung-Yuen Tseng. 2014. Overview of emerging nonvolatile memory technologies. Nanoscale Research Letters 9, 1 (2014), 526.Google ScholarCross Ref
- Suman Nath and Phillip B. Gibbons. 2008. Online Maintenance of Very Large Random Samples on Flash Storage. Proceedings of the VLDB Endowment 1, 1 (2008), 970--983.Google ScholarDigital Library
- Suman Nath and Aman Kansal. 2007. FlashDB: dynamic self-tuning database for NAND flash. Proceedings of the International Symposium on Information Processing in Sensor Networks (IPSN) (2007).Google ScholarDigital Library
- Patrick E. O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth J. O'Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica 33, 4 (1996), 351--385.Google ScholarDigital Library
- Tarikul Islam Papon and Manos Athanassoulis. 2021. The Need for a New I/O Model. In Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR).Google Scholar
- Hyoungmin Park and Kyuseok Shim. 2009. FAST:Flash-aware external sorting for mobile database systems. Journal of Systems and Software 82, 8 (2009), 1298--1312.Google ScholarDigital Library
- Stan Park and Kai Shen. 2009. A performance evaluation of scientific I/O workloads on Flash-based SSDs. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER). 1--5.Google ScholarCross Ref
- Stan Park and Kai Shen. 2012. FIOS: a fair, efficient flash I/O scheduler. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). 13.Google Scholar
- Georgios Psaropoulos, Ismail Oukid, Thomas Legler, Norman May, and Anastasia Ailamaki. 2019. Bridging the Latency Gap between NVM and DRAM for Latency-bound Operations. In Proceedings of the International Workshop on Data Management on New Hardware (DAMON). 13:1--13:8.Google ScholarDigital Library
- Simone Raoux, Geoffrey W Burr, Matthew J Breitwisch, Charles T Rettner, Yi-Chou Chen, Robert M Shelby, Martin Salinga, Daniel Krebs, Shih-Hung Chen, Hsiang-Lan Lung, and Chung Hon Lam. 2008. Phase-change random access memory: A scalable technology. IBM Journal of Research and Development 52, 4-5 (2008), 465--480.Google ScholarDigital Library
- Hongchan Roh, Sanghyun Park, Sungho Kim, Mincheol Shin, and Sang-Won Lee. 2011. B+-Tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives. Proceedings of the VLDB Endowment 5, 4 (2011), 286--297.Google ScholarDigital Library
- Guido De Sandre, Luca Bettini, Alessandro Pirola, Lionel Marmonier, Marco Pasotti, Massimo Borghi, Paolo Mattavelli, Paola Zuliani, Luca Scotti, Gianfranco Mastracchio, Ferdinando Bedeschi, Roberto Gastaldi, and Roberto Bez. 2011. A 4 Mb LV MOS-Selected Embedded Phase Change Memory in 90 nm Standard CMOS Technology. J. Solid-State Circuits 46, 1 (2011), 52--63.Google ScholarCross Ref
- Subhadeep Sarkar, Tarikul Islam Papon, Dimitris Staratzis, and Manos Athanassoulis. 2020. Lethe: A Tunable Delete-Aware LSM Engine. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 893--908.Google ScholarDigital Library
- Jinho Seol, Hyotaek Shim, Jaegeuk Kim, and Seungryoul Maeng. 2009. A buffer replacement algorithm exploiting multi-chip parallelism in solid state disks. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES). 137--146.Google ScholarDigital Library
- Arman Shehabi, Sarah Smith, Dale Sartor, Richard Brown, Magnus Herrlin, Jonathan Koomey, Eric Masanet, Nathaniel Horner, Inês Azevedo, and William Lintner. 2016. United States Data Center Energy Usage Report. Ernest Orlando Lawrence Berkeley National Laboratory LBNL-10057 (2016).Google Scholar
- Kai Shen and Stan Park. 2013. FlashFQ: A Fair Queueing I/O Scheduler for Flash-Based SSDs. In Proceedings of the USENIX Annual Technical Conference (ATC). 67--78.Google Scholar
- SPDK. 2016. Storage Performance Development Kit (SPDK). https://spdk.io (2016).Google Scholar
- Dmitri B. Strukov, Gregory S. Snider, Duncan R. Stewart, and R. Stanley Williams. 2008. The missing memristor found. Nature 453, 7191 (2008), 80--83.Google Scholar
- Stratis D. Viglas. 2012. Adapting the B +-tree for asymmetric I/O. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 7503 LNCS. 399--412.Google Scholar
- Kan Wu, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2019. Towards an Unwritten Contract of Intel Optane SSD. In Proceedings of the USENIX Conference on Hot Topics in Storage and File Systems (HotStorage).Google Scholar
- Jian Xu and Steven Swanson. 2016. NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). 323--338.Google ScholarDigital Library
- Qiumin Xu, Huzefa Siyamwala, Mrinmoy Ghosh, Tameesh Suri, Manu Awasthi, Zvika Guz, Anahita Shayesteh, and Vijay Balakrishnan. 2015. Performance analysis of NVMe SSDs and their implication on real world databases. In Proceedings of the ACM International Systems and Storage Conference (SYSTOR). 6:1--6:11.Google ScholarDigital Library
- Jianhui Yue and Yifeng Zhu. 2013. Accelerating write by exploiting PCM asymmetries. In 19th IEEE International Symposium on High Performance Computer Architecture, HPCA 2013, Shenzhen, China, February 23-27, 2013. 282--293.Google ScholarDigital Library
Recommendations
Prism: Optimizing Key-Value Store for Modern Heterogeneous Storage Devices
ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2As data generation has been on an upward trend, storing vast volumes of data cost-effectively as well as efficiently accessing them is paramount. At the same time, today's storage landscape continues to diversify, from high-bandwidth storage devices ...
An analytic model of hierarchical mass storage systems with network-attached storage devices
Network attached storage devices improve I/O performance by separating control and data paths and eliminating host intervention during data transfer. Devices are attached to a high speed network for data transfer and to a slower network for control ...
A low-latency storage stack for fast storage devices
Modern storage systems are facing an important challenge of making the best use of fast storage devices. Even though the underlying storage devices are being enhanced, the traditional storage stack falls short of utilizing the enhanced characteristics, ...
Comments