Abstract
A unique merit of a solid-state drive (SSD) is its internal parallelism. In this article, we present a set of comprehensive studies on understanding and exploiting internal parallelism of SSDs. Through extensive experiments and thorough analysis, we show that exploiting internal parallelism of SSDs can not only substantially improve input/output (I/O) performance but also may lead to some surprising side effects and dynamics. For example, we find that with parallel I/Os, SSD performance is no longer highly sensitive to access patterns (random or sequential), but rather to other factors, such as data access interferences and physical data layout. Many of our prior understandings about SSDs also need to be reconsidered. For example, we find that with parallel I/Os, write performance could outperform reads and is largely independent of access patterns, which is opposite to our long-existing common understanding about slow random writes on SSDs. We have also observed a strong interference between concurrent reads and writes as well as the impact of physical data layout to parallel I/O performance. Based on these findings, we present a set of case studies in database management systems, a typical data-intensive application. Our case studies show that exploiting internal parallelism is not only the key to enhancing application performance, and more importantly, it also fundamentally changes the equation for optimizing applications. This calls for a careful reconsideration of various aspects in application and system designs. Furthermore, we give a set of experimental studies on new-generation SSDs and the interaction between internal and external parallelism in an SSD-based Redundant Array of Independent Disks (RAID) storage. With these critical findings, we finally make a set of recommendations to system architects and application designers for effectively exploiting internal parallelism.
- Daniel J. Abadi, Samuel Madden, and Nabil Hachem. 2008. Column-stores vs. row-stores: How different are they really? In Proceedings of the 2008 ACM SIGMOD/PODS Conference. ACM, New York, NY, 967--980. Google ScholarDigital Library
- D. Agrawal, D. Ganesan, R. Sitaraman, Y. Diao, and S. Singh. 2009. Lazy-adaptive tree: An optimized index structure for flash devices. In Proceedings of the 35th International Conference on Very Large Data Bases (VLDB’09). Google ScholarDigital Library
- N. Agrawal, V. Prabhakaran, T. Wobber, J. D. Davis, M. Manasse, and R. Panigrahy. 2008. Design tradeoffs for SSD performance. In Proceedings of the 2008 USENIX Annual Technical Conference (USENIX’08). Google ScholarDigital Library
- A. Birrell, M. Isard, C. Thacker, and T. Wobber. 2005. A design for high-performance flash disks. In 2005 Microsoft Research Technical Report.Google Scholar
- Blktrace. 2011. Homepage. Retrieved from http://linux.die.net/man/8/blktrace.Google Scholar
- S. Boboila and P. Desnoyers. 2010. Write endurance in flash drives: Measurements and analysis. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). Google ScholarDigital Library
- L. Bouganim, B. Jónsson, and P. Bonnet. 2009. uFLIP: Understanding flash IO patterns. In Proceedings of the 4th Biennial Conference on Innovative Data Systems (CIDR’09).Google Scholar
- M. Canim, G. A. Mihaila, B. Bhattacharjee, K. A. Ross, and C. A. Lang. 2009. An object placement advisor for DB2 using solid state storage. In Proceedings of the 35th International Conference on Very Large Data Bases (VLDB’09). Google ScholarDigital Library
- F. Chen, D. A. Koufaty, and X. Zhang. 2009. Understanding intrinsic characteristics and system implications of flash memory based solid state drives. In Proceedings of ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS/Performance’09). ACM, New York, NY. Google ScholarDigital Library
- Feng Chen, Rubao Lee, and Xiaodong Zhang. 2011a. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In Proceedings of IEEE 17th International Symposium on High Performance Computer Architecture (HPCA’11). Google ScholarDigital Library
- F. Chen, T. Luo, and X. Zhang. 2011b. CAFTL: A content-aware flash translation layer enhancing the lifespan of flash memory based solid state drives. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST’11). Google ScholarDigital Library
- Feng Chen, Michael P. Mesnier, and Scott Hahn. 2014. A protected block device for persistent memory. In Proceedings of the 30th International Conference on Massive Storage Systems and Technology (MSST’14).Google ScholarCross Ref
- S. Chen. 2009. FlashLogging: Exploiting flash devices for synchronous logging performance. In Proceedings of the 2009 ACM SIGMOD Conference (SIGMOD’09). ACM, New York, NY. Google ScholarDigital Library
- Dhananjoy Das, Dulcardo Arteaga, Nisha Talagala, Torben Mathiasen, and Jan Lindström. 2014. NVM compression - hybrid flash-aware application level compression. In Proceedings of the 2nd Workshop on Interactions of NVM/Flash with Operating Systems and Workloads (INFLOW’14).Google Scholar
- T. E. Denehy, J. Bent, F. I. Popovici, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. 2014. Deconstructing storage arrays. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’04). Google ScholarDigital Library
- C. Dirik and B. Jacob. 2009. The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device, architecture, and system organization. In Proceedings of the 36th International Symposium on Computer Architecture (ISCA’09). Google ScholarDigital Library
- J. Do and J. M. Patel. 2009. Join processing for flash SSDs: Remembering past lessons. In Proceedings of the 5th International Workshop on Data Management on New Hardware (DaMon’09). Google ScholarDigital Library
- Janene Ellefson. 2013. NVM express: Unlock your solid state drives potential. In 2013 Flash Memory Summit.Google Scholar
- Filebench. 2015. Homepage. Retrieved from http://sourceforge.net/projects/filebench/.Google Scholar
- E. Gal and S. Toledo. 2005. Algorithms and data structures for flash memories. In ACM Comput. Surv. 37, 2 (2005), 138--163. Google ScholarDigital Library
- Goetz Graefe. 1993. Query evaluation techniques for large databases. ACM Comput. Surv. 25, 2 (1993), 73--170. Google ScholarDigital Library
- G. Graefe. 2007. The five-minute rule 20 years later, and how flash memory changes the rules. In Proceedings of the 3rd International Workshop on Data Management on New Hardware (DaMon’07). Google ScholarDigital Library
- Xufeng Guo, Jianfeng Tan, and Yuping Wang. 2013. PAB: Parallelism-aware buffer management scheme for nand-based SSDs. In Proceedings of IEEE 21st International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS’’13). Google ScholarDigital Library
- Yang Hu, Hong Jiang, Dan Feng, Lei Tian, Hao Luo, and Chao Ren. 2013. Exploring and exploiting the multilevel parallelism inside SSDs for improved performance and endurance. In IEEE Trans. Comput. 62, 6 (2013), 1141--1155. Google ScholarDigital Library
- S. Iyer and P. Druschel. 2001. Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous I/O. In Proceedings of the 18th Symposium on Operating System Principles (SOSP’01). Google ScholarDigital Library
- S. Jiang, X. Ding, F. Chen, E. Tan, and X. Zhang. 2005. DULO: An effective buffer cache management scheme to exploit both temporal and spatial localities. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’05). Google ScholarDigital Library
- W. K. Josephson, L. A. Bongo, D. Flynn, and K. Li. 2010. DFS: A file system for virtualized flash storage. In Proceedings of USENIX Conference on File and Storage Technologies (FAST’10). Google ScholarDigital Library
- Myoungsoo Jung, Wonil Choi, Shekhar Srikantaiah, Joonhyuk Yoo, and Mahmut T. Kandemir. 2014. HIOS: A host interface I/O scheduler for solid state disks. In Proceedings of the 41st International Symposium on Computer Architecture (ISCA’14). Google ScholarDigital Library
- Myoungsoo Jung and Mahmut Kandemir. 2012. An evaluation of different page allocation strategies on high-speed SSDs. In Proceedings of the 4th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’12). Google ScholarDigital Library
- Myoungsoo Jung and Mahmut Kandemir. 2013. Revisiting widely held SSD expectations and rethinking system-level implications. In Proceedings of the ACM SIGMETRICS/International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’13). ACM, New York, NY. Google ScholarDigital Library
- Myoungsoo Jung and Mahmut T. Kandemir. 2014. Sprinkler: Maximizing resource utilization in many-chip solid state disks. In Proceedings of 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA’14).Google Scholar
- Myoungsoo Jung, Ellis H. Wilson, and Mahmut Kandemir. 2012. Physically addressed queuing (PAQ): Improving parallelism in solid state disks. In Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA’12). Google ScholarDigital Library
- T. Kgil, D. Roberts, and T. Mudge. 2008. Improving NAND flash based disk caches. In Proceedings of the 35th International Conference on Computer Architecture (ISCA’08). Google ScholarDigital Library
- Youngjae Kim, Sarp Oral, Galen M. Shipman, Junghee Lee, David A. Dillow, and Feiyi Wang. 2011. Harmonia: A globally coordinated garbage collector for arrays of solid-state drives. In Proceedings of the 27th IEEE Symposium on Massive Storage Systems and Technologies (MSST’11). Google ScholarDigital Library
- I. Koltsidas and S. Viglas. 2008. Flashing up the storage layer. In Proceedings of the 34th International Conference on Very Large Data Bases (VLDB’08). Google ScholarDigital Library
- R. Lee, X. Ding, F. Chen, Q. Lu, and X. Zhang. 2009a. MCC-DB: Minimizing cache conflicts in multi-core processors for databases. In Proceedings of the 35th International Conference on Very Large Data Bases (VLDB’09). Google ScholarDigital Library
- S. Lee and B. Moon. 2007. Design of flash-based DBMS: An in-page logging approach. In Proceedings of the 2007 ACM SIGMOD Conference (SIGMOD’07). Google ScholarDigital Library
- S. Lee, B. Moon, and C. Park. 2009b. Advances in flash memory SSD technology for enterprise database applications. In Proceedings of the 2009 ACM SIGMOD Conference (SIGMOD’09). ACM, New York, NY. Google ScholarDigital Library
- S. Lee, B. Moon, C. Park, J. Kim, and S. Kim. 2008. A case for flash memory SSD in enterprise database applications. In Proceedings of 2008 ACM SIGMOD Conference (SIGMOD’08). ACM, New York, NY, Google ScholarDigital Library
- Y. Li, B. He, Q. Luo, and K. Yi. 2009. Tree indexing on flash disks. In Proceedings of the 25th International Conference on Data Engineering (ICDE’09). Google ScholarDigital Library
- M. Mesnier. 2011. Intel Open Storage Toolkit. http://www.source forge.org/projects/intel-iscsi. (2011).Google Scholar
- M. Mesnier, M. Wachs, R. Sambasivan, A. Zheng, and G. Ganger. 2007. Modeling the relative fitness of storage. In Proceedings of 2007 ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’07). ACM, New York, NY. Google ScholarDigital Library
- Micron. 2007. Micron 8, 16, 32, 64Gb SLC NAND Flash Memory Data Sheet. Retrieved from http://www.micron.com/.Google Scholar
- S. Nath and P. B. Gibbons. 2008. Online maintenance of very large random samples on flash storage. In Proceedings of the 34th International Conference on Very Large Data Bases (VLDB’08). Google ScholarDigital Library
- NVM Express. 2015. Homepage. Retrieved from http://www.nvmexpress.org.Google Scholar
- P. O’Neil, B. O’Neil, and X. Chen. 2009. Star Schema Benchmark. Retrieved from http://www.cs.umb.edu/∼poneil/StarSchemaB.pdf.Google Scholar
- OpenNVM. 2015. Homepage. Retrieved from http://opennvm.github.io.Google Scholar
- OpenSSD. 2015. Homepage. Retrieved from http://www.openssd-project.org/.Google Scholar
- Jian Ouyang, Shiding Lin, Song Jiang, Zhenyu Hou, Yong Wang, and Yuanzheng Wang. 2014. SDF: Software-defined flash for web-scale internet storage systems. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). Google ScholarDigital Library
- C. Park, P. Talawar, D. Won, M. Jung, J. Im, S. Kim, and Y. Choi. 2006. A high performance controller for NAND flash-based solid state disk (NSSD). In Proceedings of the 21st IEEE Non-Volatile Semiconductor Memory Workshop (NVSMW’06).Google Scholar
- S. Park and K. Shen. 2012. FIOS: A fair, efficient flash I/O scheduler. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google ScholarDigital Library
- D. Patterson, G. Gibson, and R. Katz. 1988. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of 1988 ACM SIGMOD International Conference on Management of Data (SIGMOD’88). ACM, New York, NY. Google ScholarDigital Library
- PC Perspective. 2009. OCZ Apex Series 250GB Solid State Drive Review. Retrieved from http://www.pcper.com/article.php?aid=661.Google Scholar
- Pgbench. 2011. Homepage. Retrieved from http://developer.postgresql.org/pgdocs/postgres/pgbench.html.Google Scholar
- M. Polte, J. Simsa, and G. Gibson. 2008. Comparing performance of solid state devices and mechanical disks. In Proceedings of the 3rd Petascale Data Storage Workshop.Google Scholar
- T. Pritchett and M. Thottethodi. 2010. SieveStore: A highly-selective, ensemble-level disk cache for cost-performance. In Proceedings of International Symposium on Computer Architecture (ISCA’10). Google ScholarDigital Library
- Hongchan Roh, Sanghyun Park, Sungho Kim, Mincheol Shin, and Sang-Won Lee. 2012. B+-tree index optimization by exploiting internal parallelism of flash-based solid state drives. In Proceedings of VLDB Endowment (VLDB’12).Google Scholar
- M. Rosenblum and J. K. Ousterhout. 1992. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst. 10, 1 (1992), 26--52. Google ScholarDigital Library
- Samsung. 2007. Datasheet (K9LBG08U0M). Retrieved from http://www.samsung.com.Google Scholar
- SATA. 2011. Serial ATA Revision 2.6. Retrieved from http://www.sata-io.org.Google Scholar
- M. A. Shah, S. Harizopoulos, J. L. Wiener, and G. Graefe. 2008. Fast scans and joins using flash drives. In Proceedings of the 4th International Workshop on Data Management on New Hardware (DaMon’08). Google ScholarDigital Library
- S. Shah and B. D. Noble. 2007. A study of e-mail patterns. In Software Practice and Experience, Vol. 37(14). 1515--1538. Google ScholarDigital Library
- G. Soundararajan, V. Prabhakaran, M. Balakrishnan, and T. Wobber. 2010. Extending SSD lifetimes with disk-based write caches. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). Google ScholarDigital Library
- R. Stoica, M. Athanassoulis, R. Johnson, and A. Ailamaki. 2009. Evaluating and repairing write performance on flash devices. In Proceedings of the 5th International Workshop on Data Management on New Hardware (DaMon’09). Google ScholarDigital Library
- Michael Stonebraker, Chuck Bear, Ugur Çetintemel, Mitch Cherniack, Tingjian Ge, Nabil Hachem, Stavros Harizopoulos, John Lifter, Jennie Rogers, and Stanley B. Zdonik. 2007. One size fits all? Part 2: Benchmarking studies. In Proceedings of the 3rd Biennial Conference on Innovative Data Systems (CIDR’07).Google Scholar
- D. Tsirogiannis, S. Harizopoulos, and M. A. Shah. 2009. Query processing techniques for solid state drives. In Proceedings of the 2009 ACM SIGMOD Conference (SIGMOD’09). ACM, New York, NY. Google ScholarDigital Library
- Peng Wang, Guangyu Sun, Song Jiang, Jian Ouyang, Shiding Lin, Chen Zhang, and Jason Cong. 2014. An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In Proceedings of the Ninth European Conference on Computer Systems (EuroSys’14). Amsterdam, The Netherlands. Google ScholarDigital Library
- Guanying Wu and Xubin He. 2014. Reducing SSD read latency via NAND flash program and erase suspension. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google ScholarDigital Library
Index Terms
- Internal Parallelism of Flash Memory-Based Solid-State Drives
Recommendations
Understanding intrinsic characteristics and system implications of flash memory based solid state drives
SIGMETRICS '09Flash Memory based Solid State Drive (SSD) has been called a "pivotal technology" that could revolutionize data storage systems. Since SSD shares a common interface with the traditional hard disk drive (HDD), both physically and logically, an effective ...
Understanding intrinsic characteristics and system implications of flash memory based solid state drives
SIGMETRICS '09: Proceedings of the eleventh international joint conference on Measurement and modeling of computer systemsFlash Memory based Solid State Drive (SSD) has been called a "pivotal technology" that could revolutionize data storage systems. Since SSD shares a common interface with the traditional hard disk drive (HDD), both physically and logically, an effective ...
An empirical study of redundant array of independent solid-state drives (RAIS)
Solid-state drives (SSD) are popular storage media devices alongside magnetic hard disk drives (HDD). SSD flash chips are packaged in HDD form factors and SSDs are compatible with regular HDD device drivers and I/O buses. This compatibility allows easy ...
Comments