Skip to main content
Erschienen in: Cluster Computing 4/2020

11.12.2019

HasFS: optimizing file system consistency mechanism on NVM-based hybrid storage architecture

verfasst von: Yubo Liu, Hongbo Li, Yutong Lu, Zhiguang Chen, Nong Xiao, Ming Zhao

Erschienen in: Cluster Computing | Ausgabe 4/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In order to protect the data during system crash, traditional DRAM–DISK architecture file systems (e.g., EXT4) need to synchronize the dirty metadata and data from the memory to disk. At the same time, the disk synchronization may break the consistency of file system upon a crash, so traditional file systems use some mechanisms to guarantee the file system consistency when the dirty metadata and data is synchronized onto persistent storage devices (e.g., HDD and SSD). Journaling is a consistency mechanism widely used by file systems. We observe that the overhead of periodic disk synchronization and journaling is high. Emerging non-volatile memories (NVMs) can be potentially utilized to reduce these overheads. In this paper, we present hybrid architecture for storage file system (HasFS), a file system designed for the DRAM–NVM–DISK architecture. HasFS extends the main memory with NVM and considers NVM as a persistent page cache to eliminate the periodic disk synchronization overhead of dirty data. Then we design an efficient consistency mechanism based on the hybrid memory architecture to provide strong (both metadata and data) consistency guarantee with low overhead. The evaluation demonstrates that HasFS outperforms mainstream DRAM–DISK file systems for many workloads. For instance, HasFS has between 1.6X to 46.6X performance improvement over other tested file systems in random write workload. In particular, HasFS outperforms EXT4 without journal in some cases even though HasFS provides metadata and data consistency guarantees (similar to EXT4 with journal data mode).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Bonwick, J., Moore, B.: Zfs: The last word in file systems (2007) Bonwick, J., Moore, B.: Zfs: The last word in file systems (2007)
4.
Zurück zum Zitat Burr, G.W., Breitwisch, M.J., Franceschini, M., Garetto, D., Gopalakrishnan, K., Jackson, B., Kurdi, B., Lam, C., Lastras, L.A., Padilla, A., et al.: Phase change memory technology. J. Vac. Sci. Technol. B 28(2), 223–262 (2010)CrossRef Burr, G.W., Breitwisch, M.J., Franceschini, M., Garetto, D., Gopalakrishnan, K., Jackson, B., Kurdi, B., Lam, C., Lastras, L.A., Padilla, A., et al.: Phase change memory technology. J. Vac. Sci. Technol. B 28(2), 223–262 (2010)CrossRef
5.
Zurück zum Zitat Chen, C., Yang, J., Wei, Q., Wang, C., Xue, M.: Fine-grained metadata journaling on NVM. In: Proceedings of IEEE Conference on MASS Storage Systems and Technologies (MSST), pp. 1–13 (2016) Chen, C., Yang, J., Wei, Q., Wang, C., Xue, M.: Fine-grained metadata journaling on NVM. In: Proceedings of IEEE Conference on MASS Storage Systems and Technologies (MSST), pp. 1–13 (2016)
6.
Zurück zum Zitat Chen, F., Mesnier, M.P., Hahn, S.: A protected block device for persistent memory. In: Proceedings of IEEE Conference on MASS Storage Systems and Technologies (MSST), pp. 1–12 (2014) Chen, F., Mesnier, M.P., Hahn, S.: A protected block device for persistent memory. In: Proceedings of IEEE Conference on MASS Storage Systems and Technologies (MSST), pp. 1–12 (2014)
7.
Zurück zum Zitat Chidambaram, V., Pillai, T.S., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Optimistic crash consistency. In: Proceedings of the 24 ACM Symposium on Operating Systems Principles (SOSP), pp. 228–243 (2013) Chidambaram, V., Pillai, T.S., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Optimistic crash consistency. In: Proceedings of the 24 ACM Symposium on Operating Systems Principles (SOSP), pp. 228–243 (2013)
8.
Zurück zum Zitat Chidambaram, V., Sharma, T., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Consistency without ordering. In: Proceedings of USENIX Conference on File and Storage Technologies (FAST), p. 9 (2012) Chidambaram, V., Sharma, T., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Consistency without ordering. In: Proceedings of USENIX Conference on File and Storage Technologies (FAST), p. 9 (2012)
9.
Zurück zum Zitat Coburn, J., Caulfield, A.M., Akel, A., Grupp, L.M., Gupta, R.K., Jhala, R., Swanson, S.: Nv-heaps: making persistent objects fast and safe with next-generation, non-volatile memories. ACM Sigplan Not. 46(3), 105–118 (2011)CrossRef Coburn, J., Caulfield, A.M., Akel, A., Grupp, L.M., Gupta, R.K., Jhala, R., Swanson, S.: Nv-heaps: making persistent objects fast and safe with next-generation, non-volatile memories. ACM Sigplan Not. 46(3), 105–118 (2011)CrossRef
10.
Zurück zum Zitat Condit, J., Nightingale, E.B., Frost, C., Ipek, E., Lee, B., Burger, D., Coetzee, D.: Better i/o through byte-addressable, persistent memory. In: Proceedings of the 22nd symposium on Operating systems principles (SOSP), pp. 133–146 (2009) Condit, J., Nightingale, E.B., Frost, C., Ipek, E., Lee, B., Burger, D., Coetzee, D.: Better i/o through byte-addressable, persistent memory. In: Proceedings of the 22nd symposium on Operating systems principles (SOSP), pp. 133–146 (2009)
12.
Zurück zum Zitat Dulloor, S.R., Kumar, S., Keshavamurthy, A., Lantz, P., Reddy, D., Sankaran, R., Jackson, J.: System software for persistent memory. In: Proceedings of European Conference on Computer Systems (EuroSys), p. 15 (2014) Dulloor, S.R., Kumar, S., Keshavamurthy, A., Lantz, P., Reddy, D., Sankaran, R., Jackson, J.: System software for persistent memory. In: Proceedings of European Conference on Computer Systems (EuroSys), p. 15 (2014)
16.
Zurück zum Zitat Izraelevitz, J., Yang, J., Zhang, L., Kim, J., Liu, X., Memaripour, A., Soh, Y.J., Wang, Z., Xu, Y., Dulloor, S.R., et al.: Basic performance measurements of the intel optane dc persistent memory module. arXiv preprint arXiv:1903.05714 (2019) Izraelevitz, J., Yang, J., Zhang, L., Kim, J., Liu, X., Memaripour, A., Soh, Y.J., Wang, Z., Xu, Y., Dulloor, S.R., et al.: Basic performance measurements of the intel optane dc persistent memory module. arXiv preprint arXiv:​1903.​05714 (2019)
17.
Zurück zum Zitat Kawahara, T.: Scalable spin-transfer torque ram technology for normally-off computing. IEEE Des. Test Comput. 1, 52–63 (2010) Kawahara, T.: Scalable spin-transfer torque ram technology for normally-off computing. IEEE Des. Test Comput. 1, 52–63 (2010)
18.
Zurück zum Zitat Lantz, P., Rao, D.S., Kumar, S., Sankaran, R., Jackson, J.: Yat: A validation framework for persistent memory software. In: Proceedings of USENIX Technical Conference (ATC), pp. 433–438 (2014) Lantz, P., Rao, D.S., Kumar, S., Sankaran, R., Jackson, J.: Yat: A validation framework for persistent memory software. In: Proceedings of USENIX Technical Conference (ATC), pp. 433–438 (2014)
19.
Zurück zum Zitat Lee, B.C., Zhou, P., Yang, J., Zhang, Y., Zhao, B., Ipek, E., Mutlu, O., Burger, D.: Phase-change technology and the future of main memory. IEEE Micro 30(1), 143 (2010)CrossRef Lee, B.C., Zhou, P., Yang, J., Zhang, Y., Zhao, B., Ipek, E., Mutlu, O., Burger, D.: Phase-change technology and the future of main memory. IEEE Micro 30(1), 143 (2010)CrossRef
20.
Zurück zum Zitat Lee, E., Bahn, H., Noh, S.H.: Unioning of the buffer cache and journaling layers with non-volatile memory. In: Proceedings of USENIX Conference on File and Storage Technologies (FAST), pp. 73–80 (2013) Lee, E., Bahn, H., Noh, S.H.: Unioning of the buffer cache and journaling layers with non-volatile memory. In: Proceedings of USENIX Conference on File and Storage Technologies (FAST), pp. 73–80 (2013)
21.
Zurück zum Zitat Lee, E., Yoo, S., Jang, J.E., Bahn, H.: Shortcut-jfs: a write efficient journaling file system for phase change memory. In: Proceedings of IEEE Conference on MASS Storage Systems and Technologies (MSST), pp. 1–6. IEEE (2012) Lee, E., Yoo, S., Jang, J.E., Bahn, H.: Shortcut-jfs: a write efficient journaling file system for phase change memory. In: Proceedings of IEEE Conference on MASS Storage Systems and Technologies (MSST), pp. 1–6. IEEE (2012)
22.
Zurück zum Zitat McKusick, M.K., Joy, W.N., Leffler, S.J., Fabry, R.S.: A fast file system for unix. ACM Trans. Comput. Syst. (TOCS) 2(3), 181–197 (1984)CrossRef McKusick, M.K., Joy, W.N., Leffler, S.J., Fabry, R.S.: A fast file system for unix. ACM Trans. Comput. Syst. (TOCS) 2(3), 181–197 (1984)CrossRef
23.
Zurück zum Zitat Moraru, I., Andersen, D.G., Kaminsky, M., Tolia, N., Ranganathan, P., Binkert, N.: Consistent, durable, and safe memory management for byte-addressable non volatile main memory. In: Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems, p. 1 (2013) Moraru, I., Andersen, D.G., Kaminsky, M., Tolia, N., Ranganathan, P., Binkert, N.: Consistent, durable, and safe memory management for byte-addressable non volatile main memory. In: Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems, p. 1 (2013)
24.
Zurück zum Zitat Ou, J., Shu, J., Lu, Y.: A high performance file system for non-volatile main memory. In: Proceedings of European Conference on Computer Systems (EuroSys), p. 12 (2016) Ou, J., Shu, J., Lu, Y.: A high performance file system for non-volatile main memory. In: Proceedings of European Conference on Computer Systems (EuroSys), p. 12 (2016)
25.
Zurück zum Zitat Oukid, I., Lasperas, J., Nica, A., Willhalm, T., Lehner, W.: Fptree: A hybrid SCM-DRAM persistent and concurrent b-tree for storage class memory. In: Proceedings of International Conference on Management of Data (SIGMOD), pp. 371–386 (2016) Oukid, I., Lasperas, J., Nica, A., Willhalm, T., Lehner, W.: Fptree: A hybrid SCM-DRAM persistent and concurrent b-tree for storage class memory. In: Proceedings of International Conference on Management of Data (SIGMOD), pp. 371–386 (2016)
27.
Zurück zum Zitat Rodeh, O., Bacik, J., Mason, C.: Btrfs: the linux b-tree filesystem. ACM Trans. Storage (TOS) 9(3), 9 (2013) Rodeh, O., Bacik, J., Mason, C.: Btrfs: the linux b-tree filesystem. ACM Trans. Storage (TOS) 9(3), 9 (2013)
28.
Zurück zum Zitat Rosenblum, M., Ousterhout, J.K.: The design and implementation of a log-structured file system. ACM Trans. Comput. Syst. (TOCS) 10(1), 26–52 (1992)CrossRef Rosenblum, M., Ousterhout, J.K.: The design and implementation of a log-structured file system. ACM Trans. Comput. Syst. (TOCS) 10(1), 26–52 (1992)CrossRef
29.
Zurück zum Zitat Schwan, P., et al.: Lustre: Building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux symposium, pp. 380–386 (2003) Schwan, P., et al.: Lustre: Building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux symposium, pp. 380–386 (2003)
30.
Zurück zum Zitat Sehgal, P., Basu, S., Srinivasan, K., Voruganti, K.: An empirical study of file systems on nvm. In: Proceedings of IEEE Conference on MASS Storage Systems and Technologies (MSST), pp. 1–14 (2015) Sehgal, P., Basu, S., Srinivasan, K., Voruganti, K.: An empirical study of file systems on nvm. In: Proceedings of IEEE Conference on MASS Storage Systems and Technologies (MSST), pp. 1–14 (2015)
31.
Zurück zum Zitat Sweeney, A., Doucette, D., Hu, W., Anderson, C., Nishimoto, M., Peck, G.: Scalability in the XFS file system. In: Proceedings of USENIX Annual Technical Conference (ATC), vol. 15 (1996) Sweeney, A., Doucette, D., Hu, W., Anderson, C., Nishimoto, M., Peck, G.: Scalability in the XFS file system. In: Proceedings of USENIX Annual Technical Conference (ATC), vol. 15 (1996)
32.
Zurück zum Zitat Venkataraman, S., Tolia, N., Ranganathan, P., Campbell, R.H., et al.: Consistent and durable data structures for non-volatile byte-addressable memory. In: Proceedings of USENIX Conference on File and Storage Technologies (FAST), pp. 61–75 (2011) Venkataraman, S., Tolia, N., Ranganathan, P., Campbell, R.H., et al.: Consistent and durable data structures for non-volatile byte-addressable memory. In: Proceedings of USENIX Conference on File and Storage Technologies (FAST), pp. 61–75 (2011)
33.
Zurück zum Zitat Volos, H., Tack, A.J., Swift, M.M.: Mnemosyne: Lightweight persistent memory. In: Proceedings of Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 91–104 (2011) Volos, H., Tack, A.J., Swift, M.M.: Mnemosyne: Lightweight persistent memory. In: Proceedings of Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 91–104 (2011)
35.
Zurück zum Zitat Wu, M., Zwaenepoel, W.: envy: a non-volatile, main memory storage system. In: ACM SIGOPS Operating Systems Review, pp. 86–97 (1994) Wu, M., Zwaenepoel, W.: envy: a non-volatile, main memory storage system. In: ACM SIGOPS Operating Systems Review, pp. 86–97 (1994)
36.
Zurück zum Zitat Wu, X., Reddy, A.: SCMFS: a file system for storage class memory. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis (SC), p. 39 (2011) Wu, X., Reddy, A.: SCMFS: a file system for storage class memory. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis (SC), p. 39 (2011)
37.
Zurück zum Zitat Xu, J., Swanson, S.: Nova: a log-structured file system for hybrid volatile/non-volatile main memories. In: Proceedings of USENIX Conference on File and Storage Technologies (FAST), pp. 323–338 (2016) Xu, J., Swanson, S.: Nova: a log-structured file system for hybrid volatile/non-volatile main memories. In: Proceedings of USENIX Conference on File and Storage Technologies (FAST), pp. 323–338 (2016)
38.
Zurück zum Zitat Yang, J., Wei, Q., Chen, C., Wang, C., Yong, K.L., He, B.: NC-tree: Reducing consistency cost for NVM-based single level systems. In: Proceedings of USENIX Conference on File and Storage Technologies (FAST), pp. 167–181 (2015) Yang, J., Wei, Q., Chen, C., Wang, C., Yong, K.L., He, B.: NC-tree: Reducing consistency cost for NVM-based single level systems. In: Proceedings of USENIX Conference on File and Storage Technologies (FAST), pp. 167–181 (2015)
39.
Zurück zum Zitat Yang, J.J., Williams, R.S.: Memristive devices in computing system: promises and challenges. ACM J. Emerg. Technol. Computi. Syst. (JETC) 9(2), 11 (2013) Yang, J.J., Williams, R.S.: Memristive devices in computing system: promises and challenges. ACM J. Emerg. Technol. Computi. Syst. (JETC) 9(2), 11 (2013)
40.
Zurück zum Zitat Zhang, X., Feng, D., Hua, Y., Chen, J.: Optimizing file systems with a write-efficient journaling scheme on non-volatile memory. IEEE Trans. Comput. 68(3), 402–413 (2018)MathSciNetCrossRef Zhang, X., Feng, D., Hua, Y., Chen, J.: Optimizing file systems with a write-efficient journaling scheme on non-volatile memory. IEEE Trans. Comput. 68(3), 402–413 (2018)MathSciNetCrossRef
41.
Zurück zum Zitat Zhang, Y., Yang, J., Memaripour, A., Swanson, S.: Mojim: a reliable and highly-available non-volatile memory system. In: Proceedings of Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 3–18 (2015) Zhang, Y., Yang, J., Memaripour, A., Swanson, S.: Mojim: a reliable and highly-available non-volatile memory system. In: Proceedings of Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 3–18 (2015)
42.
Zurück zum Zitat Zhao, J., Li, S., Yoon, D.H., Xie, Y., Jouppi, N.P.: Kiln: Closing the performance gap between systems with and without persistence support. In: Proceedings of 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 421–432 (2013) Zhao, J., Li, S., Yoon, D.H., Xie, Y., Jouppi, N.P.: Kiln: Closing the performance gap between systems with and without persistence support. In: Proceedings of 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 421–432 (2013)
Metadaten
Titel
HasFS: optimizing file system consistency mechanism on NVM-based hybrid storage architecture
verfasst von
Yubo Liu
Hongbo Li
Yutong Lu
Zhiguang Chen
Nong Xiao
Ming Zhao
Publikationsdatum
11.12.2019
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 4/2020
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-019-03023-y

Weitere Artikel der Ausgabe 4/2020

Cluster Computing 4/2020 Zur Ausgabe

Premium Partner