Skip to main content
Erschienen in: The Journal of Supercomputing 11/2019

22.08.2019

Effective metadata management in exascale file system

verfasst von: Myung-Hoon Cha, Sang-Min Lee, Hong-Yeon Kim, Young-Kyun Kim

Erschienen in: The Journal of Supercomputing | Ausgabe 11/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents an effective method of managing metadata in exascale file systems. In order to store exponentially growing numbers of files, numerous methods for distributing and managing metadata have been suggested and developed. However, these methods have not provided an appropriate solution for managing a very large amount of metadata because they do not overcome two significant challenges in exascale file systems: (1) nonlinear performance scalability and (2) performance degradation over time. We propose an effective metadata management model and high-performance metadata management system that not only overcome these limitations but also provide a foundation for managing exascale metadata in a distributed file system. The resulting implementation of our metadata management system is the core of EEFS, an exascale distributed file system by the Electronics and Telecommunications Research Institute. The evaluation results show that the critical challenges of existing metadata management technologies are overcome and particularly that the performance is not degraded even when the amount of accumulated metadata increases with time.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Konstantin S, Hairong K, Sanjay R, Robert C (2010) The hadoop distributed file system. In: Proceedings of the 26th IEEE Symposium on Mass Storage Systems and Technologies (MSST’10), pp 1–10 Konstantin S, Hairong K, Sanjay R, Robert C (2010) The hadoop distributed file system. In: Proceedings of the 26th IEEE Symposium on Mass Storage Systems and Technologies (MSST’10), pp 1–10
3.
Zurück zum Zitat Konstantin S (2010) HDFS scalability: the limits to growth. USENIX; login 35(2):6–16 Konstantin S (2010) HDFS scalability: the limits to growth. USENIX; login 35(2):6–16
4.
Zurück zum Zitat Sadaf RA, Hussein NEH, Kristopher H, Neil S, Fabio V (2011) Parallel I/O and the metadata wall. In: Proceedings of the 6th Workshop on Parallel Data Storage (PDSW’11), pp 13–18 Sadaf RA, Hussein NEH, Kristopher H, Neil S, Fabio V (2011) Parallel I/O and the metadata wall. In: Proceedings of the 6th Workshop on Parallel Data Storage (PDSW’11), pp 13–18
5.
Zurück zum Zitat Sage AW (2007) Ceph: reliable, scalable, and high-performance distributed storage. Doctoral dissertation, University of California Sage AW (2007) Ceph: reliable, scalable, and high-performance distributed storage. Doctoral dissertation, University of California
6.
Zurück zum Zitat Sage AW, Scott AB, Ethan LM, Darrell DEL, Carlos M (2006) Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI’06), pp 307–320 Sage AW, Scott AB, Ethan LM, Darrell DEL, Carlos M (2006) Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI’06), pp 307–320
8.
Zurück zum Zitat Beaver D, Kumar S, Li H, Sobel J, Vajgel P (2010) Finding a needle in Haystack: Facebook’s photo storage. In: Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10), pp 47–60 Beaver D, Kumar S, Li H, Sobel J, Vajgel P (2010) Finding a needle in Haystack: Facebook’s photo storage. In: Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10), pp 47–60
9.
Zurück zum Zitat Muralidhar S, Llyod W, Roy S, Hill C, Lin E, Liu W, Pan S, Shankar S, Sivakumar V, Tang L, Kumar S (2014) f4: Facebook’s warm BLOB storage system. In: Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14), pp 383–398 Muralidhar S, Llyod W, Roy S, Hill C, Lin E, Liu W, Pan S, Shankar S, Sivakumar V, Tang L, Kumar S (2014) f4: Facebook’s warm BLOB storage system. In: Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14), pp 383–398
10.
Zurück zum Zitat Bronson N, Amsden Z, Cabrera G, Chakka P, Dimov P, Ding H, Ferris J, Giardullo A, Kulkarni S, Li H, Marchukov M, Petrov D, Puzar L, Song Y, Venkataramani V (2013) TAO: Facebook’s distributed data store for the social graph. In: Proceedings of USENIX Annual Technical Conference (USENIX ATC’13), pp 49–60 Bronson N, Amsden Z, Cabrera G, Chakka P, Dimov P, Ding H, Ferris J, Giardullo A, Kulkarni S, Li H, Marchukov M, Petrov D, Puzar L, Song Y, Venkataramani V (2013) TAO: Facebook’s distributed data store for the social graph. In: Proceedings of USENIX Annual Technical Conference (USENIX ATC’13), pp 49–60
11.
Zurück zum Zitat Alexander T, Daniel JA (2015) CalvinFS: consistent WAN replication and scalable metadata management for distributed file systems. In: Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15), pp 1–14 Alexander T, Daniel JA (2015) CalvinFS: consistent WAN replication and scalable metadata management for distributed file systems. In: Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15), pp 1–14
12.
Zurück zum Zitat Johnson C, Keeton K, Morrey III C, Soules C, Veitch A, Bacon S, Batuner O, Condotta M, Coutinho H, Doyle P, Eichelberger R, Kiehl H, Magalhaes G, McEvoy J, Nagarajan P, Osborne P, Souza J, Sparkes A, Spitzer M, Tandel S, Thomas L, Zangaro S (2014) From research to practice: experiences engineering a production metadata database for a scale out file system. In: Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14), pp 191–198 Johnson C, Keeton K, Morrey III C, Soules C, Veitch A, Bacon S, Batuner O, Condotta M, Coutinho H, Doyle P, Eichelberger R, Kiehl H, Magalhaes G, McEvoy J, Nagarajan P, Osborne P, Souza J, Sparkes A, Spitzer M, Tandel S, Thomas L, Zangaro S (2014) From research to practice: experiences engineering a production metadata database for a scale out file system. In: Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14), pp 191–198
13.
Zurück zum Zitat Xiao L, Ren K, Zheng Q, Gibson G (2015) ShardFS vs. IndexFS: replication vs. caching strategies for distributed metadata management in cloud storage systems. In: Proceedings of the 6th ACM Symposium on Cloud Computing (SoCC’15), pp 236–249 Xiao L, Ren K, Zheng Q, Gibson G (2015) ShardFS vs. IndexFS: replication vs. caching strategies for distributed metadata management in cloud storage systems. In: Proceedings of the 6th ACM Symposium on Cloud Computing (SoCC’15), pp 236–249
14.
Zurück zum Zitat Ghemawat S, Gobioff H, Leung S (2003) The Google file system. In: Proceedings of ACM Symposium on Operating Systems Principles (SOSP’03), pp 29–43 Ghemawat S, Gobioff H, Leung S (2003) The Google file system. In: Proceedings of ACM Symposium on Operating Systems Principles (SOSP’03), pp 29–43
15.
Zurück zum Zitat Brandt S, Miller E, Long D, Xue L (2003) Efficient metadata management in large distributed storage systems. In: Proceedings of the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST’03), pp 290–298 Brandt S, Miller E, Long D, Xue L (2003) Efficient metadata management in large distributed storage systems. In: Proceedings of the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST’03), pp 290–298
16.
Zurück zum Zitat Zhang S, Catanese H, Wang A (2016) The composite-file file system: decoupling the one-to-one mapping of files and metadata for better performance. In: Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16), pp 15–22 Zhang S, Catanese H, Wang A (2016) The composite-file file system: decoupling the one-to-one mapping of files and metadata for better performance. In: Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16), pp 15–22
17.
Zurück zum Zitat Sinnamohideen S, Sambasivan R, Hendricks J, Liu L, Ganger G (2010) A transparently-scalable metadata service for the ursa minor storage system. In: Proceedings of USENIX Annual Technical Conference (USENIX ATC’10) Sinnamohideen S, Sambasivan R, Hendricks J, Liu L, Ganger G (2010) A transparently-scalable metadata service for the ursa minor storage system. In: Proceedings of USENIX Annual Technical Conference (USENIX ATC’10)
18.
Zurück zum Zitat Weil S, Pollack K, Brandt S, Miller E (2004) Dynamic metadata management for petabyte-scale file systems. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing (SC’04) Weil S, Pollack K, Brandt S, Miller E (2004) Dynamic metadata management for petabyte-scale file systems. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing (SC’04)
19.
Zurück zum Zitat Xiong J, Hu Y, Li G, Tang R, Fan Z (2011) Metadata distribution and consistency techniques for large-scale cluster file systems. IEEE Trans Parallel Distrib Syst 22(5):803–816CrossRef Xiong J, Hu Y, Li G, Tang R, Fan Z (2011) Metadata distribution and consistency techniques for large-scale cluster file systems. IEEE Trans Parallel Distrib Syst 22(5):803–816CrossRef
20.
Zurück zum Zitat Cha M, Kim D, Kim H, Kim Y (2017) Adaptive metadata rebalance in exascale file system. J Supercomput 73:1337–1359CrossRef Cha M, Kim D, Kim H, Kim Y (2017) Adaptive metadata rebalance in exascale file system. J Supercomput 73:1337–1359CrossRef
21.
Zurück zum Zitat Noghabi S, Subramanian S, Narayanan P, Narayanan S, Holla G, Zadeh M, Li T, Gupta I, Campbell R (2016) Ambry: LinkedIn’s scalable geo-distributed object store. In: Proceedings of the 2016 International Conference on Management of Data (SIGMOD’16), pp 253–265 Noghabi S, Subramanian S, Narayanan P, Narayanan S, Holla G, Zadeh M, Li T, Gupta I, Campbell R (2016) Ambry: LinkedIn’s scalable geo-distributed object store. In: Proceedings of the 2016 International Conference on Management of Data (SIGMOD’16), pp 253–265
23.
Zurück zum Zitat Thomson A, Diamond T, Weng S, Ren K, Shao P, Abadi D (2014) Fast distributed transactions and strongly consistent replication for OLTP database systems. ACM T Database Syst 39(2):11–49MathSciNet Thomson A, Diamond T, Weng S, Ren K, Shao P, Abadi D (2014) Fast distributed transactions and strongly consistent replication for OLTP database systems. ACM T Database Syst 39(2):11–49MathSciNet
24.
Zurück zum Zitat Ren K, Thomson A, Abadi D (2014) An evaluation of the advantages and disadvantages of deterministic database systems. Proc VLDB Endow 7(10):821–832CrossRef Ren K, Thomson A, Abadi D (2014) An evaluation of the advantages and disadvantages of deterministic database systems. Proc VLDB Endow 7(10):821–832CrossRef
25.
Zurück zum Zitat Cipar J, Ganger G, Keeton K, Morrey III C, Soules C, Veitch A (2012) LazyBase: trading freshness for performance in a scalable database. In: Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys’12), pp 169–182 Cipar J, Ganger G, Keeton K, Morrey III C, Soules C, Veitch A (2012) LazyBase: trading freshness for performance in a scalable database. In: Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys’12), pp 169–182
Metadaten
Titel
Effective metadata management in exascale file system
verfasst von
Myung-Hoon Cha
Sang-Min Lee
Hong-Yeon Kim
Young-Kyun Kim
Publikationsdatum
22.08.2019
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 11/2019
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-019-02974-8

Weitere Artikel der Ausgabe 11/2019

The Journal of Supercomputing 11/2019 Zur Ausgabe

Premium Partner