Skip to main content
Top

2018 | OriginalPaper | Chapter

Exploring Scientific Application Performance Using Large Scale Object Storage

Authors : Steven Wei-der Chien, Stefano Markidis, Rami Karim, Erwin Laure, Sai Narasimhamurthy

Published in: High Performance Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

One of the major performance and scalability bottlenecks in large scientific applications is parallel reading and writing to supercomputer I/O systems. The usage of parallel file systems and consistency requirements of POSIX, that all the traditional HPC parallel I/O interfaces adhere to, pose limitations to the scalability of scientific applications. Object storage is a widely used storage technology in cloud computing and is more frequently proposed for HPC workload to address and improve the current scalability and performance of I/O in scientific applications. While object storage is a promising technology, it is still unclear how scientific applications will use object storage and what the main performance benefits will be. This work addresses these questions, by emulating an object storage used by a traditional scientific application and evaluating potential performance benefits. We show that scientific applications can benefit from the usage of object storage on large scales.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Lustre: A scalable, high-performance file system. Cluster File Systems Inc., White Paper (2002) Lustre: A scalable, high-performance file system. Cluster File Systems Inc., White Paper (2002)
3.
go back to reference IEEE standard for information technology-portable operating system interface (POSIX(R)) base specifications, issue 7. IEEE Std 1003.1-2017 (Revision of IEEE Std 1003.1-2008) (2018) IEEE standard for information technology-portable operating system interface (POSIX(R)) base specifications, issue 7. IEEE Std 1003.1-2017 (Revision of IEEE Std 1003.1-2008) (2018)
4.
go back to reference Bergman, K., et al.: Exascale computing study: technology challenges in achieving exascale systems. Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Technical Report 15 (2008) Bergman, K., et al.: Exascale computing study: technology challenges in achieving exascale systems. Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Technical Report 15 (2008)
5.
go back to reference Borrill, J., Oliker, L., Shalf, J., Shan, H.: Investigation of leading HPC I/O performance using a scientific-application derived benchmark. In: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, SC 2007, pp. 1–12 (2007) Borrill, J., Oliker, L., Shalf, J., Shan, H.: Investigation of leading HPC I/O performance using a scientific-application derived benchmark. In: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, SC 2007, pp. 1–12 (2007)
7.
go back to reference Byna, S., Chaarawi, M., Koziol, Q., Mainzer, J., Willmore, F.: Tuning HDF5 subfiling performance on parallel file systems (2017) Byna, S., Chaarawi, M., Koziol, Q., Mainzer, J., Willmore, F.: Tuning HDF5 subfiling performance on parallel file systems (2017)
8.
go back to reference Carns, P., Latham, R., Ross, R., Iskra, K., Lang, S., Riley, K.: 24/7 characterization of petascale I/O workloads. In: IEEE International Conference on Cluster Computing and Workshops, CLUSTER 2009, pp. 1–10. IEEE (2009) Carns, P., Latham, R., Ross, R., Iskra, K., Lang, S., Riley, K.: 24/7 characterization of petascale I/O workloads. In: IEEE International Conference on Cluster Computing and Workshops, CLUSTER 2009, pp. 1–10. IEEE (2009)
11.
go back to reference Factor, M., Meth, K., Naor, D., Rodeh, O., Satran, J.: Object storage: the future building block for storage systems. In: Local to Global Data Interoperability-Challenges and Technologies, pp. 119–123. IEEE (2005) Factor, M., Meth, K., Naor, D., Rodeh, O., Satran, J.: Object storage: the future building block for storage systems. In: Local to Global Data Interoperability-Challenges and Technologies, pp. 119–123. IEEE (2005)
12.
go back to reference Folk, M., Heber, G., Koziol, Q., Pourmal, E., Robinson, D.: An overview of the HDF5 technology suite and its applications. In: Proceedings of the EDBT/ICDT2011 Workshop on Array Databases, pp. 36–47. ACM (2011) Folk, M., Heber, G., Koziol, Q., Pourmal, E., Robinson, D.: An overview of the HDF5 technology suite and its applications. In: Proceedings of the EDBT/ICDT2011 Workshop on Array Databases, pp. 36–47. ACM (2011)
13.
go back to reference Li, J., et al.: Parallel netCDF: a high-performance scientific I/O interface. In: 2003 ACM/IEEE Conference on Supercomputing, p. 39 (2003) Li, J., et al.: Parallel netCDF: a high-performance scientific I/O interface. In: 2003 ACM/IEEE Conference on Supercomputing, p. 39 (2003)
14.
go back to reference Mohindra, A., Devarakonda, M.: Distributed token management in Calypso file system. In: Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing, pp. 290–297 (1994) Mohindra, A., Devarakonda, M.: Distributed token management in Calypso file system. In: Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing, pp. 290–297 (1994)
15.
go back to reference Narasimhamurthy, S., et al.: The SAGE project: a storage centric approach for exascale computing. In: Proceedings of Computing Frontiers. ACM (2018) Narasimhamurthy, S., et al.: The SAGE project: a storage centric approach for exascale computing. In: Proceedings of Computing Frontiers. ACM (2018)
17.
go back to reference Peng, I.B., Gioiosa, R., Kestor, G., Cicotti, P., Laure, E., Markidis, S.: Exploring the performance benefit of hybrid memory system on HPC environments. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 683–692. IEEE (2017) Peng, I.B., Gioiosa, R., Kestor, G., Cicotti, P., Laure, E., Markidis, S.: Exploring the performance benefit of hybrid memory system on HPC environments. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 683–692. IEEE (2017)
18.
go back to reference Peng, I.B., Markidis, S., Laure, E., Kestor, G., Gioiosa, R.: Exploring application performance on emerging hybrid-memory supercomputers. In: 2016 IEEE 18th International Conference on High Performance Computing and Communications, IEEE 14th International Conferenceon Smart City, IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp. 473–480. IEEE (2016) Peng, I.B., Markidis, S., Laure, E., Kestor, G., Gioiosa, R.: Exploring application performance on emerging hybrid-memory supercomputers. In: 2016 IEEE 18th International Conference on High Performance Computing and Communications, IEEE 14th International Conferenceon Smart City, IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp. 473–480. IEEE (2016)
19.
go back to reference Rivas-Gomez, S., et al.: MPI windows on storage for HPC applications. In: Proceedings of the 24th European MPI Users’ Group Meeting, p. 15. ACM (2017) Rivas-Gomez, S., et al.: MPI windows on storage for HPC applications. In: Proceedings of the 24th European MPI Users’ Group Meeting, p. 15. ACM (2017)
20.
go back to reference Schmuck, F.B., Haskin, R.L.: GPFS: a shared-disk file system for large computing clusters. In: Proceedings of the Conference on File and Storage Technologies, FAST 2002, pp. 231–244 (2002) Schmuck, F.B., Haskin, R.L.: GPFS: a shared-disk file system for large computing clusters. In: Proceedings of the Conference on File and Storage Technologies, FAST 2002, pp. 231–244 (2002)
21.
go back to reference Schwan, P., et al.: Lustre: building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux Symposium, vol. 2003, pp. 380–386 (2003) Schwan, P., et al.: Lustre: building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux Symposium, vol. 2003, pp. 380–386 (2003)
22.
go back to reference Shan, H., Antypas, K., Shalf, J.: Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, p. 42. IEEE Press (2008) Shan, H., Antypas, K., Shalf, J.: Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, p. 42. IEEE Press (2008)
23.
go back to reference Thakur, R., Gropp, W., Lusk, E.: On implementing MPI-IO portably and withhigh performance. In: Proceedings of the Sixth Workshop on I/O in Paralleland Distributed Systems, pp. 23–32 (1999) Thakur, R., Gropp, W., Lusk, E.: On implementing MPI-IO portably and withhigh performance. In: Proceedings of the Sixth Workshop on I/O in Paralleland Distributed Systems, pp. 23–32 (1999)
26.
go back to reference Wang, F., et al.: File system workload analysis for large scale scientific computing applications. Technical report (2004) Wang, F., et al.: File system workload analysis for large scale scientific computing applications. Technical report (2004)
27.
go back to reference Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.: Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th symposium on Operating systems design and implementation, pp. 307–320. USENIX Association (2006) Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.: Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th symposium on Operating systems design and implementation, pp. 307–320. USENIX Association (2006)
29.
go back to reference Yu, W., Vetter, J., Canon, R.S., Jiang, S.: Exploiting Lustre file joining for effective collective I/O. In: Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), pp. 267–274 (2007) Yu, W., Vetter, J., Canon, R.S., Jiang, S.: Exploiting Lustre file joining for effective collective I/O. In: Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), pp. 267–274 (2007)
Metadata
Title
Exploring Scientific Application Performance Using Large Scale Object Storage
Authors
Steven Wei-der Chien
Stefano Markidis
Rami Karim
Erwin Laure
Sai Narasimhamurthy
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-02465-9_8

Premium Partner