Skip to main content
Erschienen in: The Journal of Supercomputing 5/2017

01.11.2016

Reducing I/O variability using dynamic I/O path characterization in petascale storage systems

verfasst von: Seung Woo Son, Saba Sehrish, Wei-keng Liao, Ron Oldfield, Alok Choudhary

Erschienen in: The Journal of Supercomputing | Ausgabe 5/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In petascale systems with a million CPU cores, scalable and consistent I/O performance is becoming increasingly difficult to sustain mainly because of I/O variability. The I/O variability is caused by concurrently running processes/jobs competing for I/O or a RAID rebuild when a disk drive fails. We present a mechanism that stripes across a selected subset of I/O nodes with the lightest workload at runtime to achieve the highest I/O bandwidth available in the system. In this paper, we propose a probing mechanism to enable application-level dynamic file striping to mitigate I/O variability. We implement the proposed mechanism in the high-level I/O library that enables memory-to-file data layout transformation and allows transparent file partitioning using subfiling. Subfiling is a technique that partitions data into a set of files of smaller size and manages file access to them, making data to be treated as a single, normal file to users. We demonstrate that our bandwidth probing mechanism can successfully identify temporally slower I/O nodes without noticeable runtime overhead. Experimental results on NERSC’s systems also show that our approach isolates I/O variability effectively on shared systems and improves overall collective I/O performance with less variation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Bent J, Faibish S, Ahrens J, Grider G, Patchett J, Tzelnic P, Woodring J (2012) Jitter-free co-processing on a prototype exascale storage stack. In: IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pp 1–5 Bent J, Faibish S, Ahrens J, Grider G, Patchett J, Tzelnic P, Woodring J (2012) Jitter-free co-processing on a prototype exascale storage stack. In: IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pp 1–5
3.
Zurück zum Zitat Bent J, Gibson G, Grider G, McClelland B, Nowoczynski P, Nunez J, Polte M, Wingate M (2009) PLFS: A checkpoint filesystem for parallel applications. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Bent J, Gibson G, Grider G, McClelland B, Nowoczynski P, Nunez J, Polte M, Wingate M (2009) PLFS: A checkpoint filesystem for parallel applications. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
4.
Zurück zum Zitat Byna S, Uselton A, Praphat Knaaky D, He YH (2013) Trillion particles, 120,000 cores, and 350 TBs: lessons learned from a hero I/O run on hopper. In: Cray user group meeting Byna S, Uselton A, Praphat Knaaky D, He YH (2013) Trillion particles, 120,000 cores, and 350 TBs: lessons learned from a hero I/O run on hopper. In: Cray user group meeting
5.
Zurück zum Zitat Carns P, Latham R, Ross R, Iskra K, Lang S, Riley K (2009) 24/7 characterization of petascale I/O workloads. In: Proceedings of the First Workshop on Interfaces and Abstractions for Scientific Data Storage Carns P, Latham R, Ross R, Iskra K, Lang S, Riley K (2009) 24/7 characterization of petascale I/O workloads. In: Proceedings of the First Workshop on Interfaces and Abstractions for Scientific Data Storage
6.
Zurück zum Zitat Dai D, Chen Y, Kimpe D, Ross R (2014) Two-choice randomized dynamic I/O scheduler for object storage systems. International Conference for High Performance Computing, Networking, Storage and Analysis, pp 635–646 Dai D, Chen Y, Kimpe D, Ross R (2014) Two-choice randomized dynamic I/O scheduler for object storage systems. International Conference for High Performance Computing, Networking, Storage and Analysis, pp 635–646
7.
Zurück zum Zitat Dickens PM, Logan J (2009) Y-lib: a user level library to increase the performance of MPI-IO in a Lustre file system environment. In: Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, pp 31–38 Dickens PM, Logan J (2009) Y-lib: a user level library to increase the performance of MPI-IO in a Lustre file system environment. In: Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, pp 31–38
8.
Zurück zum Zitat Dorier M, Antoniu G, Ross R, Kimpe D, Ibrahim S (2014) CALCioM: mitigating I/O interference in HPC systems through cross-application coordination. In: Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp 155–164 Dorier M, Antoniu G, Ross R, Kimpe D, Ibrahim S (2014) CALCioM: mitigating I/O interference in HPC systems through cross-application coordination. In: Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp 155–164
10.
Zurück zum Zitat Fang A, Chien AA (2015) How much ssd is useful for resilience in supercomputers. In: Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale, pp 47–54 Fang A, Chien AA (2015) How much ssd is useful for resilience in supercomputers. In: Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale, pp 47–54
11.
Zurück zum Zitat Fryxell B, Olson K, Ricker P, Timmes FX, Zingale M, Lamb DQ, MacNeice P, Rosner R, Truran JW, Tufo H (2000) FLASH: an adaptive mesh hydrodynamics code for modeling astrophysical thermonuclear flashes. Astrophys J Suppl Ser 131(1):273CrossRef Fryxell B, Olson K, Ricker P, Timmes FX, Zingale M, Lamb DQ, MacNeice P, Rosner R, Truran JW, Tufo H (2000) FLASH: an adaptive mesh hydrodynamics code for modeling astrophysical thermonuclear flashes. Astrophys J Suppl Ser 131(1):273CrossRef
12.
Zurück zum Zitat Fu J, Liu N, Sahni O, Jansen KE, Shephard MS, Carothers CD (2010) Scalable parallel I/O alternatives for massively parallel partitioned solver systems. In: Proceedings of Workshop on Large-Scale Parallel Processing Fu J, Liu N, Sahni O, Jansen KE, Shephard MS, Carothers CD (2010) Scalable parallel I/O alternatives for massively parallel partitioned solver systems. In: Proceedings of Workshop on Large-Scale Parallel Processing
13.
Zurück zum Zitat Fu J, Min M, Latham R, Carothers CD (2011) Parallel I/O performance for application-level checkpointing on the blue gene/P system. In: Proceedings on Workshop on Interfaces and Architectures for Scientific Data Storage, pp 465–473 Fu J, Min M, Latham R, Carothers CD (2011) Parallel I/O performance for application-level checkpointing on the blue gene/P system. In: Proceedings on Workshop on Interfaces and Architectures for Scientific Data Storage, pp 465–473
14.
Zurück zum Zitat Gao K, Liao Wk, Nisar A, Choudhary A, Ross R, Latham R (2009) Using Subfiling to improve programming flexibility and performance of parallel shared-file I/O. In: Proceedings of the International Conference on Parallel Processing, pp 470–477 Gao K, Liao Wk, Nisar A, Choudhary A, Ross R, Latham R (2009) Using Subfiling to improve programming flexibility and performance of parallel shared-file I/O. In: Proceedings of the International Conference on Parallel Processing, pp 470–477
16.
Zurück zum Zitat Kendall W, Huang J, Peterka T, Latham R, Ross R (2011) Visualization viewpoint: towards a general I/O layer for parallel visualization applications. IEEE Comput Graph Appl 31(6):6–10CrossRef Kendall W, Huang J, Peterka T, Latham R, Ross R (2011) Visualization viewpoint: towards a general I/O layer for parallel visualization applications. IEEE Comput Graph Appl 31(6):6–10CrossRef
17.
Zurück zum Zitat Kim Y, Atchley S, Vallée GR, Shipman GM (2015) LADS: optimizing data transfers using layout-aware data scheduling. In: Proceedings of the 13th USENIX Conference on File and Storage Technologies, FAST’15, pp 67–80 Kim Y, Atchley S, Vallée GR, Shipman GM (2015) LADS: optimizing data transfers using layout-aware data scheduling. In: Proceedings of the 13th USENIX Conference on File and Storage Technologies, FAST’15, pp 67–80
18.
Zurück zum Zitat Kotz D (1997) Disk-directed I/O for MIMD multiprocessors. ACM Trans Comput Syst 15(1):41–74CrossRef Kotz D (1997) Disk-directed I/O for MIMD multiprocessors. ACM Trans Comput Syst 15(1):41–74CrossRef
19.
Zurück zum Zitat Kumar S, Vishwanath V, Carns P, Levine JA, Latham R, Scorzelli G, Kolla H, Grout R, Chen J, Ross R, Papka ME, Pascucci V (2012) Efficient data restructuring and aggregation for IO acceleration in PIDX. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Kumar S, Vishwanath V, Carns P, Levine JA, Latham R, Scorzelli G, Kolla H, Grout R, Chen J, Ross R, Papka ME, Pascucci V (2012) Efficient data restructuring and aggregation for IO acceleration in PIDX. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
20.
Zurück zum Zitat Kumar S, Vishwanath V, Carns P, Summa B, Scorzelli G, Pascucci V, Ross R, Chen J, Kolla H, Grout R (2011) PIDX: Efficient parallel I/O for multi-resolution multi-dimensional scientific datasets. In: Proceedings of the 2011 IEEE International Conference on Cluster Computing, pp 103–111 Kumar S, Vishwanath V, Carns P, Summa B, Scorzelli G, Pascucci V, Ross R, Chen J, Kolla H, Grout R (2011) PIDX: Efficient parallel I/O for multi-resolution multi-dimensional scientific datasets. In: Proceedings of the 2011 IEEE International Conference on Cluster Computing, pp 103–111
21.
Zurück zum Zitat Lang S, Carns P, Latham R, Ross R, Harms K, Allcock W (2009) I/O performance challenges at leadership scale. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pp 40:1–40:12 Lang S, Carns P, Latham R, Ross R, Harms K, Allcock W (2009) I/O performance challenges at leadership scale. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pp 40:1–40:12
22.
Zurück zum Zitat Latham R, Daley C, Keng Liao W, Gao K, Ross R, Dubey A, Choudhary A (2012) A case study for scientific I/O: improving the FLASH astrophysics code. Comput Sci Discov 5(1):015, 001 Latham R, Daley C, Keng Liao W, Gao K, Ross R, Dubey A, Choudhary A (2012) A case study for scientific I/O: improving the FLASH astrophysics code. Comput Sci Discov 5(1):015, 001
23.
Zurück zum Zitat Li J, Liao Wk, Choudhary A, Ross R, Thakur R, Gropp W, Latham R, Siegel A, Gallagher B, Zingale M (2003) Parallel netCDF: a high-performance scientific I/O interface. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Li J, Liao Wk, Choudhary A, Ross R, Thakur R, Gropp W, Latham R, Siegel A, Gallagher B, Zingale M (2003) Parallel netCDF: a high-performance scientific I/O interface. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
24.
Zurück zum Zitat Li Y, Lu X, Miller EL, Long DDE (2015) ASCAR: automating contention management for high-performance storage systems. In: IEEE 31st Symposium on Mass Storage Systems and Technologies, MSST, pp 1–16 Li Y, Lu X, Miller EL, Long DDE (2015) ASCAR: automating contention management for high-performance storage systems. In: IEEE 31st Symposium on Mass Storage Systems and Technologies, MSST, pp 1–16
25.
Zurück zum Zitat Liao Wk, Choudhary A (2008) Dynamically adapting file domain partitioning methods for collective I/O based on underlying parallel file system locking protocols. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Liao Wk, Choudhary A (2008) Dynamically adapting file domain partitioning methods for collective I/O based on underlying parallel file system locking protocols. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
26.
Zurück zum Zitat Liao WK, Coloma K, Choudhary A, Ward L, Russell E, Pundit N (2006) Scalable design and implementations for mpi parallel overlapping I/O. IEEE Trans Parallel Distrib Syst 17(11):1264–1276CrossRef Liao WK, Coloma K, Choudhary A, Ward L, Russell E, Pundit N (2006) Scalable design and implementations for mpi parallel overlapping I/O. IEEE Trans Parallel Distrib Syst 17(11):1264–1276CrossRef
27.
Zurück zum Zitat Liao WK, Coloma K, Choudhary A, Ward L, Russell E, Tideman S (2005) Collective caching: application-aware client-side file caching. In: Proceedings of 14th IEEE International Symposium on High Performance Distributed Computing, pp 81–90 Liao WK, Coloma K, Choudhary A, Ward L, Russell E, Tideman S (2005) Collective caching: application-aware client-side file caching. In: Proceedings of 14th IEEE International Symposium on High Performance Distributed Computing, pp 81–90
28.
Zurück zum Zitat Liu N, Cope J, Carns PH, Carothers CD, Ross RB, Grider G, Crume A, Maltzahn C (2012) On the role of burst buffers in leadership-class storage systems. In: Proceedings of the IEEE Conference on Mass Storage Systems, pp 1–11 Liu N, Cope J, Carns PH, Carothers CD, Ross RB, Grider G, Crume A, Maltzahn C (2012) On the role of burst buffers in leadership-class storage systems. In: Proceedings of the IEEE Conference on Mass Storage Systems, pp 1–11
29.
Zurück zum Zitat Lofstead J, Ross R (2013) Insights for exascale IO APIs from building a petascale IO API. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp 87:1–87:12 Lofstead J, Ross R (2013) Insights for exascale IO APIs from building a petascale IO API. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp 87:1–87:12
30.
Zurück zum Zitat Lofstead J, Zheng F, Liu Q, Klasky S, Oldfield R, Kordenbrock T, Schwan K, Wolf M (2010) Managing variability in the IO performance of petascale storage systems. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’10, pp 1–12 Lofstead J, Zheng F, Liu Q, Klasky S, Oldfield R, Kordenbrock T, Schwan K, Wolf M (2010) Managing variability in the IO performance of petascale storage systems. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’10, pp 1–12
31.
Zurück zum Zitat Lofstead JF, Klasky S, Schwan K, Podhorszki N, Jin C (2008) Flexible IO and Integration for scientific codes through the adaptable IO system (ADIOS). In: Proceedings of the 6th International Workshop on Challenges of Large Applications in Distributed Environments, pp 15–24 Lofstead JF, Klasky S, Schwan K, Podhorszki N, Jin C (2008) Flexible IO and Integration for scientific codes through the adaptable IO system (ADIOS). In: Proceedings of the 6th International Workshop on Challenges of Large Applications in Distributed Environments, pp 15–24
33.
Zurück zum Zitat Ma X, Winslett M, Lee J, Yu S (2003) Improving MPI-IO output performance with active buffering plus threads. In: Proceedings of the 17th International Symposium on Parallel and Distributed Processing Ma X, Winslett M, Lee J, Yu S (2003) Improving MPI-IO output performance with active buffering plus threads. In: Proceedings of the 17th International Symposium on Parallel and Distributed Processing
36.
Zurück zum Zitat Park S, Shen K (2012) FIOS: a fair, efficient flash I/O scheduler. In: Proceedings of the 10th USENIX Conference on File and Storage Technologies, pp 13–13 Park S, Shen K (2012) FIOS: a fair, efficient flash I/O scheduler. In: Proceedings of the 10th USENIX Conference on File and Storage Technologies, pp 13–13
37.
Zurück zum Zitat Randall D, Khairoutdinov M, Arakawa A, Grabowski W (2003) Breaking the cloud parameterization deadlock. Bull Am Meteor Soc 84:1547–1564CrossRef Randall D, Khairoutdinov M, Arakawa A, Grabowski W (2003) Breaking the cloud parameterization deadlock. Bull Am Meteor Soc 84:1547–1564CrossRef
38.
Zurück zum Zitat del Rosario JM, Bordawekar R, Choudhary A (1993) Improved parallel I/O via a two-phase run-time access strategy. In: Proceedings of Workshop on Input/Output in Parallel Computer Systems, pp 56–70 del Rosario JM, Bordawekar R, Choudhary A (1993) Improved parallel I/O via a two-phase run-time access strategy. In: Proceedings of Workshop on Input/Output in Parallel Computer Systems, pp 56–70
39.
Zurück zum Zitat Sankaran R, Hawkes ER, Chen JH, Lu T, Law CK (2006) Direct numerical simulations of turbulent lean premixed combustion. J Phys Conf Ser 46(1):38CrossRef Sankaran R, Hawkes ER, Chen JH, Lu T, Law CK (2006) Direct numerical simulations of turbulent lean premixed combustion. J Phys Conf Ser 46(1):38CrossRef
40.
Zurück zum Zitat Sato K, Mohror K, Moody A, Gamblin T, d. Supinski BR, Maruyama N, Matsuoka S (2014) A user-level infiniband-based file system and checkpoint strategy for burst buffers. In: 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp 21–30 Sato K, Mohror K, Moody A, Gamblin T, d. Supinski BR, Maruyama N, Matsuoka S (2014) A user-level infiniband-based file system and checkpoint strategy for burst buffers. In: 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp 21–30
41.
Zurück zum Zitat Schuchardt K, Palmer B, Daily J, Elsethagen T, Koontz A (2007) IO strategies and data services for petascale data sets from a global cloud resolving model. J Phys Conf Ser 78:012089 Schuchardt K, Palmer B, Daily J, Elsethagen T, Koontz A (2007) IO strategies and data services for petascale data sets from a global cloud resolving model. J Phys Conf Ser 78:012089
42.
Zurück zum Zitat Seamons KE, Chen Y, Jones P, Jozwiak J, Winslett M (1995) Server-directed collective I/O in Panda. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Seamons KE, Chen Y, Jones P, Jozwiak J, Winslett M (1995) Server-directed collective I/O in Panda. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
43.
Zurück zum Zitat Shende SS, Malony AD (2006) The TAU parallel performance system. Int J High Perform Comput Appl 20(2):287–311CrossRef Shende SS, Malony AD (2006) The TAU parallel performance system. Int J High Perform Comput Appl 20(2):287–311CrossRef
44.
Zurück zum Zitat Son SW, Sehrish S, k. Liao W, Oldfield R, Choudhary A (2013) Dynamic file striping and data layout transformation on parallel system with fluctuating I/O workload. In: 2013 IEEE International Conference on Cluster Computing (CLUSTER), pp 1–8 Son SW, Sehrish S, k. Liao W, Oldfield R, Choudhary A (2013) Dynamic file striping and data layout transformation on parallel system with fluctuating I/O workload. In: 2013 IEEE International Conference on Cluster Computing (CLUSTER), pp 1–8
45.
Zurück zum Zitat Song H, Yin Y, Sun XH, Thakur R, Lang S (2011) Server-side I/O coordination for parallel file systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (2011) Song H, Yin Y, Sun XH, Thakur R, Lang S (2011) Server-side I/O coordination for parallel file systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (2011)
46.
Zurück zum Zitat Tavakoli N, Dai D, Chen Y (2016) Log-assisted straggler-aware I/O scheduler for high-end computing. In: 2016 45th International Conference on Parallel Processing Workshops (ICPPW), pp 181–189. doi:10.1109/ICPPW.2016.38 Tavakoli N, Dai D, Chen Y (2016) Log-assisted straggler-aware I/O scheduler for high-end computing. In: 2016 45th International Conference on Parallel Processing Workshops (ICPPW), pp 181–189. doi:10.​1109/​ICPPW.​2016.​38
47.
Zurück zum Zitat Thakur R, Choudhary A (1996) An extended two-phase method for accessing sections of out-of-core arrays. Sci Progr 5(4):301–317 Thakur R, Choudhary A (1996) An extended two-phase method for accessing sections of out-of-core arrays. Sci Progr 5(4):301–317
48.
Zurück zum Zitat Thakur R, Gropp W, Lusk E (1999) Data sieving and collective I/O in ROMIO. In: Proceedings of the 7th Symposium on the Frontiers of Massively Parallel Computation Thakur R, Gropp W, Lusk E (1999) Data sieving and collective I/O in ROMIO. In: Proceedings of the 7th Symposium on the Frontiers of Massively Parallel Computation
49.
Zurück zum Zitat Thapaliya S, Bangalore P, Lofstead J, Mohror K, Moody A (2014) IO-Cop: managing concurrent accesses to shared parallel file system. In: 43rd International Conference on Parallel Processing Workshops, pp 52–60 Thapaliya S, Bangalore P, Lofstead J, Mohror K, Moody A (2014) IO-Cop: managing concurrent accesses to shared parallel file system. In: 43rd International Conference on Parallel Processing Workshops, pp 52–60
50.
Zurück zum Zitat Thapaliya S, Bangalore P, Lofstead J, Mohror K, Moody A (2016) Managing I/O interference in a shared burst buffer system. In: 2016 45th International Conference on Parallel Processing (ICPP), pp. 416–425 Thapaliya S, Bangalore P, Lofstead J, Mohror K, Moody A (2016) Managing I/O interference in a shared burst buffer system. In: 2016 45th International Conference on Parallel Processing (ICPP), pp. 416–425
52.
Zurück zum Zitat Wachs M, Abd-El-Malek M, Thereska E, Ganger GR (2007) Argon: performance insulation for shared storage servers. In: Proceedings of the 5th USENIX Conference on File and Storage Technologies Wachs M, Abd-El-Malek M, Thereska E, Ganger GR (2007) Argon: performance insulation for shared storage servers. In: Proceedings of the 5th USENIX Conference on File and Storage Technologies
53.
Zurück zum Zitat Wang T, Oral S, Pritchard M, Wang B, Yu W (2015) TRIO: burst buffer based I/O orchestration. In: 2015 IEEE International Conference on Cluster Computing, pp 194–203 Wang T, Oral S, Pritchard M, Wang B, Yu W (2015) TRIO: burst buffer based I/O orchestration. In: 2015 IEEE International Conference on Cluster Computing, pp 194–203
54.
Zurück zum Zitat Wang T, Oral S, Wang Y, Settlemyer B, Atchley S, Yu W (2014) BurstMem: a high-performance burst buffer system for scientific applications. In: 2014 IEEE International Conference on Big Data (Big Data), pp 71–79 Wang T, Oral S, Wang Y, Settlemyer B, Atchley S, Yu W (2014) BurstMem: a high-performance burst buffer system for scientific applications. In: 2014 IEEE International Conference on Big Data (Big Data), pp 71–79
55.
Zurück zum Zitat Xie B, Chase J, Dillow D, Drokin O, Klasky S, Oral S, Podhorszki N (2012) Characterizing output bottlenecks in a supercomputer. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp 8:1–8:11 Xie B, Chase J, Dillow D, Drokin O, Klasky S, Oral S, Podhorszki N (2012) Characterizing output bottlenecks in a supercomputer. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp 8:1–8:11
56.
Zurück zum Zitat Yildiz O, Dorier M, Ibrahim S, Ross R, Antoniu G (2016) On the root causes of cross-application I/O interference in HPC storage systems. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp 750–759 Yildiz O, Dorier M, Ibrahim S, Ross R, Antoniu G (2016) On the root causes of cross-application I/O interference in HPC storage systems. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp 750–759
57.
Zurück zum Zitat Ying L (2008) Lustre ADIO collective write driver—white paper. Tech. rep, Sun and ORNL Ying L (2008) Lustre ADIO collective write driver—white paper. Tech. rep, Sun and ORNL
58.
Zurück zum Zitat Yu W, Vetter J (2008) ParColl: partitioned collective I/O on the cray XT. In: Proceedings of the 37th International Conference on Parallel Processing, pp 562–569 Yu W, Vetter J (2008) ParColl: partitioned collective I/O on the cray XT. In: Proceedings of the 37th International Conference on Parallel Processing, pp 562–569
59.
Zurück zum Zitat Yu W, Vetter J, Canon RS, Jiang S (2007) Exploiting lustre file joining for effective collective IO. In: Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid, pp 267–274 Yu W, Vetter J, Canon RS, Jiang S (2007) Exploiting lustre file joining for effective collective IO. In: Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid, pp 267–274
60.
Zurück zum Zitat Zhang X, Davis K, Jiang S (2011) QoS support for end users of I/O-intensive applications using shared storage systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, pp 18:1–18:12 Zhang X, Davis K, Jiang S (2011) QoS support for end users of I/O-intensive applications using shared storage systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, pp 18:1–18:12
61.
Zurück zum Zitat Zhou Z, Yang X, Zhao D, Rich P, Tang W, Wang J, Lan Z (2016) I/O-aware bandwidth allocation for petascale computing systems. Parallel Comput 58:107–116MathSciNetCrossRef Zhou Z, Yang X, Zhao D, Rich P, Tang W, Wang J, Lan Z (2016) I/O-aware bandwidth allocation for petascale computing systems. Parallel Comput 58:107–116MathSciNetCrossRef
Metadaten
Titel
Reducing I/O variability using dynamic I/O path characterization in petascale storage systems
verfasst von
Seung Woo Son
Saba Sehrish
Wei-keng Liao
Ron Oldfield
Alok Choudhary
Publikationsdatum
01.11.2016
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 5/2017
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-016-1904-7

Weitere Artikel der Ausgabe 5/2017

The Journal of Supercomputing 5/2017 Zur Ausgabe

Premium Partner