Skip to main content
Erschienen in: Cluster Computing 3/2015

01.09.2015

The dispatch time aligning I/O scheduling for parallel file systems

verfasst von: Yonggang Liu, Jing Qin, Renato Figueiredo

Erschienen in: Cluster Computing | Ausgabe 3/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In Parallel File Systems (PFSs), a data file I/O request may be divided into multiple I/O sub-requests across the storage system. The latency of the original I/O request depends on the finish time of the last sub-request. Due to application multiplexing and various file data layouts employed in PFS, data servers may have very different workloads. Thus, the performance penalty caused by the finish time of different sub-requests can be significant. The Dispatch Time Aligning (DTA) I/O scheduling algorithm focuses on improving system throughput by prioritizing lagged sub-requests in PFS I/O requests. The DTA algorithm associates sub-requests from the same I/O request, detects and prioritizes the lagged-behind sub-requests. The dual-queue scheduling scheme in DTA provides I/O request latency control while improving system throughput. Simulation results show that the DTA algorithm can provide up to 83 % higher total system throughput than the Earliest Deadline First algorithm, while offering similar latency guarantees.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The writing processes will be paused for explicit write back when the dirty page ratio reaches vm.dirty_ratio (0.2 in the tests). This happens only on server S0, S4, S8 and S12 for the heavy I/O.
 
2
This information is available in some PFSs such as PVFS2. For other systems, client module can be patched to support the feature.
 
3
Randomly choosing data servers for a PFS file may not be common for a real storage system. This scheme is adopted here for a highly imbalanced scenario case.
 
Literatur
1.
Zurück zum Zitat Coffman Jr, E.G., Garey, M.R., Johnson, D.S.: An application of bin-packing to multiprocessor scheduling. SIAM J. Comput. 7(1), 1–17 (1978)MATHMathSciNetCrossRef Coffman Jr, E.G., Garey, M.R., Johnson, D.S.: An application of bin-packing to multiprocessor scheduling. SIAM J. Comput. 7(1), 1–17 (1978)MATHMathSciNetCrossRef
2.
Zurück zum Zitat Ekelin, C.: Clairvoyant non-preemptive edf scheduling. In: 8th Euromicro Conference on Real-Time Systems, p. 7. IEEE (2006) Ekelin, C.: Clairvoyant non-preemptive edf scheduling. In: 8th Euromicro Conference on Real-Time Systems, p. 7. IEEE (2006)
3.
Zurück zum Zitat Garey, M.R., Johnson, D.S.: Complexity results for multiprocessor scheduling under resource constraints. SIAM J. Comput. 4(4), 397–411 (1975)MATHMathSciNetCrossRef Garey, M.R., Johnson, D.S.: Complexity results for multiprocessor scheduling under resource constraints. SIAM J. Comput. 4(4), 397–411 (1975)MATHMathSciNetCrossRef
4.
Zurück zum Zitat Goyal, P., Vin, H.M., Chen, H.: Start-time fair queueing: a scheduling algorithm for integrated services packet switching networks. In: ACM SIGCOMM Computer Communication Review, vol. 26, pp. 157–168. ACM (1996) Goyal, P., Vin, H.M., Chen, H.: Start-time fair queueing: a scheduling algorithm for integrated services packet switching networks. In: ACM SIGCOMM Computer Communication Review, vol. 26, pp. 157–168. ACM (1996)
5.
Zurück zum Zitat Gulati, A., Ahmad, I., Waldspurger, C.A., et al.: Parda: Proportional allocation of resources for distributed storage access. In: Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST), vol. 9, pp. 85–98 (2009) Gulati, A., Ahmad, I., Waldspurger, C.A., et al.: Parda: Proportional allocation of resources for distributed storage access. In: Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST), vol. 9, pp. 85–98 (2009)
6.
Zurück zum Zitat Hou, E.S., Ansari, N., Ren, H.: A genetic algorithm for multiprocessor scheduling. IEEE Trans. Parallel Distrib. Syst. 5(2), 113–120 (1994)CrossRef Hou, E.S., Ansari, N., Ren, H.: A genetic algorithm for multiprocessor scheduling. IEEE Trans. Parallel Distrib. Syst. 5(2), 113–120 (1994)CrossRef
7.
Zurück zum Zitat Huang, L., Peng, G., Chiueh, T.C.: Multi-dimensional storage virtualization. ACM SIGMETRICS Perform. Eval. Rev. 32(1), 14–24 (2004)CrossRef Huang, L., Peng, G., Chiueh, T.C.: Multi-dimensional storage virtualization. ACM SIGMETRICS Perform. Eval. Rev. 32(1), 14–24 (2004)CrossRef
8.
Zurück zum Zitat Jin, C., Wei, D.X., Low, S.H.: Fast tcp: motivation, architecture, algorithms, performance. In: INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies, vol. 4, pp. 2490–2501. IEEE (2004) Jin, C., Wei, D.X., Low, S.H.: Fast tcp: motivation, architecture, algorithms, performance. In: INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies, vol. 4, pp. 2490–2501. IEEE (2004)
9.
Zurück zum Zitat Jin, W., Chase, J.S., Kaur, J.: Interposed proportional sharing for a storage service utility. In: ACM SIGMETRICS Performance Evaluation Review, vol. 32, pp. 37–48. ACM (2004) Jin, W., Chase, J.S., Kaur, J.: Interposed proportional sharing for a storage service utility. In: ACM SIGMETRICS Performance Evaluation Review, vol. 32, pp. 37–48. ACM (2004)
10.
Zurück zum Zitat Kasahara, H., Narita, S.: Practical multiprocessor scheduling algorithms for efficient parallel processing. IEEE Trans. Comput. 33(11), 1023–1029 (1984)CrossRef Kasahara, H., Narita, S.: Practical multiprocessor scheduling algorithms for efficient parallel processing. IEEE Trans. Comput. 33(11), 1023–1029 (1984)CrossRef
11.
Zurück zum Zitat Kwok, Y.K., Ahmad, I.: Efficient scheduling of arbitrary task graphs to multiprocessors using a parallel genetic algorithm. J. Parallel Distrib. Comput. 47(1), 58–77 (1997) Kwok, Y.K., Ahmad, I.: Efficient scheduling of arbitrary task graphs to multiprocessors using a parallel genetic algorithm. J. Parallel Distrib. Comput. 47(1), 58–77 (1997)
12.
Zurück zum Zitat Liu, C.L., Layland, J.W.: Scheduling algorithms for multiprogramming in a hard-real-time environment. J. ACM (JACM) 20(1), 46–61 (1973)MATHMathSciNetCrossRef Liu, C.L., Layland, J.W.: Scheduling algorithms for multiprogramming in a hard-real-time environment. J. ACM (JACM) 20(1), 46–61 (1973)MATHMathSciNetCrossRef
13.
Zurück zum Zitat Liu, Y., Figueiredo, R., Clavijo, D., Xu, Y., Zhao, M.: Towards simulation of parallel file system scheduling algorithms with pfssim. In: Proceedings of the 7th IEEE International Workshop on Storage Network Architectures and Parallel I/O (2011) Liu, Y., Figueiredo, R., Clavijo, D., Xu, Y., Zhao, M.: Towards simulation of parallel file system scheduling algorithms with pfssim. In: Proceedings of the 7th IEEE International Workshop on Storage Network Architectures and Parallel I/O (2011)
14.
Zurück zum Zitat Lumb, C.R., Merchant, A., Alvarez, G.A.: Façade: Virtual storage devices with performance guarantees. In: Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST), pp. 131–144. USENIX Association (2003) Lumb, C.R., Merchant, A., Alvarez, G.A.: Façade: Virtual storage devices with performance guarantees. In: Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST), pp. 131–144. USENIX Association (2003)
15.
Zurück zum Zitat Nagle, D., Serenyi, D., Matthews, A.: The panasas activescale storage cluster delivering scalable high bandwidth storage. In: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, p. 53. IEEE Computer Society (2004) Nagle, D., Serenyi, D., Matthews, A.: The panasas activescale storage cluster delivering scalable high bandwidth storage. In: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, p. 53. IEEE Computer Society (2004)
16.
Zurück zum Zitat Ross, R.B., Ligon III, W.B.: Server-side scheduling in cluster parallel i/o systems. Calc. Paralleles Spec. Issue Parallel I/O Clust. Comput. (2001) Ross, R.B., Ligon III, W.B.: Server-side scheduling in cluster parallel i/o systems. Calc. Paralleles Spec. Issue Parallel I/O Clust. Comput. (2001)
18.
Zurück zum Zitat Ross, R.B., Thakur, R., et al.: Pvfs: A parallel file system for linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, pp. 391–430 (2000) Ross, R.B., Thakur, R., et al.: Pvfs: A parallel file system for linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, pp. 391–430 (2000)
19.
Zurück zum Zitat Schmuck, F.B., Haskin, R.L.: Gpfs: A shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST), vol. 2, p. 19 (2002) Schmuck, F.B., Haskin, R.L.: Gpfs: A shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST), vol. 2, p. 19 (2002)
20.
Zurück zum Zitat Schwan, P.: Lustre: Building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux Symposium (2003) Schwan, P.: Lustre: Building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux Symposium (2003)
21.
Zurück zum Zitat Song, H., Yin, Y., Sun, X.H., Thakur, R., Lang, S.: Server-side i/o coordination for parallel file systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, p. 17. ACM (2011) Song, H., Yin, Y., Sun, X.H., Thakur, R., Lang, S.: Server-side i/o coordination for parallel file systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, p. 17. ACM (2011)
22.
Zurück zum Zitat Wang, Y., Merchant, A.: Proportional-share scheduling for distributed storage systems. In: Proceedings of the 5nd USENIX Conference on File and Storage Technologies (FAST), vol. 7 (2007) Wang, Y., Merchant, A.: Proportional-share scheduling for distributed storage systems. In: Proceedings of the 5nd USENIX Conference on File and Storage Technologies (FAST), vol. 7 (2007)
23.
Zurück zum Zitat Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.: Ceph: A scalable, high-performance distributed file system. In: Proceedings of the 7th symposium on Operating systems design and implementation, pp. 307–320. USENIX Association (2006) Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.: Ceph: A scalable, high-performance distributed file system. In: Proceedings of the 7th symposium on Operating systems design and implementation, pp. 307–320. USENIX Association (2006)
24.
Zurück zum Zitat Xu, Y., Arteaga, D., Zhao, M., Liu, Y., Figueiredo, R., Seelam, S.: vpfs: bandwidth virtualization of parallel storage systems. In: IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–12. IEEE (2012) Xu, Y., Arteaga, D., Zhao, M., Liu, Y., Figueiredo, R., Seelam, S.: vpfs: bandwidth virtualization of parallel storage systems. In: IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–12. IEEE (2012)
25.
Zurück zum Zitat Zhang, J., Sivasubramaniam, A., Riska, A., Wang, Q., Riedel, E.: An interposed 2-level i/o scheduling framework for performance virtualization. In: Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS) (2005) Zhang, J., Sivasubramaniam, A., Riska, A., Wang, Q., Riedel, E.: An interposed 2-level i/o scheduling framework for performance virtualization. In: Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS) (2005)
Metadaten
Titel
The dispatch time aligning I/O scheduling for parallel file systems
verfasst von
Yonggang Liu
Jing Qin
Renato Figueiredo
Publikationsdatum
01.09.2015
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 3/2015
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-015-0457-5

Weitere Artikel der Ausgabe 3/2015

Cluster Computing 3/2015 Zur Ausgabe