Skip to main content

2019 | OriginalPaper | Buchkapitel

Throughput Analytics of Data Transfer Infrastructures

verfasst von : Nageswara S. V. Rao, Qiang Liu, Zhengchun Liu, Rajkumar Kettimuthu, Ian Foster

Erschienen in: Testbeds and Research Infrastructures for the Development of Networks and Communities

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

To support increasingly distributed scientific and big-data applications, powerful data transfer infrastructures are being built with dedicated networks and software frameworks customized to distributed file systems and data transfer nodes. The data transfer performance of such infrastructures critically depends on the combined choices of file, disk, and host systems as well as network protocols and file transfer software, all of which may vary across sites. The randomness of throughput measurements makes it challenging to assess the impact of these choices on the performance of infrastructure or its parts. We propose regression-based throughput profiles by aggregating measurements from sites of the infrastructure, with RTT as the independent variable. The peak values and convex-concave shape of a profile together determine the overall throughput performance of memory and file transfers, and its variations show the performance differences among the sites. We then present projection and difference operators, and coefficients of throughput profiles to characterize the performance of infrastructure and its parts, including sites and file transfer tools. In particular, the utilization-concavity coefficient provides a value in the range [0, 1] that reflects overall transfer effectiveness. We present results of measurements collected using (i) testbed experiments over dedicated 0–366 ms 10 Gbps connections with combinations of TCP versions, file systems, host systems and transfer tools, and (ii) Globus GridFTP transfers over production infrastructure with varying site configurations.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Allcock, W., et al.: The Globus striped GridFTP framework and server. In: ACM/IEEE Conference on Supercomputing, pp. 54–64. IEEE Computer Society, Washington, D.C. (2005) Allcock, W., et al.: The Globus striped GridFTP framework and server. In: ACM/IEEE Conference on Supercomputing, pp. 54–64. IEEE Computer Society, Washington, D.C. (2005)
4.
Zurück zum Zitat Allen, B., et al.: Software as a service for data scientists. Commun. ACM 55(2), 81–88 (2012)CrossRef Allen, B., et al.: Software as a service for data scientists. Commun. ACM 55(2), 81–88 (2012)CrossRef
5.
Zurück zum Zitat Arslan, E., Kosar, T.: High speed transfer optimization based on historical analysis and real-time tuning. IEEE Trans. Parallel Distrib. Syst. 29, 1303–1316 (2018)CrossRef Arslan, E., Kosar, T.: High speed transfer optimization based on historical analysis and real-time tuning. IEEE Trans. Parallel Distrib. Syst. 29, 1303–1316 (2018)CrossRef
7.
Zurück zum Zitat Cardwell, N., Cheng, Y., Gunn, C.S., Yeganeh, S.H., Jacobson, V.: BBR: congestion based congestion control. ACM Queue 14(5), 50 (2016) Cardwell, N., Cheng, Y., Gunn, C.S., Yeganeh, S.H., Jacobson, V.: BBR: congestion based congestion control. ACM Queue 14(5), 50 (2016)
8.
Zurück zum Zitat Chard, K., Dart, E., Foster, I., Shifflett, D., Tuecke, S.J., Williams, J.: The modern research data portal: a design pattern for networked, data-intensive science. Peer J. Comput. Sci. 4(6), e144 (2018)CrossRef Chard, K., Dart, E., Foster, I., Shifflett, D., Tuecke, S.J., Williams, J.: The modern research data portal: a design pattern for networked, data-intensive science. Peer J. Comput. Sci. 4(6), e144 (2018)CrossRef
10.
Zurück zum Zitat Gu, Y., Grossman, R.L.: UDT: UDP-based data transfer for high-speed wide area networks. Comput. Netw. 51(7), 1777–1799 (2007)CrossRef Gu, Y., Grossman, R.L.: UDT: UDP-based data transfer for high-speed wide area networks. Comput. Netw. 51(7), 1777–1799 (2007)CrossRef
11.
Zurück zum Zitat Habib, S., Morozov, V., Frontiere, N., Finkel, H., Pope, A., Heitmann, K.: HACC: extreme scaling and performance across diverse architectures. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013, pp. 6:1–6:10. ACM, New York (2013) Habib, S., Morozov, V., Frontiere, N., Finkel, H., Pope, A., Heitmann, K.: HACC: extreme scaling and performance across diverse architectures. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013, pp. 6:1–6:10. ACM, New York (2013)
12.
Zurück zum Zitat Hacker, T.J., Athey, B.D., Noble, B.: The end-to-end performance effects of parallel TCP sockets on a lossy wide-area network. In: 16th International Parallel and Distributed Processing Symposium (2002) Hacker, T.J., Athey, B.D., Noble, B.: The end-to-end performance effects of parallel TCP sockets on a lossy wide-area network. In: 16th International Parallel and Distributed Processing Symposium (2002)
13.
Zurück zum Zitat Henschel, R., et al.: Demonstrating Lustre over a 100 Gbps wide area network of 3,500 km. In: International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–8, November 2012 Henschel, R., et al.: Demonstrating Lustre over a 100 Gbps wide area network of 3,500 km. In: International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–8, November 2012
15.
Zurück zum Zitat Jain, S., et al.: B4: experience with a globally-deployed software defined WAN. SIGCOMM Comput. Commun. Rev. 43(4), 3–14 (2013)CrossRef Jain, S., et al.: B4: experience with a globally-deployed software defined WAN. SIGCOMM Comput. Commun. Rev. 43(4), 3–14 (2013)CrossRef
16.
Zurück zum Zitat Kettimuthu, R., Liu, Z., Wheelerd, D., Foster, I., Heitmann, K., Cappello, F.: Transferring a petabyte in a day. In: 4th International Workshop on Innovating the Network for Data Intensive Science, p. 10, November 2017 Kettimuthu, R., Liu, Z., Wheelerd, D., Foster, I., Heitmann, K., Cappello, F.: Transferring a petabyte in a day. In: 4th International Workshop on Innovating the Network for Data Intensive Science, p. 10, November 2017
17.
Zurück zum Zitat Liu, Q., Rao, N.S.V.: On concavity and utilization analytics of wide-area network transport protocols. In: Proceedings of the 20th IEEE Conference on High Performance Computing and Communications (HPCC), Exeter, UK, June 2018 Liu, Q., Rao, N.S.V.: On concavity and utilization analytics of wide-area network transport protocols. In: Proceedings of the 20th IEEE Conference on High Performance Computing and Communications (HPCC), Exeter, UK, June 2018
18.
Zurück zum Zitat Liu, Q., Rao, N.S.V., Wu, C.Q., Yun, D., Kettimuthu, R., Foster, I.: Measurement-based performance profiles and dynamics of UDT over dedicated connections. In: International Conference on Network Protocols, Singapore, November 2016 Liu, Q., Rao, N.S.V., Wu, C.Q., Yun, D., Kettimuthu, R., Foster, I.: Measurement-based performance profiles and dynamics of UDT over dedicated connections. In: International Conference on Network Protocols, Singapore, November 2016
19.
Zurück zum Zitat Liu, Z., Balaprakash, P., Kettimuthu, R., Foster, I.: Explaining wide area data transfer performance. In: 26th International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2017, pp. 167–178. ACM, New York (2017) Liu, Z., Balaprakash, P., Kettimuthu, R., Foster, I.: Explaining wide area data transfer performance. In: 26th International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2017, pp. 167–178. ACM, New York (2017)
20.
Zurück zum Zitat Liu, Z., Kettimuthu, R., Foster, I., Beckman, P.H.: Towards a smart data transfer node. In: 4th International Workshop on Innovating the Network for Data Intensive Science, p. 10, November 2017CrossRef Liu, Z., Kettimuthu, R., Foster, I., Beckman, P.H.: Towards a smart data transfer node. In: 4th International Workshop on Innovating the Network for Data Intensive Science, p. 10, November 2017CrossRef
21.
Zurück zum Zitat Liu, Z., Kettimuthu, R., Leyffer, S., Palkar, P., Foster, I.: A mathematical programming - and simulation-based framework to evaluate cyberinfrastructure design choices. In: IEEE 13th International Conference on e-Science, p. 148–157, October 2017 Liu, Z., Kettimuthu, R., Leyffer, S., Palkar, P., Foster, I.: A mathematical programming - and simulation-based framework to evaluate cyberinfrastructure design choices. In: IEEE 13th International Conference on e-Science, p. 148–157, October 2017
23.
Zurück zum Zitat Mathis, M., Semke, J., Mahdavi, J., Ott, T.: The mascroscopic behavior of the TCP congestion avoidance algorithm. Comput. Commun. Rev. 27(3), 67–82 (1997)CrossRef Mathis, M., Semke, J., Mahdavi, J., Ott, T.: The mascroscopic behavior of the TCP congestion avoidance algorithm. Comput. Commun. Rev. 27(3), 67–82 (1997)CrossRef
24.
Zurück zum Zitat Matsunaga, H., Isobe, T., Mashimo, T., Sakamoto, H., Ueda, I.: Data transfer over the wide area network with a large round trip time. J. Phys.: Conf. Ser. 219(6), 062056 (2010) Matsunaga, H., Isobe, T., Mashimo, T., Sakamoto, H., Ueda, I.: Data transfer over the wide area network with a large round trip time. J. Phys.: Conf. Ser. 219(6), 062056 (2010)
26.
Zurück zum Zitat Michael, S., Zhen, L., Henschel, R., Simms, S., Barton, E., Link, M.: A study of Lustre networking over a 100 gigabit wide area network with 50 milliseconds of latency. In: 5th International Workshop on Data-Intensive Distributed Computing, pp. 43–52 (2012) Michael, S., Zhen, L., Henschel, R., Simms, S., Barton, E., Link, M.: A study of Lustre networking over a 100 gigabit wide area network with 50 milliseconds of latency. In: 5th International Workshop on Data-Intensive Distributed Computing, pp. 43–52 (2012)
28.
Zurück zum Zitat Rao, N.S.V., Imam, N., Hanley, J., Sarp, O.: Wide-area Lustre file system using LNet routers. In: 12th Annual IEEE International Systems Conference (2018) Rao, N.S.V., Imam, N., Hanley, J., Sarp, O.: Wide-area Lustre file system using LNet routers. In: 12th Annual IEEE International Systems Conference (2018)
29.
Zurück zum Zitat Rao, N.S.V., et al.: TCP throughput profiles using measurements over dedicated connections. In: ACM Symposium on High-Performance Parallel and Distributed Computing, Washington, D.C., July–August 2017 Rao, N.S.V., et al.: TCP throughput profiles using measurements over dedicated connections. In: ACM Symposium on High-Performance Parallel and Distributed Computing, Washington, D.C., July–August 2017
30.
Zurück zum Zitat Rao, N.S.V., et al.: Experimental analysis of file transfer rates over wide-area dedicated connections. In: 18th IEEE International Conference on High Performance Computing and Communications (HPCC), Sydney, Australia, pp. 198–205, December 2016 Rao, N.S.V., et al.: Experimental analysis of file transfer rates over wide-area dedicated connections. In: 18th IEEE International Conference on High Performance Computing and Communications (HPCC), Sydney, Australia, pp. 198–205, December 2016
31.
Zurück zum Zitat Rao, N.S.V., et al.: Experiments and analyses of data transfers over wide-area dedicated connections. In: 26th International Conference on Computer Communications and Network (2017) Rao, N.S.V., et al.: Experiments and analyses of data transfers over wide-area dedicated connections. In: 26th International Conference on Computer Communications and Network (2017)
32.
Zurück zum Zitat Rhee, I., Xu, L.: CUBIC: a new TCP-friendly high-speed TCP variant. In: 3rd International Workshop on Protocols for Fast Long-Distance Networks (2005) Rhee, I., Xu, L.: CUBIC: a new TCP-friendly high-speed TCP variant. In: 3rd International Workshop on Protocols for Fast Long-Distance Networks (2005)
33.
Zurück zum Zitat Settlemyer, B.W., Dobson, J.D., Hodson, S.W., Kuehn, J.A., Poole, S.W., Ruwart, T.M.: A technique for moving large data sets over high-performance long distance networks. In: IEEE 27th Symposium on Mass Storage Systems and Technologies, pp. 1–6, May 2011 Settlemyer, B.W., Dobson, J.D., Hodson, S.W., Kuehn, J.A., Poole, S.W., Ruwart, T.M.: A technique for moving large data sets over high-performance long distance networks. In: IEEE 27th Symposium on Mass Storage Systems and Technologies, pp. 1–6, May 2011
34.
Zurück zum Zitat Shorten, R.N., Leith, D.J.: H-TCP: TCP for high-speed and long-distance networks. In: 3rd International Workshop on Protocols for Fast Long-Distance Networks (2004) Shorten, R.N., Leith, D.J.: H-TCP: TCP for high-speed and long-distance networks. In: 3rd International Workshop on Protocols for Fast Long-Distance Networks (2004)
35.
Zurück zum Zitat Srikant, Y., Ying, L.: Communication Networks: An Optimization, Control, and Stochastic Networks Perspective. Cambridge University Press, Cambridge (2014)MATH Srikant, Y., Ying, L.: Communication Networks: An Optimization, Control, and Stochastic Networks Perspective. Cambridge University Press, Cambridge (2014)MATH
38.
Zurück zum Zitat Yildirim, E., Arslan, E., Kim, J., Kosar, T.: Application-level optimization of big data transfers through pipelining, parallelism and concurrency. IEEE Trans. Cloud Comput. 4(1), 63–75 (2016)CrossRef Yildirim, E., Arslan, E., Kim, J., Kosar, T.: Application-level optimization of big data transfers through pipelining, parallelism and concurrency. IEEE Trans. Cloud Comput. 4(1), 63–75 (2016)CrossRef
39.
Zurück zum Zitat Yildirim, E., Yin, D., Kosar, T.: Prediction of optimal parallelism level in wide area data transfers. IEEE Trans. Parallel Distrib. Syst. 22(12), 2033–2045 (2011)CrossRef Yildirim, E., Yin, D., Kosar, T.: Prediction of optimal parallelism level in wide area data transfers. IEEE Trans. Parallel Distrib. Syst. 22(12), 2033–2045 (2011)CrossRef
Metadaten
Titel
Throughput Analytics of Data Transfer Infrastructures
verfasst von
Nageswara S. V. Rao
Qiang Liu
Zhengchun Liu
Rajkumar Kettimuthu
Ian Foster
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-12971-2_2