Skip to main content
Top

2018 | OriginalPaper | Chapter

Megafly: A Topology for Exascale Systems

Authors : Mario Flajslik, Eric Borch, Mike A. Parker

Published in: High Performance Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper we explore network topologies suitable for future exascale systems that need to support over fifty thousand endpoints. With the increased necessity to use optics at higher link speeds, some of the more traditional topologies, such as Tori and Fat-Trees, become prohibitively expensive at such large scale. We identify two cost efficient hierarchical topologies, one a canonical Dragonfly, and one a variant of the Dragonfly topology that we call Megafly. Megafly is an indirect hierarchical topology with high path diversity, flexible tapering options and an abundance of possible system design points. We describe and analyze the Megafly topology to understand its key features and advantages, when compared to the Dragonfly. Additionally, we define a Megafly tapering scheme that enables a good balance of system performance versus cost. Our evaluation shows that the Megafly topology achieves equal or better throughput than the Dragonfly on a variety of traffic patterns, while requiring only half of the virtual channels for deadlock-free routing. Megafly also provides better fairness, which is shown in the evaluation of synchronizing traffic patterns, such as neighbor exchanges. We also showcase the design flexibility and cost vs. performance trade-offs of Megafly in a mini case study that illustrates the challenges of building a high performance fabric topology.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abts, D., Marty, M.R., Wells, P.M., Klausler, P., Liu, H.: Energy proportional datacenter networks. In: ACM SIGARCH Computer Architecture News. ACM (2010)CrossRef Abts, D., Marty, M.R., Wells, P.M., Klausler, P., Liu, H.: Energy proportional datacenter networks. In: ACM SIGARCH Computer Architecture News. ACM (2010)CrossRef
2.
go back to reference Ajima, Y., Inoue, T., Hiramoto, S., Uno, S., Sumimoto, S., Miura, K., Shida, N., Kawashima, T., Okamoto, T., Moriyama, O., Ikeda, Y., Tabata, T., Yoshikawa, T., Seki, K., Shimizu, T.: Tofu interconnect 2: system-on-chip integration of high-performance interconnect. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 498–507. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07518-1_35CrossRef Ajima, Y., Inoue, T., Hiramoto, S., Uno, S., Sumimoto, S., Miura, K., Shida, N., Kawashima, T., Okamoto, T., Moriyama, O., Ikeda, Y., Tabata, T., Yoshikawa, T., Seki, K., Shimizu, T.: Tofu interconnect 2: system-on-chip integration of high-performance interconnect. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 498–507. Springer, Cham (2014). https://​doi.​org/​10.​1007/​978-3-319-07518-1_​35CrossRef
3.
go back to reference Antypas, K., Wright, N., Cardo, N.P., Andrews, A., Cordery, M.: Cori: a cray XC pre-exascale system for NERSC. In: Cray User Group Proceedings. Cray (2014) Antypas, K., Wright, N., Cardo, N.P., Andrews, A., Cordery, M.: Cori: a cray XC pre-exascale system for NERSC. In: Cray User Group Proceedings. Cray (2014)
4.
go back to reference Arimilli, B., Arimilli, R., Chung, V., Clark, S., Denzel, W., Drerup, B., Hoefler, T., Joyner, J., Lewis, J., Li, J., Ni, N., Rajamony, R.: The PERCS high-performance interconnect. In: 2010 18th IEEE Symposium on High Performance Interconnects, pp. 75–82, August 2010 Arimilli, B., Arimilli, R., Chung, V., Clark, S., Denzel, W., Drerup, B., Hoefler, T., Joyner, J., Lewis, J., Li, J., Ni, N., Rajamony, R.: The PERCS high-performance interconnect. In: 2010 18th IEEE Symposium on High Performance Interconnects, pp. 75–82, August 2010
5.
go back to reference Besta, M., Hoefler, T.: Slim Fly: a cost effective low-diameter network topology. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 348–359. IEEE Press (2014) Besta, M., Hoefler, T.: Slim Fly: a cost effective low-diameter network topology. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 348–359. IEEE Press (2014)
6.
go back to reference Bhatele, A., Jain, N., Gropp, W.D., Kale, L.V.: Avoiding hot-spots on two-level direct networks. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, p. 76. ACM (2011) Bhatele, A., Jain, N., Gropp, W.D., Kale, L.V.: Avoiding hot-spots on two-level direct networks. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, p. 76. ACM (2011)
7.
go back to reference Camarero, C., Vallejo, E., Beivide, R.: Topological characterization of hamming and dragonfly networks and its implications on routing. ACM Trans. Architect. Code Optim. (TACO) 11(4), 39 (2015) Camarero, C., Vallejo, E., Beivide, R.: Topological characterization of hamming and dragonfly networks and its implications on routing. ACM Trans. Architect. Code Optim. (TACO) 11(4), 39 (2015)
8.
go back to reference Chen, D., Heidelberger, P., Stunkel, C., Sugawara, Y., Minkenberg, C., Prisacari, B., Rodriguez, G.: An evaluation of network architectures for next generation supercomputers. In: 2016 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), pp. 11–21, November 2016 Chen, D., Heidelberger, P., Stunkel, C., Sugawara, Y., Minkenberg, C., Prisacari, B., Rodriguez, G.: An evaluation of network architectures for next generation supercomputers. In: 2016 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), pp. 11–21, November 2016
10.
go back to reference Faanes, G., Bataineh, A., Roweth, D., Court, T., Froese, E., Alverson, B., Johnson, T., Kopnick, J., Higgins, M., Reinhard, J.: Cray cascade: a scalable HPC system based on a Dragonfly network. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (2012) Faanes, G., Bataineh, A., Roweth, D., Court, T., Froese, E., Alverson, B., Johnson, T., Kopnick, J., Higgins, M., Reinhard, J.: Cray cascade: a scalable HPC system based on a Dragonfly network. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (2012)
11.
go back to reference Hastings, E., Rincon-Cruz, D., Spehlmann, M., Meyers, S., Xu, A., Bunde, D.P., Leung, V.J.: Comparing global link arrangements for dragonfly networks. In: 2015 IEEE International Conference on Cluster Computing (CLUSTER), pp. 361–370. IEEE (2015) Hastings, E., Rincon-Cruz, D., Spehlmann, M., Meyers, S., Xu, A., Bunde, D.P., Leung, V.J.: Comparing global link arrangements for dragonfly networks. In: 2015 IEEE International Conference on Cluster Computing (CLUSTER), pp. 361–370. IEEE (2015)
12.
go back to reference Hoefler, T., Schneider, T., Lumsdaine, A.: Multistage switches are not crossbars: Effects of static routing in high-performance networks. In: 2008 IEEE International Conference on Cluster Computing, pp. 116–125. IEEE (2008) Hoefler, T., Schneider, T., Lumsdaine, A.: Multistage switches are not crossbars: Effects of static routing in high-performance networks. In: 2008 IEEE International Conference on Cluster Computing, pp. 116–125. IEEE (2008)
13.
go back to reference Jain, N., Bhatele, A., Ni, X., Wright, N.J., Kale, L.V.: Maximizing throughput on a Dragonfly network. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 336–347. IEEE Press (2014) Jain, N., Bhatele, A., Ni, X., Wright, N.J., Kale, L.V.: Maximizing throughput on a Dragonfly network. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 336–347. IEEE Press (2014)
14.
go back to reference Jiang, N., Balfour, J., Becker, D.U., Towles, B., Dally, W.J., Michelogiannakis, G., Kim, J.: A detailed and flexible cycle-accurate network-on-chip simulator. In: 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2013 Jiang, N., Balfour, J., Becker, D.U., Towles, B., Dally, W.J., Michelogiannakis, G., Kim, J.: A detailed and flexible cycle-accurate network-on-chip simulator. In: 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2013
15.
go back to reference Jiang, N., Kim, J., Dally, W.J.: Indirect adaptive routing on large scale interconnection networks. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA 2009, pp. 220–231. ACM, New York (2009) Jiang, N., Kim, J., Dally, W.J.: Indirect adaptive routing on large scale interconnection networks. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA 2009, pp. 220–231. ACM, New York (2009)
16.
go back to reference Kathareios, G., Minkenberg, C., Prisacari, B., Rodriguez, G., Hoefler, T.: Cost-effective diameter-two topologies: analysis and evaluation. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM (2015) Kathareios, G., Minkenberg, C., Prisacari, B., Rodriguez, G., Hoefler, T.: Cost-effective diameter-two topologies: analysis and evaluation. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM (2015)
17.
go back to reference Kim, J., Balfour, J., Dally, W.: Flattened butterfly topology for on-chip networks. In: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 172–182. IEEE Computer Society (2007) Kim, J., Balfour, J., Dally, W.: Flattened butterfly topology for on-chip networks. In: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 172–182. IEEE Computer Society (2007)
18.
go back to reference Kim, J., Dally, W.J., Scott, S., Abts, D.: Technology-driven, highly-scalable Dragonfly topology. In: Proceedings of the 35th Annual International Symposium on Computer Architecture, ISCA 2008, pp. 77–88. IEEE Computer Society, Washington, DC (2008) Kim, J., Dally, W.J., Scott, S., Abts, D.: Technology-driven, highly-scalable Dragonfly topology. In: Proceedings of the 35th Annual International Symposium on Computer Architecture, ISCA 2008, pp. 77–88. IEEE Computer Society, Washington, DC (2008)
19.
go back to reference Kim, J., Dally, W.J., Towles, B., Gupta, A.K.: Microarchitecture of a high-radix router. In: Proceedings of the 32nd Annual International Symposium on Computer Architecture, ISCA 2005, pp. 420–431. IEEE Computer Society, Washington, DC (2005) Kim, J., Dally, W.J., Towles, B., Gupta, A.K.: Microarchitecture of a high-radix router. In: Proceedings of the 32nd Annual International Symposium on Computer Architecture, ISCA 2005, pp. 420–431. IEEE Computer Society, Washington, DC (2005)
20.
go back to reference Leiserson, C.E.: Fat-trees: universal networks for hardware-efficient supercomputing. IEEE Trans. Comput. 100(10), 892–901 (1985)CrossRef Leiserson, C.E.: Fat-trees: universal networks for hardware-efficient supercomputing. IEEE Trans. Comput. 100(10), 892–901 (1985)CrossRef
21.
go back to reference Li, N., Laizet, S.: 2DECOMP & FFT-a highly scalable 2d decomposition library and FFT interface. In: Cray User Group 2010 conference, pp. 1–13 (2010) Li, N., Laizet, S.: 2DECOMP & FFT-a highly scalable 2d decomposition library and FFT interface. In: Cray User Group 2010 conference, pp. 1–13 (2010)
22.
go back to reference Matsuoka, S., et al.: You don’t really need big fat switches anymore-almost. ARC 2003(84 (2003-ARC-154)), pp. 157–162 (2003) Matsuoka, S., et al.: You don’t really need big fat switches anymore-almost. ARC 2003(84 (2003-ARC-154)), pp. 157–162 (2003)
24.
go back to reference Scott, S., Abts, D., Kim, J., Dally, W.J.: The BlackWidow high-radix clos network. In: Proceedings of the 33rd Annual International Symposium on Computer Architecture, ISCA 2006, pp. 16–28. IEEE Computer Society, Washington, DC (2006) Scott, S., Abts, D., Kim, J., Dally, W.J.: The BlackWidow high-radix clos network. In: Proceedings of the 33rd Annual International Symposium on Computer Architecture, ISCA 2006, pp. 16–28. IEEE Computer Society, Washington, DC (2006)
25.
go back to reference Shpiner, A., Haramaty, Z., Eliad, S., Zdornov, V., Gafni, B., Zahavi, E.: Dragonfly+: low cost topology for scaling datacenters. In: 2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (2017) Shpiner, A., Haramaty, Z., Eliad, S., Zdornov, V., Gafni, B., Zahavi, E.: Dragonfly+: low cost topology for scaling datacenters. In: 2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (2017)
26.
go back to reference Singh, A.: Load-balanced routing in interconnection networks. Ph.D. thesis, Stanford University (2005) Singh, A.: Load-balanced routing in interconnection networks. Ph.D. thesis, Stanford University (2005)
28.
go back to reference Valerio, M., Moser, L., Melliar-Smith, P.: Recursively scalable fat-trees as interconnection networks. In: Phoenix Conference on Computers and Communications, vol. 13 (1994) Valerio, M., Moser, L., Melliar-Smith, P.: Recursively scalable fat-trees as interconnection networks. In: Phoenix Conference on Computers and Communications, vol. 13 (1994)
29.
go back to reference Won, J., Kim, G., Kim, J., Jiang, T., Parker, M., Scott, S.: Overcoming far-end congestion in large-scale networks. In: 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pp. 415–427, February 2015 Won, J., Kim, G., Kim, J., Jiang, T., Parker, M., Scott, S.: Overcoming far-end congestion in large-scale networks. In: 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pp. 415–427, February 2015
Metadata
Title
Megafly: A Topology for Exascale Systems
Authors
Mario Flajslik
Eric Borch
Mike A. Parker
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-92040-5_15