Skip to main content

2014 | OriginalPaper | Buchkapitel

3. Fast Network-on-Chip Design

verfasst von : Ayan Mandal, Sunil P. Khatri, Rabi N. Mahapatra

Erschienen in: Source-Synchronous Networks-On-Chip

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In previous Chapter, we showed how resonant clocking can be used as a high-speed, low power, stable, on-chip clock generation and distribution schemes. In this chapter, we use such a clock to design a high speed source-synchronous ring-based NoC architecture. In Sect. 3.1, we introduce our NoC design, which comprises of extremely fast, intersecting source-synchronous data rings. These source-synchronous data rings traverse the CMP in both the horizontal and vertical directions providing complete connectivity to all the PEs in a CMP. In our approach, the interconnection network operates on a different clock domain which runs significantly faster than the PE clocks. This helps us achieve inter-processor communication with minimal latency. We perform architectural simulations of the ring-based NoC in Sect. 3.2. We propose a deadlock-free routing protocol of the source-synchronous ring-based NoC by using link ordering and virtual channel based buffered flow control. Architectural results obtained on synthetic and real traffic demonstrate that the source-synchronous ring-based NoC has significantly lower latency and higher maximum sustained injection rate compared to a state of the art mesh-based NoC. Next, in Sect. 3.3, we propose a modified source-synchronous design in which the PEs extract a low jitter clock directly from the high speed ring clock by division, and hence are synchronous with the NoC. This is feasible due to the extremely good jitter characteristics of the SWO based clock generation and distribution scheme of Sect. 2.2. Using the above modified design, we propose a class of source-synchronous NoCs organized in an H-tree topology which consume lower logic and wiring area compared to a state of the art mesh. Architectural simulations on synthetic and real traffic show that our H-tree based NoC designs can provide significantly lower latency and are able to sustain a higher injection rate compared to a state of the art mesh. Using the modified source-synchronous design proposed in Sect. 3.3, we also evaluate two more floorplan-friendly NoC topologies in Sect. 3.4. These two floorplan-friendly NoC topologies consume significantly lower logic and wiring area compared to a state of the art mesh. Architectural simulations on synthetic and real traffic show that they can provide significantly lower latency while achieving same or better maximum sustained injection rate compared to a state of the art mesh.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Rajeev Balasubramonian, Naveen Muralimanohar, Karthik Ramani, and Venkatanand Venkatachalapathy, “Microarchitectural Wire Management for Performance and Power in Partitioned Architectures,” in Proceedings of the 11th International Symposium on High-Performance Computer Architecture, Washington, DC, USA, 2005, pp. 28–39, IEEE Computer Society. Rajeev Balasubramonian, Naveen Muralimanohar, Karthik Ramani, and Venkatanand Venkatachalapathy, “Microarchitectural Wire Management for Performance and Power in Partitioned Architectures,” in Proceedings of the 11th International Symposium on High-Performance Computer Architecture, Washington, DC, USA, 2005, pp. 28–39, IEEE Computer Society.
Zurück zum Zitat James D. Balfour and William J. Dally, “Design tradeoffs for tiled CMP on-chip networks,” in International Conference on Supercomputing, 2006, pp. 187–198. James D. Balfour and William J. Dally, “Design tradeoffs for tiled CMP on-chip networks,” in International Conference on Supercomputing, 2006, pp. 187–198.
Zurück zum Zitat Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li, “The PARSEC benchmark suite: Characterization and architectural implications,” Tech. Rep., IN PRINCETON UNIVERSITY, 2008. Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li, “The PARSEC benchmark suite: Characterization and architectural implications,” Tech. Rep., IN PRINCETON UNIVERSITY, 2008.
Zurück zum Zitat Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood, “The GEM5 simulator,” SIGARCH Comput. Archit. News, vol. 39, no. 2, pp. 1–7, Aug. 2011.CrossRef Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood, “The GEM5 simulator,” SIGARCH Comput. Archit. News, vol. 39, no. 2, pp. 1–7, Aug. 2011.CrossRef
Zurück zum Zitat T. Bjerregaard, “The MANGO clockless network-on-chip: Concepts and implementation,” 2005, Supervised by Assoc. Prof. Jens Sparsø, IMM. T. Bjerregaard, “The MANGO clockless network-on-chip: Concepts and implementation,” 2005, Supervised by Assoc. Prof. Jens Sparsø, IMM.
Zurück zum Zitat L. Bononi, N. Concer, M. Grammatikakis, M. Coppola, and R. Locatelli, “NoC Topologies Exploration based on Mapping and Simulation Models,” in Digital System Design Architectures, Methods and Tools, 2007. DSD 2007. 10th Euromicro Conference on, 2007, pp. 543–546. L. Bononi, N. Concer, M. Grammatikakis, M. Coppola, and R. Locatelli, “NoC Topologies Exploration based on Mapping and Simulation Models,” in Digital System Design Architectures, Methods and Tools, 2007. DSD 2007. 10th Euromicro Conference on, 2007, pp. 543–546.
Zurück zum Zitat T. Chelcea and S.M. Nowick, “A low-latency FIFO for mixed-clock systems,” in VLSI, 2000. Proceedings. IEEE Computer Society Workshop on, 2000, pp. 119–126. T. Chelcea and S.M. Nowick, “A low-latency FIFO for mixed-clock systems,” in VLSI, 2000. Proceedings. IEEE Computer Society Workshop on, 2000, pp. 119–126.
Zurück zum Zitat D. M. Chiu, M. Kadansky, R. Perlman, J. Reynders, G. Steele, and M. Yuksel, “Deadlock-free routing based on ordered links,” in Proceedings of the 27th Annual IEEE Conference on Local Computer Networks, Washington, DC, USA, 2002, LCN '02, pp. 0062–, IEEE Computer Society. D. M. Chiu, M. Kadansky, R. Perlman, J. Reynders, G. Steele, and M. Yuksel, “Deadlock-free routing based on ordered links,” in Proceedings of the 27th Annual IEEE Conference on Local Computer Networks, Washington, DC, USA, 2002, LCN '02, pp. 0062–, IEEE Computer Society.
Zurück zum Zitat E C Cummings and Peter Alfke, “Simulation and Synthesis Techniques for Asynchronous FIFO Design with Asynchronous Pointer Comparisons,” Technical Report, Sunburst Design, 2002. E C Cummings and Peter Alfke, “Simulation and Synthesis Techniques for Asynchronous FIFO Design with Asynchronous Pointer Comparisons,” Technical Report, Sunburst Design, 2002.
Zurück zum Zitat W. J. Dally and C. L. Seitz, “The Torus Routing Chip,” The Journal of Distributed Computing, vol. 1(3), pp. 187–196, 1986. W. J. Dally and C. L. Seitz, “The Torus Routing Chip,” The Journal of Distributed Computing, vol. 1(3), pp. 187–196, 1986.
Zurück zum Zitat W. J. Dally and C. L. Seitz, “Deadlock-free message routing in multiprocessor interconnection networks,” IEEE Trans. Comput., vol. 36, no. 5, pp. 547–553, May 1987.CrossRefMATH W. J. Dally and C. L. Seitz, “Deadlock-free message routing in multiprocessor interconnection networks,” IEEE Trans. Comput., vol. 36, no. 5, pp. 547–553, May 1987.CrossRefMATH
Zurück zum Zitat W J Dally and J W Poulton, Digital Systems Engineering, Cambridge University Press, 1998. W J Dally and J W Poulton, Digital Systems Engineering, Cambridge University Press, 1998.
Zurück zum Zitat W.J. Dally and B. Towles, “Route packets, not wires: on-chip interconnection networks,” in Design Automation Conference, 2001. Proceedings, 2001, pp. 684–689. W.J. Dally and B. Towles, “Route packets, not wires: on-chip interconnection networks,” in Design Automation Conference, 2001. Proceedings, 2001, pp. 684–689.
Zurück zum Zitat Jose Duato, Sudhakar Yalamanchili, and Ni Lionel, Interconnection Networks: An Engineering Approach, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2002. Jose Duato, Sudhakar Yalamanchili, and Ni Lionel, Interconnection Networks: An Engineering Approach, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2002.
Zurück zum Zitat G. Gerosa, S. Curtis, M. D’Addeo, Bo Jiang, B. Kuttanna, F. Merchant, B. Patel, M.H. Taufique, and H. Samarchi, “A Sub-2W Low Power IA Processor for Mobile Internet Devices in 45 nm High-k Metal Gate CMOS,” Solid-State Circuits, IEEE Journal of, vol. 44, no. 1, pp. 73–82, 2009. G. Gerosa, S. Curtis, M. D’Addeo, Bo Jiang, B. Kuttanna, F. Merchant, B. Patel, M.H. Taufique, and H. Samarchi, “A Sub-2W Low Power IA Processor for Mobile Internet Devices in 45 nm High-k Metal Gate CMOS,” Solid-State Circuits, IEEE Journal of, vol. 44, no. 1, pp. 73–82, 2009.
Zurück zum Zitat P. Gratz, Changkyu Kim, R. McDonald, S.W. Keckler, and D. Burger, “Implementation and Evaluation of On-Chip Network Architectures,” in Computer Design, 2006. ICCD 2006. International Conference on, Oct 2006, pp. 477–484. P. Gratz, Changkyu Kim, R. McDonald, S.W. Keckler, and D. Burger, “Implementation and Evaluation of On-Chip Network Architectures,” in Computer Design, 2006. ICCD 2006. International Conference on, Oct 2006, pp. 477–484.
Zurück zum Zitat M.N. Horak, S.M. Nowick, M. Carlberg, and U. Vishkin, “A Low-Overhead Asynchronous Interconnection Network for GALS Chip Multiprocessors,” in Networks-on-Chip (NOCS), 2010 Fourth ACM/IEEE International Symposium on, May 2010, pp. 43–50. M.N. Horak, S.M. Nowick, M. Carlberg, and U. Vishkin, “A Low-Overhead Asynchronous Interconnection Network for GALS Chip Multiprocessors,” in Networks-on-Chip (NOCS), 2010 Fourth ACM/IEEE International Symposium on, May 2010, pp. 43–50.
Zurück zum Zitat Jingcao Hu, Yangdong Deng, and Radu Marculescu, “System-level point-to-point communication synthesis using floorplanning information,” in Proceedings of the 2002 Asia and South Pacific Design Automation Conference, Washington, DC, USA, 2002, ASP-DAC '02, pp. 573–, IEEE Computer Society. Jingcao Hu, Yangdong Deng, and Radu Marculescu, “System-level point-to-point communication synthesis using floorplanning information,” in Proceedings of the 2002 Asia and South Pacific Design Automation Conference, Washington, DC, USA, 2002, ASP-DAC '02, pp. 573–, IEEE Computer Society.
Zurück zum Zitat Inc Meta-Software, “HSPICE user’s manual,” Campbell, CA. Inc Meta-Software, “HSPICE user’s manual,” Campbell, CA.
Zurück zum Zitat F. Karim, A. Nguyen, and S. Dey, “An interconnect architecture for networking systems on chips,” Micro, IEEE, vol. 22, no. 5, pp. 36–45, Sep/Oct 2002. F. Karim, A. Nguyen, and S. Dey, “An interconnect architecture for networking systems on chips,” Micro, IEEE, vol. 22, no. 5, pp. 36–45, Sep/Oct 2002.
Zurück zum Zitat J. Kim, J. Balfour, and W.J. Dally, “Flattened butterfly topology for on-chip networks,” Computer Architecture Letters, vol. 6, no. 2, pp. 37–40, Feb. 2007.CrossRef J. Kim, J. Balfour, and W.J. Dally, “Flattened butterfly topology for on-chip networks,” Computer Architecture Letters, vol. 6, no. 2, pp. 37–40, Feb. 2007.CrossRef
Zurück zum Zitat M.M. Kim, J.D. Davis, M. Oskin, and T. Austin, “Polymorphic On-Chip Networks,” in Computer Architecture, 2008. ISCA '08. 35th International Symposium on, June 2008, pp. 101–112. M.M. Kim, J.D. Davis, M. Oskin, and T. Austin, “Polymorphic On-Chip Networks,” in Computer Architecture, 2008. ISCA '08. 35th International Symposium on, June 2008, pp. 101–112.
Zurück zum Zitat Charles E. Leiserson, “Fat-trees: universal networks for hardware-efficient supercomputing,” IEEE Trans. Comput., vol. 34, pp. 892–901, October 1985.CrossRef Charles E. Leiserson, “Fat-trees: universal networks for hardware-efficient supercomputing,” IEEE Trans. Comput., vol. 34, pp. 892–901, October 1985.CrossRef
Zurück zum Zitat Daniele Ludovici, Alessandro Strano, Georgi N. Gaydadjiev, and Davide Bertozzi, “Mesochronous NoC technology for power-efficient GALS MPSoCs,” in Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip, New York, NY, USA, 2011, INA-OCMC '11, pp. 27–30, ACM. Daniele Ludovici, Alessandro Strano, Georgi N. Gaydadjiev, and Davide Bertozzi, “Mesochronous NoC technology for power-efficient GALS MPSoCs,” in Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip, New York, NY, USA, 2011, INA-OCMC '11, pp. 27–30, ACM.
Zurück zum Zitat George Michelogiannakis, Daniel Sanchez, William J. Dally, and Christos Kozyrakis, “Evaluating bufferless flow control for on-chip networks,” in Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip, Washington, DC, USA, 2010, NOCS '10, pp. 9–16, IEEE Computer Society. George Michelogiannakis, Daniel Sanchez, William J. Dally, and Christos Kozyrakis, “Evaluating bufferless flow control for on-chip networks,” in Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip, Washington, DC, USA, 2010, NOCS '10, pp. 9–16, IEEE Computer Society.
Zurück zum Zitat U Nawathe, “Design and implementation of Sun’s Niagara2 processor,” Technical Report, Sun Microsystems, 2007. U Nawathe, “Design and implementation of Sun’s Niagara2 processor,” Technical Report, Sun Microsystems, 2007.
Zurück zum Zitat L Peh H Wang and S Malik, “Power-driven design of router microarchitectures in on-chip networks,” in Microarchitecture, 2003. MICRO-36. Proceedings. 36th Annual IEEE/ACM International Symposium on, dec. 2003, pp. 105–116. L Peh H Wang and S Malik, “Power-driven design of router microarchitectures in on-chip networks,” in Microarchitecture, 2003. MICRO-36. Proceedings. 36th Annual IEEE/ACM International Symposium on, dec. 2003, pp. 105–116.
Zurück zum Zitat “Raphael Interconnect Analysis Tool: User’s Guide,”. “Raphael Interconnect Analysis Tool: User’s Guide,”.
Zurück zum Zitat H. Samuelsson and S. Kumar, “Ring Road NoC architecture,” in Norchip, 2004, pp. 16–19. H. Samuelsson and S. Kumar, “Ring Road NoC architecture,” in Norchip, 2004, pp. 16–19.
Zurück zum Zitat Daniel Sanchez, George Michelogiannakis, and Christos Kozyrakis, “An analysis of on-chip interconnection networks for large-scale chip multiprocessors,” ACM Trans. Archit. Code Optim., vol. 7, pp. 4:1–4:28, May 2010. Daniel Sanchez, George Michelogiannakis, and Christos Kozyrakis, “An analysis of on-chip interconnection networks for large-scale chip multiprocessors,” ACM Trans. Archit. Code Optim., vol. 7, pp. 4:1–4:28, May 2010.
Zurück zum Zitat Yvain Thonnart, Pascal Vivet, and Fabien Clermidy, “A fully-asynchronous low-power framework for GALS NoC integration,” in Proceedings of the Conference on Design, Automation and Test in Europe, 3001 Leuven, Belgium, Belgium, 2010, DATE '10, pp. 33–38, European Design and Automation Association. Yvain Thonnart, Pascal Vivet, and Fabien Clermidy, “A fully-asynchronous low-power framework for GALS NoC integration,” in Proceedings of the Conference on Design, Automation and Test in Europe, 3001 Leuven, Belgium, Belgium, 2010, DATE '10, pp. 33–38, European Design and Automation Association.
Zurück zum Zitat Sergio Tota, Mario R. Casu, and Luca Macchiarulo, “Implementation analysis of NoC: a MPSoC trace-driven approach,” in Proceedings of the 16th ACM Great Lakes symposium on VLSI. 2006, GLSVLSI '06, pp. 204–209, ACM. Sergio Tota, Mario R. Casu, and Luca Macchiarulo, “Implementation analysis of NoC: a MPSoC trace-driven approach,” in Proceedings of the 16th ACM Great Lakes symposium on VLSI. 2006, GLSVLSI '06, pp. 204–209, ACM.
Zurück zum Zitat Anh Thien Tran, Dean Nguyen Truong, and B. Baas, “A Reconfigurable Source-Synchronous On-Chip Network for GALS Many-Core Platforms,” Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 29, no. 6, pp. 897–910, June 2010. Anh Thien Tran, Dean Nguyen Truong, and B. Baas, “A Reconfigurable Source-Synchronous On-Chip Network for GALS Many-Core Platforms,” Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 29, no. 6, pp. 897–910, June 2010.
Metadaten
Titel
Fast Network-on-Chip Design
verfasst von
Ayan Mandal
Sunil P. Khatri
Rabi N. Mahapatra
Copyright-Jahr
2014
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-9405-8_3

Neuer Inhalt