ABSTRACT
This paper describes an unconventional way to apply wireless networking in emerging technologies. It makes the case for using a two-tier hybrid wireless/wired architecture to interconnect hundreds to thousands of cores in chip multiprocessors (CMPs), where current interconnect technologies face severe scaling limitations in excessive latency, long wiring, and complex layout. We propose a recursive wireless interconnect structure called the WCube that features a single transmit antenna and multiple receive antennas at each micro wireless router and offers scalable performance in terms of latency and connectivity. We show the feasibility to build miniature on-chip antennas, and simple transmitters and receivers that operate at 100-500 GHz sub-terahertz frequency bands. We also devise new two-tier wormhole based routing algorithms that are deadlock free and ensure a minimum-latency route on a 1000-core on-chip interconnect network. Our simulations show that our protocol suite can reduce the observed latency by 20% to 45%, and consumes power that is comparable to or less than current 2-D wired mesh designs.
- V. Agarwal, M. S. Hrishikesh, S. W. Keckler, and D. Burger. Clock rate versus IPC: the end of the road for conventional microarchitecture. ISCA-27, 2000. Google ScholarDigital Library
- J. Andrews and N. Backer. Xbox360 system architecture. Hot Chips, 2005.Google Scholar
- Ansoft Corporation. High Frequency Structure Simulator (HFSS). http://www.ansoft.com/products/hf/hfss/Google Scholar
- K. Asanovic et al. The landscape of parallel computing research: a view from Berkeley. Technical Report, UCB/EECS-2006-183.Google Scholar
- S. Borkar. Thousand core chips: a technology perspective. DAC, 2007. Google ScholarDigital Library
- L. A. Barroso et al. Piranha: a scalable architecture based on single-chipmultiprocessing. ISCA-27, 2000. Google ScholarDigital Library
- M.-C. F. Chang et al. CMP network-on-chip overlaid with multi-band RF-Interconnect. HPCA, 2008.Google Scholar
- M.-C. F. Chang et al. Power reduction of CMP communication networks via RF-Interconnects. MICRO, 2008. Google ScholarDigital Library
- D. Choudhury, J. Foschaar, R. Bowen, M. Mokhtari. A 70 GHz BW package for multigigabit IC applications. Microwave Symposium Digest, June 2004.Google ScholarCross Ref
- S. Boyd-Wickizer et al. Corey: an operating system for many cores. OSDI, 2006. Google ScholarDigital Library
- W. Dally and C. Seitz. Deadlock-free message routing in multiprocessor interconnection networks. IEEE Trans. on Computers, 1987. Google ScholarDigital Library
- W. Dally. Virtual-channel flow control. IEEE Trans. on Parallel and Distributed Systems, 1992. Google ScholarDigital Library
- W. Dally. Wire efficient VLSI multiprocessor communication networks. Proc. Stanford Conf. Advanced Research VLSI, 1987.Google Scholar
- D. Huang et al. Terahertz CMOS frequency generator using linear superposition technique. IEEE Journal of Solid State Circuits, Dec 2008.Google ScholarCross Ref
- A. Duller, G. Panesar, and D. Towner. Parallel Processing - the picoChip way!. Communicating Process Architectures, 2003.Google Scholar
- B. A. Floyd. Intra-chip wireless interconnect for clock distribution implemented with integrated antennas, receivers, and transmitters. IEEE JSSC, 2002.Google ScholarCross Ref
- N. Agarwal et al. Garnet: A detailed interconnection network model inside a full-system simulation framework. TR CE-P08-001, Princeton University, 2007.Google Scholar
- C. J. Glass, L. M. Ni. The turn model for adaptive routing. ISCA-19, 1992. Google ScholarDigital Library
- A. Ghuloum, Unwelcome advice from Intel.blogs.intel.com/research/2008/06/unwelcome_advice.Google Scholar
- B. Grot and S. W. Keckler. Scalable on-chip interconnect topologies. 2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects, 2008.Google Scholar
- International technology roadmap for semiconductors, 2007 edition.http://www.itrs.net/Links/2007ITRS/2007_Chapter/2007_Wireless.pdfGoogle Scholar
- D. N. Jayasimha, B. Zafar, Y. Hoskote. On-chip interconnection networks: why they are different and how to compare them. Technical Report, Intel Corp, 2006Google Scholar
- J. Kahle, M. Day, H. Hofstee, C. Johns, T. Maeurer and D. Shippy. Introduction to the Cell multiprocessor. IBM Journal of Research and Development, 2005. Google ScholarDigital Library
- J. Kim, J. Balfour, and W. Dally. Flattened butterfly topology for on-chip networks. MICRO, 2007. Google ScholarDigital Library
- G. Koch. Intel's road to multi-core chip architecture.www.intel.com/cd/00/00/22/09/220997_220997.pdfGoogle Scholar
- NVIDIA Quadro FX 5600. http://www.nvidia.com/docs/IO/40049/quadro_fx_5600_datasheet.pdfGoogle Scholar
- NVIDIA Tesla C1060. http://www.nvidia.com/docs/IO/56483/Tesla_C1060_boardSpec_v03.pdfGoogle Scholar
- K. Olukotun and L. Hammond. The future of microprocessors. ACM QUEUE Magazine, September 2005. Google ScholarDigital Library
- K. Olukotun, L. Hammond, and J. Laudon. Chip multiprocessor architecture: techniques to improve throughput and latency. Morgan&Claypool, 2007. Google ScholarDigital Library
- S.-W. Tam et al. A simultaneous tri-band on-chip RF-Interconnect for future network-on-chip. VLSI Symposium, 2009.Google Scholar
- E. Seok et al. A 410GHz CMOS push-push oscillator with an on-chip patch antenna. ISSCC, 2008.Google ScholarCross Ref
- S. Vangal et al. An 80-tile 1.28 TFLOPS network-on-chip in 65nm CMOS. IEEE ISSCC, 2007.Google Scholar
- H. -S. Wang et al. Orion: a power-performance simulator for interconnection networks. Int' Symposium on Microarchitecture, 2002. Google ScholarDigital Library
- D. Zhao and Y. Wang.SD-MAC: design and synthesis of a hardware-efficient collition-free QoS-aware MAC protocolfor wireless network-on-chip. IEEE Transactions on Computers, Vol.57, No.9, September 2008. Google ScholarDigital Library
Index Terms
- A scalable micro wireless interconnect structure for CMPs
Recommendations
RF interconnects for communications on-chip
ISPD '08: Proceedings of the 2008 international symposium on Physical designIn this paper, we propose a new way of implementing on-chip global interconnect that would meet stringent challenges of core-to-core communications in latency, data rate, and re-configurability for future chip-microprocessors (CMP) with efficient area ...
Achieving predictable performance through better memory controller placement in many-core CMPs
ISCA '09: Proceedings of the 36th annual international symposium on Computer architectureIn the near term, Moore's law will continue to provide an increasing number of transistors and therefore an increasing number of on-chip cores. Limited pin bandwidth prevents the integration of a large number of memory controllers on-chip. With many ...
Analytical Performance Modeling of Hierarchical Interconnect Fabrics
NOCS '12: Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-ChipThe continuous scaling of nanoelectronics is increasing the complexity of chip multiprocessors (CMPs) and exacerbating the memory wall problem. As CMPs become more complex, the memory subsystem is organized into more hierarchical structures to better ...
Comments