ABSTRACT
It is well-known that memory latency, energy, capacity, bandwidth, and scalability will be critical bottlenecks in future large-scale systems. This paper addresses these problems, focusing on the interface between the compute cores and memory, comprising the physical interconnect and the memory access protocol. For the physical interconnect, we study the prudent use of emerging silicon-photonic technology to reduce energy consumption and improve capacity scaling. We conclude that photonics are effective primarily to improve socket-edge bandwidth by breaking the pin barrier, and for use on heavily utilized links. For the access protocol, we propose a novel packet based interface that relinquishes most of the tight control that the memory controller holds in current systems and allows the memory modules to be more autonomous, improving flexibility and interoperability. The key enabler here is the introduction of a 3D-stacked interface die that allows both these optimizations without modifying commodity memory dies. The interface die handles all conversion between optics and electronics, as well as all low-level memory device control functionality. Communication beyond the interface die is fully electrical, with TSVs between dies and low-swing wires on-die. We show that such an approach results in substantially lowered energy consumption, reduced latency, better scalability to large capacities, and better support for heterogeneity and interoperability.
Supplemental Material
- CACTI: An Integrated Cache and Memory Access Time, Cycle Time, Area, Leakage, and Dynamic Power Model. http://www.hpl.hp.com/research/cacti/.Google Scholar
- Fully-Buffered DIMM Technology in HP ProLiant Servers - Technology Brief. http://www.hp.com.Google Scholar
- STREAM - Sustainable Memory Bandwidth in High Performance Computers. http://www.cs.virginia.edu/stream/.Google Scholar
- Virtutech Simics Full System Simulator. http://www.virtutech.com.Google Scholar
- N. Aggarwal et al. Power Efficient DRAM Speculation. In Proceedings of HPCA, 2008.Google ScholarCross Ref
- D. E. Atkins et al. Report of the NSF Blue-Ribbon Advisory Panel on Cyberinfrastructure. Technical report, National Science Foundation, 2003.Google Scholar
- M. Awasthi et al. Handling PCM Resistance Drift with Device, Circuit, Architecture, and System Solutions. In Non-Volatile Memories Workshop, 2011.Google Scholar
- R. Barbieri et al. Design and Construction of the High-Speed Optoelectronic Memory System Demonstrator. Applied Opt., 2008.Google ScholarCross Ref
- L. Barroso and U. Holzle. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan & Claypool, 2009. Google ScholarDigital Library
- S. Beamer et al. Re-Architecting DRAM Memory Systems with Monolithically Integrated Silicon Photonics. In Proceedings of ISCA, 2010. Google ScholarDigital Library
- R. G. Beausoleil et al. Nanoelectronic and Nanophotonic Interconnect. Proceedings of IEEE, 2008.Google ScholarCross Ref
- C. Benia et al. The PARSEC Benchmark Suite: Characterization and Architectural Implications. Technical report, Princeton University, 2008.Google Scholar
- B. Black et al. Die Stacking (3D) Microarchitecture. In Proceedings of MICRO, December 2006. Google ScholarDigital Library
- M. J. Cianchetti, J. C. Kerekes, and D. H. Albonesi. Phastlane: A Rapid Transit Optical Routing Network. In Proceedings of ISCA, 2009. Google ScholarDigital Library
- J. Condit et al. Better I/O Through Byte-Addressable, Persistent Memory. In Proceedings of SOSP, 2009. Google ScholarDigital Library
- E. Cooper-Balis and B. Jacob. Fine-Grained Activation for Power Reduction in DRAM. IEEE Micro, May/June 2010. Google ScholarDigital Library
- W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann, 1st edition, 2003. Google ScholarDigital Library
- Elpida Memory, Inc. News Release: Elpida Completes Development of Cu-TSV (Through Silicon Via) Multi-Layer 8-Gigabit DRAM. http://www.elpida.com/pdfs/pr/2009-08-27e.pdf.Google Scholar
- K. Fang et al. Mode-locked Silicon Evanescent Lasers. Optics Express, September 2007.Google Scholar
- A. Hadke et al. OCDIMM: Scaling the DRAM Memory Wall Using WDM based Optical Interconnects. In Proceedings of HOTI, 2008. Google ScholarDigital Library
- R. Ho. On-Chip Wires: Scaling and Efficiency. PhD thesis, Stanford University, August 2003.Google Scholar
- ITRS. International Technology Roadmap for Semiconductors, 2008 Update.Google Scholar
- J. Ahn et al. Devices and architectures for photonic chip-scale integration. Applied Physics A: Materials Science and Processing, 95, 2009.Google Scholar
- B. Jacob, S. W. Ng, and D. T. Wang. Memory Systems - Cache, DRAM, Disk. Elsevier, 2008. Google ScholarDigital Library
- John Carter, IBM Power Aware Systems. Personal Correspondence, 2011.Google Scholar
- U. Kang et al. 8Gb 3D DDR DRAM Using Through-Silicon-Via Technology. In Proceedings of ISSCC, 2009.Google Scholar
- N. Kirman et al. Leveraging Optical Technology in Future Bus-Based Chip Multiprocessors. In Proceedings of MICRO, 2006. Google ScholarDigital Library
- N. Kirman and J. F. Martinez. An Efficient All-Optical On-Chip Interconnect Based on Oblivious Routing. In Proceedings of ASPLOS, 2010. Google ScholarDigital Library
- P. Kogge(Editor). ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems. Defense Advanced Research Projects Agency (DARPA), 2008.Google Scholar
- S. Li et al. McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures. In Proceedings of MICRO, 2009. Google ScholarDigital Library
- K. Lim et al. Disaggregated Memory for Expansion and Sharing in Blade Servers. In Proceedings of ISCA, 2009. Google ScholarDigital Library
- G. Loh. 3D-Stacked Memory Architectures for Multi-Core Processors. In Proceedings of ISCA, 2008. Google ScholarDigital Library
- D. Meisner, B. Gold, and T. Wenisch. PowerNap: Eliminating Server Idle Power. In Proceedings of ASPLOS, 2009. Google ScholarDigital Library
- D. A. B. Miller. Device Requirements for Optical Interconnects to Silicon Chips. Proceedings of IEEE Special Issue on Silicon Photonics, 2009.Google ScholarCross Ref
- C. Natarajan, B. Christenson, and F. Briggs. A Study of Performance Impact of Memory Controller Features in Multi-Processor Environment. In Proceedings of WMPI, 2004. Google ScholarDigital Library
- C. Nitta, M. Farrens, and V. Akella. Addressing System-Level Trimming Issues in On-Chip Nanophotonic Networks. In Proceedings of HPCA, 2011. Google ScholarDigital Library
- E. Ordentlich et al. Coding for Limiting Current in Memristor Crossbar Memories. In Non-Volatile Memories Workshop, 2011.Google Scholar
- M. Qureshi, V. Srinivasan, and J. Rivers. Scalable High Performance Main Memory System Using Phase-Change Memory Technology. In Proceedings of ISCA, 2009. Google ScholarDigital Library
- Raymond G. Beausoleil, HP Labs. Personal Correspondence, 2010.Google Scholar
- K. Skadron et al. Temperature-Aware Microarchitecture. In Proceedings of ISCA, 2003. Google ScholarDigital Library
- N. Streibl et al. Digital Optics. Proceedings of IEEE, 1989.Google Scholar
- J. Torrellas. Architectures for Extreme-Scale Computing. IEEE Computer, November 2009. Google ScholarDigital Library
- A. N. Udipi et al. Rethinking DRAM Design and Organization for Energy-Constrained Multi-Cores. In Proceedings of ISCA, 2010. Google ScholarDigital Library
- D. Vantrease et al. Corona: System Implications of Emerging Nanophotonic Technology. In Proceedings of ISCA, 2008. Google ScholarDigital Library
- D. H. Woo et al. An Optimized 3D-Stacked Memory Architecture by Exploiting Excessive, High-Density TSV Bandwidth. In Proceedings of HPCA, 2010.Google Scholar
- Q. Xu et al. Micrometre-Scale Silicon Electro-Optic Modulator. Nature, 435:325--327, May 2005.Google Scholar
- J. Xue et al. An Intra-Chip Free-Space Optical Interconnect. In Proceedings of ISCA, 2010. Google ScholarDigital Library
Index Terms
- Combining memory and a controller with photonics through 3D-stacking to enable scalable and energy-efficient systems
Recommendations
Combining memory and a controller with photonics through 3D-stacking to enable scalable and energy-efficient systems
ISCA '11It is well-known that memory latency, energy, capacity, bandwidth, and scalability will be critical bottlenecks in future large-scale systems. This paper addresses these problems, focusing on the interface between the compute cores and memory, ...
Challenges of High-Capacity DRAM Stacks and Potential Directions
MCHPC'18: Proceedings of the Workshop on Memory Centric High Performance ComputingWith rapid growth in data volumes and an increase in number of CPU/GPU cores per chip, the capacity and bandwidth of main memory can be scaled up to accommodate performance requirements of data-intensive applications. Recent 3D-stacked in-package memory ...
An Energy Efficient 3D-Heterogeneous Main Memory Architecture for Mobile Devices
MEMSYS '20: Proceedings of the International Symposium on Memory SystemsThe demand for main memory capacity is ever increasing in mobile devices and embedded systems. Dynamic Random Access Memories (DRAMs) can not keep pace with the required main memory capacities because of the restrictions in improving the cell density ...
Comments