Abstract
As technology scales, interconnects have become a major performance bottleneck and a major source of power consumption for microprocessors. Increasing interconnect costs make it necessary to consider alternate ways of building modern microprocessors. One promising option is 3D architectures where a stack of multiple device layers with direct vertical tunneling through them are put together on the same chip. As fabrication of 3D integrated circuits has become viable, developing CAD tools and architectural techniques is imperative to explore the design space to 3D microarchitectures. In this article, we give a brief introduction to 3D integration technology, discuss the EDA design tools that can enable the adoption of 3D ICs, and present the implementation of various microprocessor components using 3D technology. An industrial case study is presented as an initial attempt to design 3D microarchitectures.
- Albayraktaroglu, K., Jalell, A., Wu, X., Franklin, M., Jacob, B., Tseng, C.-W., and Yeung, D. 2005. Biobench: A benchmark suite of bioinformatics applications. In Proceedings of the International Symposium on Performance Analysis of Systems and Software. Austin, TX. 2--9. Google Scholar
- Austin, T., Larson, E., and Ernst, D. 2002. Simplescalar: An infrastructure for computer system modeling. IEEE Micro Magazine, 59--67. Google Scholar
- Austin, T. M., Breach, S. E., and Sohi, G. S. 1994. Efficient detection of all pointer and array access errors. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation. Orlando, FL. 290--301. Google Scholar
- Bernstein, K. 2006. Introduction to 3d integration. In International Solid State Circuits Conference Tutorial.Google Scholar
- Brent, R. P. and Kung, H. T. 1982. A regular layout for parallel adders. IEEE Trans. Comput., 260--264.Google Scholar
- Chang, Y., Chang, Y., Wu, G.-M., and Wu, S.-W. 2000. B*-trees: A new representation for non-slicing floorplans. In Proceedings of the Annual ACM/IEEE Design Automation Conference. Google Scholar
- Chu, C. and Wong, D. 1997. A matrix synthesis approach to thermal placement. In Proceedings of the International Symposium on Physical Design (ISPD'97). Google Scholar
- Cong, J., Wei, J., and Zhang, Y. 2004. A thermal-driven floorplanning algorithm for 3d ics. In Proceedings of the International Conference on Computer Aided Design (ICCAD). Google Scholar
- Das, S., Fan, A., Chen, K.-N., Tan, C. S., Checka, N., and Reif, R. 2004. Technology, performance, and computer-aided design of three-dimensional integrated circuits. In Proceedings of the International Symposium on Physical Design (ISPD'97). ACM Press, New York, NY. 108--115. Google Scholar
- Davis, W. R., Wilson, J., Mick, S., Xu, J., Hua, H., Mineo, C., Sule, A. M, Steer, M., and Franzon, P. D. 2005. Demystifying 3d ics: The pros and cons of going vertical. IEEE Design and Test of Comput. 22, 498--510. Google Scholar
- Deng, Y. and Maly, W. 2004. 2.5D system integration: A design driven system implementation schema. In Proceedings of the Conference on Asia South Pacific Design Automation. Google Scholar
- Goplen, B. and Sapatnekar, S. 2003. Efficient thermal placement of standard cells in 3D ICs using a force directed approach. In Proceedings of the International Conference on Computer Aided Design (ICCAD). Google Scholar
- Gupta, S., Hilbert, M., Hong, S., and Patti, R. 2004. Techniques for producing 3d ics with high-density interconnect. In Proceedings of the 21st International VLSI Multilevel Interconnection Conference. Waikoloa Beach, HI.Google Scholar
- Guthaus, M. R., Ringenberg, J. S., Ernst, D., Austin, T. M., Mudge, T., and Brown, R. B. 2001. Mibench: A free, commerically representative embedded benchmark suite. In Proceedings of the 4th Workshop on Workload Characterization. Austin, TX. 83--94. Google Scholar
- Ha, P. Z., Davis, J., and Meindl, J. 2000. Prediction of net length distribution for global interconnects in a heterogeneous soc. IEEE Trans. VLSI Syst. 8, 6 (Dec.), 649--659. Google Scholar
- Hennessy, J. and Patterson, D. 2003. Computer Architecture: A Quantitative Approach 3rd Ed. Morgan Kaufmann. Google Scholar
- Hinton, G., Sager, D., Upton, M., Boggs, D., Carmean, D., Kyler, A., and Roussel, P. 2001. The microarchitecture of the pentium 4 processor. Intel Techn. J.Google Scholar
- Hung, W., Link, G., Xie, Y., Narayanan, V., and Irwin, M. J. 2006. Interconnect and thermal-aware floorplanning for 3d microprocessors. In Proceedings of the International Symposium of Quality Electronic Devices. Google Scholar
- Jouppi, N. P. 1990. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proceedings of the 17th International Symposium on Computer Architecture. Seattle, WA. 364--373. Google Scholar
- Jung, S. M., Jang, J., Cho, W., Moon, J., Kwak, K., Choi, B., Hwang, B., Lim, H., Jeong, J., Kim, J., and Kim, K. 2004. The revolutionary and truly 3-dimentional 25F2 SRAM technology with the smallest S3 cell, 0.16um2 and SSTFF for ultra high density SRAM. VLSI Techn. Dig. Techn. Papers, 228--229.Google Scholar
- Kang, Y. H. Jung, S. M., Jang, J. H., Moon, J. H., Cho, W. S., Yeo, C. D., Kwak, K. H., Choi, B. H., Hwang, B. J., Jung, W. R., Kim, S. J., Kim, J. H., Na, J. H., Lim, H., Jeong, J. H., and Kim, K. 2004. Fabrication and characteristics of novel load PMOS SSTFT (stacked single-crystal thin film transistor) for 3-dimentional SRAM memory cell. In Proceedings of the IEEE Silicon-on-Insulator Conference (SOI). 127--129.Google Scholar
- Kogge, P. M. and Stone, H. S. 1973. A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Trans. Comput., 786--793.Google Scholar
- Larson, E., Chatterjee, S., and Austin, T. 2001. Mase: A novel infrastructure for detailed microarchitectural modeling. In Proceedings of the International Symposium on Performance Analysis of Systems and Software. Tucson, AZ. 1--9.Google Scholar
- Lee, C., Potkonjak, M., and Mangione-Smith, W. H. 1997. Mediabench: A tool for evaluating and synthesizing multimedia and communication systems. In Proceedings of the 30th International Symposium on Microarchitecture. Research Triangle Park, NC. 330--335. Google Scholar
- Lee, K. W. Nakqmura, T., Ono, T., Yamada, Y., Mozukusa, T., Hashimoto, H., Park, K. T., Kuring, H., and Koyanag, N. 2000. Three-dimensional shared memory fabricated using wafer stacking technology. International Electron Devides Meeting (IEDM). Technical Digest, 165--168.Google Scholar
- Link, G. and Narayanan, V. 2006. Thermal trends in emergent technologies. In Proceedings of the International Symposium of Quality Electronic Devices. Google Scholar
- Lipasti, M. H., Mestan, B. R., and Gunadi, E. 2004. Physical register inlining. In Proceedings of the 31st International Symposium on Computer Architecture. München, Germany. 325--335. Google Scholar
- Mayega, J., Erdogan, O., Belemjian, P. M., Zhou, K., McDonald, J. F., and Kraft, R. P. 2003. 3d direct vertical interconnect microprocessors test vehicle. In Proceedings of the ACM Great Lakes Symposium on VLSI. Washington, DC. 141--146. Google Scholar
- Palacharla, S. 1998. Complexity-effective superscalar processors. Ph.D. thesis, University of Wisconsin. Google Scholar
- Palacharla, S., Jouppi, N. P., and Smith, J. E. 1997. Complexity-effective superscalar processors. In Proceedings of the 24th International Symposium on Computer Architecture. Boulder, CO. 206--218. Google Scholar
- Puttaswamy, K. and Loh, G. H. 2005. Implementing caches in a 3d technology for high performance processors. In Proceedings of the International Conference on Computer Design. San Jose, CA. Google Scholar
- Rahman, A. and Reif, R. 2000. System level performance evaluation of three-dimensional integrated circuits. IEEE Trans. VLSI Syst. 8, 6 (Dec.), 671--678. Google Scholar
- Reif, R., Fan, A., Chen, K., and Das., S. 2002. Fabrication technologies for three-dimensional integrated circuits. In Proceedings of the International Symopsium on Quality Electronic Devices. 33--37. Google Scholar
- Seznec, A., Felix, S., Krishnan, V., and Sazeides, Y. 2002. Design tradeoffs for the alpha ev8 conditional branch predictor. In Proceedings of the 29th International Symposium on Computer Architecture. Anchorage, AK. Google Scholar
- Shiu, P. and Lim, S. K. 2004. Multi-layer floorplanning for reliable system-on-package. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS).Google Scholar
- Shivakumar, P. Shivakumar, P., and Joupp, N. P. 2001. Cacti 3.0: An integrated cache timing, power, and area model. Western Research Lab Research Report.Google Scholar
- Skadron, K., Stan, M. R., Huang, W., Velusamy, S., Sankaranarayanan, K., and Tarjan, D. 2003. Temperature-Aware Microarchitecture. In Proceedings of the International Sumposium on Computer Architecture 30, 2, 2--13. Google Scholar
- Sklansky, J. 1960. Conditional sum addition logic. IRE Trans. Electron. Comput. 9, 2 (June), 226--231.Google Scholar
- Souri, S. J., Banerjee, K., Mehrotra, A., and Saraswat, K. C. 2000. Multiple si layer ics: motivation, performance analysis, and design implications. In Proceedings of the 37th Conference on Design Automation (DAC '00). ACM Press, New York, NY. 213--220. Google Scholar
- Tezzaron Semiconductors. 2005. Tezzaron unveils 3d SRAM. http://www.tezzaron.com.Google Scholar
- Tsai, C. and Kang, S. 2000. Cell-level placement for improving substrate thermal distributio. IEEE Trans. Comput.-Aided Design Integrat. Circuits Syst. Google Scholar
- Tsai, Y., Xie, Y., Narayanan, V., and Irwin, M. J. 2005. Three-dimensional cache design exploration using 3dcacti. In Proceedings of the IEEE International Conference on Computer Design (ICCD'05) 519--524. Google Scholar
- Xue, L., Liu, C., and Tiwari, S. 2001. Multi-layers with buried structures (mlbs): An approach to three-dimensional integration. In Proceedings of the IEEE International Conference on Silicon On Insulator. 117--118.Google Scholar
- Zhang, K., Bhattacharya, U., Chen, Z., Hamzaogiu, F., Murray, D., Vallepalli, N., Wang, Y., Zheng, B., and Bohr, M. 2004. A SRAM Design on 65nm CMOS technology with Integrated Leakage Reduction Scheme. IEEE Symposium On VLSI Circuit. Digest of Technical Papers, 294--295.Google Scholar
Index Terms
- Design space exploration for 3D architectures
Recommendations
Processor Design in 3D Die-Stacking Technologies
Three-dimensional die-stacking integration stacks multiple layers of processed silicon with a very high-density, low-latency layer-to-layer interconnect. After presenting a brief background on 3D die-stacking technology, this article gives multiple case ...
Out-of-order vector architectures
MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on MicroarchitectureRegister renaming and out-of-order instruction issue are now commonly used in superscalar processors. These techniques can also be used to significant advantage in vector processors, as this paper shows. Performance is improved and available memory ...
Code Density and Energy Efficiency of Exposed Datapath Architectures
Exposing details of the processor datapath to the programmer is motivated by improvements in the energy efficiency and the simplification of the microarchitecture. However, an instruction format that can control the data path in a more explicit manner ...
Comments