research-article

A programmable memory controller for the DDRx interfacing standards

Authors:
Mahdi Nazm Bojnordi

University of Rochester

University of Rochester
View Profile

,
Engin Ipek

University of Rochester

University of Rochester
View Profile

Authors Info & Claims

ACM Transactions on Computer Systems Volume 31 Issue 4Article No.: 11pp 1–31https://doi.org/10.1145/2534845

Published:20 December 2013Publication History

ACM Transactions on Computer Systems

Abstract

Modern memory controllers employ sophisticated address mapping, command scheduling, and power management optimizations to alleviate the adverse effects of DRAM timing and resource constraints on system performance. A promising way of improving the versatility and efficiency of these controllers is to make them programmable—a proven technique that has seen wide use in other control tasks, ranging from DMA scheduling to NAND Flash and directory control. Unfortunately, the stringent latency and throughput requirements of modern DDRx devices have rendered such programmability largely impractical, confining DDRx controllers to fixed-function hardware.

This article presents the instruction set architecture (ISA) and hardware implementation of PARDIS, a programmable memory controller that can meet the performance requirements of a high-speed DDRx interface. The proposed controller is evaluated by mapping previously proposed DRAM scheduling, address mapping, refresh scheduling, and power management algorithms onto PARDIS. Simulation results show that the average performance of PARDIS comes within 8% of fixed-function hardware for each of these techniques; moreover, by enabling application-specific optimizations, PARDIS improves system performance by 6 to 17% and reduces DRAM energy by 9 to 22% over four existing memory controllers.

References

Agarwal, A., Bianchini, R., Chaiken, D., Kranz, D., Kubiatowicz, J., Hong Lim, B., MacKenzie, K., and Yeung, D. 1995. The MIT alewife machine: Architecture and performance. In Proceedings of the 22nd Annual International Symposium on Computer Architecture. 2--13. Google ScholarDigital Library
Bailey, D. H. et al. 1994. NAS parallel benchmarks. Tech. rep. RNR-94-007, NASA Ames Research Center.Google Scholar
Browne, M., Aybay, G., Nowatzyk, A., Dubois, M., and Member, S. 1998. Design verification of the s3.mp cache coherent shared-memory system. IEEE Trans. Comput. Google ScholarDigital Library
Cadence. Encounter RTL compiler. http://www.cadence.com/products/ld/rtl-compiler/.Google Scholar
Carter, J., Hsieh, W., Stoller, L., Swanson, M., Zhang, L., Brunvand, E., Davis, A., Kuo, C.-C., Kuramkote, R., Parker, M., Schaelicke, L., and Tateyama, T. 1999. Impulse: Building a smarter memory controller. In Proceedings of the International Symposium 5th HPCA. High-Performance Computer Architecture. 70--79. Google ScholarDigital Library
Choudhary, N. K., Wadhavkar, S. V., Shah, T. A., Mayukh, H., Gandhi, J., Dwiel, B. H., Navada, S., Najaf-Abadi, H. H., and Rotenberg, E. 2011. Fabscalar: Composing synthesizable RTL designs of arbitrary cores within a canonical superscalar template. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA'11). ACM, New York, 11--22. Google ScholarDigital Library
Dagum, L. and Menon, R. 1998. OpenMP: An industry-standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5, 1, 46--55. Google ScholarDigital Library
Diniz, B., Guedes, D., Meira,W., Jr., and Bianchini, R. 2007. Limiting the power consumption of main memory. In Proceedings of the Annual International Symposium on Computer Architecture (ISCA). 290--301. Google ScholarDigital Library
Firoozshahian, A., Solomatnikov, A., Shacham, O., Asgar, Z., Richardson, S., Kozyrakis, C., and Horowitz, M. 2009. A memory system design framework: Creating smart memories. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA'09). ACM, New York, 406--417. Google ScholarDigital Library
FreePDK. Free PDK 45nm open-access based PDK for the 45nm technology node. http://www.eda.ncsu.edu/wiki/FreePDK.Google Scholar
Hewlett-Packard Development Company, L. P. 2010. DDR3 memory technology. http://h20195.www2.hp.com/v2/GetPDF.aspx/c01750914.pdf.Google Scholar
Hur, I. and Lin, C. 2008. A comprehensive approach to dram power management. In Proceedings of HPCA'08. 305--316.Google Scholar
Ipek, E., Mutlu, O., Martinez, J., and Caruana, R. 2008. Self-optimizing memory controllers: A reinforcement learning approach. In Proceedings of the International Symposium on Computer Architecture. Google ScholarDigital Library
Isen, C. and John, L. 2009. Eskimo - Energy savings using semantic knowledge of inconsequential memory occupancy for dram subsystem. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42). 337--346. Google ScholarDigital Library
ITRS. International Technology Roadmap for Semiconductors: 2010 Update. http://www.itrs.net/links/2010itrs/home2010.htm.Google Scholar
Jacob, B. L., Ng, S. W., Wang, D. T., and Wang, D. T. 2008. Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann. Google ScholarDigital Library
Kim, Y., Han, D., Mutlu, O., and Harchol-Balter, M. 2010a. Atlas: A scalable and high-performance scheduling algorithm for multiple memory controllers. In Proceedings of the IEEE 16th International Symposium on High Performance Computer Architecture (HPCA). 1--12.Google Scholar
Kim, Y., Papamichael, M., Mutlu, O., and Harchol-Balter, M. 2010b. Thread cluster memory scheduling: Exploiting differences in memory access behavior. In Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'43). IEEE, Los Alamistos, CA, 65--76. Google ScholarDigital Library
Kornaros, G., Papaefstathiou, I., Nikologiannis, A., and Zervos, N. 2003. A fully programmable memory management system optimizing queue handling at multi gigabit rates. In Proceedings of the Design Automation Conference. 54--59. Google ScholarDigital Library
Kuskin, J., Ofelt, D., Heinrich, M., Heinlein, J., Simoni, R., Gharachorloo, K., Chapin, J., Nakahira, D., Baxter, J., Horowitz, M., Gupta, A., Rosenblum, M., and Hennessy, J. 1994. The Stanford flash multiprocessor. In Proceedings of the 21st Annual International Symposium on Computer Architecture (ISCA'94). IEEE, Los Alamitos, CA, 302--313. Google ScholarDigital Library
Lee, K.-B., Lin, T.-C., and Jen, C.-W. 2005. An efficient quality-aware memory controller for multimediaplatform soc. IEEE Trans. Circuits Syst. Video Technol. 15, 5, 620--633. Google ScholarDigital Library
Liu, S., Pattabiraman, K., Moscibroda, T., and Zorn, B. G. 2011. Flikker: Saving DRAM refresh-power through critical data partitioning. In Proceedings of ASPLOS, R. Gupta and T. C. Mowry, Eds., ACM, New York, 213--224. Google ScholarDigital Library
Martin, J., Bernard, C., Clermidy, F., and Durand, Y. 2009. A microprogrammable memory controller for high-performance dataflow applications. In Proceedings of ESSCIRC (ESSCIRC'09). 348--351.Google Scholar
Micron Technology, Inc. 2009a. 8Gb DDR3 SDRAM. Micron Technology, Inc. http://www.micron.com//getdocument/&quest;documentId=416.Google Scholar
Micron Technology, Inc. 2009b. TN-29-14: Increasing NAND flash performance functionality. Micron Technology Inc. http://www.micron.com/getdocument/&quest;documentId=140.Google Scholar
Micron Technology, Inc. 2009c. TN-41-08: design guide for two DDR3-1066 UDIMM systems introduction. Micron Technology, Inc. http://www.micron.com//document download/&quest;documentId=4297.Google Scholar
Mukundan, J. and Martinez, J. F. 2012. Morse: Multi-objective reconfigurable self-optimizing memory scheduler. In Proceedings of the IEEE 18th International Symposium on High-Performance Computer Architecture (HPCA'12). IEEE, Los Alamitos, CA, 1--12. Google ScholarDigital Library
Mutlu, O. and Moscibroda, T. 2008. Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared dram systems. In Proceedings of the 35th Annual International Symposium on Computer Architecture. ACM, New York, 32--41. Google ScholarDigital Library
Narayanan, R., et al. 2006. Minebench: A benchmark suite for data mining workloads. In Proceedings of the IEEE International Symposium on Workload Characterization.Google ScholarCross Ref
Reinhardt, S. K., Larus, J. R., and Wood, D. A. 1994. Tempest and typhoon: User-level shared memory. In Proceedings of ISCA-21. 325--336. Google ScholarDigital Library
Renau, J., et al. 2005. SESC simulator. http://sesc.sourceforge.net.Google Scholar
Rixner, S., et al. 2000. Memory access scheduling. In Proceedings of the 27th Annual International Symposium on Computer Architecture. Google ScholarDigital Library
Stuecheli, J., Kaseridis, D., Hunter, H. C., and John, L. K. 2010. Elastic refresh: Techniques to mitigate refresh penalties in high density memory. In Proceedings of MICRO. 375--384. Google ScholarDigital Library
Sudan, K., Chatterjee, N., Nellans, D., Awasthi, M., Balasubramonian, R., and Davis, A. 2010. Micro-pages: increasing dram efficiency with locality-aware data placement. In Proceedings of ASPLOS'10. 219--230. Google ScholarDigital Library
Wilton, S. and Jouppi, N. 1996. CACTI: An enhanced cache access and cycle time model. IEEE J. Solid-State Circuits 31, 5, 677--688.Google ScholarCross Ref
Woo, S. C., Ohara, M., Torrie, E., Singh, J. P., and Gupta, A. 1995. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of ISCA-22. Google ScholarDigital Library
Yoo, R. M., Romano, A., and Kozyrakis, C. 2009. Phoenix rebirth: Scalable MapReduce on a large-zscale shared-memory system. In Proceedings of the IEEE International Symposium on Workload Characterization. Google ScholarDigital Library
Zhang, Z., Zhu, Z., and Zhang, X. 2000. A permutation-based page interleaving scheme to reduce row buffer conflicts and exploit data locality. In Proceedings of the 33rd Annual International Symposium on Microarchitecture. ACM, New York, 32--41. Google ScholarDigital Library
Zhao, W. and Cao, Y. 2006. New generation of predictive technology model for sub-45nm design exploration. In Proceedings of the International Symposium on Quality Electronic Design. Google ScholarDigital Library
Zheng, H., Lin, J., Zhang, Z., Gorbatov, E., David, H., and Zhu, Z. 2008. Mini-rank: Adaptive dram architecture for improving memory power efficiency. In Proceedings of the 41st IEEE/ACM International Symposium on Microarchitecture (MICRO-41). IEEE, Los Alamitos, CA, 210--221. Google ScholarDigital Library

Index Terms

A programmable memory controller for the DDRx interfacing standards
1. Applied computing
  1. Computers in other domains
    1. Personal computers and PC applications
      1. Microcomputers
2. Computer systems organization
  1. Architectures
  2. Embedded and cyber-physical systems
    1. Embedded systems

Recommendations

Programmable DDRx Controllers

Modern memory controllers employ sophisticated address mapping, command scheduling, and power management optimizations to alleviate the adverse effects of DRAM timing and resource constraints on system performance. A promising way of improving the ...
Read More
Refresh pausing in DRAM memory systems

Dynamic Random Access Memory (DRAM) cells rely on periodic refresh operations to maintain data integrity. As the capacity of DRAM memories has increased, so has the amount of time consumed in doing refresh. Refresh operations contend with read ...
Read More
Design and Implementation of a DDR3-based Memory Controller
ISDEA '13: Proceedings of the 2013 Third International Conference on Intelligent System Design and Engineering Applications

Memory performance has become the major bottleneck to improve the overall performance of the computer system. DDR3 SDRAM is a new generation of memory technology standard introduced by JEDEC, support multibank in parallel and open-page technology. On ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Computer Systems Volume 31, Issue 4
December 2013
90 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/2542150
Editor:
Todd C. Mowry
Issue’s Table of Contents
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 December 2013
- Revised: 1 June 2013
- Accepted: 1 June 2013
- Received: 1 December 2012
Published in tocs Volume 31, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Programmable
memory controller
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 487
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A programmable memory controller for the DDRx interfacing standards

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

Programmable DDRx Controllers

Refresh pausing in DRAM memory systems

Design and Implementation of a DDR3-based Memory Controller

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A programmable memory controller for the DDRx interfacing standards

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

Programmable DDRx Controllers

Refresh pausing in DRAM memory systems

Design and Implementation of a DDR3-based Memory Controller

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media