article

Data and memory optimization techniques for embedded systems

Authors:
P. R. Panda

Synopsys, Inc., 700 E. Middlefield Rd., Mountain View, CA

Synopsys, Inc., 700 E. Middlefield Rd., Mountain View, CA
View Profile

,
F. Catthoor

Inter-University Microelectronics Centre and Katholieke Universiteit Leuven, Kapeldreef 75, Leuven, Belgium

Inter-University Microelectronics Centre and Katholieke Universiteit Leuven, Kapeldreef 75, Leuven, Belgium
View Profile

,
N. D. Dutt

Center for Embedded Computer Systems, University of California at Irvine, Irvine, CA

Center for Embedded Computer Systems, University of California at Irvine, Irvine, CA
View Profile

,
K. Danckaert

Inter-University Microelectronics Centre, Kapeldreef 75, Leuven, Belgium

Inter-University Microelectronics Centre, Kapeldreef 75, Leuven, Belgium
View Profile

,
E. Brockmeyer

Inter-University Microelectronics Centre, Kapeldreef 75, Leuven, Belgium

Inter-University Microelectronics Centre, Kapeldreef 75, Leuven, Belgium
View Profile

,
C. Kulkarni

Inter-University Microelectronics Centre, Kapeldreef 75, Leuven, Belgium

Inter-University Microelectronics Centre, Kapeldreef 75, Leuven, Belgium
View Profile

,
A. Vandercappelle

Inter-University Microelectronics Centre, Kapeldreef 75, Leuven, Belgium

Inter-University Microelectronics Centre, Kapeldreef 75, Leuven, Belgium
View Profile

,
P. G. Kjeldsberg

Norwegian University of Science and Technology, Trondheim, Norway

Norwegian University of Science and Technology, Trondheim, Norway
View Profile

ACM Transactions on Design Automation of Electronic Systems Volume 6 Issue 2pp 149–206https://doi.org/10.1145/375977.375978

Published:01 April 2001Publication History

ACM Transactions on Design Automation of Electronic Systems

Abstract

We present a survey of the state-of-the-art techniques used in performing data and memory-related optimizations in embedded systems. The optimizations are targeted directly or indirectly at the memory subsystem, and impact one or more out of three important cost metrics: area, performance, and power dissipation of the resulting implementation.

We first examine architecture-independent optimizations in the form of code transoformations. We next cover a broad spectrum of optimization techniques that address memory architectures at varying levels of granularity, ranging from register files to on-chip memory, data caches, and dynamic memory (DRAM). We end with memory addressing related issues.

References

AGARWAL, A., KRANTZ, D., AND NATARANJAN, V. 1995. Automatic partitioning of parallel loops and data arrays for distributed shared-memory multiprocessors. IEEE Trans. Parallel Distrib. Syst. 6, 9 (Sept.), 943-962.]] Google Scholar
AHMAD,I.AND CHEN, C. Y. R. 1991. Post-processor for data path synthesis using multiport memories. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD '91, Santa Clara, CA, Nov. 11-14). IEEE Computer Society Press, Los Alamitos, CA, 276-279.]]Google Scholar
AHO, A., SETHI, R., AND ULLMAN, J. 1986. Compilers: Principles, Techniques, and Tools. Addison-Wesley, Reading, MA.]] Google Scholar
AMARASINGHE, S., ANDERSON, J., LAM, M., AND TSENG, C.-W. 1995. An overview of the suif compiler for scalable parallel machines. In Proceedings of the SIAM Conference on Parallel Processing for Scientific Computing (San Francisco, CA, Feb.). SIAM, Philadelphia, PA.]]Google Scholar
BAJWA,R.S.,HIRAKI, M., KOJIMA, H., GORNY,D.J.,NITTA, K., SHRIDHAR, A., SEKI, K., AND SASAKI, K. 1997. Instruction buffering to reduce power in processors for signal processing. IEEE Trans. Very Large Scale Integr. Syst. 5, 4, 417-424.]] Google Scholar
BAKSHI,S.AND GAJSKI, D. D. 1995. A memory selection algorithm for high-performance pipelines. In Proceedings of the European Conference EURO-DAC '95 with EURO-VHDL '95 on Design Automation (Brighton, UK, Sept. 18-22), G. Musgrave, Ed. IEEE Computer Society Press, Los Alamitos, CA, 124-129.]] Google Scholar
BALAKRISHNAN, M., BANERJI,D.K.,MAJUMDAR,A.K.,LINDERS,J.G.,AND MAJITHIA,J. C. 1990. Allocation of multiport memories in data path synthesis. IEEE Trans. Comput.-Aided Des. 7, 4 (Apr.), 536-540.]]Google Scholar
BALASA, F., CATTHOOR, F., AND DE MAN, H. 1994. Dataflow-driven memory allocation for multi-dimensional signal processing systems. In Proceedings of the 1994 IEEE/ACM International Conference on Computer-Aided Design (ICCAD '94, San Jose, CA, Nov. 6-10), J. A. G. Jess and R. Rudell, Eds. IEEE Computer Society Press, Los Alamitos, CA, 31-34.]] Google Scholar
BALASA, F., CATTHOOR, F., AND DE MAN, H. 1995. Background memory area estimation for multidimensional signal processing systems. IEEE Trans. Very Large Scale Integr. Syst. 3, 2 (June), 157-172.]] Google Scholar
BANERJEE, P., CHANDY, J., GUPTA, M., HODGES, E., HOLM, J., LAIN, A., PALERMO, D., RA- MASWAMY, S., AND SU, E. 1995. The paradigm compiler for distributed-memory multicomputers. IEEE Computer 28, 10 (Oct.), 37-47.]] Google Scholar
BANERJEE, U. 1998. Dependence Analysis for Supercomputing. Kluwer Academic Publishers, Hingham, MA.]] Google Scholar
BANERJEE, U., EIGENMANN, R., NICOLAU, A., AND PADUA, D. A. 1993. Automatic program parallelization. Proc. IEEE 81, 2 (Feb.), 211-243.]]Google Scholar
BELLAS, N., HAJJ,I.N.,POLYCHRONOPOULOS,C.D.,AND STAMOULIS, G. 2000. Architectural and compiler techniques for energy reduction in high-performance microprocessors. IEEE Trans. Very Large Scale Integr. Syst. 8, 3 (June), 317-326.]] Google Scholar
BENINI,L.AND DE MICHELI, G. 2000. System-level power optimization techniques and tools. ACM Trans. Des. Autom. Electron. Syst. 5, 2 (Apr.), 115-192.]] Google Scholar
BENINI, L., DE MICHELI, G., MACII, E., PONCINO, M., AND QUER, S. 1998a. Power optimization of core-based systems by address bus encoding. IEEE Trans. Very Large Scale Integr. Syst. 6, 4, 554-562.]] Google Scholar
BENINI, L., DE MICHELI, G., MACII, E., SCIUTO, D., AND SILVANO, C. 1998b. Address bus encoding techniques for system-level power optimization. In Proceedings of the Conference on Design, Automation and Test in Europe 98. 861-866.]] Google Scholar
BENINI, L., MACII, A., AND PONCINO, M. 2000. A recursive algorithm for low-power memory partitioning. In Proceedings of the IEEE International Symposium on Low Power Design (Rapallo, Italy, Aug.). IEEE Computer Society Press, Los Alamitos, CA, 78-83.]] Google Scholar
BROCKMEYER, E., VANDECAPPELLE, A., AND CATTHOOR, F. 2000a. Systematic cycle budget versus system power trade-off: a new perspective on system exploration of real-time datadominated applications. In Proceedings of the IEEE International Symposium on Low Power Design (Rapallo, Italy, Aug.). IEEE Computer Society Press, Los Alamitos, CA, 137-142.]] Google Scholar
BROCKMEYER, E., WUYTACK, S., VANDECAPPELLE, A., AND CATTHOOR, F. 2000b. Low power storage cycle budget tool support for hierarchical graphs. In Proceedings of the 13th ACM/IEEE International Symposium on System-Level Synthesis (Madrid, Sept). ACM Press, New York, NY, 20-22.]] Google Scholar
CATTHOOR, F., DANCKAERT, K., KULKARNI, C., AND OMNES, T. 2000. Data transfer and storage architecture issues and exploration in multimedia processors. In Programmable Digital Signal Processors: Architecture, Programming, and Applications, Y. H. Yu, Ed. Marcel Dekker, Inc., New York, NY.]]Google Scholar
CATTHOOR, F., JANSSEN, M., NACHTERGAELE, L., AND MAN, H. D. 1996. System-level dataflow transformations for power reduction in image and video processing. In Proceedings of the International Conference on Electronic Circuits and Systems on Electronic Circuits and Systems (Oct.). 1025-1028.]]Google Scholar
CATTHOOR, F., WUYTACK, S., DE GREEF, E., BALASA, F., NACHTERGAELE, L., AND VANDECAPPELLE, A. 1998. Custom Memory Management Methodology: Exploration of Memory Organization for Embedded Multimedia System Design. Kluwer Academic, Dordrecht, Netherlands.]] Google Scholar
CATTHOOR, F., FRANSSEN, F., WUYTACK, S., NACHTERGAELE, L., AND DE MAN, H. 1994. Global communication and memory optimizing transformations for low power systems. In Proceed-ings of the International Workshop on Low Power Design. 203-208.]]Google Scholar
CHAITIN, G., AUSLANDER, M., CHANDRA, A., COCKE, J., HOPKINS, M., AND MARKSTEIN, P. 1981. Register allocation via coloring. Comput. Lang. 6, 1, 47-57.]]Google Scholar
CHANG, H.-K AND LIN, Y.-L. 2000. Array allocation taking into account SDRAM characteristics. In Proceedings of the Asia and South Pacific Conference on Design Automation (Yokohama, Jan.). 497-502.]] Google Scholar
CHEN, T.-S. AND SHEU, J.-P. 1994. Communication-free data allocation techniques for parallelizing compilers on multicomputers. IEEE Trans. Parallel Distrib. Syst. 5, 9 (Sept.), 924-938.]] Google Scholar
CIERNIAK,M.AND LI, W. 1995. Unifying data and control transformations for distributed shared-memory machines. SIGPLAN Not. 30, 6 (June), 205-217.]] Google Scholar
CRUZ, J.-L., GONZALEZ, A., VALERO, M., AND TOPHAM, N. 2000. Multiple-banked register file architectures. In Proceedings of the 27th International Symposium on Computer Architecture (ISCA-27, Vancouver, B.C., June). ACM, New York, NY, 315-325.]] Google Scholar
CUPPU, V., JACOB,B.L.,DAVIS, B., AND MUDGE, T. N. 1999. A performance comparison of contemporary dram architectures. In Proceedings of the International Symposium on Computer Architecture (Atlanta, GA, May). 222-233.]] Google Scholar
DA SILVA,J.L.,CATTHOOR, F., VERKEST, D., AND DE MAN, H. 1998. Power exploration for dynamic data types through virtual memory management refinement. In Proceedings of the 1998 International Symposium on Low Power Electronics and Design (ISLPED '98, Monterey, CA, Aug. 10-12), A. Chandrakasan and S. Kiaei, Chairs. ACM Press, New York, NY, 311-316.]] Google Scholar
DANCKAERT, K., CATTHOOR, F., AND MAN, H. D. 1996. System-level memory management for weakly parallel image processing. In Proceedings of the Conference on EuroPar'96 Parallel Processing (Lyon, France, Aug.). Springer-Verlag, New York, NY, 217-225.]] Google Scholar
DANCKAERT, K., CATTHOOR, F., AND MAN, H. D. 1999. Platform independent data transfer and storage exploration illustrated on a parallel cavity detection algorithm. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA99). 1669-1675.]]Google Scholar
DANCKAERT, K., CATTHOOR, F., AND MAN, H. D. 2000. A preprocessing step for global loop transformations for data transfer and storage optimization. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (San Jose CA, Nov.).]] Google Scholar
DARTE, A., RISSET, T., AND ROBERT, Y. 1993. Loop nest scheduling and transformations. In Environments and Tools for Parallel Scientific Computing, J. J. Dongarra and B. Tou-rancheau, Eds. Elsevier Advances in parallel computing series. Elsevier Sci. Pub. B. V., Amsterdam, The Netherlands, 309-332.]] Google Scholar
DARTE,A.AND ROBERT, Y. 1995. Affine-by-statement scheduling of uniform and affine loop nests over parametric domains. J. Parallel Distrib. Comput. 29, 1 (Aug. 15), 43-59.]] Google Scholar
DIGUET,J.PH., WUYTACK, S., CATTHOOR, F., AND DE MAN, H. 1997. Formalized methodology for data reuse exploration in hierarchical memory mappings. In Proceedings of the 1997 International Symposium on Low Power Electronics and Design (ISLPED '97, Monterey, CA, Aug. 18-20), B. Barton, M. Pedram, A. Chandrakasan, and S. Kiaei, Chairs. ACM Press, New York, NY, 30-35.]] Google Scholar
DING,C.AND KENNEDY, K. 2000. The memory bandwidth bottleneck and its amelioration by a compiler. In Proceedings of the International Symposium on Parallel and Distributed Processing (Cancun, Mexico, May). 181-189.]] Google Scholar
DE GREEF,E.AND CATTHOOR, F. 1996. Reducing storage size for static control programs mapped onto parallel architectures. In Proceedings of the Dagstuhl Seminar on Loop Parallelisation (Schloss Dagstuhl, Germany, Apr.).]]Google Scholar
FEAUTRIER, P. 1991. Dataflow analysis of array and scalar references. Int. J. Parallel Program. 20, 1, 23-53.]]Google Scholar
FEAUTRIER, P. 1995. Compiling for massively parallel architectures: A perspective. Microprocess. Microprogram. 41, 5-6 (Oct.), 425-439.]] Google Scholar
FRABOULET, A., HUARD, G., AND MIGNOTTE, A. 1999. Loop alignment for memory access optimisation. In Proceedings of the 12th ACM/IEEE International Symposium on System-Level Synthesis (San Jose CA, Dec.). ACM Press, New York, NY, 70-71.]] Google Scholar
FRANSSEN, F., BALASA, F., VAN SWAAIJ, M., CATTHOOR, F., AND MAN, H. D. 1993. Modeling multi-dimensional data and control flow. IEEE Trans. Very Large Scale Integr. Syst. 1,3 (Sept.), 319-327.]]Google Scholar
FRANSSEN, F., NACHTERGAELE, L., SAMSOM, H., CATTHOOR, F., AND MAN, H. D. 1994. Control flow optimization for fast system simulation and storage minimization. In Proceedings of the International Conference on Design and Test (Paris, Feb.). 20-24.]]Google Scholar
GAJSKI, D., DUTT, N., LIN, S., AND WU, A. 1992. High Level Synthesis: Introduction to Chip and System Design. Kluwer Academic Publishers, Hingham, MA.]] Google Scholar
GAREY,M.R.AND JOHNSON, D. S. 1979. Computers and Intractibility - A Guide to the Theory of NP-Completeness. W. H. Freeman and Co., New York, NY.]] Google Scholar
GHEZ, C., MIRANDA, M., VANDECAPPELLE, A., CATTHOOR, F., AND VERKEST, D. 2000. Systematic high-level address code transformations for piece-wise linear indexing: illustration on a medical imaging algorithm. In Proceedings of the IEEE Workshop on Signal Processing Systems (Lafayette, LA, Oct.). IEEE Press, Piscataway, NJ, 623-632.]]Google Scholar
GONZALEZ, A., ALIAGAS, C., AND VALERO, M. 1995. A data cache with multiple caching strategies tuned to different types of locality. In Proceedings of the 9th ACM International Conference on Supercomputing (ICS '95, Barcelona, Spain, July 3-7), M. Valero, Chair. ACM Press, New York, NY, 338-347.]] Google Scholar
GOOSSENS, G., VANDEWLLE, J., AND DE MAN, H. 1989. Loop optimization in register-transfer scheduling for DSP-systems. In Proceedings of the 26th ACM/IEEE Conference on Design Automation (DAC '89, Las Vegas, NV, June 25-29), D. E. Thomas, Ed. ACM Press, New York, NY, 826-831.]] Google Scholar
GRANT,D.AND DENYER, P. B. 1991. Address generation for array access based on modulus m counters. In Proceedings of the European Conference on Design Automation (EDAC, Feb.). 118-123.]] Google Scholar
GRANT, D., DENYER,P.B.,AND FINLAY, I. 1989. Synthesis of address generators. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD '89, Santa Clara, CA, Nov.). ACM Press, New York, NY, 116-119.]]Google Scholar
GRANT,D.M.,MEERBERGEN,J.V.,AND LIPPENS, P. E. R. 1994. Optimization of address generator hardware. In Proceedings of the 1994 Conference on European Design and Test (Paris, France, Feb.). 325-329.]]Google Scholar
GREEF,E.D.,CATTHOOR, F., AND MAN, H. D. 1995. Memory organization for video algorithms on programmable signal processors. In Proceedings of the IEEE International Conference on Computer Design (ICCD '95, Austin TX, Oct.). IEEE Computer Society Press, Los Alamitos, CA, 552-557.]] Google Scholar
GREEF,E.D.,CATTHOOR, F., AND MAN, H. D. 1997. Array placement for storage size reduction in embedded multimedia systems. In Proceedings of the International Conference on Applic.-Spec./Array Processors (Zurich, July). 66-75.]] Google Scholar
GRUN, P., BALASA, F., AND DUTT, N. 1998. Memory size estimation for multimedia applications. In Proceedings of the Sixth International Workshop on Hardware/Software Codesign (CODES/CASHE '98, Seattle, WA, Mar. 15-18), G. Borriello, A. A. Jerraya, and L. Lavagno, Chairs. IEEE Computer Society Press, Los Alamitos, CA, 145-149.]] Google Scholar
GRUN, P., DUTT, N., AND NICOLAU, A. 2000a. Memory aware compilation through accurate timing extraction. In Proceedings of the Conference on Design Automation (Los Angeles, CA, June). ACM Press, New York, NY, 316-321.]] Google Scholar
GRUN, P., DUTT, N., AND NICOLAU, A. 2000b. MIST: An algorithm for memory miss traffic management. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (San Jose, CA, Nov.). ACM Press, New York, NY, 431-437.]] Google Scholar
GRUN, P., DUTT, N., AND NICOLAU, A. 2001. Access pattern based local memory customization for low power embedded systems. In Proceedings of the Conference on Design, Automation, and Test in Europe (Munich, Mar.).]] Google Scholar
GUPTA, M., SCHONBERG, E., AND SRINIVASAN, H. 1996. A unified framework for optimizing communication in data-parallel programs. IEEE Trans. Parallel Distrib. Syst. 7,7, 689-704.]] Google Scholar
GUPTA, S., MIRANDA, M., CATTHOOR, F., AND GUPTA, R. 2000. Analysis of high-level address code transformations for programmable processors. In Proceedings of the 3rd ACM/IEEE Conference on Design and Test in Europe (Mar.). ACM Press, New York, NY, 9-13.]] Google Scholar
HALAMBI, A., GRUN, P., GANESH, V., KHARE, A., DUTT, N., AND NICOLAU, A. 1999a. Expression: A language for architecture exploration through compiler/simulator retargetability. In Proceedings of the Conference on DATE (Munich, Mar.).]] Google Scholar
HALAMBI, A., GRUN, P., TOMIYAMA, H., DUTT, N., AND NICOLAU, A. 1999b. Automatic software toolkit generation for embedded systems-on-chip. In Proceedings of the Conference on ICVC.]]Google Scholar
HALL,M.W.,HARVEY,T.J.,KENNEDY, K., MCINTOSH, N., MCKINLEY,K.S.,OLDHAM,J.D., PALECZNY,M.H.,AND ROTH, G. 1993. Experiences using the ParaScope Editor: an interactive parallel programming tool. SIGPLAN Not. 28, 7 (July), 33-43.]] Google Scholar
HALL, M., ANDERSON, J., AMARASINGHE, S., MURPHY, B., LIAO, S., BUGNION, E., AND LAM,M. 1996. Maximizing multiprocessor performance with the SUIF compiler. IEEE Computer 29, 12 (Dec.), 84-89.]] Google Scholar
HENNESSY,J.L.AND PATTERSON, D. A. 1996. Computer Architecture: A Quantitative Approach. 2nd ed. Morgan Kaufmann Publishers Inc., San Francisco, CA.]] Google Scholar
HUANG, C.-Y., CHEN, Y.-S., LIN, Y.-L., AND HSU, Y.-C. 1990. Data path allocation based on bipartite weighted matching. In Proceedings of the 27th ACM/IEEE Conference on Design Automation (DAC '90, Orlando, FL, June 24-28), R. C. Smith, Chair. ACM Press, New York, NY, 499-504.]] Google Scholar
ISO/IEC MOVING PICTURE EXPERTS GROUP. 2001. The MPEG Home Page (http://www.cselt.it/ mpeg/)11.]]Google Scholar
ITOH, K., SASAKI, K., AND NAKAGOME, Y. 1995. Trends in low-power RAM circuit technologies. Proc. IEEE 83, 4 (Apr.), 524-543.]]Google Scholar
JHA,P.K.AND DUTT, N. 1997. Library mapping for memories. In Proceedings of the Conference on European Design and Test (Mar.). 288-292.]] Google Scholar
JOUPPI, N. 1990. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proceedings of the 17th International Symposium on Computer Architecture (ISCA '90, Seattle, WA, May). IEEE Press, Piscat-away, NJ, 364-373.]] Google Scholar
KANDEMIR, M., VIJAYKRISHNAN, N., IRWIN,M.J.,AND YE, W. 2000. Influence of compiler optimisations on system power. In Proceedings of the Conference on Design Automation (Los Angeles, CA, June). ACM Press, New York, NY, 304-307.]] Google Scholar
KARCHMER,D.AND ROSE, J. 1994. Definition and solution of the memory packing problem for field-programmable systems. In Proceedings of the 1994 IEEE/ACM International Conference on Computer-Aided Design (ICCAD '94, San Jose, CA, Nov. 6-10), J. A. G. Jess and R. Rudell, Eds. IEEE Computer Society Press, Los Alamitos, CA, 20-26.]] Google Scholar
KELLY,W.AND PUGH, W. 1992. Generating schedules and code within a unified reordering transformation framework. UMIACS-TR-92-126. University of Maryland at College Park, College Park, MD.]] Google Scholar
KHARE, A., PANDA,P.R.,DUTT,N.D.,AND NICOLAU, A. 1999. High-level synthesis with SDRAMs and RAMBUS DRAMs. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E82-A, 11 (Nov.), 2347-2355.]]Google Scholar
KIM,T.AND LIU, C. L. 1993. Utilization of multiport memories in data path synthesis. In Proceedings of the 30th ACM/IEEE International Conference on Design Automation (DAC '93, Dallas, TX, June 14-18), A. E. Dunlop, Ed. ACM Press, New York, NY, 298-302.]] Google Scholar
KIROVSKI, D., LEE, C., POTKONJAK, M., AND MANGIONE-SMITH, W. 1999. Application-driven synthesis of memory-intensive systems-on-chip. IEEE Trans. Comput.-Aided Des. 18,9 (Sept.), 1316-1326.]]Google Scholar
KJELDSBERG,P.G.,CATTHOOR, F., AND AAS, E. J. 2000a. Automated data dependency size estimation with a partially fixed execution ordering. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (San Jose, CA, Nov.). ACM Press, New York, NY, 44-50.]] Google Scholar
KJELDSBERG,P.G.,CATTHOOR,, F., AND AAS, E. J. 2000b. Storage requirement estimation for data-intensive applications with partially fixed execution ordering. In Proceedings of the ACM/IEEE Workshop on Hardware/Software Co-Design (San Diego CA, May). ACM Press, New York, NY, 56-60.]] Google Scholar
KOHAVI, Z. 1978. Switching and Finite Automata Theory. McGraw-Hill, Inc., New York, NY.]] Google Scholar
KOLSON,D.J.,NICOLAU, A., AND DUTT, N. 1994. Minimization of memory traffic in high-level synthesis. In Proceedings of the 31st Annual Conference on Design Automation (DAC '94, San Diego, CA, June 6-10), M. Lorenzetti, Chair. ACM Press, New York, NY, 149-154.]] Google Scholar
KRAMER,H.AND MULLER, J. 1992. Assignment of global memory elements for multi-process vhdl specifications. In Proceedings of the International Conference on Computer Aided Design. 496-501.]] Google Scholar
KULKARNI, C., CATTHOOR, F., AND MAN, H. D. 1999. Cache transformations for low power caching in embedded multimedia processors. In Proceedings of the International Symposium on Parallel Processing (Orlando, FL, Apr.). 292-297.]] Google Scholar
KULKARNI, C., CATTHOOR, F., AND MAN, H. D. 2000. Advanced data layout organization for multi-media applications. In Proceedings of the Workshop on Parallel and Distributed Computing in Image Processing, Video Processing, and Multimedia (PDIVM 2000, Cancun, Mexico, May).]] Google Scholar
KULKARNI,D.AND STUMM, M. 1995. Linear loop transformations in optimizing compilers for parallel machines. Aust. Comput. J. 27, 2 (May), 41-50.]]Google Scholar
KURDAHI,F.J.AND PARKER, A. C. 1987. REAL: A program for REgister ALlocation. In Proceedings of the 24th ACM/IEEE Conference on Design Automation (DAC '87, Miami Beach, FL, June 28-July 1), A. O'Neill and D. Thomas, Eds. ACM Press, New York, NY, 210-215.]] Google Scholar
LEE, H.-D. AND HWANG, S.-Y. 1995. A scheduling algorithm for multiport memory minimization in datapath synthesis. In Proceedings of the Conference on Asia Pacific Design Automation (CD-ROM) (ASP-DAC '95, Makuhari, Japan, Aug. 29-Sept. 4), I. Shirakawa, Chair. ACM Press, New York, NY, 93-100.]] Google Scholar
LEFEBVRE,V.AND FEAUTRIER, P. 1997. Optimizing storage size for static control programs in automatic parallelizers. In Proceedings of the Conference on EuroPar. Springer-Verlag, New York, NY, 356-363.]] Google Scholar
LEUPERS,R.AND MARWEDEL, P. 1996. Algorithms for address assignment in DSP code generation. In Proceedings of the 1996 IEEE/ACM International Conference on Computer-Aided Design (ICCAD '96, San Jose, CA, Nov. 10-14), R. A. Rutenbar and R. H. J. M. Otten, Chairs. IEEE Computer Society Press, Los Alamitos, CA, 109-112.]] Google Scholar
LI,W.AND PINGALI, K. 1994. A singular loop transformation framework based on non-singular matrices. Int. J. Parallel Program. 22, 2 (Apr.), 183-205.]] Google Scholar
LI,Y.AND HENKEL, J.-R. 1998. A framework for estimation and minimizing energy dissipation of embedded HW/SW systems. In Proceedings of the 35th Annual Conference on Design Automation (DAC '98, San Francisco, CA, June 15-19), B. R. Chawla, R. E. Bryant, and J. M. Rabaey, Chairs. ACM Press, New York, NY, 188-193.]] Google Scholar
LI,Y.AND WOLF, W. 1998. Hardware/software co-synthesis with memory hierarchies. In Proceedings of the 1998 IEEE/ACM International Conference on Computer-Aided Design (ICCAD '98, San Jose, CA, Nov. 8-12), H. Yasuura, Chair. ACM Press, New York, NY, 430-436.]] Google Scholar
LIEM, C., PAULIN, P., AND JERRAYA, A. 1996. Address calculation for retargetable compilation and exploration of instruction-set architectures. In Proceedings of the 33rd Annual Conference on Design Automation (DAC '96, Las Vegas, NV, June 3-7), T. P. Pennino and E. J. Yoffa, Chairs. ACM Press, New York, NY, 597-600.]] Google Scholar
LOVEMAN, D. B. 1977. Program improvement by source-to-source transformation. J. ACM 24, 1 (Jan.), 121-145.]] Google Scholar
LY, T., KNAPP, D., MILLER, R., AND MACMILLEN, D. 1995. Scheduling using behavioral templates. In Proceedings of the 32nd ACM/IEEE Conference on Design Automation (DAC '95, San Francisco, CA, June 12-16), B. T. Preas, Ed. ACM Press, New York, NY, 101-106.]] Google Scholar
MANJIAKIAN,N.AND ABDELRAHMAN, T. 1995. Fusion of loops for parallelism and locality. Tech. Rep. CSRI-315. Dept. of Computer Science, University of Toronto, Toronto, Ont., Canada.]]Google Scholar
MASSELOS, K., CATTHOOR, F., GOUTIS,C.E.,AND MAN, H. D. 1999a. A performance oriented use methodology of power optimizing code transformations for multimedia applications realized on programmable multimedia processors. In Proceedings of the IEEE Workshop on Signal Processing Systems (Taipeh, Taiwan). IEEE Computer Society Press, Los Alamitos, CA, 261-270.]]Google Scholar
MASSELOS, K., DANCKAERT, K., CATTHOOR, F., GOUTIS,C.E.,AND DEMAN, H. 1999b. A methodology for power efficient partitioning of data-dominated algorithm specifications within performance constraints. In Proceedings of the IEEE International Symposium on Low Power Design (San Diego CA, Aug.). IEEE Computer Society Press, Los Alamitos, CA, 270-272.]] Google Scholar
MCFARLING, S. 1989. Program optimization for instruction caches. In Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-III, Boston, MA, Apr. 3-6), J. Emer, Chair. ACM Press, New York, NY, 183-191.]] Google Scholar
MCKINLEY, K. S. 1998. A compiler optimization algorithm for shared-memory multiprocessors. IEEE Trans. Parallel Distrib. Syst. 9, 8, 769-787.]] Google Scholar
MCKINLEY,K.S.,CARR, S., AND TSENG, C.-W. 1996. Improving data locality with loop transformations. ACM Trans. Program. Lang. Syst. 18, 4 (July), 424-453.]] Google Scholar
MENG, T., GORDON, B., TSENG, E., AND HUNG, A. 1995. Portable video-on-demand in wireless communication. Proc. IEEE 83, 4 (Apr.), 659-690.]]Google Scholar
MIRANDA, M., CATTHOOR, F., AND MAN, H. D. 1994. Address equation optimization and hardware sharing for real-time signal processing applications. In Proceedings of the IEEE Workshop on VLSI Signal Processing VII (La Jolla, CA, Oct. 26-28). IEEE Press, Piscat-away, NJ, 208-217.]]Google Scholar
MIRANDA,M.A.,CATTHOOR,F.V.M.,JANSSEN, M., AND DE MAN, H. J. 1998. High-level address optimization and synthesis techniques for data-transfer-intensive applications. IEEE Trans. Very Large Scale Integr. Syst. 6, 4, 677-686.]] Google Scholar
MISHRA, P., GRUN, P., DUTT, N., AND NICOLAU, A. 2001. Processor-memory co-exploration driven by a memory-aware architecture description language. In Proceedings of the Conference on VLSIDesign (Bangalore).]] Google Scholar
MOWRY,T.C.,LAM,M.S.,AND GUPTA, A. 1992. Design and evaluation of a compiler algorithm for prefetching. SIGPLAN Not. 27, 9 (Sept.), 62-73.]] Google Scholar
MUSOLL, E., LANG, T., AND CORTADELLA, J. 1998. Working-zone encoding for reducing the energy in microprocessor address buses. IEEE Trans. Very Large Scale Integr. Syst. 6,4, 568-572.]] Google Scholar
NEERACHER,M.AND RUHL, R. 1993. Automatic parallelization of linpack routines on distributed memory parallel processors. In Proceedings of the IEEE International Symposium on Parallel Processing (Newport Beach CA, Apr.). IEEE Computer Society Press, Los Alamitos, CA.]]Google Scholar
NICOLAU,A.AND NOVACK, S. 1993. Trailblazing: A hierarchical approach to percolation scheduling. In Proceedings of the International Conference on Parallel Processing: Software (Boca Raton, FL, Aug.). CRC Press, Inc., Boca Raton, FL, 120-124.]] Google Scholar
PADUA,D.A.AND WOLFE, M. J. 1986. Advanced compiler optimizations for supercomputers. Commun. ACM 29, 12 (Dec.), 1184-1201.]] Google Scholar
PANDA, P. R. 1999. Memory bank customization and assignment in behavioral synthesis. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (San Jose, CA, Nov.). IEEE Computer Society Press, Los Alamitos, CA, 477-481.]] Google Scholar
PANDA,P.AND DUTT, N. 1999. Low-power memory mapping through reducing address bus activity. IEEE Trans. Very Large Scale Integr. Syst. 7, 3 (Sept.), 309-320.]] Google Scholar
PANDA,P.R.,DUTT,N.D.,AND NICOLAU, A. 1997. Memory data organization for improved cache performance in embedded processor applications. ACM Trans. Des. Autom. Electron. Syst. 2, 4, 384-409.]] Google Scholar
PANDA,P.R.,DUTT,N.D.,AND NICOLAU, A. 1998. Incorporating DRAM access modes into high-level synthesis. IEEE Trans. Comput.-Aided Des. 17, 2 (Feb.), 96-109.]]Google Scholar
PANDA,P.R.,DUTT,N.D.,AND NICOLAU, A. 1999a. Local memory exploration and optimization in embedded systems. IEEE Trans. Comput.-Aided Des. 18, 1 (Jan.), 3-13.]]Google Scholar
PANDA,P.R.,DUTT,N.D.,AND NICOLAU, A. 1999b. Memory Issues in Embedded Systems-On-Chip: Optimizations and Exploration. Kluwer Academic Publishers, Hingham, MA.]] Google Scholar
PANDA,P.R.,DUTT,N.D.,AND NICOLAU, A. 2000. On-chip vs. off-chip memory: The data partitioning problem in embedded processor-based systems. ACM Trans. Des. Autom. Electron. Syst. 5, 3 (July), 682-704.]] Google Scholar
PARHI, K. 1989. Rate-optimal fully-static multiprocessor scheduling of data-flow signal processing programs. In Proceedings of the IEEE International Symposium on Circuits and Systems (Portland, OR, May). IEEE Press, Piscataway, NJ, 1923-1928.]]Google Scholar
PASSOS,N.AND SHA, E. 1994. Full parallelism of uniform nested loops by multi-dimensional retiming. In Proceedings of the 1994 International Conference on Parallel Processing (Aug.). CRC Press, Inc., Boca Raton, FL, 130-133.]] Google Scholar
PASSOS, N., SHA, E., AND CHAO, L.-F. 1995. Multi-dimensional interleaving for time-andmemory design optimization. In Proceedings of the IEEE International Conference on Computer Design (Austin TX, Oct.). IEEE Computer Society Press, Los Alamitos, CA, 440-445.]] Google Scholar
PAUWELS, M., CATTHOOR, F., LANNEER, D., AND MAN, H. D. 1989. Type-handling in bit-true silicon compilation for dsp. In Proceedings of the European Conference on Circuit Theory and Design (Brighton, U.K., Sept.). 166-170.]]Google Scholar
POLYCHRONOPOULOS, C. D. 1988. Compiler optimizations for enhancing parallelism and their impact in architecture design. IEEE Trans. Comput. 37, 8 (Aug.), 991-1004.]] Google Scholar
PUGH,W.AND WONNACOTT, D. 1993. An evaluation of exact methods for analysis of value-based array data dependences. In Proceedings of the 6th Workshop on Programming Languages and Compilers for Parallel Computing (Portland OR). 546-566.]] Google Scholar
QUILLERE,F.AND RAJOPADHYE, S. 1998. Optimizing memory usage in the polyhedral mode. In Proceedings of the Conference on Massively Parallel Computer Systems (Apr.).]]Google Scholar
RAMACHANDRAN, L., GAJSKI, D., AND CHAIYAKUL, V. 1993. An algorithm for array variable clustering. In Proceedings of the IEEE European Conference on Design Automation (EURO-DAC '93). IEEE Computer Society Press, Los Alamitos, CA.]]Google Scholar
SAGHIR,M.A.R.,CHOW, P., AND LEE, C. G. 1996. Exploiting dual data-memory banks in digital signal processors. ACM SIGOPS Oper. Syst. Rev. 30, 5, 234-243.]] Google Scholar
SCHMIT,H.AND THOMAS, D. E. 1997. Synthesis of application-specific memory designs. IEEE Trans. Very Large Scale Integr. Syst. 5, 1, 101-111.]] Google Scholar
SCHMIT,H.AND THOMAS, D. E. 1995. Address generation for memories containing multiple arrays. In Proceedings of the 1995 IEEE/ACM International Conference on Computer-Aided Design (ICCAD-95, San Jose, CA, Nov. 5-9), R. Rudell, Ed. IEEE Computer Society Press, Los Alamitos, CA, 510-514.]] Google Scholar
SEMERIA, L., SATO, K., AND DE MICHELI, G. 2000. Resolution of dynamic memory allocation and pointers for the behavioral synthesis from C. In Proceedings of the European Conference on Design Automation and Test (DATE 2000, Paris, Mar.). 312-319.]] Google Scholar
SHACKLEFORD, B., YASUDA, M., OKUSHI, E., KOIZUMI, H., TOMIYAMA, H., AND YASUURA, H. 1997. Memory-cpu size optimization for embedded system designs. In Proceedings of the 34th Conference on Design Automation (DAC '97, Anaheim, CA, June).]] Google Scholar
SHANG, W., HODZIC, E., AND CHEN, Z. 1996. On uniformization of affine dependence algorithms. IEEE Trans. Comput. 45, 7 (July), 827-839.]] Google Scholar
SHANG, W., O'KEEFE,M.T.,AND FORTES, J. A. B. 1992. Generalized cycle shrinking. In Proceedings of the International Workshop on Algorithms and Parallel VLSI Architectures II (Gers, France, June 3-6), P. Quinton and Y. Robert, Eds. Elsevier Sci. Pub. B. V., Amsterdam, The Netherlands, 131-144.]] Google Scholar
SHIUE,W.AND CHAKRABARTI, C. 1999. Memory exploration for low power, embedded systems. In Proceedings of the 36th ACM/IEEE Conference on Design Automation (New Orleans LA, June). ACM Press, New York, NY, 140-145.]] Google Scholar
SHIUE, W.-T., TADAS, S., AND CHAKRABARTI, C. 2000. Low power multi-module, multiport memory design for embedded systems. In Proceedings of the IEEE Workshop on Signal Processing Systems (Lafayette, LA, Oct.). IEEE Press, Piscataway, NJ, 529-538.]]Google Scholar
SLOCK, P., WUYTACK, S., CATTHOOR, F., AND DE JONG, G. 1997. Fast and extensive system-level memory exploration for ATM applications. In Proceedings of the Tenth International Symposium on System Synthesis (ISSS '97, Antwerp, Belgium, Sept. 17-19), F. Vahid and F. Catthoor, Chairs. IEEE Computer Society Press, Los Alamitos, CA, 74-81.]] Google Scholar
STAN,M.R.AND BURLESON, W. P. 1995. Bus-invert coding for low-power I/O. IEEE Trans. Very Large Scale Integr. Syst. 3, 1 (Mar.), 49-58.]] Google Scholar
STOK,L.AND JESS, J. A. G. 1992. Foreground memory management in data path synthesis. Int. J. Circuits Theor. Appl. 20, 3, 235-255.]]Google Scholar
SU, C.-L. AND DESPAIN, A. M. 1995. Cache design trade-offs for power and performance optimization: a case study. In Proceedings of the 1995 International Symposium on Low Power Design (ISLPD-95, Dana Point, CA, Apr. 23-26), M. Pedram, R. Brodersen, and K. Keutzer, Eds. ACM Press, New York, NY, 63-68.]] Google Scholar
SUDARSANAM,A.AND MALIK, S. 2000. Simultaneous reference allocation in code generation for dual data memory bank asips. ACM Trans. Des. Autom. Electron. Syst. 5, 2 (Apr.), 242-264.]] Google Scholar
SYNOPSYS INC. 1997. Behavioral Compiler User Guide. Synopsys Inc, Mountain View, CA.]]Google Scholar
THIELE, L. 1989. On the design of piecewise regular processor arrays. In Proceedings of the IEEE International Symposium on Circuits and Systems (Portland, OR, May). IEEE Press, Piscataway, NJ, 2239-2242.]]Google Scholar
TOMIYAMA, H., HALAMB, A., GRUN, P., DUTT, N., AND NICOLAU, A. 1999. Architecture description languages for systems-on-chip design. In Proceedings of the 6th Asia Pacific Conference on Chip Design Languages (Fukuoka, Japan, Oct.). 109-116.]]Google Scholar
TOMIYAMA, H., ISHIHARA, T., INOUE, A., AND YASUURA, H. 1998. Instruction scheduling for power reduction in processor-based system design. In Proceedings of the Conference on Design, Automation and Test in Europe 98. 855-860.]] Google Scholar
TOMIYAMA,H.AND YASUURA, H. 1996. Size-constrained code placement for cache miss rate reduction. In Proceedings of the ACM/IEEE International Symposium on System Synthesis (La Jolla, CA, Nov.). ACM Press, New York, NY, 96-101.]] Google Scholar
TOMIYAMA,H.AND YASUURA, H. 1997. Code placement techniques for cache miss rate reduction. ACM Trans. Des. Autom. Electron. Syst. 2, 4, 410-429.]] Google Scholar
TSENG,C.AND SIEWIOREK, D. P. 1986. Automated synthesis of data paths in digital systems. IEEE Trans. Comput.-Aided Des. 5, 3 (July), 379-395.]]Google Scholar
VANDECAPPELLE, A., MIRANDA, M., CATTHOOR,E.B.F.,AND VERKEST, D. 1999. Global multimedia system design exploration using accurate memory organization feedback. In Proceedings of the 36th ACM/IEEE Conference on Design Automation (New Orleans LA, June). ACM Press, New York, NY, 327-332.]] Google Scholar
VERBAUWHEDE, I., CATTHOOR, F., VANDEWALLE, J., AND MAN, H. D. 1989. Background memory management for the synthesis of algebraic algorithms on multi-processor dsp chips. In Proceedings of the IFIP 1989 International Conference on VLSI (IFIP VLSI '89, Munich, Aug.). IFIP, 209-218.]]Google Scholar
VERBAUWHEDE,I.M.,SCHEERS,C.J.,AND RABAEY, J. M. 1994. Memory estimation for high level synthesis. In Proceedings of the 31st Annual Conference on Design Automation (DAC '94, San Diego, CA, June 6-10), M. Lorenzetti, Chair. ACM Press, New York, NY, 143-148.]] Google Scholar
VERHAEGH, W., LIPPENS, P., AARTS, E., KORST, J., VAN MEERBERGEN, J., AND VAN DER WERF,A. 1995. Improved force-directed scheduling in high-throughput digital signal processing. IEEE Trans. Comput.-Aided Des. 14, 8 (Aug.), 945-960.]]Google Scholar
VERHAEGH, W., LIPPENS, P., AARTS, E., MEERBERGEN, J., AND VAN DER WERF, A. 1996. Multi-dimensional periodic scheduling: model and complexity. In Proceedings of the Conference on EuroPar'96 Parallel Processing (Lyon, France, Aug.). Springer-Verlag, New York, NY, 226--235.]] Google Scholar
WILSON,P.R.,JOHNSTONE, M., NEELY, M., AND BOLES, D. 1995. Dynamic storage allocation: A survey and critical review. In Proceedings of the International Workshop on Memory Management (Kinross, Scotland, Sept.).]] Google Scholar
WOLF,M.E.AND LAM, M. S. 1991. A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans. Parallel Distrib. Syst. 2, 4 (Oct.), 452-471.]] Google Scholar
WOLFE, M. 1991. The tiny loop restructuring tool. In Proceedings of the 1991 International Conference on Parallel Processing (Aug.).]]Google Scholar
WOLFE, M. 1996. High-Performance Compilers for Parallel Computing. Addison-Wesley, Reading, MA.]] Google Scholar
WUYTACK, S., CATTHOOR, F., JONG,G.D.,AND MAN, H. D. 1999a. Minimizing the required memory bandwidth in vlsi system realizations. IEEE Trans. Very Large Scale Integr. Syst. 7, 4 (Dec.), 433-441.]] Google Scholar
WUYTACK, S., DA SILVA,J.L.,CATTHOOR, F., JONG,G.D.,AND YKMAN-COUVREU, C. 1999b. Memory management for embedded network applications. IEEE Trans. Comput.-Aided Des. 18, 5 (May), 533-544.]]Google Scholar
WUYTACK, S., DIGUET, J.-P., CATTHOOR,F.V.M.,AND DE MAN, H. J. 1998. Formalized methodology for data reuse exploration for low-power hierarchical memory mappings. IEEE Trans. Very Large Scale Integr. Syst. 6, 4, 529-537.]] Google Scholar
YKMAN-COUVREUR, C., LAMBRECHT, J., VERKEST, D., CATTHOOR, F., AND MAN, H. D. 1999. Exploration and synthesis of dynamic data sets in telecom network applications. In Proceedings of the 12th ACM/IEEE International Symposium on System-Level Synthesis (San Jose CA, Dec.). ACM Press, New York, NY, 125-130.]] Google Scholar
ZHAO,Y.AND MALIK, S. 1999. Exact memory size estimation for array computation without loop unrolling. In Proceedings of the 36th ACM/IEEE Conference on Design Automation (New Orleans LA, June). ACM Press, New York, NY, 811-816.]] Google Scholar

Index Terms

Data and memory optimization techniques for embedded systems
1. Hardware
  1. Electronic design automation
    1. High-level and register-transfer level synthesis
      1. Datapath optimization
    2. Logic synthesis
      1. Circuit optimization
  2. Integrated circuits
    1. Semiconductor memory
2. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Embedded Memories: Progress and a Look into the Future

Memories are categorized as embedded memories (e-memories) and stand-alone memories. E-memories favor high speed rather than low cost. In addition, they must maintain compatibility with the logic process, because they must be cofabricated on the same ...
Read More
A circuit-architecture co-optimization framework for exploring nonvolatile memory hierarchies

Many new memory technologies are available for building future energy-efficient memory hierarchies. It is necessary to have a framework that can quickly find the optimal memory technology at each hierarchy level. In this work, we first build a circuit-...
Read More
SRAM-DRAM hybrid memory with applications to efficient register files in fine-grained multi-threading
ISCA '11

Large register files are common in highly multi-threaded architectures such as GPUs. This paper presents a hybrid memory design that tightly integrates embedded DRAM into SRAM cells with a main application to reducing area and power consumption of multi-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Design Automation of Electronic Systems Volume 6, Issue 2
April 2001
127 pages
ISSN:1084-4309
EISSN:1557-7309
DOI:10.1145/375977
Issue’s Table of Contents

Copyright © 2001 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States

Journal Family
ACM Journals for the Design of Smart and Connected Systems
Publication History
- Published: 1 April 2001
Published in todaes Volume 6, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
DRAM
SRAM
address generation
allocation
architecture exploration
code transformation
data cache
data optimization
high-level synthesis
memory architecture customization
memory power dissipation
register file
size estimation
survey
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 341
  Total Citations
  View Citations
- 6,741
  Total Downloads
- Downloads (Last 12 months)91
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Data and memory optimization techniques for embedded systems

ACM Transactions on Design Automation of Electronic Systems

Abstract

References

Cited By

Index Terms

Recommendations

Embedded Memories: Progress and a Look into the Future

A circuit-architecture co-optimization framework for exploring nonvolatile memory hierarchies

SRAM-DRAM hybrid memory with applications to efficient register files in fine-grained multi-threading

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Data and memory optimization techniques for embedded systems

ACM Transactions on Design Automation of Electronic Systems

Abstract

References

Cited By

Index Terms

Recommendations

Embedded Memories: Progress and a Look into the Future

A circuit-architecture co-optimization framework for exploring nonvolatile memory hierarchies

SRAM-DRAM hybrid memory with applications to efficient register files in fine-grained multi-threading

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media