ABSTRACT
Although VLIW architectures offer the advantages of simplicity of design and high issue rates, a major impediment to their use is that they are not compatible with the existing software base. We describe new simple hardware features for a VLIW machine we call DAISY (Dynamically Architected Instruction Set from Yorktown). DAISY is specifically intended to emulate existing architectures, so that all existing software for an old architecture (including operating system kernel code) runs without changes on the VLIW. Each time a new fragment of code is executed for the first time, the code is translated to VLIW primitives, parallelized and saved in a portion of main memory not visible to the old architecture, by a Virtual Machine Monitor (software) residing in read only memory. Subsequent executions of the same fragment do not require a translation (unless cast out). We discuss the architectural requirements for such a VLIW, to deal with issues including self-modifying code, precise exceptions, and aggressive reordering of memory references in the presence of strong MP consistency and memory mapped I/O. We have implemented the dynamic parallelization algorithms for the PowerPC architecture. The initial results show high degrees of instruction level parallelism with reasonable translation overhead and memory usage.
- 1.J. Auslander, M. Philipose, C. Chambers, S.J. Eggers and B.N. Bershad, Fast, Effective Dynamic Compilation, PLDI '96. Google ScholarDigital Library
- 2.J.P. Buzen and U.O. Gagliardi. The Evolution of Virtual Machine Architecture National Computer Conference, pp.291-299, 1973.Google Scholar
- 3.R.F. Cmelik and D. Keppel, Shade: A Fast Instruction- Set Simulator for Execution Profiling, Technical Report UWCSE 93-06-06, University of Washington Computer Science and Engineering Department, 1993, http ://~. cs. washington, edu/rese arch/compiler/ papers, d/shade .html Google ScholarDigital Library
- 4.T.M. Conte and S.W. Sathaye Dynamic Rescheduling: A Technique for Object Code Compatibility in VLIW Architectures Proe. MICRO-28, pp. 208-217, 1995. Google ScholarDigital Library
- 5.K. Ebeio~lu, Some Design Ideas for a VLIWArchitecture for Sequential-Natured Software, In Parallel Processing (Proceedings of IFIP WG 10.3 Working Conference on Parallel Processing), edited by M. Cosnard et al., pp. 3-21, North Holland. Available at http ://www. research, ibm. com/vliwGoogle Scholar
- 6.K. Ebcio~glu and E.R. Altman, DAISY: Compilation for 100% Architectural Compatibility, Report No. RC 20538, IBM T.J. Watson Research Center. Available at http : //www.watson. ibm. corn: 8080Google Scholar
- 7.K. Ebci0~lu, E.R. A/~nan, and E. Hokenek, A JAVA ILP Machine Based on Fast Dynamic Compilation, Proceedings of (IEEE MASCOTS) International Workshop on Security and Efficiency Aspects of Java, Eilat, Israel, January 9-10, 1997, Available at http ://www. watson, ibm. com: 8080Google Scholar
- 8.K. Ebeio~,lu and R. Groves, Some Global Compiler Optimizations and Architectural Features for Improving the Performance of Superscalars, Report No. RC 16145, IBM T.J. Watson Research Center. Available at http : //www. research, ibm. com/vliwGoogle Scholar
- 9.M. Franklin and M. Smotherman. A Fill-unitApproach to Multiple Instruction Issue Proe. MICRO-27, 1994. Google ScholarDigital Library
- 10.T.R. Halfhill, Emulation: RISC's Secret Weapon BYTE, April 1994.Google Scholar
- 11.W.M.Hwu VLIW.. b it For Real This Time? Keynote Speech in MICRO-27, November 1994. The foils are currently in: http ://american. cs. ucdavis, edu/Micro27Google Scholar
- 12.V. Kathail, M. Seh}ansker, and B.R. Rau, HPL PlayDoh Architecture Specification Version 1.0, Technical report HPL-93-80, Hewlett-Packard Laboratories, Technical Publications Department, 1501 Page Mill Road, Palo Alto, CA 94304, Feb. 1994.Google Scholar
- 13.Scott A. Mahlke, William Y. Chen, Wen-mei W. Hwu, B. Ramakrishna Ran, and Mieheal S. Sehlansker, Sentinel Scheduling for VLIW and Superscalar Processors, Proceedings of the Fifth Int'l Conference on Architecture Support for Programming Languages and Operating Systems (ASPLOS-V), Boston, MA, Oct. 12-15, 1992, pp.238-247 Google ScholarDigital Library
- 14.S. Melvin, M. Shebanow, and Y. Patt, Hardware Support for Large Atomic Units in Dynamically Scheduled Machines, In Proceedings of the 21st Annual International Symposium on Mieroarehitecture, December 1988. Google ScholarDigital Library
- 15.S.M. Moon and K. Ebcio~lu, An Efficient Resource- Constrained Global Scheduling Technique for Superscalar and VLiWProcessors, Proc. MICRO-25, pp. 55- 71, IEEE Press, December 1992. Google ScholarDigital Library
- 16.M. Moudgill, J.H. Moreno, K. Ebcio/glu, E.R. Altman, S.K. Chen, and A. Polyak, Compiler/Architecture Interaction in a Tree-Based VLlWProcessor, Report No. RC 20694, IBM T.J. Watson Research Center. Available at http://www, watson, ibm. com: 8080.Google Scholar
- 17.R. Nair and M. Hopkins, Exploiting Instruction Level Parallelism in Processors by Caching Scheduled Groups, In Proceedings of the 24th Annual ACM/IEEE International Symposium on Computer Architecture, Denver, Colorado, June 2-4, 1997. Google ScholarDigital Library
- 18.B.R. Rau, Dynamically Scheduled VLIW Processors, Proe. MICRO-26, IEEE Press, December 1993. Google ScholarDigital Library
- 19.E. Rotenberg, S. Bennett, and J.E. Smith, Trace Cache: A Low Latency Approach to High Bandwidth Instruction Fetching, In Proceedings of the 29th Annual International Symposium on Microarchitecture, November 1996. Google ScholarDigital Library
- 20.G.M. Silberman and K. Ebeio~lu, An Architectural Framework for Supporting Heterogeneous Instruction- Set Architectures, IEEE Computer, Vol. 26, No. 6, June 1993, pp. 39-56. Google ScholarDigital Library
- 21.G.M. S ilberman and K. Ebcio/glu, An Architectural Framework for Migration from CISC to Higher Performance Platforms, Proe. 1992 International Conference on Supercomputing, pp. 198-215, ACM Press, 1992. Google ScholarDigital Library
- 22.R. Sites et al. Binary Translation, CACM, Vol. 36, no.2, pp. 69-81, Feb. 1993. Google ScholarDigital Library
- 23.T. ThompsonAn Atpha in PC Clothing BYTE, February 1996.Google Scholar
Index Terms
- DAISY: dynamic compilation for 100% architectural compatibility
Recommendations
DAISY: dynamic compilation for 100% architectural compatibility
Special Issue: Proceedings of the 24th annual international symposium on Computer architecture (ISCA '97)Although VLIW architectures offer the advantages of simplicity of design and high issue rates, a major impediment to their use is that they are not compatible with the existing software base. We describe new simple hardware features for a VLIW machine ...
Execution-Based Scheduling for VLIW Architectures
Euro-Par '99: Proceedings of the 5th International Euro-Par Conference on Parallel ProcessingWe describe a new dynamic software scheduling technique for VLIW architectures, which compiles into VLIW code the program paths that are actually executed. Unlike trace processors, or DIF, the technique executes operations speculatively on multiple ...
Dynamic Binary Translation and Optimization
We describe a VLIW architecture designed specifically as a target for dynamic compilation of an existing instruction set architecture. This design approach offers the simplicity and high performance of statically scheduled architectures, achieves ...
Comments