SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture

Authors:
K. Murakami

Department of Information Systems, Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, Fukuoka, 816 JAPAN

Department of Information Systems, Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, Fukuoka, 816 JAPAN
View Profile

,
N. Irie

Department of Information Systems, Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, Fukuoka, 816 JAPAN

Department of Information Systems, Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, Fukuoka, 816 JAPAN
View Profile

,
S. Tomita

Department of Information Systems, Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, Fukuoka, 816 JAPAN

Department of Information Systems, Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, Fukuoka, 816 JAPAN
View Profile

Authors Info & Claims

ACM SIGARCH Computer Architecture News Volume 17 Issue 3June 1989pp 78–85https://doi.org/10.1145/74926.74935

Published:01 April 1989Publication History

ACM SIGARCH Computer Architecture News

Abstract

SIMP is a novel multiple instruction-pipeline parallel architecture. It is targeted for enhancing the performance of SISD processors drastically by exploiting both temporal and spatial parallelisms, and for keeping program compatibility as well. Degree of performance enhancement achieved by SIMP depends on; i) how to supply multiple instructions continuously, and ii) how to resolve data and control dependencies effectively. We have devised the outstanding techniques for instruction fetch and dependency resolution. The instruction fetch mechanism employs unique schemes of; i) prefetching multiple instructions with the help of branch prediction, ii) squashing instructions selectively, and iii) providing multiple conditional modes as a result. The dependency resolution mechanism permits out-of-order execution of sequential instruction stream. Our out-of-order execution model is based on Tomasulo's algorithm which has been used in single instruction-pipeline processors. However, it is greatly extended and accommodated to multiple instruction pipelining with; i) detecting and identifying multiple dependencies simultaneously, ii) alleviating the effects of control dependencies with both eager execution and advance execution, and iii) ensuring a precise machine state against branches and interrupts. By taking advantage of these techniques, SIMP is one of the most promising architectures toward the coming generation of high-speed single processors.

References

Acosta86 R.D.Acosta, J.Kjelstrup, and H.C.Torng, "An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors," IEEE Trans. Cornput., vol.C-36, no.9, pp.815-828, Sept. 1986. Google ScholarDigital Library
Colwell87 R.P.ColwelI, R.P.Nix, J.J.O'Donnell, D.B.Papworth, and P.K.Rodman, "A VLIW Architecture for a Trace Scheduling Compiler," Proc. 2nd Znt. Conf. Archifectural Support for Programming Languages and Operating Systems fASPLOS If), pp.180-192, Oct. 1987. Google ScholarCross Ref
Fisher81 J.A.Fisher, "Trace Scheduling: A Technique for Global Microcode Compaction," IEEE Trans. Comput., vol. C-30, no.7, pp.478-490, July 1981.Google Scholar
Fisher83 J.A.Fisher, "Very Long Instruction Word Architectures and the ELI-512," Proc. 10th Ann. Int. Symp. Computer Architecture, pp.140-150, June 1983. Google ScholarDigital Library
Hagiwara80 H.Hagiwara, STomita, S.Oyanagi, and K.Shibayama, "A Dynamically Microprogrammable Computer with Low-Level Parallelism," IEEE Trans. Comput., vol.C-29, no.7, pp.577-695, July 1980.Google ScholarDigital Library
Hwu87 W.W.Hwu and Y.N.Patt, "Checkpoint Repair for Out-oforder Execution Machines," Proc. 14th Artn. Int. Symp. Computer Architecture, pp.18-26, June 1987; also IEEE Trans. Cornput. vol.C-36, no.12, pp.1496.1514, Dec. 1987. Google ScholarDigital Library
Irie88 N.Irie, M.Kuga, K.Murakami, and S.Tomita, "Speedup Mechanisms and Performance Estimate for the SIMP Processor Prototype (in Japanese)," ZPSJ WGARC report 73-11, Nov. 1988.Google Scholar
Kuga89 M.Kuga, K.Murakami, and STomita, "Low-level Parallel Processing Algorithms for the SIMP Processor Prototype (in Japanese)," Proc. IPSJ Joint Symp. Parallel Processing'89, pp.163-170, Feb. 1989.Google Scholar
Lam88 M.Lam, "Software Pipelining: An Effective Scheduling Technique for VLIW Machines," Proc. SIGPLAN'88 Conf. Programming Language Design and ImpZemcntation , pp.318- 328, June 1988. Google ScholarDigital Library
Lee84 J.K.F.Lee and A.J.Smith, "Branch Prediction Strategies and Branch Target Buffer Design," IEEE Computer, vol.17, no.1, pp.6-22, Jan. 1984.Google ScholarDigital Library
Murakami88 K.Murakami,A.Fukuda, T.Sueyoshi, and STomita, "SIMP:Single Instruction stream/Multiple instruction Pipelining (in Japanese)," IPSJ WGARC report 69-4, Jan. 1988.Google Scholar
Patt85 Y.N.Patt, W-M.Hwu, and M.Shebanow,"HPS, A New Microarchitecture: Rationale and Introduction," Proc. 18th Ann. Workshop on Microprogramming, pp.103-108, Dec. 1985. Google ScholarDigital Library
Pleszkun88 A.R.Pleszkun and G.S.Sohi, "The Performance Potential of Multiple Functional Unit Processors," Proc. 15th Ann. lnt. Symp. Computer Architecture, pp.37-44, May 1988. Google ScholarDigital Library
Rau89 JB.R.Rau,D.W.L.Yen, W.YenandR.A.Towle,"TheCydra 5 Departmental Supercomputer," ZEEE Computer, vol.22, no.J, Jan. 1989. Google ScholarDigital Library
Smith85 J.E.Smith and A.R.Pleszkun, "Implementation of Precise Interrupts in Pipelined Processors," Proc. 12th Ann. Int. Symp. Computer Architecture, pp.36-44, June 1985; also IEEE Trans. Cornput., vol.C-37, no.5, pp.562-573, May 1988. Google ScholarDigital Library
Sohi87 GSSohi and S.Vajapeyam, "Instruction Issue Logic for High-Performance, Interruptable Pipelined Processors," Proc. 14th Ann. Int. Symp. Computer Architecture, pp.27-34, June 1987. Google ScholarDigital Library
Tjaden70 GSTjaden and M.J.Flynn, "Detection and Parallel Execution of independent instructions," IEEE Trans. Cornput., vol.C-19,no.l0, pp.889-895, Oct. 1970.Google Scholar
Tomasulo67 R.M.Tomasulo, "An Efficient Algorithm for Exploiting Multiple Arithmetic Units." IBM J. Res. Develop., vol.ll,pp.25-33, Jan. 1967.Google Scholar
Tomita83 S.Tomita, KShibayama, T.Kitamura, T.Nakata, and H.Hagiwara, "A User-Microprogrammable, Local Host Computer with Low-Level Parallelism," Proc. 10th Ann. Int. Symp. Computer Architecture, pp.151-157, June 1983. Google ScholarDigital Library
Tomita86 STomita, K.Shibayama, T.Nakata, S.Yuasa, and H.Hagiwara, "A Computer with Low-Level Parallelism QA-2 - Its Applications to 3-D Graphics and Prolog/Lisp Machines -," Proc. 13th Ann. Int. Symp. Computer Architecture, pp.280-289, June 1986. Google ScholarDigital Library
Weiss84 SWeiss and J.E.Smith, "Instruction Issue Logic for Pipelined Supercomputers," Proc. 11th Ann. Znt. Symp. Computer Architecture, pp.llO-118, June 1984; also IEEE Trans. Comput., vol.C-33, no.ll.pp.1013-1022, Nov. 1984. Google ScholarDigital Library

Index Terms

SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
    2. Serial architectures
      1. Pipeline computing

Recommendations

SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture
ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture

SIMP is a novel multiple instruction-pipeline parallel architecture. It is targeted for enhancing the performance of SISD processors drastically by exploiting both temporal and spatial parallelisms, and for keeping program compatibility as well. Degree ...
Read More
Natural instruction level parallelism-aware compiler for high-performance QueueCore processor architecture

This work presents a static method implemented in a compiler for extracting high instruction level parallelism for the 32-bit QueueCore, a queue computation-based processor. The instructions of a queue processor implicitly read and write their operands, ...
Read More
Modeling the effects of instruction queue loading on a static instruction stream micro-architecture
MICRO 21: Proceedings of the 21st annual workshop on Microprogramming and microarchitecture

Increased processor performance requires the exploitation of the parallelism that exists within the instruction stream and within the processor itself: A static instruction stream micro-architecture, CONDEL, extracts and uses the machine instruction ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGARCH Computer Architecture News Volume 17, Issue 3
Special Issue: Proceedings of the 16th annual international symposium on Computer Architecture
June 1989
400 pages
ISSN:0163-5964
DOI:10.1145/74926
Editor:
Jean-Claude Syre
Issue’s Table of Contents
ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture
April 1989
426 pages
ISBN:0897913191
DOI:10.1145/74925
Chairman:
Jean-Claude Syre
Copyright © 1989 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 April 1989
Check for updates
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 63
  Total Citations
  View Citations
- 1,077
  Total Downloads
- Downloads (Last 12 months)96
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture

ACM SIGARCH Computer Architecture News

Abstract

References

Cited By

Index Terms

Recommendations

SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture

Natural instruction level parallelism-aware compiler for high-performance QueueCore processor architecture

Modeling the effects of instruction queue loading on a static instruction stream micro-architecture