skip to main content
article
Free Access

Supporting systolic and memory communication in iWarp

Authors Info & Claims
Published:01 May 1990Publication History
Skip Abstract Section

Abstract

iWarp is a parallel architecture developed jointly by Carnegie Mellon University and Intel Corporation. The iWarp communication system supports two widely used interprocessor communication styles: memory communication and systolic communication. This paper describes the rationale, architecture, and implementation for the iWarp communication system.

The sending or receiving processor of a message can perform either memory or systolic communication. In memory communication, the entire message is buffered in the local memory of the processor before it is transmitted or after it is received. Therefore communication begins or terminates at the local memory. For conventional message passing methods, both sending and receiving processors use memory communication. In systolic communication, individual data items are transferred as they are produced, or are used as they are received, by the program running at the processor. Memory communication is flexible and well suited for general computing; whereas systolic communication is efficient and well suited for speed critical applications.

A major achievement of the iWarp effort is the derivation of a common design to satisfy the requirements of both systolic and memory communication styles. This is made possible by two important innovations in communication: (1) program access to communication and (2) logical channels. The former allows programs to access data as they are transmitted and to redirect portions of messages to different destinations efficiently. The latter increases the connectivity between the processors and guarantees communication bandwidth for classes of messages. These innovations have provided a focus for the iWarp architecture. The result is a communication system that provides a total bandwidth of 320 MBytes/sec and that is integrated on a single VLSI component with a 20 MFLOPS plus 20 MIPS long instruction word computation engine.

References

  1. 1 Annaratone, M., Bitz. F., Chme. E. Ktmg, H. T., Maul&, P., Ribas, H. Tseng, P. and Webb, J. Applications and Algorithm Partitioning on Warp. COMPCON Spring '87. IBEE Computer Society, 1987. pp. 272-275.Google ScholarGoogle Scholar
  2. 2 Annaratone, M. Amould, E. Gross. T. Kung, H. T., Lam, M., Menzilcioglu. 0. and Webb, J. A. The Warp Computer: Architecture, Implementation, and Performance. IEEE Transactions on Computers C-36.12 (December 1987). 1523-1538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3 Amould. E. A., Bitz, F. J., Cooper. E. C., Kung. H. T., Sansom, R. D. and Steenkiste, P. A. The Design of Nectar A Network Backplane for Heterogeneous Multicomputers. Roceedings of Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLGS El). ACM, April, 1989. pp. 205216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4 Athas, W. C. and Seitz, C. L. Multicomputers: Message- Passing Concurrent Computers. Compufer 21.8 (August 1988). 9-24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 Borlcar, S., Cohn, R., Cox. G., Gleason. S. Gross, T., Ktmg, H. T., Lam, M., Moore, B., Peterson, C., Pieper, J., Rankin, L., Tseng, P. S., Sutton, J. Urban&i. J. and Webb, J. iWarp: An Integrated Solution to High-Speed Parallel Computing. Proceedings of Supercomputing '88, IEEE Computer Society and ACM SIGARCH, Orlando, Florida, November, 1988,pp.330-339. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6 Cohn, R. Gross, T. Lam, M. and Tseng, P. S. Architecture and Compiler Tradeoffs for a Long Instruction Word Microprocessor. Roccedings of Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS Ill), ACM, April, 1989, pp. 2-14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7 Dally, William J. A VUZArchitecture for Concurrent Data Structures. Kluwer Academic Publishers. 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8 Dally, W. I, and Seitz. C. L. The Torus Routing Chip. Distributed Computing 1.4 (1986), 187-196.Google ScholarGoogle Scholar
  9. 9 Gross, T. Communication in iWarp Systems. Proceedings of Supercomputing '89. November. 1989. pp. 436 - 445. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 Hamey, L. G. C., Webb, J. A., and Wu, I. C. An Architecture Independent Programming Language for Lcw- Level Vision. Computer V&wt, Graphics, and Image Processing 48 (1989). 246-264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11 Hamey. L. G. C., Webb, I. A., and Wu, I. C. Low-level Vision on Warp and the Apply Rogrammlng Model. In Parallel Computation and Computers for Artifkcial Intelligence. Kluwer Academic Publishers, 1987. pp. 185-199. Edited by J. Kowahlc. Google ScholarGoogle Scholar
  12. 12 Kung. H. T. Systolic Communication. Proceedings of the International Conference on Systolic Arrays, San Diego, California, May, 1988, pp. 695-703.Google ScholarGoogle ScholarCross RefCross Ref
  13. 13 Kung. H. T. "Deadlock Avoidance for Systolic Communication". JorvM1 of Complexity 4.2 (June 1988), 87-105. (A revised version also appears in Conference Proceedings of the 15th Annual International Symposium on Computer Architecture, June 1988, pp. 252-260). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14 Kung, H. T. Network-Based Multicomputers: Redefming High Performance Computing in the 1990s. Roceedings of Decennial Caltech Conference on VLSI, MlT Press, Pasadena, California, March, 1989, pp. 49-66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 Lam, M. A Systolic Array Optimizing Compiler. Ph.D. Th., Carnegie Mellon University , May 1987. The thesis is published by Kluwer Academic Publishers. Boston. Massachusetts, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16 Meruilcioglu. 0. Kung. H. T. and Song, S. W. Comprehensive Evaluation of a Two-Dimensional Configurable Array. Proceediigs of the Nineteenth International Symposium on Fault-Tolerant Computing, 1989, pp. 93-100.Google ScholarGoogle Scholar
  17. 17 Seitz. C. L., Athas, W. C. Flaig, C. M., Martin, A. J., Seizovic, J., Steele, C. S. and Su. W-K. The Architecture and Rogramming of the Ametek Series 2010 Multicomputer. The Third Confererence on Hypercube Concurrent Computers and Applications., Pasadena, California, January, 1988, pp. 33.36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 Tseng. P. S. A Parallelking Compiler for Distributed Memory Parallel Computers. Ph.D. Th., Carnegie Mellon University, May 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Supporting systolic and memory communication in iWarp

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 18, Issue 2SI
      Special Issue: Proceedings of the 17th annual international symposium on Computer Architecture
      June 1990
      356 pages
      ISSN:0163-5964
      DOI:10.1145/325096
      Issue’s Table of Contents

      Copyright © 1990 Authors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 May 1990

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader