Abstract
A class of parallel processors potentially involving thousands of individual processing elements is described. The architecture is based on the perfect shuffle connection and has two favorable characteristics: (1) Each processor communicates with a fixed number of other processors. (2) Important communication functions can be accomplished in time proportional to the logarithm of the number of processors. A number of basic algorithms for these “ultracomputers” are presented, and physical design considerations are discussed in a preliminary fashion.
- 1 AHO, A., HOPCROFT, J., AND ULLMAN, J.D. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, Mass., 1976.]] Google Scholar
- 2 BATCHER, K.E. Sorting networks and their applications. Proc. 1968 Spring JCC, AFIPS Press, Arlington, Va., pp. 307-314.]]Google Scholar
- 3 BAUDET, G., AND STEVENSOn, D. Optimal sorting algorithms for parallel computers. Tech. Rep., Computer Science Dep., Carnegie-Mellon Univ., Pittsburgh, Pa. See also IEEE Trans. Comput. C-27 (1978), 84-86.]]Google Scholar
- 4 BE~ES, V.E. Mathematical theory of connecting networks and telephone traffic. Academic Press, New York, 1965.]]Google Scholar
- 5 CHANORA, A.K. Maximal parallelism in matrix multiplication. Rep. RC-6193, IBM Thomas J. Watson Research Center, Yorktown Heights, N.Y., 1976.]]Google Scholar
- 6 CLOS, C. A study of nonblocking switching networks. Bell Syst. Tech. J. 32 (1953), 406-424.]]Google Scholar
- 7 CSANKY, L. Fast parallel matrix inversion algorithms. SIAM J. Cornput. 5 (1976), 618-623. See also Proc. 16th Ann. IEEE Syrup. on Foundations of Computer Science, Berkeley, Calif., 1975, pp. 11-12.]]Google Scholar
- 8 GOTrLIEB, A. PLUS--A PL/I ultracomputer simulator, I. Ultracomputer Note, Computer Science Dep., New York Univ., New York, N.Y., 1980.]]Google Scholar
- 9 GOTTLIEB, A. PLUS--A PL/I ultracomputer simulator, II. Ultracomputer Note, Computer Science Dep., New York Univ., New York, N.Y., 1980.]]Google Scholar
- 10 GOTTLIEB, A., AND KRUSKAL, C.P. MULT--A multitasking ultracomputer language with timing, I. Ultracomputer Note, Computer Science Dep., New York Univ., New York, N.Y., 1980.]]Google Scholar
- 11 GOTTLIEB, A., AND KRUSKAL, C.P. MULT--A multitasking ultracomputer language with timing, xl 28II. Ultracomputer Note, Computer Science Dep., New York Univ., New York, N.Y., 1980. II. Ultracomputer Note, Computer Science Dep., New York Univ., New York, N.Y., 1980.]]Google Scholar
- 12 GOTTLIEB, A., AND KRUSKAL, C.P. Supersaturated ultracomputer algorithms. Ultracomputer Note, Computer Science Dep., New York Univ., New York, N.Y., 1980.]]Google Scholar
- 13 HmSHBERC, D.S. Parallel algorithms for the transitive closure and the connected component problems. Proc. 8th Ann. ACM Symp. on Theory of Computing, Hershey, Pa., May 1976, pp. 55- 57.]] Google Scholar
- 14 HmSCHBERG, D.S. Fast parallel sorting algorithms. Commun. ACM 21, 8 (Aug. 1978), 657-661. See also Tech. Rep., Dep. of Electrical Engineering, Rice Univ., Houston, Texas, 1977.]] Google Scholar
- 15 KNU?H, D.E. The Art of Computer Programming, Vol. 3: Sorting and Searching. Addison- Wesley, Reading, Mass., 1972, pp. 232-235.]]Google Scholar
- 16 KUNG, H.T. New algorithms and lower bounds for the parallel evaluation of certain rational expressions and recurrences. J. ACM 23, 2 (April 1979), 252-261. See also Proc. 6th Ann. ACM Symp. on Theory of Computing, Seattle, Wash., 1974, pp. 323-333.]] Google Scholar
- 17 LEvitt, K.N., AND KarTz, W.H. Cellular arrays for the solution of graph problems. Commun. ACM 15, 9 (Sept. 1972), 789-801.]] Google Scholar
- 18 NASSIMI, D., AND SAHNI, S. Parallel permutation and sorting networks and a new generalizedconnection network. Preprint, Univ. of Minnesota, Minneapolis, Minn., 1979.]] Google Scholar
- 19 NASSIMI, D., AND SAHNI, S. Bitonic sort on a mesh-connected parallel computer. IEEE Trans. Comput. C-28 (1979), 2-7.]]Google Scholar
- 20 OPFERMAN, D.C., AND TSAo-Wu, N.T. On a class of rearrangeable switching networks. Bell Syst. Tech. J. 50 (1971), 1579-1618.]]Google Scholar
- 21 ORcu~r, S.E. Parallel solution methods for triangular linear systems of equations. Tech. Rep. 77, Digital Systems Laboratory, Stanford Univ., Stanford, Calif., 1974.]] Google Scholar
- 22 PEASE, M.C. Matrix inversion using parallel processing. J. ACM 14, 4 (Oct. 1967), 757-764.]] Google Scholar
- 23 PEASS, M.C. An adaptation of the fast Fourier transform for parallel processing. J. ACM 15, 2 (April 1968), 252-264.]] Google Scholar
- 24 PRF. PARATA, F.P. Parallelism in sorting. Proc. 1977 Int. Conf. on Parallel Processing, Detroit, Mich., 1977, pp. 202-206.]]Google Scholar
- 25 PREPARATA, F.P., AND SARWATE, D.V. An improved parallel processor bound in fast matrix inversion. Inf. Proc. Letters 7 (1978), 178-150.]]Google Scholar
- 26 SAMr. H, A., AND BRENT, R. Solving triangular systems on a parallel computer. SIAM J. Numer. Anal. (1977), 1101-1113.]]Google Scholar
- 27 STONr., H.S. Parallel processing with the perfect shuffle. IEEE Trans. Comput. C-20 (1971), 153-161.]]Google Scholar
- 28 VALIANT, L.G. Parallelism in comparison problems. SIAM J. Cornput. 4 (1975), 348-355.]]Google Scholar
- 29 WAKSMAN, A. A permutation network. J. ACM 15, 1 (Jan. 1968), 159-163.]] Google Scholar
- Several useful surveys of algorithms for parallel numerical and nonnumerical computation have appeared. See Heller (1978), Miranker (1971), Kuck (1975), and Stone (1975b).]]Google Scholar
- ABRAMSON, N., AND KUO, F.F., EDS. Computer'Communication Networks. Prentice-Hall, Englewood Cliffs, N.J., 1973.]]Google Scholar
- ADAMS, D.A. A model for parallel computations. In Parallel Processor Systems, Technologies, and Applications, L.C. Hobbs, D.J. Theis, J. Trimble, H. Titus, and I. Highberg, Eds., Spartan Publishers, Washington, D.C., 1970, pp. 311-333.]]Google Scholar
- AHO, A., AND ULLMAN, J.D. Dynamic memories with rapid random and sequential access. IEEE Trans. Comput. C-23 (1974), 272-276.]]Google Scholar
- AKERS, S.B. A rectangular logic array. IEEE Trans. Comput. C-21 (1972), 848-857.]]Google Scholar
- ANI)r. RSON, G.A. 'Interconnecting a distributed processor system for avionics. Proc. IEEE Symp. on Computer Architecture, Gainesville, Fla., 1973, pp. 1-20.]] Google Scholar
- ANI)~.RSON, G.A., AND JENSEN, E.D. Computer interconnection structures: Taxonomy, characteristics, and examples. Comput. Surv. 7, 4 (Dec. 1975), 197-213.]] Google Scholar
- ARJOMANDI, E. A study of parallelism in graph theory. Ph.D. Thesis, Dep. of Computer Science, Univ. of Toronto, Toronto, Ontario, Canada, 1977.]] Google Scholar
- ARJOMANDI, E., AND CORNF. IL, D.G. Parallel computations in graph theory. Proc. 16th Ann. IEEE Syrup. on Foundations of Computer Science, Berkeley, Calif., 1975, pp. 13-18.]]Google Scholar
- ARNOLD, R.G., AND PAGE, E.W. A hierarchical, restructurable multimicroprocessor architecture. 3rd Ann. Syrup. on Computer Architecture, Clearwater, Fla., 1976, pp. 40-45.]] Google Scholar
- AVRIEL, M., AND WILDE, D.J. Optimal search for a maximum with sequences of simultaneous function evaluations. Manage. Sci. 12 (1966), 722-731.]]Google Scholar
- BAER, J.L., AND RUSSELL, E.C. Preparation and evaluation of computer programs for parallel processing systems. In Parallel Processor Systems, Technologies, and Applications, L.C. Hobbs, D.J. Theis, J. Trimble, H. Titus, and I. Highberg, Eds., Spartan Publishers, Washington, D.C., 1970, pp. 375-415.]]Google Scholar
- BANDYOPADHYAY, $., BASU, S., AND CHOUDHARY, A.K. A cellular permuter array. IEEE Trans. Comput. C-21 (1972), 1116-1119.]]Google Scholar
- BARNES, G.H., BROWN, R., KATO, M., KUCK, D., SLOTNiCK, D., AND STOKES, R. The ILLIAC IV computer. IEEE Trans. Comput. C-17 (1968), 746-757.]]Google Scholar
- BASSALYGO, L.A., GRUSHKO, I.I., AND NEYMAN, V.I. Asymptotic estimation of the number of switching points in nonblocking circuits. Telecommun. Radio Eng. 24 (1970), 34-39.]]Google Scholar
- BASSALYG0, L.A., AND PINSKER, M.S. On the complexity of optimal nonblocking switching networks without rearrangement. Probl. Inf. Transm. 9 (1973), 84-87 (translation).]]Google Scholar
- BATCHER, K.E. The flip network in STARAN. Proc. 1976 Int. Conf. on Parallel Processing, Detroit, Mich., 1976, pp. 65-71.]]Google Scholar
- BATCHER, K.E. STARAN series E. Proc. 1977 Int. Conf. on Parallel Processing, Wayne State Univ., Detroit, Mich., 1977, pp. 140-143.]]Google Scholar
- BENES, V.E. Algebraic and topological properties of connecting networks. Bell Syst. Tech. J. 41 (1962), 1249-1273.]]Google Scholar
- BENES, V.E. Permutation groups, complexes, and rearrangeable connecting networks. Bell Syst. Tech. J. 43 (1964), 1619-1640.]]Google Scholar
- BENES, V.E. Optimal rearrangeable multistage connecting networks. Bell Syst. Tech. J. 42 (1964), 1641-1656.]]Google Scholar
- BENES, V.E. Applications of group theory to connecting networks. Bell Syst. Tech. J. 54 (1975), 4O7-42O.]]Google Scholar
- BERNSTEIN, A. Analysis of programs for parallel processing. IEEE Trans. Electron. Comput. EC-15 (1966), 757-763.]]Google Scholar
- BOULTIS, R.L., AND FAISS, R.O. STARAN E performance and LACIE algorithms. Proc. 1977 Int. Conf. on Parallel Processing, Wayne State Univ., Detroit, Mich., 1977, pp. 144-152.]]Google Scholar
- BREDT, T.H. A survey of models for parallel computing. Tech. Rep. 8, Stanford Univ. Electronics Lab., Palo Alto, Calif., 1970.]] Google Scholar
- BREDT, T.H., AND MCCLUSKEY, E.J. Analysis and synthesis of control mechanisms for parallel processes. In Parallel Processor Systems, Technologies, and Applications, L.C. Hobbs, D.J. Theis, J. Trimble, H. Titus, and I. Highberg, Eds., Spartan Publishers, Washington, D.C., 1970, pp. 287-296.]]Google Scholar
- BRENT, R. On the addition of binary numbers. IEEE Trans. Comput. C-19 (1970), 758-759.]]Google Scholar
- BRENT, R. The parallel evaluation of arithmetic expressions in logarithmic time. In Complexity of Sequential and Parallel Numerical Algorithms, J.F. Traub, Ed., Academic Press, New York, pp. 83- 102.]]Google Scholar
- BRENT, R. The parallel evaluation of general arithmetic expressions. J. ACM 21, 2 (April 1974), 201-206.]] Google Scholar
- BRENT, R., KUCK, D., AND MARUYAMA, K. The parallel evaluation of arithmetic expressions without divisions. IEEE Trans. Comput. C-23 (1973), 532-534.]]Google Scholar
- BRINSFIELD, W.A., AND MILLER, R.E. On the composition of parallel program schemata. Conf. Rec. IEEE 12th Ann. Syrup. on Switching and Automata Theory, East Lansing, Mich., 1971, pp. 20-23.]]Google Scholar
- BRINSFIELD, W.A., AND MILLER, R.E. Insertion of parallel program schemata. Proc. 7th Ann. Princeton Conf. Information Sciences and Systems, Princeton, N.J., 1973.]]Google Scholar
- BUDNIK, P., AND KUCK, D.J. The organization and use of parallel memories. IEEE Trans. Comput. C-20 (1971), 1566-1569.]]Google Scholar
- BUZBEE, B.L. A fast Poisson solver amenable to parallel computation, iEEE Trans. Comput. C-22 (1973), 793-796.]]Google Scholar
- CALLAHAN, D. Parallel solution of sparse simultaneous linear equations. Tech. Rep., Dep. of Electrical Engineering, Univ. of Michigan, Ann Arbor, Mich., 1973.]]Google Scholar
- CANTOR, D.G. On nonblocking switching networks. Networks I (1971), 367-377.]]Google Scholar
- CARROLL, A.B., AND WETHERALD, R.?. Applications of parallel processing to numerical weather prediction. J. ACM 14, 3 (July 1967), 591-614.]] Google Scholar
- CASTI, J.L., RICHARDSON, M.H., AND LARSON, R.E. Dynamic programming in parallel computers. J. Optim. Theory Appl. 12 (1973), 423-438.]]Google Scholar
- CHANDRA~ A.K. Independent permutations, as related to a problem of Moser and a theorem of Polya. J. Comb. Theory Series A 16 (1974), 111-120.]]Google Scholar
- CHAZAN, D., AND MIRANKER, W.L. Chaotic relaxation. Linear Alg. Appl. 2 (1969), 199-222.]]Google Scholar
- CHAZAN, D., AND MIRANKER, W.L. A nongradient and parallel algorithm for unconstrained minimization. SlAM J. Control 8 (1970), 207-217.]]Google Scholar
- CHEN, I.N. A cellular data manipulating array. Proc. 1975 Sagamore Conf. on Parallel Processing, Sagamore Lake, N.Y., 1975, p. 114.]]Google Scholar
- CHEN, S.C. Speedup of iterative programs in multiprocessor systems. Ph.D. Thesis, Computer Science Dep., Univ. of Illinois at Urbana, Urbana, Ill., 1975 {Rep. 694 (NSF-OCA-GJ-36936-000004)}.]] Google Scholar
- CHEN, S.C. Time and parallel processor bounds for linear recurrence systems with constant coefficients. Proc. 1976 Int. Conf. on Parallel Processing, Detroit, Mich., 1976, pp. 196-205.]]Google Scholar
- CHEN, S.C., AND KUCK, D.J. Time and parallel processor bounds for linear recurrence systems. IEEE Trans. Comput. C-24 (1975), 701-717.]]Google Scholar
- CHEN, S.C., AND SAMEH, A. On parallel triangular system solvers. Proc. 1975 Sagamore Conf. on Parallel Processing, Sagamore Lake, N.Y., 1975, pp. 237-238.]]Google Scholar
- CHEN, T.C. Parallelism, pipelining and computer efficiency. Comput. Design 10 (1971), 69-74.]]Google Scholar
- CHEN, T.C. Unconventional superspeed computer systems. AFIPS Proc. 1971 Spring JCC, Vol. 38, AFIPS Press, Arlington, Va., pp. 365-371.]]Google Scholar
- CHrN, Y.K., AND FENG, T. A parallel algorithm for the maximum flow problem (summary). Proc. 1973 Sagamore Conf. on Parallel Processing, Sagamore Lake, N.Y., 1973, p. 60.]]Google Scholar
- COLE, S.N. Real-time computation by n-dimensional arrays of finite-state machines. IEEE Trans. Comput. C-18 (1969), 349-365.]]Google Scholar
- COPPAOr., S., AND SCHWARTZ, J.T. A fast switch. Amer. Math. Monthly 83 (1976), 711-717.]]Google Scholar
- CRANE, B.A., AND GiTHF. NS, J.A. Bulk processing in distributed logic memory. IEEE Trans. Electron. Comput. EC-14 (1965), 186-196.]]Google Scholar
- CSANKY, L. On the parallel complexity of some computational problems. Ph.D. Dissertation, Dep. of Computer Science, Univ. of California, Berkeley, Calif., 1974.]]Google Scholar
- CYRE, W.R., AND LIPOVSKI, G.J. On generating multipliers for a cellular fast Fourier transform processor. IEEE Trans. Comput. C-21 (1972), 83-87.]]Google Scholar
- DAVIDSON, I.A., AND FIELD, J.A. Design criteria for a switch for a multiprocessor computing system. Proc. 1975 Sagamore Conf. on Parallel Processing, Sagamore Lake, N.Y., 1975, pp 110-113.]]Google Scholar
- DENNIS, J.B., AND MISUNAS, D.P. A computer architecture for highly parallel signal processing. Proc. ACM 1974 National Conf., San Diego, Calif., pp. 402-409.]]Google Scholar
- DENNIS, J.B., AND MISUNAS, D.P. A preliminary architecture for a basic data-flow processor. Proc. 2nd Ann. IEEE Symp. on Computer Architecture, Houston, Texas, 1975, pp. 126-132.]] Google Scholar
- DONNELLY, J.D. Periodic chaotic relaxation. Linear Alg. Appl. 4 (1971), 117-128.]]Google Scholar
- DORN, W.S., Hsu, N.C., AND RIVLIN, T~J. Some mathematical aspects of parallel computation. Tech. Rep. RC-647, IBM Thomas J. Watson Research Center, Yorktown Heights, New York, 1962.]]Google Scholar
- DowNs, R.S. Real-time algorithms and data management on Illiac IV. IEEE Trans. Comput. C-22 (1973), 773-777.]]Google Scholar
- DRYSDALE, R.L. Sorting networks which generalize Batcher's odd-even merge. Honors Paper, Knox College, Galesburg, Ill., 1973.]]Google Scholar
- ECKSTEIN, D. Parallel graph processing using depth-first search and breadth-first search. Ph.D. Thesis, Univ. of Iowa, 1977.]] Google Scholar
- ENSLOW, P.H., ED. Multiprocessors and parallel processors. John Wiley & Sons, New York, 1974.]]Google Scholar
- ERMANN, R.N., A~D GROSKY, W.I. Some computation and systems-theoretic properties of regular processor networks. Proc. 1976 Int. Conf. on Parallel Processing, Detroit, Mich., 1976, pp. 230-234.]]Google Scholar
- ESTRIN, G., BUSSELL, B., TURN, R., AND BIBB, J. Parallel processing in a restructurable computer system. IEEE Trans. Electron. Comput. EC-12 (1963), 747-755.]]Google Scholar
- EVEN, S. Parallelism in tape sorting. Commun. ACM 17, 4 (April 1974), 202-204.]] Google Scholar
- EVENSEN, A.J., AND TROY, J.L. Introduction to the architecture of a 288-element PEPE. Proc. 1973 Sagamore Conf. on Parallel Processing, Sagamore Lake, N.Y., 1973, pp. 162-169.]]Google Scholar
- FALKOFF, A.D. Algorithms for parallel search memories. J. ACM 9, 4 (Oct. 1962), 488-511.]] Google Scholar
- FENG, T.Y. Data manipulating functions in parallel processors and their implementations, iEEE Trans. Comput. C-23 (1964), 309-318.]]Google Scholar
- FENC, T.Y. A configurable multi-microprocessor organization. In Micro-Architecture of Computer Systems, Proc. Euromicro, Nice Workshop, 1975, R. Hartenstein, Ed., N(irth-Holland, Amsterdam, pp. 207-219.]]Google Scholar
- FISHER, D. Program analysis for multi-processing. Tech. Rep. TR-67-2, Burroughs Corp., 1967. FLOYD, R.W., AND KNUTH, D.E. The Bose-Nelson sorting problem. CS Rep. 70-177, Stanford Univ., Stanford, Calif., 1976.]]Google Scholar
- FLYNN, M.J. Very-high-speed computing systems. Proc. IEEE 54 (1966), 1901-1909.]]Google Scholar
- FLYNN, M.J., PoovIN, A., AND SHIMIZU, K. A multiple instruction stream processor with shared resources. In Parallel Processor Systems, Technologies, and Applications, L.C. Hobbs, D.J. Theis, J. Trimble, H. Titus, and I. Highberg, Eds., Spartan Publishers, Washington, D.C., 1970, pp. 215-286.]]Google Scholar
- GALE, D., AND KARP, R.K. A phenomenon in the theory of sorting. IEEE Conf. Rec. of llth Ann. Symp. on Switching and Automata Theory, Santa Monica, Calif., 1970, pp. 51-59.]]Google Scholar
- GAVRIL, F. Merging with parallel processors. Commun. ACM 18, 10 (Oct. 1975), 588-591.]] Google Scholar
- GENTLEMAN, W.M. Some complexity results for matrix computations on parallel processors. Tech. Rep., Dep. of Computer Science, Univ. of Waterloo, Waterloo, Ontario, Canada, 1976.]]Google Scholar
- GEORGE, A., POOLE, W.G., AND VOIOHT, R.G. Analysis of dissection algorithms for vector computers. Tech. Rep., Institute for Computer Applications in Science and Engineering, Hampton, Va., 1976.]]Google Scholar
- GILL, S. Parallel programming. Comput. J. I (1958), 2-10.]]Google Scholar
- GILMORE, P.A. Structuring of parallel algorithms, j. ACM 15, 2 (April 1968), 176-192.]] Google Scholar
- GILMORE, P.A. Parallel relaxation. Tech. Rep., Goodyear Aerospace Corp., Akron, Ohio, 1971.]]Google Scholar
- GOKE, R. Connecting networks for partitioning polymorphic systems. Ph.D. Dissertation, Dep. of Electrical Engineering, Univ. of Florida, Gainesville, Fla., 1976.]]Google Scholar
- GOKE, R., AND LIPOVSKI, G.J. Banyan networks for partitioning on multiprocessor systems. Proc. 1st Ann. Symp. on Computer Architecture, Gainesville, Fla., 1973, pp. 21-30.]] Google Scholar
- GOLOMB, S.W. Permutations by cutting and shuffling. SIAM Rev. 3 (1961), 293-297.]]Google Scholar
- GONZALES, M.J., AND RAMAMOORTHY, C.V. Recognition and representation of parallel processable streams in computer programs. In Parallel Processor Systems, Technologies and Applications, L.C.]]Google Scholar
- Hobbs, D.J. Theis, J. Trimble, H. Titus, and I. Highberg, Eds., Spartan Publishers, Washington, D.C., 1970, pp. 335-373.]]Google Scholar
- GONZALES, M.J., AND RAMAMOORTHY, C.V. Program suitability for parallel processing. IEEE Trans. Comput. C-20 (1971), 647-654.]]Google Scholar
- GREEN, M.W. Some improvements in nonadaptive sorting algorithms. Proc. 6th Ann. Princeton Conf.' on Information Sciences and Systems, Princeton, N.J., 1972, pp. 387-391.]]Google Scholar
- HARADH, K. Sequential permutation networks. IEEE Trans. Comput. C-21 (1972), 472-479.]]Google Scholar
- HELLER, D. A determinant theorem with applications to parallel algorithms. Tech. Rep., Dep. of Computer Science, Carnegie-Mellon Univ., Pittsburgh, Pa., 1973.]]Google Scholar
- HELLER, D. A survey of parallel algorithms in numerical linear algebra. SIAM Rev. 20 (1978), 740- 777. See also Tech. Rep., Dep. of Computer Science, Carnegie-Mellon Univ., Pittsburgh, Pa., 1976.]]Google Scholar
- HOLLAND, J. A universal computer capable of executing an arbitrary number of subprograms simultaneously. Proc. IEEE Eastern Joint Computer Conference, Boston, Mass., pp. 108-113.]]Google Scholar
- HYAFIL, L., AND KUNC, H.T. Parallel algorithms for solving triangular linear systems. Tech. Rep., Dep. of Computer Science, Carnegie-Mellon Univ., Pittsburgh, Pa., 1974.]]Google Scholar
- HYAFIL, L., AND KUNG, H.T. The complexity of parallel evaluation of linear recurrences. Proc. 7th Ann. ACM Symp. on Theory of Computing, Albuquerque, N.M., 1975, pp. 12-22.]] Google Scholar
- HYAFIL, L., AND KUNG, H.T. Bounds on the speed-ups of parallel evaluation of recurrences. Proc. 2nd USA-Japan Computer Conf., Tokyo, 1975, pp. 178-182.]]Google Scholar
- HYAFIL, L., AND KUNG, H.T. The complexity of parallel evaluation of linear recurrences. J. ACM 24, 3 (July 1977), 513-521.]] Google Scholar
- JOEL, A.E. On permutation switching networks. Bell Syst. Tech. J. 47 (1968), 813-822.]]Google Scholar
- KARP, R.M., AND MIRANKER, W. Parallel minimax search for a maximum. J. Comb. Theory 4 (1968), 19-35.]]Google Scholar
- KAge, R.M., AND MILLER, R.E. Parallel program schemata. J. Comput. Sys. Sci. 3 (1969), 147-195.]]Google Scholar
- KARP, R.M., MILLER, R.E., AND WINOORAD, S. The organization of computations for uniform recurrence equations. J. ACM 14, 3 (July 1967), 563-590.]] Google Scholar
- KAUTZ, W.H. Cellular logic-in-memory arrays. IEEE Trans. Comput. C-18 (1969), 719-727.]]Google Scholar
- KAUTZ, W.H. The design of optimum interconnection networks for multiprocessors. In Structure et Conception des Ordinateurs (Architecture and Design of Digital Computers), Guy Boulaye, Ed., Dunod Publishers, Paris, 1971, pp. 249-278.]]Google Scholar
- KAUTZ, W.H., LEvn~r, K.N., AND WAKSMAN, A. Cellular interconnection arrays. IEEE Trans. Comput. C-17 (1968), pp. 443-451.]]Google Scholar
- KAUTZ, W.H., PEASE, M.C., AND GREEN, M.W. Cellular logic-in-memory arrays. Final Rep. Pt. 1, Project 5509, NONR-4833 (00), Stanford Research Institute, Paid Alto, Calif., 1970.]]Google Scholar
- KELLER, R.M. On maximally parallel schemata. Conf. Rec. llth Ann. IEEE Syrup. on Switching and Automata Theory, Santa Monica, Calif., 1970, pp. 32-50.]]Google Scholar
- KELLER, R.M. On the decomposition of asynchronous systems. Conf. Rec. 13th Ann. IEEE Symp. on Switching and Automata Theory, Baltimore, Md., 1972, pp. 78-89.]]Google Scholar
- KNIGHT, J.C., POOLE, W.G., JR., AND VOICT, R.G. System balance analysis for vector computers. Proc. ACM Ann. Conf., Minneapolis, Minn., 1975, pp. 163-168.]] Google Scholar
- KOGGE, P.M. Parallel algorithms for the solution of recurrence problems. Tech. Rep., Digital Systems Laboratory, Stanford Univ., Stanford, Calif., 1972.]]Google Scholar
- KOCGE, P.M. The numerical stability of parallel algorithms for solving recurrence problems. Tech. Rep., Digital Systems Laboratory, Stanford Univ., Stanford, Calif., 1972.]]Google Scholar
- KOCGE, P.M. Parallel solution of recurrence problems. IBM J. Res. Devel. 18 (1974), 183-184.]]Google Scholar
- KOCGE, P.M., AND STONE, H.S. A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Trans. Comput. C-22 (1973), 786-792.]]Google Scholar
- KOTOV, V.E. Towards automatic construction of parallel programs. Proc. Symp. on Theoretical Programming, Novosibirsk, USSR, 1972, pp. 309-332.]] Google Scholar
- KoTov, V.E., AND NARINYANI, A.S. On transformation of sequential programs into asynchronous parallel programs. Proc. 1968 IFIP Congress, Edinburgh, pp. 351-357.]]Google Scholar
- KUCK, D.J. Multioperation machine computational complexity. Proc. of Symp. on Complexity of Sequential and Parallel Numerical Algorithms, J.F. Traub, Ed., Academic Press, New York, pp. 17- 48.]]Google Scholar
- KUCK, D.J. Parallel processing architecture, a s?vey. Proc. 1975 Sagamore Conf. on Parallel Processing, Sagamore Lake, N.Y., 1975, pp. 15-39.]]Google Scholar
- KUCK, D.J., I~URIE, D.H., AND MURAOKA, Y. Interconnection networks for processors and memories in large systems. COMPCON 72, San Francisco, Calif., 1972, pp. 131-134.]]Google Scholar
- KUCK, D.J., AND MARUYAMA, K.M. The parallel evaluation of arithmetic expressions of special forms. Rep. RC4726, IBM Thomas J. Watson Research Center, Yorktown Heights, N.Y., 1973.]]Google Scholar
- KUCK, D.J., AND MARUYAMA, K.M. Time bounds on the parallel evaluation of arithmetic expressions. SIAM J. Comput. 4 (1974), 147-162.]]Google Scholar
- KUCK, D.J., AND SAMEH, A.H. Parallel computation of eigenvalues of real matrices. Proc. 1971 IFIP Congress, Vol. II, North-Holland, Amsterdam-London, pp. 1266-1272.]]Google Scholar
- KUKREJA, N., AND CHEN, I.N. Combinatorial and sequential cellular structures, iEEE Trans. Comput. C-22 (1973), 813-823.]]Google Scholar
- KUNC, H.T. Synchronous and asynchronous parallel algorithms for multiprocessors. In Algorithms and Complexity, J.F. Traub, Ed., 1967, pp. 153-200.]]Google Scholar
- LADNER, R.E., AND FISHER, M.J. Parallel prefix computations. Proc. 1977 Int. Conf. on Parallel Processing, Detroit, Mich., 1977, pp. 218-223.]]Google Scholar
- LAMBIOTTE, J.J., JR., AND VOIGT, R.G. The solution of tridiagonal linear systems on the CDC STAR-100 computer. ACM Trans. Math. Softw. 1, 4 (Dec. 1975), 308-329.]] Google Scholar
- LAMPORT, L. The parallel execution of DO loops. Commun. ACM, 17, 2 (Feb. 1974), 83-93.]] Google Scholar
- LANG, T. Interconnections between memory modules using the shuffle-exchange network. IEEE Trans. Comput. C-25 (1976), 496-503. See also Tech. Rep. 76, Digital Systems Laboratory, Stanford Univ., Stanford, Calif., 1973.]]Google Scholar
- LANG, T., AND STONE, H.S. A shuffle-exchange network with simplified control. IEEE Trans. Comput. C-25 (1976), 55-65.]]Google Scholar
- LANGE, R.G. High level language for associative and parallel computation with STARAN. Proc. 1976 Int. Conf. on Parallel Processing, Detroit, Mich., 1976, pp. 170-176.]]Google Scholar
- LARSON, R.E., RICHARDSON, M.H., AND BREE, D.W. Dynamic programming in parallel computers. 4th Ann. Hawaii Conf. on Systems Science, pp. 256-269.]]Google Scholar
- LARSON, R.E., RICHARDSON, M.H., AND CASTI, J.L. Dynamic programming with parallel computers for use in Air Force applications. Final Rep., AFOSR Project F44620-70-C0084, SCI Project U956, System Control, Inc., Palo Alto, Calif., 1971.]]Google Scholar
- LARSON, R.E., AND TSE, E. Modal trajectory estimation and parallel computers. 2nd Syrup. on Nonlinear Estimation Theory, San Diego, Calif., 1971, pp. 188-198.]]Google Scholar
- LARSON, R.E., AND TSE, E. Parallel computation of the modal trajectory estimate. Joint Automata Control Conf., Stanford, Calif. See also 5th Ann. Hawaii Conf. on System Sciences, Honolulu, 1972, pp. 495-499.]]Google Scholar
- LARSON, R.E., AND TSE, E. Parallel processing algorithms for the optimal control of nonlinear dynamic systems. IEEE Trans. Comput. C-22 (1973), 777-785.]]Google Scholar
- LAWRIE, D.H. Memory-processor communication networks. Tech. Rep. and Thesis, UIU CSCS-R- 75-557, Computer Science Dep., University of Ill., Urbana, Ill., 1973.]] Google Scholar
- LAWRiE, D.H. Access and alignment of data in an array processor. IEEE Trans. Comput. C-24 (1975), 1145-1155.]]Google Scholar
- LAWRIE, D.H., LAYMAN, T., BAER, D., AND RANDAL, J.M. GLYPNIR--A programming language for Illiac iV. Commun. ACM 18, 3 (March 1975), 157-164.]] Google Scholar
- LEE, C.C., AND FENC;, T. Sorting algorithms for parallel processing (summary). Proc. 1975 Sagamore Conf. on Parallel Processing, Sagamore Lake, N.Y., 1975, p. 239.]]Google Scholar
- LEHMAN, M. A survey of problems and prehminary results concerning parallel processing and parallel processors. Proc. IEEE 54 (1966), 1889-1901.]]Google Scholar
- LEONDES, C., AND RUBINOFF, M. DINA, a digital analyzer for Laplace, Poisson, diffusion and wave equations. AIEE Trans. ( Commun. & Electron.) 71 (1952), 303-309.]]Google Scholar
- LEVITT, K.N., GREEN, M.W., AND GOLDBERG, J. A study of the data communication problems in a self-repairable multiprocessor. Proc. 1968 Spring JCC, AFIPS Press, Arlington, Va., pp. 515-527.]]Google Scholar
- LIPOVSKI, G.J. The architecture of a large associative processor. Proc. 1970 Spring JCC, Vol. 37, AFIPS Press, Arlington, Va., pp. 385-395.]]Google Scholar
- LIPOVSKI, G.J. On a varistructured array of microprocessors. IEEE Trans. Comput. C-26 (1977), 125-138.]]Google Scholar
- LIPOVSKI, G.J., AND TRIPATHI, i. A reconfigurable varistructure array processor. Proc. 1977 Int. Conf. on Parallel Processing, Detroit, Mich., 1977, pp. 165-173.]]Google Scholar
- LIU, C.L. Construction of sorting plans. In Theory of Machines and Computation, Z. Kohavi and A. Paz, Eds., Academic Press, New York, 1971, pp. 87-98.]]Google Scholar
- LIU, J.W.H. The solution of mesh equations on a parallel computer. Tech. Rep., Dep. of Computer Science, Waterloo Univ., Waterloo, Ontario, Canada, 1974.]]Google Scholar
- MADSEN, N.K., RODRIGUE, G.N., AND KARUSH, J.I. Matrix multiplication by diagonals on a vector/ parallel processor. Inf. Proc. Letters 5 (1976), 41-45.]]Google Scholar
- MARCUS, M.J. New approaches to the analysis of connecting and sorting networks. Tech. Rep. 486, Research Laboratory of Electronics, M.I.T., Cambridge, Mass., 1972.]]Google Scholar
- MARUYAMA, K. On the parallel evaluation of polynomials. IEEE Trans. Comput. C.22 (1973), 2-5.]]Google Scholar
- MARUYAMA, K. The parallel evaluation of matrix expressions. Tech. Rep., IBM Thomas J. Watson Research Center, Yorktown Heights, N.Y., 1973.]]Google Scholar
- MILLER, J.C.P., AND BROWN, D.J.S. An algorithm for evaluation of remote terms in a linear recurrence relation. Comput. J. 9 (1967), 188-190.]]Google Scholar
- MILLER, R.E. A comparison of some theoretical models of parallel computation. IEEE Trans. Comput. C-22 (1973), 710-717.]]Google Scholar
- MIRANKER, W. Parallel methods for evaluating the root of a function. IBM J. Res. Devel. 13 (1967), 297-301.]]Google Scholar
- MIRANKER, W. A survey of parallelism in numerical analysis. SIAM Rev. 13 (1971), 524-547.]]Google Scholar
- MIRANKER, W., AND LINIGER, W.M. Parallel methods for the numerical integration of ordinary differential equations. Math. Comp. 21 (1967), 303-320.]]Google Scholar
- MISUNAS, D.P. A computer architecture for data-flow computation. Master's Thesis, Dep. Electrical Engineering and Computer Science, M.I.T., Cambridge, Mass., 1975.]]Google Scholar
- MULLER, D.E., AND PREPARATA, F.P. Bounds to complexities of networks for sorting and for switching. J. ACM 22, 2 (April 1975), 195-201.]] Google Scholar
- MUNRO, I., AND PATTERSON, M. Optimal algorithms for parallel polynomial evaluation. J. Comput. Sys. Sci. 7 (1973), 189-198. See also Conf. Rec. 12th Ann. Syrup. on Switching and Automata Theory, East Lansing, Mich., 1971.]]Google Scholar
- MURAOKA, Y. Parallelism exposure and exploitation in programs. Ph.D. Thesis, Computer Science Dep., Univ. of Illinois, Urbana, Ill., 1971 (Tech. Rep. 424).]] Google Scholar
- MURAOKA, Y., AND KUCK, D.J. On the time required for a sequence of matrix products. Commun. ACM 16, 1 (Jan. 1973), 22-26.]] Google Scholar
- MURTHA, J.C. Highly parallel information processing systems. Adv. Comput. 7 (1966), 2-116.]]Google Scholar
- NARINYANI, A.S. Looking for a theory of parallel computation models. Proc. Symp. Theoretical Programming, Novosibirsk, USSR, 1972, pp. 247-284.]] Google Scholar
- NEIMAN, V.I. Structure et commandes optimales des reseaux de connections sans blockage. Ann. Telecommun. 24 (1969), 232-238.]]Google Scholar
- NIEVERGELT, J. Parallel methods for integrating ordinary differential equations. Commun. ACM 7, 12 (Dec. 1964), 731-733.]] Google Scholar
- OKADA, Y., TAJIMA, M., ANO MORI, R. A novel multiprocessor array. 2nd Symp. on Microarchitecture (Euromicro), North-Holland, Amsterdam, 1976.]]Google Scholar
- ORcu~r, S.E. Computer organization and algorithms for very high-speed computation. Ph.D. Thesis, Stanford Univ., Stanford, Calif., 1974.]]Google Scholar
- PEASE, M.C. The implicit binary n-cube microprocessor array. IEEE Trans. Comput. Co26 (1975), 153-161.]]Google Scholar
- PIPP~NGER, N. The complexity theory of switching networks. Tech. Rep., Research Laboratory of Electronics, M.I.T., Cambridge, Mass., 1973.]]Google Scholar
- PIPPENaER, N. On the complexity of strictly nonblocking concentration networks. IEEE Trans. Commun. COM-22 (1974), 1890-1893.]]Google Scholar
- PIPPENGER, N. On crossbar switching networks. IEEE Trans. Commun. COM-23 (1975), 646-659.]]Google Scholar
- PmTLE, M. Intercommunication of processors and memory. Proc. Fall JCC, Vol. 31, Thompson Publishing, Washington, D.C., pp. 621-633.]]Google Scholar
- PREPARATA, F.P. New parallel-sorting schemes. IEEE Trans. Comput. C-27 (1978), 669-673.]]Google Scholar
- Rr.~TER, R. Scheduling parallel computations. J. ACM 15, 4 (Oct. 1968), 590-599.]] Google Scholar
- RoDmoUr. Z, J.E. A graph model for parallel computation. Ph.D. Dissertation, M.I.T., Cambridge, Mass., 1967. See also Project MAC Rep. ESL-R-398, MAC-TR-64, Sept. 1969.]] Google Scholar
- RODRiGUEZ, J.E. Parallel processes, schemata, and transformations. IBM Res. Rep. RC 2912, IBM Thomas J. Watson Research Center, Yorktown Heights, N.Y., 1970.]]Google Scholar
- ROSEN, K.F. An algorithm for random walks on highly parallel machines. Tech. Rep. TR-15, Cooley Electronics Laboratory, Univ. of Michigan, Ann Arbor, Mich., 1964.]]Google Scholar
- ROSENFELD, J.L. A case study in programming for parallel processors. Commun. ACM 12, 12 (Dec. 1969), 645-655.]] Google Scholar
- ROSENFELO, J.L., AND DRiSCOLL, C.G. Solution of the Dirichlet problem on a simulated parallel processing system. Proc. IFIP Congress 1968, North-Holland, Amsterdam, 1969, pp. 499-507.]]Google Scholar
- SAMEH, A.H. On Jacobi and Jacobi-like algorithms for a parallel computer. Math. Comp. 25 (1971), 579-590.]]Google Scholar
- SAMEH, A.H. Linear system solvers for parallel computers. Tech. Rep., Dep. of Computer Science, Univ. of Illinois, Urbana, Ill., 1975.]]Google Scholar
- SAMEH, A., AND KUCK, D. Linear system solvers for parallel computers. Rep. 701 (NSF-OCS-GJ- 36936-000009), Computer Science Dep., Univ. of Illinois, Urbana, Ill., 1975.]]Google Scholar
- SAMEH, A., AND KUCK, D. A parallel QR-algorithm for symmetric tridiagonal matrices. IEEE Trans. Comput. C-26 (1977), 147-153.]]Google Scholar
- SAMEH, A., ANO KUCK, D. On stable parallel linear system solvers. J. ACM 25, 1 (Jan. 1978), 81-91.]] Google Scholar
- SAVAGr., C. Parallel algorithms for graph-theoretic problems. Ph.D. Dissertation, Univ. of illinois, Urbana, Ill., 1978.]]Google Scholar
- SCHWARTZ, J. Large parallel computers. J. ACM 13, 1 (Jan. 1966), 25-32.]] Google Scholar
- S~^NNON, C.E. Memory requirements in a telephone exchange. Bell Syst. Tech. J. 29 (1950), 347- 349.]]Google Scholar
- SHAPmo, H.D. Storage schemes in parallel memories. Proc. I975 Sagamore Conf. On Parallel Processing, Sagamore Lake, N.Y., 1975, pp. 159-164.]]Google Scholar
- SHArmo, H.D. Theoretical limitations on the use of parallel memories. Ph.D. Thesis, Computer Science Dep., Univ. of Illinois, Urbana, ill., 1976.]]Google Scholar
- SHr~OLER, G.S. Parallel numerical methods for the solution of equations. Comrnun. ACM 10, 5 (May 1967), 286-291.]] Google Scholar
- SHEVLER, G.S., AND LEHMAN, M. Evaluation of redundancy in a parallel algorithm. IBM Syst. J. 6 (1967), 142-149.]]Google Scholar
- S~F. GEL, H.J. Single instruction stream--multiple data stream machine interconnection network design. Proc. 1976 Int. Conf. on Parallel Processing, Detroit, Mich., 1976, pp. 272-280.]]Google Scholar
- SISCEL, H.J. Analysis techniques for SIMD machine interconnection networks and the effects of processor address masks. IEEE Trans. Cornput. C-26 (1977), 153-161. See also Proc. 1975 Sagamore Conf. on Parallel Computing, Sagamore Lake, N.Y., 1975, pp. 106-109.]]Google Scholar
- SIEGEL, H.J. The universality of various types of SIMD machine interconnection networks. Proc. IEEE 4th Ann. Symp. on Computer Architecture, Silver Springs, Md., 1977, pp. 70-79.]] Google Scholar
- SISGEL, H.J., AND SMITH, S.D. Study of multistage SIMD interconnection networks. Proc. IEEE 15th Ann. Symp. on Computer Architecture, Stanford, Calif., pp. 223-229.]] Google Scholar
- SMITH, B.J. An analysis of sorting networks. Final Tech. Rep., ONR Contract N00014-70-A-0362- OO06, 1972.]] Google Scholar
- STONE, H.S. The organization of high-speed memory for parallel block transfer of data. IEEE Trans. Comput. C-19 (1970), 47-53.]]Google Scholar
- STONE, H.S. A logic-in-memory computer. IEEE Trans. Comput. C-18 (1970), 719-727.]]Google Scholar
- STONE, H.S. Dynamic memories with enhanced data access. IEEE Trans. Comput. C-21 (1972), 359-366.]]Google Scholar
- STONE, H.S. Problems of parallel computation. In Complexity of Sequential and Parallel Numerical Algorithms, J.F. Traub, Ed., Academic Press, New York, 1973, pp. 1-17.]]Google Scholar
- STONE, H.S. An efficient parallel algorithm for the solution of a tridiagonal linear system of equations. J. ACM 20, 1 (Jan. 1973), 27-38. ..]] Google Scholar
- STONE, H.S. Discrete Mathematical Structures and Their Applications. Science Research Associates, Chicago, II1., 1973.]]Google Scholar
- STONE, H.S. Parallel tridiagonal equation solvers. ACM Trans. Math. Softw. 1, 4 (Dec. 1975a), 289- 307.]] Google Scholar
- STONE, H.S. Parallel computers. In introduction to Computer Architecture, Science Research Associates, Palo Alto, Calif., 1975b, pp. 318-374.]]Google Scholar
- SULLIVAN, H., AND BASHKOW, T.R. A large scale, homogeneous, fully distributed parallel machine. Proc. 4th Annual Syrup. on Computer Architecture, Silver Springs, Md., 1977, pp. 105-117.]] Google Scholar
- SULLIVAN, H., BASHKOW, T.R., AND KLAPPHOLZ, D. High level language constructions in a selforganizing parallel processor. Proc. 1977 Int. Conf. on Parallel Processing, Detroit, Mich., 1977, pp. 163-174.]]Google Scholar
- SULLIVAN, H., BASHKOW, T.R., AND KLAPPHOLZ, D. A large scale, homogeneous fully distributed parallel machine, II. Proc. 4th Ann. Syrup. on Computer Architecture, Silver Springs, Md., 1977, pp. 118-127.]] Google Scholar
- SULLIVAN, H., BASHKOW, T.R., KLAPPHOLZ, D., AND COHN, L. The node kernel: resource management in a self organizing parallel processor. Proc. 1977 Int. Conf. on Parallel Processing, Detroit, Mich., 1977, pp. 157-162.]]Google Scholar
- SWANSON, R.C. Interconnection for parallel memories to unscramble p-ordered vectors. IEEE Trans. Comput. C-23 (1974), 1105-1115.]]Google Scholar
- THOMPSON, C.D. Generalized connection networks for parallel processor interconnection. IEEE Trans. Comput. C-27 (1978), 1119-1125.]]Google Scholar
- THOMPSON, C.D., AND KUNG, K.T. Sorting on a mesh-connected parallel computer. Proc. 8th Ann. ACM Syrup. on Theory of Computing, Hershey, Pa., 1976, pp. 59-64.]] Google Scholar
- THURBER, K.J. Programmable indexing networks. Proc. 1970 Spring JCC, AFIPS Press, Arlington, Va., pp. 51-58.]]Google Scholar
- THURSER, K.J. Permutation switching networks. Proc. 1971 Computer Designer's Conf., Anaheim, Calif., 1971, pp. 7-24.]]Google Scholar
- THURBER, K.J. Interconnection networks--A survey and assessment. AFIPS Conf. Proc., Vol. 43, AFIPS Press, Arlington, Va., pp. 909-919.]]Google Scholar
- TRAUB, J.F. Iterative solution of tridiagonal system's on parallel and vector computers. In Complexity of Sequential and Parallel Numerical Algorithms, Academic Press, New York, 1973, pp. 49-82.]]Google Scholar
- TROUT, H.R.G. Parallel techniques. Tech. Rep., Dep. of Computer Science, Univ. of Illinois, Urbana, ill., 1972.]]Google Scholar
- TSAO-Wu, N.T., AND OPrERMAN, D.C. On permutation algorithms for rearrangeable switching networks. Conf. Rec. 1969 IEEE Int. Conf. on Communications, Boulder, Colo., 1969, pp. 10-29-10-34.]]Google Scholar
- TsE, E. Parallel computation of the conditional mean state estimate for nonlinear systems. 2nd Symp. on Nonlinear Estimation Theory, San Diego, Calif., 1971, pp. 385-395.]]Google Scholar
- TsE, E., AND LARSON, R.E. Optimum quantization and parallel algorithms for nonlinear state estimation. 3rd Syrup. on Nonlinear Estimation Theory, San Diego, Calif., 1972.]]Google Scholar
- TUTTLS, P.G. Implementation of selected eigenvalue algorithms on a vector computer. Master's Thesis, Univ. of Virginia, Charlottesville, Va., 1975.]]Google Scholar
- VALIANT, L.G. The intrinsic complexity of parallelism in comparison problems. Tech. Rep., Dep. of Computer Science, Carnegie-Mellon Univ., Pittsburgh, Pa., 1974.]]Google Scholar
- VAN VOORHIS, D.C. A lower bound for sorting networks that use the divide-sort-merge strategy. Tech. Rep. 17, Digital Systems Laboratory, Stanford Univ., Stanford, Calif., 1971.]] Google Scholar
- VAN VOORHIS, D.C. A generalization of the divide-sort-merge strategy for sorting networks. Tech. Rep. 16, Digital Systems Laboratory, Stanford Univ., Stanford, Calif., 1971.]] Google Scholar
- VAN VOORHXS, D.C. Large {g, d} sorting networks. Tech. Rep. 18, Digital Systems Laboratory, Stanford Univ., Stanford, Calif., 1971.]] Google Scholar
- VAN VOORaIS, D.C. Toward a lower bound for sorting networks. In Complexity of Computer Computations, Plenum Publishing, New York, 1972, pp. 119-129.]]Google Scholar
- VAN Voogms, D.C. An economical construction for sorting networks. Working Paper 16/A45 No. 1, IBM Systems Development Division, Los Gatos, Calif. See also Proc. 1974 NCC, AFIPS Press, Arlington, Va., pp. 921-926.]]Google Scholar
- WAKSMAN, A. On permutation networks. Proc. Hawaii Int. Conf. on System Sciences, Univ. of Hawaii, Honolulu, 1968, pp. 581-582.]]Google Scholar
- WAaD, R.C. The QR algorithm and Hyman's method on vector computers. Math. Comput. 30 (1976), 132-142.]]Google Scholar
- WEN, K.Y. Interprocessor connection--Capabilities, exploitation and effectiveness. Ph.D. Thesis, 1976.]] Google Scholar
- WINOCRAD, S. On the time required to perform addition. J. ACM 12, 2 (April 1965), 277-285.]] Google Scholar
- WINOGRAD, S. On the time required to perform multiplication. J. ACM 14, 4 (Oct. 1967), 793-802.]] Google Scholar
- WlTTm, L.D. Efficient message routing in mega-microcomputer networks. Proc. iEEE 3rd Ann. Symp. on Computer Architecture, Clearwater, Fla., 1976, pp. 136-140.]] Google Scholar
- WoNt, C.K., AND YUE, P.C. Anticipatory control on a permutable memory. IEEE Trans. Comput. C-22 (1973), 481-488.]]Google Scholar
- WULF, W., AND B~.LL, C.G. C.mmpmA multi-mini-processor. Proc. AFIPS 1972 Fall JCC, Vol. 41, AFIPS Press, Arlington, Va., pp. 765-778.]]Google Scholar
Index Terms
- Ultracomputers
Recommendations
Ultracomputers: a teraflop before its time
Multiprocessor performance measurement and evaluationThe performance and scalability of SHMEM and MPI-2 one-sided routines on a SGI Origin 2000 and a Cray T3E-600: Performances
This paper compares the performance and scalability of SHMEM and MPI-2 one-sided routines on different communication patterns for a SGI Origin 2000 and a Cray T3E-600. The communication tests were chosen to represent commonly used communication patterns ...
OpenMP for Networks of SMPs
In this paper, we present the first system that implements OpenMP on a network of shared-memory multiprocessors. This system enables the programmer to rely on a single, standard, shared-memory API for parallelization within a multiprocessor and between ...
Comments