article

FLAME: Formal Linear Algebra Methods Environment

Authors:
John A. Gunnels

The University of Texas at Austin, Yorktown Heights, NY

The University of Texas at Austin, Yorktown Heights, NY
View Profile

,
Fred G. Gustavson

IBM T.J. Watson Research Center, Yorktown Heights, NY

IBM T.J. Watson Research Center, Yorktown Heights, NY
View Profile

,
Greg M. Henry

Intel Corporation, Hillsboro, OR

Intel Corporation, Hillsboro, OR
View Profile

,
Robert A. van de Geijn

The University of Texas at Austin, Austin, TX

The University of Texas at Austin, Austin, TX
View Profile

Authors Info & Claims

ACM Transactions on Mathematical Software Volume 27 Issue 4pp 422–455https://doi.org/10.1145/504210.504213

Published:01 December 2001Publication History

ACM Transactions on Mathematical Software

Abstract

Since the advent of high-performance distributed-memory parallel computing, the need for intelligible code has become ever greater. The development and maintenance of libraries for these architectures is simply too complex to be amenable to conventional approaches to implementation. Attempts to employ traditional methodology have led, in our opinion, to the production of an abundance of anfractuous code that is difficult to maintain and almost impossible to upgrade.Having struggled with these issues for more than a decade, we have concluded that a solution is to apply a technique from theoretical computer science, formal derivation, to the development of high-performance linear algebra libraries. We think the resulting approach results in aesthetically pleasing, coherent code that greatly facilitates intelligent modularity and high performance while enhancing confidence in its correctness. Since the technique is language-independent, it lends itself equally well to a wide spectrum of programming languages (and paradigms) ranging from C and Fortran to C++ and Java. In this paper, we illustrate our observations by looking at the Formal Linear Algebra Methods Environment (FLAME), a framework that facilitates the derivation and implementation of linear algebra algorithms on sequential architectures. This environment demonstrates that lessons learned in the distributed-memory world can guide us toward better approaches even in the sequential world.We present performance experiments on the Intel (R) Pentium (R) III processor that demonstrate that high performance can be attained by coding at a high level of abstraction.

References

AGARWAL, R., GUSTAVSON,F.,AND ZUBAIR, M. 1994. Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms. IBM J. Res. Dev. 38, 5 (Sept.), 563-576. Google Scholar
ANDERSON, E., BAI, Z., DEMMEL, J., DONGARRA, J. E., DUCROZ, J., GREENBAUM, A., HAMMARLING,S., MCKENNEY, A. E., OSTROUCHOV,S.,AND SORENSEN, D. 1992. LAPACK Users' Guide. SIAM, Philadelphia. Google Scholar
ANDERSEN,B.S.,GUSTAVSON,F.G.,AND WASNIEWSKI, J. 2000. A recursive formulation of Cholesky factorization of a matrix in packed storage. LAPACK Working Note 146 CS-00-441, University of Tennessee, Knoxville (May).Google Scholar
CHOI, J., DONGARRA,J.J.,POZO, R., AND WALKER, D. W. 1992. Scalapack: A scalable linear algebra library for distributed memory concurrent computers. In Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation. IEEE Computer Society Press, 120-127.Google Scholar
CROUT, P. D. 1941. A short method for evaluating determinants and solving systems of linear equations with real or complex coefficients. Trans AIEE 60, 1235-1240.Google Scholar
DIJKSTRA, E. W. 2000. Under the spell of Leibniz's dream. Tech. Rep. EWD1298, The University of Texas at Austin (April). http://www.cs.utexas.edu/users/EWD/.Google Scholar
DONGARRA,J.J.,BUNCH, J. R., MOLER,C.B.,AND STEWART, G. W. 1979. LINPACK Users' Guide SIAM, Philadelphia.Google Scholar
DONGARRA,J.J.,DUCROZ, J., HAMMARLING,S.,AND DUFF, I. 1990. A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Soft. 16, 1 (March), 1-17. Google Scholar
DONGARRA,J.J.,DUCROZ, J., HAMMARLING,S.,AND HANSON, R. J. 1988. An extended set of FORTRAN basic linear algebra subprograms. ACM Trans. Math. Soft. 14, 1 (March), 1-17. Google Scholar
DONGARRA,J.J.,DUFF,I.S.,SORENSEN,D.C.,AND VAN DER VORST, H. A. 1991. Solving Linear Systems on Vector and Shared Memory Computers. SIAM, Philadelphia, PA. Google Scholar
DONGARRA,J.J.,GUSTAVSON,F.G.,AND KARP, A. 1984. Implementing linear algebra algorithms for dense matrices on a vector pipeline machine. SIAM Review 26, 1 (Jan.), 91-112.Google Scholar
ELMROTH,E.AND GUSTAVSON, F. 2000. Applying recursion to serial and parallel QR factorization leads to better performance. IBM J. Res. Dev. 44, 4, 605-624. Google Scholar
GRIES,D.AND SCHNEIDER, F. B. 1992. A Logical Approach to Discrete Math. Texts and Monographs in Computer Science. Springer Verlag, New York. Google Scholar
GUNNELS, J. 2001. A systematic approach to the design and analysis of parallel dense linear algebra algorithms. Ph.D dissertation Department of Computer Sciences, The University of Texas. Google Scholar
GUNNELS, J. A., HENRY,G.M.,AND VAN DE GEIJN, R. A. 2001. A family of high-performance matrix multiplication algorithms. In Computational Science - ICCS 2001, Part I, V. N. Alexandrov, J. J. Dongarra, B. A. Juliano, R. S. Renner, and C. K. Tan, Eds. Lecture Notes in Computer Science 2073. Springer-Verlag, New York, 51-60. Google Scholar
GUNNELS, J., LIN, C., MORROW,G.,AND VAN DE GEIJN, R. 1998. A flexible class of parallel matrix multiplication algorithms. In Proceedings of First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing (1998 IPPS/SPDP '98). 110-116. Google Scholar
GUNNELS,J.A.AND VAN DE GEIJN, R. A. 2001a. Developing linear algebra algorithms: A collection of class projects. Tech. Rep. CS-TR-01-19, Department of Computer Sciences, The University of Texas at Austin, May. http://www.cs.utexas.edu/users/flame/pubs.html. Google Scholar
GUNNELS,J.A.AND VAN DE GEIJN, R. A. 2001b. Formal methods for high-performance linear algebra libraries. In The Architecture of Scientific Software, R. F. Boisvert and P. T. P. Tang, Eds. Kluwer Academic Press, Orlando, FL, 193-210. Google Scholar
GUNTER,B.C.,REILEY,W.C.,AND VAN DE GEIJN, R. A. 2001. Parallel out-of-core cholesky and qr factorizations with pooclapack. In Proceedings of the 15th International Parallel and Distributed Processing Symposium (IPDPS). IEEE Computer Society Press, Los Alamitos, CA. Google Scholar
GUSTAVSON, F. G. 1997. Recursion leads to automatic variable blocking for dense linearalgebra algorithms. IBM J. of Res. Dev. 41, 6 (November), 737-755. Google Scholar
GUSTAVSON, F. G. 2001. New generalized matrix data structures lead to a variety of highperformance algorithms. In The Architecture of Scientific Software, R. F. Boisvert and P. T. P. Tang, Eds: Kluwer Academic Press, Orlando, FL. Google Scholar
GUSTAVSON, F., HENRIKSSON, A., JONSSON, I., KAGSTROM,B.,AND LING, P. 1998a. Recursive blocked data formats and BLAS's for dense linear algebra algorithms. In Applied Parallel Computing, Large Scale Scientific and Industrial Problems, B. K. et al., Ed. Lecture Notes in Computer Science 1541. Springer-Verlag, New York, 195-206. Google Scholar
GUSTAVSON, F., HENRIKSSON, A., JONSSON, I., KAGSTROM,B.,AND LING, P. 1998b. Superscalar GEMM- based level 3 BLAS-the on-going evolution of a portable and high-performance library. In Applied Parallel Computing, Large Scale Scientific and Industrial Problems, B. K. et al., Ed. Lecture Notes in Computer Science 1541. Springer-Verlag, New York, 207-215. Google Scholar
GUSTAVSON,F.AND JONSSON, I. 2000. Minimal storage high-performance Cholesky factorization via blocking and recursion. IBM J. Res. Dev. 44, 6 (November), 823-850. Google Scholar
KAGSTROM, B., LING,P.,AND LOAN, C. V. 1998. GEMM-based level 3 BLAS: High performance model implementations and performance evaluation benchmark. ACM Trans. Math. Soft. 24,3, 268-302. Google Scholar
LAWSON, C. L., HANSON,R.J.,KINCAID,D.R.,AND KROGH, F. T. 1979. Basic linear algebra subprograms for Fortran usage. ACM Trans. Math. Soft. 5, 3 (Sept.), 308-323. Google Scholar
REILEY, W. C. 1999. Efficient parallel out-of-core implementation of the Cholesky factorization. Tech. Rep. CS-TR-99-33, Department of Computer Sciences, The University of Texas at Austin. (Dec.) Undergraduate Honors Thesis. Google Scholar
REILEY,W.C.AND VAN DE GEIJN, R. A. 1999. POOCLAPACK: Parallel Out-of-Core Linear Algebra Package. Tech. Rep. CS-TR-99-33, Department of Computer Sciences, The University of Texas at Austin (Nov.). Google Scholar
SMITH,B.T.ET AL. 1976. Matrix Eigensystem Routines-EISPACK Guide, Second ed. Lecture Notes in Computer Science 6. Springer-Verlag, New York.Google Scholar
SNIR, M., OTTO,S.W.,HUSS-LEDERMAN, S., WALKER,D.W.,AND DONGARRA, J. 1996. MPI: The Complete Reference. The MIT Press. Google Scholar
STEWART, G. W. 1998. Matrix Algorithms Volume 1: Basic Decompositions. SIAM. Google Scholar
VAN DE GEIJN, R. A. 1997. Using PLAPACK: Parallel Linear Algebra Package. The MIT Press. Google Scholar
WHALEY,R.C.AND DONGARRA, J. J. 1998. Automatically tuned linear algebra software. In Proceedings of SC'98. Google Scholar

Index Terms

FLAME: Formal Linear Algebra Methods Environment
1. Mathematics of computing
  1. Mathematical software
2. Software and its engineering
  1. Software notations and tools
    1. Context specific languages
      1. Domain specific languages
    2. Software libraries and repositories
  2. Software organization and properties
    1. Software system structures
      1. Software architectures

Recommendations

The BLIS Framework: Experiments in Portability

BLIS is a new software framework for instantiating high-performance BLAS-like dense linear algebra libraries. We demonstrate how BLIS acts as a productivity multiplier by using it to implement the level-3 BLAS on a variety of current architectures. The ...
Read More
Representing linear algebra algorithms in code: the FLAME application program interfaces

In this article, we present a number of Application Program Interfaces (APIs) for coding linear algebra algorithms. On the surface, these APIs for the MATLAB M-script and C programming languages appear to be simple, almost trivial, extensions of those ...
Read More
The science of deriving dense linear algebra algorithms

In this article we present a systematic approach to the derivation of families of high-performance algorithms for a large set of frequently encountered dense linear algebra operations. As part of the derivation a constructive proof of the correctness of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Mathematical Software Volume 27, Issue 4
December 2001
88 pages
ISSN:0098-3500
EISSN:1557-7295
DOI:10.1145/504210
Issue’s Table of Contents

Copyright © 2001 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 December 2001
Published in toms Volume 27, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Formal derivation
libraries
linear algebra
performance
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 191
  Total Citations
  View Citations
- 1,103
  Total Downloads
- Downloads (Last 12 months)46
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

FLAME: Formal Linear Algebra Methods Environment

ACM Transactions on Mathematical Software

Abstract

References

Cited By

Index Terms

Recommendations

The BLIS Framework: Experiments in Portability

Representing linear algebra algorithms in code: the FLAME application program interfaces

The science of deriving dense linear algebra algorithms

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

FLAME: Formal Linear Algebra Methods Environment

ACM Transactions on Mathematical Software

Abstract

References

Cited By

Index Terms

Recommendations

The BLIS Framework: Experiments in Portability

Representing linear algebra algorithms in code: the FLAME application program interfaces

The science of deriving dense linear algebra algorithms

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media