Miranda: A non-strict functional language with polymorphic types

D. A. Turner

Data flow graph optimization in if1

Abstract

Optimization techniques are as important when compiling data flow languages as when compiling conventional languages. This paper describes work that has been done on optimizers for SISAL programs that have been translated into IF1 data flow graphs. It shows that conventional optimization algorithms can be easily and efficiently implemented for data flow graphs, and that the payoff for even simple optimizations can be significant.

S. K. Skedzielewski, M. L. Welcome

Strictness analysis — a practical approach

Abstract

Significant improvements in performance arise if we can arrange for parallel execution of programs. The absence of side effects in functional languages allows concurrent evaluation of the program, but in lazy implementations this risks wasting work by evaluating expressions which are subsequently discarded. We discuss the use of strictness analysis to determine at compile time which parts of program evaluation can safely be carried out concurrently. We give a practical explanation of this technique, concentrating particularly on the problem of finding fixed points.

Chris Clack, Simon L Peyton Jones

The categorical abstract machine

Abstract

The Cartesian closed categories have been shown by several authors to provide the right framework of the model theory of λ-calculus. The second author developed this as a syntactic equivalence between two calculi, giving rise to a new kind of combinatory logic: the categorical combinatory logic, where computations can be done through simple rewrite rules, and, as usual with combinators, avoiding problems with variable name clashes. This paper goes further (though not requiring a previous knowledge of categorical combinatory logic) and describes a very simple machine where categorical terms can be considered as code acting on a graph of values. The only saving mechanism is a stack containing pointers on code or on the graph. Abstractions are handled using closures. The machine is called categorical abstract machine or CAM. The CAM is easier to grasp and prove than the SECD machine. The natural evaluation strategy in the CAM is call-by-value, but lazy evaluation can be easily incorporated. The paper discusses the implementation of a real functional programming language, ML, through the CAM. A basic acquaintance with λ-calculus is required.

G. Cousineau, P-L. Curien, M. Mauny

High order programming in extended FP

Patrick Bellot

Secd-m: a virtual machine for applicative programming

Abstract

We present a virtual machine to support applicative multiprogramming — in description of concurrent, asynchronous systems such as operating systems in a functional style. The machine extends Landin's secd machine to support multiple concurrent expression evaluation, non-determinism in the form of the fair merge, and a full range of input and output devices. This allows systems programs to be written in a functional style. The secd-m machine has been implemented and a number of functional concurrent programs demonstrated.

S. Abramsky, R. Sykes

Cobweb — A combinator reduction architecture

Abstract

The work reported in this paper represents the convergence of ideas stemming from two areas of research. In the hardware field there has been significant interest in developing the techniques of Wafer Scale Integration to provide large assemblies of tightly-coupled simple processors that can act cooperatively in the execution of a task. In the software field there has been a growing awareness that declarative languages lead to higher programmer productivity and offer more potential for parallel evaluation than the traditional, imperative languages.

This paper describes the first machine in a family called COBWEB. The common denominator of the family is that all of the machines are targetted to supporting functional languages and all execute some form of combinator code. The first machine employs normal order reduction to evaluate pure functional programs. The next machine will use a parallel reduction strategy and later members will support progressively more sophisticated facilities. All of the machine are intended to exploit the potential of Wafer Scale Integration.

CL Hankin, PE Osmon, MJ Shute

How to replace failure by a list of successes a method for exception handling, backtracking, and pattern matching in lazy functional languages

Abstract

Should special features for exception handling, backtracking, or pattern matching be included in a programming language? This paper presents a method whereby some programs that use these features can be re-written in a functional language with lazy evaluation, without the use of any special features. This method may be of use for practicing functional programmers; in addition, it provides further evidence of the power of lazy evaluation. The method itself is straightforward: each term that may raise an exception or backtrack is replaced by a term that returns a list of values. In the special case of pattern matching without backtracking, the method can be applied even if lazy evaluation is not present. The method should be suitable for applications such as theorem proving using tacticals, as in ML/LCF.

Philip Wadler

Lazy memo-functions

John Hughes

An architecture for fast data movement in the FFP machine

Abstract

We propose an extension to the architecture of the FFP machine which permits storage management in O(log n) time and arbitrary data rearrangement in O(log² n) time, assuming messages are transmitted in unit time independent of wire length. This is achieved using a perfect shuffle sorter network with the FFP machine. The cost is only a constant amount of hardware per atomic symbol. We also propose mechanisms for early termination, that is, easy rearrangements may be performed faster than the worst case time. We propose a hierarchical structure to the FFP machine which is compatible with a modification of the perfect shuffle sorter network. This permits local operations to be faster than global operations, and permits the FFP machine to operate as a collection of independent processors when desired. A method for giving the programmer some control of the use of this hierarchy is proposed. The FFP machine selects certain subexpressions of the input string to be processed during each computation step. These subexpressions are effectively isolated from the rest of the input string during the computation. We show how this isolation may be accomplished in the perfect shuffle sorter network without giving unique names to each such subexpression. This architecture permits fast operations, such as the multiplication of two m by m matrices in O(log² m) time.

David A. Plaisted

An architecture that efficiently updates associative aggregates in applicative programming languages

Abstract

Applicative (also called functional) programming systems prohibit side effects, including assignments to variables. This restriction has several advantages, including referential transparency and potential parallel program execution. A major disadvantage, however, is that aggregate data structures become very expensive to maintain: when the programmer updates a single element in an aggregate, many applicative language implementations must completely recopy the aggregate. This paper solves the aggregate update problem with a two-level architecture: a microprogram maintains “associative aggregate” data structures, and a hardware memory design (the Associative Aggregate Machine) implements powerful insertion, deletion and searching operations required by the microprogram. The Associative Aggregate Machine contains a linear sequence of cells comprising storage and combinational logic. Each cell is connected to its predecessor and successor, so the sequence of cells forms a shift register that supports insertion and deletion. In addition, a binary tree of combinational logic nodes performs fast associative searching through the sequence of cells. The Associative Aggregate Machine architecture is extremely regular and is well suited for VLSI implementation.

John T. O'Donnell

Lambda lifting: Transforming programs to recursive equations

Abstract

Lambda lifting is a technique for transforming a functional program with local function definitions, possibly with free variables in the function definitions, into a program consisting only of global function (combinator) definitions which will be used as rewrite rules. Different ways of doing lambda lifting are presented, as well as reasons for rejecting or selecting the method used in our Lazy ML compiler. A functional program implementing the chosen algorithm is given.

Thomas Johnsson

Optimizing almost-tail-recursive prolog programs

Abstract

There is a wide class of problems for which the natural Prolog specification, as a top-down, recursive computation, is significantly less efficient than an iterative bottom-up version. However, the transformation from the top-down specification to the bottom-up implementation is not always obvious, principally due to problems with nondeterminism and the order in which variables get bound — problems which do not arise in comparable situations in functional languages. This paper illustrates how these problems can be handled in certain cases, and the transformation mechanized, using algebraic properties of operators such as associativity and distributivity. The resulting programs are tail-recursive, and hence significantly more efficient in space usage, with no deterioration in execution speed.

Saumya K. Debray

Designing regular array architectures using higher order functions

Abstract

Functional programmers often use higher order functions such as map, reduce and filter in writing programs. By giving such higher order functions geometric as well as behavioural interpretation, we use similar techniques to design regular array architectures. We use some higher order functions from programming and two which were introduced specifically for describing and reasoning about regular arrays. Our higher order functions obey a number of algebraic laws. This allows us to use program transformation in the design process. We transform a correct (and understandable) initial design into one which is more suitable for VLSI implementation. The algebraic laws guarantee the correctness of the final circuit.

We have demonstrated our techniques by developing a novel bit-level systolic binary multiplier.

Mary Sheeran

: An environment for the multi-level specification, analysis, and synthesis of hardware algorithms

Abstract

This paper describes a method based on applicative languages for the specification, evaluation and synthesis of hardware algorithms. The goal of the research effort is to provide designers with an environment in which they can rapidly explore alternative designs for their algorithms throughout the synthesis process. It is possible to specify the algorithm at arbitrary levels of abstraction and have the system rapidly evaluate certain parameters (e.g. speed, area, etc.) so that designers can make informed decisions during the synthesis process. Layouts which are suitable as floor plans are extracted from high-level algorithms.

Dorab Patel, Martine Schlag, Miloš Ercegovac

A distributed garbage collection algorithm

John Hughes

Cyclic reference counting for combinator machines

Abstract

This new algorithm deals correctly and automatically with the kind of cyclic (i.e. self-referencing) structures which arise in a combinator graph reduction machine. By extending the standard reference count algorithm, cycles can be handled safely at little extra cost. Cyclic reference counting uses one extra bit per pointer and per object and one extra reference count per object. Extra computation amounts to inspection of the objects in a cyclic structure from time to time before the structure becomes free. In the absence of cycles, no computation overhead is incurred. When executing some typical recursive programs, the number of objects inspected is approximately equal to the number of new objects used during execution.

D. R. Brownbridge

Design for a multiprocessing heap with on-board reference counting

David S. Wise

A functional language and modular architecture for scientific computing

Abstract

An experimental functionally programmed multiprocessor for high performance computing applications is the subject of an ongoing research project at FPS, with the objective of bringing the benefits of functional programming to the scientific and engineering computing community.

The system architecture is a modular, heterogeneous multiprocessor which supports the parallel evaluation of "Flo," a stream oriented functional language. Streams (infinite sequences) give expressive power to the language and require normal order evaluation techniques. Sub-expressions of a Flo program graph are distributed among several specialized processors on which evaluation proceeds concurrently.

This paper gives an overview of the system and discusses its language, architecture, and operation.

Mark F. Young

Practical polymorphism

Abstract

Polymorphic type systems as proposed by Milner and implemented in the programming language ML offer rich types, unobtrusive compile-time type-checking and complete type-safety in functional languages. However, straightforward addition of such a type system to languages with interactive environments such as FQL, SASL or Scheme can inhibit seriously the top-down, incremental programming style characteristically employed with them. The problems include error-recovery during type-checking, type-checking with incomplete information and incremental type-checking during program development. We describe these problems and present an integrated solution as prototyped in FQL.

Rishiyur S. Nikhil

Program verification in a logical theory of constructions

Abstract

The logical theory of constructions is a simple theory which combines functional programs and intuitionistic predicate calculus. Here we propose that it is a practical alternative to other constructive programming logics, such as Martin-Löf's type theory. Its main advantage is that it admits reasoning directly about general recursion, while maintaining that all typed programs terminate. We illustrate the use of this theory by verifying the general recursive subtractive division program.

Peter Dybjer

Transforming recursive programs for execution on parallel machines

V. J. Bush, J. R. Gurd

Compiling pattern matching

Lennart Augustsson

Serial combinators: "optimal" grains of parallelism

Abstract

A method is described for translating a high-level functional program into combinators suitable for execution on multiprocessors with no shared memory. It is argued that the granularity of the standard set of fixed combinators is too fine, whereas the granularity of user-defined functions is too coarse. The notion of a serial combinator is introduced that in some sense has optimal granularity, and that takes into account pragmatic issues such as the complexity of expressions and communication costs between processors. The technique improves on the standard notion of super-combinators by making them larger to retain locality and improve efficiency, and by ensuring that they have no concurrent substructure that could result in lost parallelism. Simulation results demonstrate improved performance on both sequential and parallel computing models.

Paul Hudak, Benjamin Goldberg

The G-machine: A fast, graph-reduction evaluator

Abstract

The G-machine is an abstract architecture for evaluating functional-language programs by programmed graph reduction. Unlike combinator reduction, in which control is derived dynamically from the expression graph itself, control in programmed graph reduction is specified by a sequence of instructions derived by compiling an applicative expression.

The G-machine architecture was defined by Thomas Johnsson and Lennart Augustsson (Gothenburg) as the evaluation model for a compiler for a dialect of ML with lazy evaluation rules. This paper describes a sequential evaluator based upon that abstract architecture. It discusses performance issues affecting reduction architectures, then describes the organization of a hardware design to address these issues. The interplay between compilation strategies and the computational engine is exploited in this design.

Principal features of the design are (i) hardware support for graph traversal, (ii) a vertically microcoded, pipelined internal architecture, (iii) an instruction fetch and translation unit with very low latency, and (iv) a new memory architecture, one specifically suited to graph reduction and which can be extended to very large memories.

Richard B. Kieburtz

Springer Professional

Functional Programming Languages and Computer Architecture

Nancy, France, September 16–19, 1985

Inhaltsverzeichnis

Frontmatter

Miranda: A non-strict functional language with polymorphic types

Data flow graph optimization in if1

Strictness analysis — a practical approach

The categorical abstract machine

High order programming in extended FP

Secd-m: a virtual machine for applicative programming

Cobweb — A combinator reduction architecture

How to replace failure by a list of successes a method for exception handling, backtracking, and pattern matching in lazy functional languages

Lazy memo-functions

An architecture for fast data movement in the FFP machine

An architecture that efficiently updates associative aggregates in applicative programming languages

Lambda lifting: Transforming programs to recursive equations

Optimizing almost-tail-recursive prolog programs

Designing regular array architectures using higher order functions

: An environment for the multi-level specification, analysis, and synthesis of hardware algorithms

A distributed garbage collection algorithm

Cyclic reference counting for combinator machines

Design for a multiprocessing heap with on-board reference counting

A functional language and modular architecture for scientific computing

Practical polymorphism

Program verification in a logical theory of constructions

Transforming recursive programs for execution on parallel machines

Compiling pattern matching

Serial combinators: "optimal" grains of parallelism

The G-machine: A fast, graph-reduction evaluator