main-content

## Über dieses Buch

Fourier transforms of large multidimensional data sets arise in many fields --ranging from seismology to medical imaging. The rapidly increasing power of computer chips, the increased availability of vector and array processors, and the increasing size of the data sets to be analyzed make it both possible and necessary to analyze the data more than one dimension at a time. The increased freedom provided by multidimensional processing, however, also places intesive demands on the communication aspects of the computation, making it difficult to write code that takes all the algorithmic possiblities into account and matches these to the target architecture. This book develops algorithms for multi-dimensional Fourier transforms that yield highly efficient code on a variety of vector and parallel computers. By emphasizing the unified basis for the many approaches to one-dimensional and multidimensional Fourier transforms, this book not only clarifies the fundamental similarities, but also shows how to exploit the differences in optimizing implementations. This book will be of interest not only to applied mathematicians and computer scientists, but also to seismologists, high-energy physicists, crystallographers, and electrical engineers working on signal and image processing. Topics covered include: tensor products and the fast Fourier transform; finite Abelian groups and their Fourier transforms; Cooley- Tukey and Good-Thomas algorithms; lines and planes; reduced transform algorithms; field algorithms; implementation on Risc and parallel

## Inhaltsverzeichnis

### 1. Tensor Product

Abstract
Tensor product notation can be used to mathematically model, in terms of matrix factorizations, computations from many diverse fields, including digital signal processing, linear systems, and numerical analysis. Typically, large data sets are processed by algorithms characterized by intricate index calculations. The problem of analyzing and writing code for such algorithms that is tailored to a specific architecture or processor is both time-consuming and error-prone. The formalism of the tensor product notation provides powerful tools for keeping track of these index calculations and for establishing simple rules, in the form of tensor product identities, that can be used to modify an algorithm for optimal performance as data size and target architecture vary. By mapping certain basic tensor product operations onto code or hardware, a large array of algorithms can be implemented by simple algebraic manipulations rather than more timeconsuming programming manipulations.
Richard Tolimieri, Myoung An, Chao Lu

### 2. Multidimensional Tensor Product and FFT

Abstract
Multidimensional signal processing is a child of the computer revolution. Several fields of application such as weather, crystallography, sonar/radar, medical imaging, and seismology require multidimensional data arrays for complete and faithful modeling. The processing of such multidimensional data, one dimension at a time, introduces an error-prone and time-consuming reconstruction stage. Redundancies introduced by time-frequency, time-scale, and space-frequency analysis of speech and vision data naturally lead to multidimensional computations.
Richard Tolimieri, Myoung An, Chao Lu

### 3. Finite Abelian Groups

Abstract
The number of elements in a set X will be called the order of X and be denoted by o(X).
Richard Tolimieri, Myoung An, Chao Lu

### 4. Fourier Transform of Finite Abelian Groups

Abstract
The standard definition of the multidimensional Fourier transform (MDFT) assumes a fixed coordinate system representation of the indexing set. In this chapter, we will define and explore the MDFT in a more abstract setting, one that removes the dependence on coordinates and solely references the additive abelian group structure of the indexing set. This approach highlights the fundamental role played by the duality between periodization and decimation in MDFT algorithm design. This duality lies at the heart of all standard and recently discovered divide and conquer MDFT algorithms. Emphasizing the unity underlying these algorithms permits a deeper understanding of their differences and how these differences can be exploited in implementation. This is especially true in the design of massively parallel algorithms. Algorithm design is reduced to relatively few basic principles without having to account for the details of specific coordinates.
Richard Tolimieri, Myoung An, Chao Lu

### 5. Cooley—Tukey and Good—Thomas

Abstract
The abstract definition of the Fourier transform and the statement of Fourier transform duality as expressed by the periodization-decimation results of chapter 4 provide a unifying principle underlying most 1-dimensional and multidimensional FFTs. The C-T FFTs of chapters 1 and 2 are expressions of this duality. In [1], a vector-radix FFT was derived, extending this duality relative to groups of affine motions on the indexing set. This class of FFT is highly parallelizable.
Richard Tolimieri, Myoung An, Chao Lu

### 6. Lines

Abstract
The geometry of finite abelian groups has a long history of significant contributions to many fields in pure and applied mathematics. The arithmetic of this geometry is responsible for some of the deepest and most fruitful formulas in number theory and theta function theory [2]. It forms the basis for many results in algebraic block coding theory [1]. Most often the finite abelian groups come with additional structure, making them into modules and rings relative to which duality can be defined. Duality interleaves the additive group structure with multiplicative structure.
Richard Tolimieri, Myoung An, Chao Lu

### 7. Duality of Lines and Planes

Abstract
A mapping α : AA of a finite abelian group A is called a homomorphism of A if it satisfies
$$\alpha \left( {a + b} \right) = \alpha \left( a \right) + \alpha \left( b \right),a,b, \in A.$$
Richard Tolimieri, Myoung An, Chao Lu

### 8. Reduced Transform Algorithms

Abstract
Highly parallel computing machines can vary over a wide range of design philosophies, but all depend on some form of space-time concurrency for their potential high-speed computing capacity. Typically, such computers feature a collection of homogeneous processing elements (nodes) together with an interconnection network and can be characterized by the granularity, or power, of the node processors, the degree of parallelism as measured by the number of independent processing elements and the complexity of node coupling, which describes the degree of interaction between nodes.
Richard Tolimieri, Myoung An, Chao Lu

### 9. Field Algorithm

Abstract
In 1968, C. Rader [7]observed that for a prime number p, the p-point 1-dimensional FT could be computed by a (p— 1) x (p— 1) skew-circulant matrix action. S. Winograd and others greatly extended the range of Rader’s method to include the p R-point 1-dimensional FT and multidimensional generalizations [3].
Richard Tolimieri, Myoung An, Chao Lu

### 10. Implementation on RISC Architectures

Abstract
A wide variety of DFT and convolution algorithms have been designed to optimize computations with respect to the number of arithmetic operations, especially multiplications. Blahut (1985) [1]offers an excellent survey of many algorithms designed using this methodology. Today, with the rapid advance in VLSI technology and the availability of high-speed and inexpensive floating-point processors, the time required to carry out a fixed-point addressing operation or a floating-point addition can effectively be the same as that for the floating-point multiplication. Some advanced architectures have these functional units working in parallel, with multiple operations realized in one or a few cycles at the same time. Traditional algorithm design of trading multiplications for additions, therefore, is not only ineffective but can result in a significant decrease in performance.
Richard Tolimieri, Myoung An, Chao Lu

### 11. Implementation on Parallel Architectures

Abstract
In this chapter, we will consider some issues surrounding parallel implementation of several MDFT algorithms on a broadcast mode multiprocessor machine. Such machines typically feature a collection of homogeneous processing elements (nodes) together with an interconnection network of a regular topology for interprocessor communication. The node processors are externally connected by a single I/O channel to a host through which all data loads and unloads are carried out (see Figure 11.1).
Richard Tolimieri, Myoung An, Chao Lu

### Backmatter

Weitere Informationen