Skip to main content
Top

2013 | Book

Handbook of Signal Processing Systems

Editors: Shuvra S. Bhattacharyya, Ed F. Deprettere, Rainer Leupers, Jarmo Takala

Publisher: Springer New York

insite
SEARCH

About this book

Handbook of Signal Processing Systems is organized in three parts. The first part motivates representative applications that drive and apply state-of-the art methods for design and implementation of signal processing systems; the second part discusses architectures for implementing these applications; the third part focuses on compilers and simulation tools, describes models of computation and their associated design tools and methodologies.

This handbook is an essential tool for professionals in many fields and researchers of all levels.

Table of Contents

Frontmatter

Applications

Frontmatter
Signal Processing for Stereoscopic and Multi-View 3D Displays

Displays which aim at visualizing 3D scenes with realistic depth are known as “3D displays”. Due to technical limitations and design decisions, such displays might create visible distortions, which are interpreted by the human visual system as artifacts. This book chapter overviews a number of signal processing techniques for decreasing the visibility of artifacts on 3D displays. It begins by identifying the properties of a scene which the brain utilizes for perceiving depth. Further, operation principles of the most popular types of 3D displays are explained. A signal processing channel is proposed as a general model reflecting these principles. The model is applied in analyzing how visual quality is influenced by display distortions. The analysis allows identifying a set of optical properties which are directly related with the perceived quality. A methodology for measuring these properties and creating a

quality profile

of a 3D display is discussed. A comparative study introducing the measurement results on the visual quality and position of the sweet spots of a number of 3D displays of different types is presented. Based on knowledge of 3D artifact visibility and understanding of distortions introduced by 3D displays, a number of signal processing techniques for artifact mitigation are overviewed. These include a methodology for

passband optimization

which addresses typical 3D display artifacts (e.g. Moiré, fixed-pattern-noise and ghosting), a framework for design of tunable anti-aliasing filters and a set of real-time algorithms for view-point based optimization.

Atanas Boev, Robert Bregovic, Atanas Gotchev
Video Compression

In this chapter, we show the demands of video compression and introduce video coding systems with state-of-the-art signal processing techniques. In the first section, we show the evolution of video coding standards. The coding standards are developed to overcome the problems of limited storage capacity and limited communication bandwidth for video applications. In the second section, the basic components of a video coding system are introduced. The redundant information in a video sequence is explored and removed to achieve data compression. In the third section, we will introduce several emergent video applications (including High Definition TeleVision (HDTV), streaming, surveillance, and multiview videos) and the corresponding video coding systems. People will not stop pursuing move vivid video services. Video coding systems with better coding performance and visual quality will be continuously developed in the future.

Yu-Han Chen, Liang-Gee Chen
Inertial Sensors and Their Applications

Due to the universal presence of motion, vibration, and shock, inertial motion sensors can be applied in various contexts. Development of the microelectromechanical (MEMS) technology opens up many new consumer and automotive applications for accelerometers and gyroscopes. The large variety of application creates different requirements to inertial sensors in terms of accuracy, size, power consumption and cost. It makes it difficult to choose sensors that are suited best for the particular application. Signal processing methods depend on the application and should reflect on the physical principles behind this application. This chapter describes the principles of operation of accelerometers and gyroscopes, different applications involving the inertial sensors. It also gives examples of signal processing algorithms for pedestrian navigation and motion classification.

Jussi Collin, Pavel Davidson, Martti Kirkko-Jaakkola, Helena Leppäkoski
Finding It Now: Construction and Configuration of Networked Classifiers in Real-Time Stream Mining Systems

As data is becoming more and more prolific and complex, the ability to process it and extract valuable information has become a critical requirement. However, performing such signal processing tasks requires to solve multiple challenges. Indeed, information must frequently be extracted (a) from many distinct data streams, (b) using limited resources, and (c) in real time to be of value. The aim of this chapter is to describe and optimize the specifications of signal processing systems, aimed at extracting in real time valuable information out of large-scale decentralized datasets. A first section will explain the motivations and stakes which have made stream mining a new and emerging field of research and describe key characteristics and challenges of stream mining applications. We then formalize an analytical framework which will be used to describe and optimize distributed stream mining knowledge extraction from large scale streams. In stream mining applications, classifiers are organized into a connected topology mapped onto a distributed infrastructure. We will study linear chains of classifiers and determine how the ordering of the classifiers in the chain impacts accuracy of classification and delay and determine how to choose the most suitable order of classifiers. Finally, we present a decentralized decision framework upon which distributed algorithms for joint topology construction and local classifier configuration can be constructed. Stream mining is an active field of research, at the crossing of various disciplines, including multimedia signal processing, distributed systems, machine learning etc. As such, we will indicate several areas for future research and development.

Raphaël Ducasse, Mihaela van der Schaar
High-Energy Physics

High-energy physics (HEP) applications represent a cutting-edge field for signal processing systems. HEP applications require sophisticated hardware-based systems to process the massive amounts of data that they generate. Scientists use these systems to identify and isolate the fundamental particles produced during collisions in particle accelerators. This chapter examines the fundamental characteristics of HEP applications and the technical and developmental challenges that shape the design of signal processing systems for HEP. These challenges include huge data rates, low latencies, evolving specifications, and long design times. We cover techniques for HEP system design, including scalable designs, testing and verification, dataflow-based modeling, and design partitioning. Throughout, we provide concrete examples from the design of the Level-1 Trigger System for the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC). We also discuss some of the new physics algorithms to be included in the upcoming LHCś high luminosity upgrade.

Anthony Gregerson, Michael J. Schulte, Katherine Compton
Signal Processing for Wireless Transceivers

The data rates as well as quality of service (QoS) requirements for rich user experience in wireless communication services are continuously growing. While consuming a major portion of the energy needed by wireless devices, the wireless transceivers have a key role in guaranteeing the needed data rates with high bandwidth efficiency. The cost of wireless devices also heavily depends on the transmitter and receiver technologies. In this chapter, we concentrate on the problem of transmitting information sequences efficiently through a wireless channel and performing reception such that it can be implemented with state of the art signal processing tools. The operations of the wireless devices can be divided to RF and baseband (BB) processing. Our emphasis is to cover the BB part, including the coding, modulation, and waveform generation functions, which are mostly using the tools and techniques from digital signal processing. But we also look at the overall transceiver from the RF system point of view, covering issues like frequency translations and channelization filtering, as well as emerging techniques for mitigating the inevitable imperfections of the analog RF circuitry through advanced digital signal processing techniques.

Markku Juntti, Markku Renfors, Mikko Valkama
Signal Processing for Cryptography and Security Applications

Embedded devices need both an efficient and a secure implementation of cryptographic primitives. In this chapter we show how common signal processing techniques are used in order to achieve both objectives. Regarding efficiency, we first give an example of accelerating hash function primitives using the retiming transformation, a well known technique to improve signal processing applications. Second, we outline the use of some special features of DSP processors and techniques earlier developed for efficient implementations of public-key algorithms. Regarding the secure implementations we outline the concept of side channel attacks and show how a multitude of techniques for preprocessing the data are used in such scenarios. Finally, we talk about fuzzy secrets and point out the use of DSP techniques for an important role in cryptography—a key derivation.

Miroslav Knežević, Lejla Batina, Elke De Mulder, Junfeng Fan, Benedikt Gierlichs, Yong Ki Lee, Roel Maes, Ingrid Verbauwhede
Digital Signal Processing in Home Entertainment

In the last decade or so, audio and video media switched from analog to digital and so did consumer electronics. In this chapter we explore how digital signal processing has affected the creation, distribution, and consumption of digital media in the home. By using “photos”, “music”, and “video” as the three core media of home entertainment, we explore how advances in digital signal processing, such as audio and video compression schemes, have affected the various steps in the digital photo, music, and video pipelines. The emphasis in this chapter is more on demonstrating how applications in the digital home drive and apply state-of-the art methods for the design and implementation of signal processing systems, rather than describing in detail any of the underlying algorithms or architectures, which we expect to be covered in more detail in other chapters. We also explore how media can be shared in the home, and provide a short review the principles of the DLNA stack. We conclude with a discussion on digital rights management (DRM) and a short overview of the Microsoft Windows DRM.

Konstantinos Konstantinides
Signal Processing for Control

Signal processing and control are closely related. In fact, many controllers can be viewed as a special kind of signal processor that converts an exogenous input signal and a feedback signal into a control signal. Because the controller exists inside of a feedback loop, it is subject to constraints and limitations that do not apply to other signal processors. A well known example is that a stable controller in series with a stable plant can, because of the feedback, result in an unstable closed-loop system. Further constraints arise because the control signal drives a physical actuator that has limited range. The complexity of the signal processing in a control system is often quite low, as is illustrated by the Proportional + Integral + Derivative (PID) controller. Model predictive control is described as an exemplar of controllers with very demanding signal processing. ABS brakes are used to illustrate the possibilities for improved controller capability created by digital signal processing. Finally, suggestions for further reading are included.

William S. Levine
MPEG Reconfigurable Video Coding

Traditional efforts in standardizing video coding used to involve a lengthy process that resulted in large monolithic standards and reference codes. This approach has become increasingly ill-suited to the dynamics and the fast changing needs of the video coding community. Most importantly, there used to be no principled approach to leveraging the significant commonalities between the different codecs, neither at the level of the specification nor at the level of the implementation. The result is a long interval between the time a new idea is validated and the time it is implemented in consumer products as part of a worldwide standard. The analysis of this problem was the starting point of a new standard initiative within the ISO/IEC MPEG committee, called Reconfigurable Video Coding (RVC). The main idea is to develop a video coding standard that overcomes many shortcomings of the current standardization and specification process by updating and progressively incrementing a modular library of components. As the name implies, flexibility and reconfigurability are new attractive features of the RVC standard. The RVC framework is based on the usage of a new actor/dataflow oriented language called

Cal

for the specification of the standard library and the instantiation of the RVC decoder model.

Cal

dataflow models expose the intrinsic concurrency of the algorithms by employing the notions of actor programming and dataflow. This chapter gives an overview of the concepts and technologies building the standard RVC framework and the non-standard tools supporting the RVC model from the instantiation and simulation of the

Cal

model to the software and/or hardware code synthesis.

Marco Mattavelli, Mickaël Raulet, Jörn W. Janneck
Signal Processing for High-Speed Links

The steady growth in demand for bandwidth has resulted in data-rates in the 10s of Gb/s in back-plane and optical channels. Such high-speed links suffer from impairments such as dispersion, noise and nonlinearities. Due to the difficulty in implementing multi-Gb/s transmitters and receivers in silicon, conventionally, high-speed links were implemented primarily with analog circuits employing minimal signal processing. However, the relentless scaling of feature sizes exemplified by Moore’s Law has enabled the application of sophisticated signal processing techniques to both back-plane and optical links employing mixed analog and digital architectures and circuits. As a result, over the last decade, signal processing has emerged as a key technology to the advancement of low-cost, high data-rate, back-plane and optical communication systems. In this chapter, we provide an overview of some of the driving factors that limit the performance of high-speed links, and highlight some of the potential opportunities for the signal processing and circuits community to make substantial contributions in the modeling, design and implementation of these systems.

Naresh Shanbhag, Andrew Singer, Hyeon-Min Bae
Medical Image Processing

Medical image processing, a specialization of classical image processing, focuses on reconstruction, processing, and visualization of medical images. The field has gained particular prominence as medical imaging devices have emerged as a vast and fast growing source of image data. This chapter introduces the reader to the common medical image acquisition techniques and, using two case studies, to the imaging pipeline observed in many medical applications. Each of the stages of the pipeline, which includes reconstruction, preprocessing, segmentation, registration and visualization are presented in greater detail. Medical images continue to trend toward higher resolution and higher dimensions. Together with a persistent need for speed for clinical efficiency, this trend has created new computational challenges for practical implementations of many medical image processing algorithms. The chapter concludes with a discussion of computational needs and a brief survey of current solutions.

Raj Shekhar, Vivek Walimbe, William Plishker
Low-Power Wireless Sensor Network Platforms

Wireless sensor network (WSN) is a technology comprising even thousands of autonomic and self-organizing nodes that combine environmental sensing, data processing, and wireless multihop ad-hoc networking. The features of WSNs enable monitoring, object tracking, and control functionality. The potential applications include environmental and condition monitoring, home automation, security and alarm systems, industrial monitoring and control, military reconnaissance and targeting, and interactive games. This chapter describes low-power WSN as a platform for signal processing by presenting the WSN services that can be used as building blocks for the applications. It explains the implications of resource constraints and expected performance in terms of throughput, reliability and latency.

Jukka Suhonen, Mikko Kohvakka, Ville Kaseva, Timo D. Hämäläinen, Marko Hännikäinen
Signal Processing Tools for Radio Astronomy

Radio astronomy is known for its very large telescope dishes, but is currently making a transition towards the use of large numbers of small elements. For example, the Low Frequency Array, commissioned in 2010, uses about 50 stations, each consisting of at least 96 low band antennas and 768 high band antennas. For the Square Kilometre Array, planned for 2024, the numbers will be even larger. These instruments pose interesting array signal processing challenges. To present some aspects, we start by describing how the measured correlation data is traditionally converted into an image, and translate this into an array signal processing framework. This paves the way for a number of alternative image reconstruction techniques, such as a Weighted Least Squares approach. Self-calibration of the instrument is required to handle instrumental effects such as the unknown, possibly direction dependent, response of the receiving elements, as well a unknown propagation conditions through the Earth’s troposphere and ionosphere. Array signal processing techniques seem well suited to handle these challenges. The fact that the noise power at each antenna element may be different motivates the use of Factor Analysis, as a more appropriate alternative to the eigenvalue decomposition that is commonly used in array processing. Factor Analysis also proves to be very useful for interference mitigation. Interestingly, image reconstruction, calibration and interference mitigation are often intertwined in radio astronomy, turning this into an area with very challenging signal processing problems.

Alle-Jan van der Veen, Stefan J. Wijnholds
Distributed Smart Cameras and Distributed Computer Vision

Distributed smart cameras are multiple-camera systems that perform computer vision tasks using distributed algorithms. Distributed algorithms scale better to large networks of cameras than do centralized algorithms. However, new approaches are required to many computer vision tasks in order to create efficient distributed algorithms. This chapter motivates the need for distributed computer vision, surveys background material in traditional computer vision, and describes several distributed computer vision algorithms for calibration, tracking, and gesture recognition.

Marilyn Wolf, Jason Schlessman

Architectures

Frontmatter
Architectures for Stereo Vision

Stereo vision is an elementary problem for many computer vision tasks. It has been widely studied under the two aspects of increasing the quality of the results and accelerating the computational processes. This chapter provides theoretic background on stereo vision systems and discusses architectures and implementations for real-time applications. In particular, the computationally most intensive part, the stereo matching, is discussed on the example of one of the leading algorithms, the semi-global matching (SGM). For this algorithm two implementations are presented in detail on two of the most relevant platforms for real-time image processing today: Field Programmable Gate Arrays (FPGAs) and Graphics Processing Units (GPUs). Thus, the major differences in designing parallelization techniques for extremely different image processing platforms are being illustrated.

Christian Banz, Holger Blume, Peter Pirsch
Multicore Systems on Chip

This chapter discusses multicore architectures for DSP applications. We explain briefly the main challenges involved in future processor designs, justifying the need for thread level parallelism exploration, since instruction-level parallelism is becoming increasingly difficult and unfeasible to explore given a limited power budget. We discuss, based on an analytical model, the tradeoffs on using multiprocessor architectures over high-end single-processor design regarding performance and energy. Hence, the analytical model is applied to a traditional DSP application, illustrating the need of both instruction and thread exploration on such application domain. Some successful MPSoC designs are presented and discussed, indicating the different trend of embedded and general-purpose processor market designs. Finally, we produce a thorough analysis on hardware and software open problems like interconnection mechanism and programming models.

Luigi Carro, Mateus Beck Rutzig
Coarse-Grained Reconfigurable Array Architectures

Coarse-Grained Reconfigurable Array (CGRA) architectures accelerate the same inner loops that benefit from the high ILP support in VLIW architectures. Unlike VLIWs, CGRAs are designed to execute only the loops, which they can hence do more efficiently. This chapter discusses the basic principles of CGRAs and the wide range of design options available to a CGRA designer, covering a large number of existing CGRA designs. The impact of different options on flexibility, performance, and power-efficiency is discussed, as well as the need for compiler support. The ADRES CGRA design template is studied in more detail as a use case to illustrate the need for design space exploration, for compiler support and for the manual fine-tuning of source code.

Bjorn De Sutter, Praveen Raghavan, Andy Lambrechts
Arithmetic

In this chapter fundamentals of arithmetic operations and number representations used in DSP systems are discussed. Different relevant number systems are outlined with a focus on fixed-point representations. Structures for accelerating the carry-propagation of addition are discussed, as well as multi-operand addition. For multiplication, different schemes for generating and accumulating partial products are presented. In addition to that, optimization for constant coefficient multiplication is discussed. Division and square-rooting are also briefly outlined. Furthermore, floating-point arithmetic and the IEEE 754 floating-point arithmetic standard are presented. Finally, some methods for computing elementary functions, e.g., trigonometric functions, are presented.

Oscar Gustafsson, Lars Wanhammar
Architectures for Particle Filtering

There are many applications in which particle filters outperform traditional signal processing algorithms. Some of these applications include tracking, joint detection and estimation in wireless communication, and computer vision. However, particle filters are not used in practice for these applications mainly because they cannot satisfy real-time requirements. This chapter discusses several important issues in designing an efficient resampling architecture for high throughput parallel particle filtering. The resampling algorithm is developed in order to compensate for possible error caused by finite precision quantization in the

resampling

step. Communication between the processing elements after resampling is identified as an implementation bottleneck, and therefore, concurrent buffering is incorporated in order to speed up communication of particles among processing elements. The mechanism utilizes a particle-tagging scheme during quantization to compensate possible loss of replicated particles due to the finite precision effect. Particle tagging divides replicated particles into two groups for systematic redistribution of particles to eliminate particle localization in parallel processing. The mechanism utilizes an efficient interconnect topology for guaranteeing complete redistribution of particles even in case of potential weight unbalance among processing elements. The architecture supports high throughput and ensures that the overall parallel particle filtering execution time scales with the number of processing elements employed.

Sangjin Hong, Seong-Jun Oh
Application Specific Instruction Set DSP Processors

In this chapter, application specific instruction set processors (ASIP) for DSP applications will be introduced and discussed for readers who want general information about ASIP technology. The introduction includes ASIP design flow, source code profiling, architecture exploration, assembly instruction set design, design of assembly language programming toolchain, firmware design, benchmarking, and microarchitecture design. Special challenges from designing multicore ASIP are discussed. Two examples, design for instruction set level acceleration of radio baseband, and design for instruction set level acceleration of image and video signal processing, are introduced.

Dake Liu, Jian Wang
FPGA-Based DSP

Field Programmable Gate Array (FPGA) offer an excellent platform for embedded DSP systems when real-time processing beyond that which multiprocessor platforms can achieve is required, and volumes are too small to justify the costs of developing a custom chip. This niche role is due to the ability of FPGA to host custom computing architectures, tailored to the application. Modern FPGAs play host to large quantities of heterogeneous logic, computational and memory components which can only be effectively exploited by heterogeneous processing architectures composed of microprocessors with custom co-processors, parallel software processor and dedicated hardware units. The complexity of these architectures, coupled with the need for frequent regeneration of the implementation with each new application makes FPGA system design a highly complex and unique design problem. The key to success in this process is the ability of the designer to best exploit the FPGA resources in a custom architecture, and the ability of design tools to quickly and efficiently generate these architectures. This chapter describes the state-of-the-art in FPGA device resources, computing architectures, and design tools which support the DSP system design process.

John McAllister
Application-Specific Accelerators for Communications

For computation-intensive digital signal processing algorithms, complexity is exceeding the processing capabilities of general-purpose digital signal processors (DSPs). In some of these applications, DSP hardware accelerators have been widely used to off-load a variety of algorithms from the main DSP host, including FFT, FIR/IIR filters, multiple-input multiple-output (MIMO) detectors, and error correction codes (Viterbi, Turbo, LDPC) decoders. Given power and cost considerations, simply implementing these computationally complex parallel algorithms with high-speed general-purpose DSP processor is not very efficient. However, not all DSP algorithms are appropriate for off-loading to a hardware accelerator. First, these algorithms should have data-parallel computations and repeated operations that are amenable to hardware implementation. Second, these algorithms should have a deterministic dataflow graph that maps to parallel datapaths. The accelerators that we consider are mostly coarse grain to better deal with streaming data transfer for achieving both high performance and low power. In this chapter, we focus on some of the basic and advanced digital signal processing algorithms for communications and cover major examples of DSP accelerators for communications.

Yang Sun, Kiarash Amiri, Michael Brogioli, Joseph R. Cavallaro
General-Purpose DSP Processors

Recently the border between DSP processors and general-purpose processors has been diminishing as general-purpose processors have obtained DSP features to support various multimedia applications. This chapter provides a view to general-purpose DSP processors by considering the characteristics of DSP algorithms and identifying important features in a processor architecture for efficient DSP algorithm implementations. Fixed-point and floating-point data paths are discussed. Memory architectures are considered from parallel access point of view and address computations are shortly discussed.

Jarmo Takala
Mixed Signal Techniques

Mixed signal circuits include both analog and digital functions. Mixed signal techniques that are commonly encountered in signal processing systems are discussed in this chapter. First, the principles and general properties of sampling and analog to digital conversion are presented. The structure and operating principle of several widely used analog to digital converter architectures are then described, including both converters operating at the Nyquist rate and oversampled converters based on sigma–delta modulators. Next, different types of digital to analog converters are discussed. The basic features and building blocks of switched-capacitor circuits are then shown. Finally, mixed-signal techniques for frequency synthesis and clock synchronization are explained.

Olli Vainio
DSP Systems Using Three-Dimensional Integration Technology

As three-dimensional (3D) integration technology becomes mature and starts to enter mainstream markets, it has attracted exploding interest from integrated circuit and system research community. This chapter discusses and demonstrates the exciting opportunities and potentials for digital signal processing (DSP) circuit and system designers to exploit 3D integration technology. In particular, this chapter advocates a 3D logic-DRAM integration design paradigm and discusses the use of 3D logic-memory integration in both programmable digital signal processors and application-specific digital signal processing circuits. To further demonstrate the potential, this chapter presents case studies on applying 3D logic-DRAM integration to clustered VLIW (very long instruction word) digital signal processors and application-specific video encoders. Since DSP systems using 3D integration technology is still in its research infancy, by presenting some first discussions and results, this chapter aims to motivate greater future efforts from DSP system research community to explore this new and rewarding research area.

Tong Zhang, Yangyang Pan, Yiran Li

Design Methods and Tools

Frontmatter
Methods and Tools for Mapping Process Networks onto Multi-Processor Systems-On-Chip

Applications based on the Kahn process network (KPN) model of computation are determinate, modular, and based on FIFO communication for inter-process communication. While these properties allow KPN applications to efficiently execute on multi-processor systems-on-chip (MPSoC), they also enable the automation of the design process. This chapter focuses on the second aspect and gives an overview of methods for automating the design process of KPN applications implemented on MPSoCs. Whereas previous chapters mainly introduced techniques that apply to restricted classes of process networks, this overview will be dealing with general Kahn process networks.

Iuliana Bacivarov, Wolfgang Haid, Kai Huang, Lothar Thiele
Dynamic Dataflow Graphs

Much of the work to date on dataflow models for signal processing system design has focused on decidable dataflow models that are best suited for one-dimensional signal processing. This chapter reviews more general dataflow modeling techniques that are targeted to applications that include multidimensional signal processing and dynamic dataflow behavior. As dataflow techniques are applied to signal processing systems that are more complex, and demand increasing degrees of agility and flexibility, these classes of more general dataflow models are of correspondingly increasing interest. We first provide a motivation for dynamic dataflow models of computation, and review a number of specific methods that have emerged in this class of models. Our coverage of dynamic dataflow models in this chapter includes Boolean dataflow, CAL, parameterized dataflow, enable-invoke dataflow, dynamic polyhedral process networks, scenario aware dataflow, and a stream-based function actor model.

Shuvra S. Bhattacharyya, Ed F. Deprettere, Bart D. Theelen
DSP Instruction Set Simulation

An instruction set simulator is an important tool for system architects and for software developers. However, when implementing a simulator, there are many choices which can be made and that have an effect on the speed and the accuracy of the simulation. They are especially relevant to DSP simulation. This chapter explains the different strategies for implementing a simulator.

Florian Brandner, Nigel Horspool, Andreas Krall
Integrated Modeling Using Finite State Machines and Dataflow Graphs

In this chapter, different application modeling approaches based on the integration of finite state machines with dataflow models are reviewed. Restricted Models of Computation (MoC) may be exploited in design methodologies to generate optimized hardware/software implementations from a given application model. A particular focus is put on the analyzability of these models with respect to schedulability and the generation of efficient schedule implementations. In this purpose, clustering methods for model refinement and schedule optimization are of particular interest.

Joachim Falk, Christian Haubelt, Christian Zebelein, Jürgen Teich
C Compilers and Code Optimization for DSPs

Compilers take a central role in the software development tool chain for any processor and enable high-level programming. Hence, they increase programmer productivity and code portability while reducing time-to-market. The responsibilities of a C compiler go far beyond the translation of the source code into an executable binary and comprise additional code optimization for high performance and low memory footprint. However, traditional optimizations are typically oriented towards RISC architectures that differ significantly from most digital signal processors. In this chapter we provide an overview of the challenges faced by compilers for DSPs and outline some of the code optimization techniques specifically developed to address the architectural idiosyncrasies of the most prevalent digital signal processors on the market.

Björn Franke
Kahn Process Networks and a Reactive Extension

Kahn and MacQueen have introduced a generic class of determinate asynchronous data-flow applications, called Kahn Process Networks (KPNs) with an elegant mathematical model and semantics in terms of Scott-continuous functions on data streams together with an implementation model of independent asynchronous sequential programs communicating through FIFO buffers with blocking read and non-blocking write operations. The two are related by the Kahn Principle which states that a realization according to the implementation model behaves as predicted by the mathematical function. Additional steps are required to arrive at an actual implementation of a KPN to take care of scheduling of independent processes on a single processor and to manage communication buffers. Because of the expressiveness of the KPN model, buffer sizes and schedules cannot be determined at design time in general and require dynamic run-time system support. Constraints are discussed that need to be placed on such system support so as to maintain the Kahn Principle. We then discuss a possible extension of the KPN model to include the possibility for sporadic, reactive behavior which is not possible in the standard model. The extended model is called Reactive Process Networks. We introduce its semantics, look at analyzability and at more constrained data-flow models combined with reactive behavior.

Marc Geilen, Twan Basten
Decidable Dataflow Models for Signal Processing: Synchronous Dataflow and Its Extensions

Digital signal processing algorithms can be naturally represented by a dataflow graph where nodes represent function blocks and arcs represent the data dependency between nodes. Among various dataflow models, decidable dataflow models have restricted semantics so that we can determine the execution order of nodes at compile-time and decide if the program has the possibility of buffer overflow or deadlock. In this chapter, we explain the synchronous dataflow (SDF) model as the pioneering and representative decidable dataflow model and its decidability focusing on how the static scheduling decision can be made. In addition the cyclo-static dataflow model and a few other extended models are briefly introduced to show how they overcome the limitations of the SDF model.

Soonhoi Ha, Hyunok Oh
Systolic Arrays

This chapter reviews the basic ideas of systolic array, its design methodologies, and historical development of various hardware implementations. Two modern applications, namely, motion estimation of video coding and wireless communication baseband processing are also discussed.

Yu Hen Hu, Sun-Yuan Kung
Multidimensional Dataflow Graphs

In many signal processing applications, the tokens in a stream of tokens have a dimension higher than one. For example, the tokens in a video stream represent images so that a video application is actually three- or four-dimensional: Two dimensions are required in order to describe the pixel coordinates, one dimension indexes the different color components, and the time finally corresponds to the last dimension. Static multidimensional (MD) streaming applications can be modeled using one-dimensional dataflow graphs[7], but these are at best cyclostatic dataflow graphs, often with many phases in the actor’s vector valued token production and consumption patterns. These models incur a high control overhead. Furthermore such a notation hides many important algorithm properties such as inherent data parallelism, fine grained data dependencies and thus required memory sizes. Finally, the model is very implementation specific in that some of the degrees of freedom such as the processing order are already nailed down and cannot be changed easily without completely recreating the model.

Joachim Keinert, Ed F. Deprettere
Compiling for VLIW DSPs

This chapter describes fundamental compiler techniques for VLIW DSP processors. We begin with a review of VLIW DSP architecture concepts, as far as relevant for the compiler writer. As a case study, we consider the TI TMS320C62x

TM

clustered VLIW DSP processor family. We survey the main tasks of VLIW DSP code generation, discuss instruction selection, cluster assignment, instruction scheduling and register allocation in some greater detail, and present selected techniques for these, both heuristic and optimal ones. Some emphasis is put on phase ordering problems and on phase coupled and integrated code generation techniques.

Christoph W. Kessler
Software Compilation Techniques for MPSoCs

The increasing demands such as high-performance and energy-efficiency for future embedded systems result in the emerging of heterogeneous Multiprocessor System-on-Chip (MPSoC) architectures. To fully enable the power of those architectures, new tools are needed to take care of the increasing complexity of the software to achieve high productivity. An

MPSoC compiler

is the tool-chain to tackle the problems of expressing parallelism in applications’ modeling/programming, mapping/scheduling and generating the software to distribute on an MPSoC platform for efficient usage, for a given (pre-)verified MPSoC platform. This chapter talks about the various aspects of MPSoC compilers for heterogeneous MPSoC architectures, using a comparison to the well-established uni-processor C compiler technology. After a brief introduction to MPSoC and MPSoC compilers, the important ingredients of the compilation process, such as programming models, granularity and partitioning, platform description, mapping/scheduling and code-generation, are explained in detail. As the topic is relatively young, a number of case studies from academia and industry are selected to illustrate the concepts at the end of this chapter.

Rainer Leupers, Weihua Sheng, Jeronimo Castrillon
Embedded C for Digital Signal Processing

The majority of micro processors in the world do not sit inside a desktop personal computer or laptop as general purpose processor, but have a dedicated purpose inside some kind of apparatus, like a mobile telephone, modem, washing machine, cruise missile, hard disk, DVD player, etc. Such processors are called embedded processors. They are designed with their application in mind and therefore carry special features. With the high volume and strict real time requirements of mobile communication the digital signal processor (DSP) emerged. These embedded processors featured special hardware and instructions to support efficient processing of the communication signal. Traditionally these special features were programmed through some assembly language, but with the growing volume of devices and software a desire arose to access these features from a standardized programming language. A work group of the International Organization for Standardization (ISO) has recognized this desire and came up with an extension of their C standard to support those features. This chapter intends to explain this extension and illustrate how to use them to efficiently use a DSP.

Bryan E. Olivier
Signal Flow Graphs and Data Flow Graphs

This chapter first introduces two types of graphical representations of digital signal processing algorithms including signal flow graph (SFG) and data flow graph (DFG). Since SFG and DFG are in general used for analyzing structural properties and exploring architectural alternatives using high-level transformations, such transformations including retiming, pipelining, unfolding and folding will then be addressed. Finally, their real-world applications to both hardware and software design will be presented.

Keshab K. Parhi, Yanni Chen
Optimization of Number Representations

In this section, automatic scaling and word-length optimization procedures for efficient implementation of signal processing systems are explained. For this purpose, a fixed-point data format that contains both integer and fractional parts is introduced, and used for systematic and incremental conversion of floating-point algorithms into fixed-point or integer versions. A simulation based range estimation method is explained, and applied to automatic scaling of C language based digital signal processing programs. A fixed-point optimization method is also discussed, and optimization examples including a recursive filter and an adaptive filter are shown.

Wonyong Sung
Polyhedral Process Networks

Reference implementations of signal processing applications are often written in a sequential language that does not reveal the available parallelism in the application. However, if an application satisfies some constraints then a parallel specification can be derived automatically. In particular, if the application can be represented in the polyhedral model, then a

polyhedral process network

can be constructed from the application. After introducing the required polyhedral tools, this chapter details the construction of the processes and the communication channels in such a network. Special attention is given to various properties of the communication channels including their buffer sizes.

Sven Verdoolaege
Mapping Decidable Signal Processing Graphs into FPGA Implementations

Field programmable gate arrays (FPGAs) are examples of complex programmable system-on-chip (PSoC) platforms and comprise dedicated DSP hardware resources and distributed memory. They are ideal platforms for implementing computationally complex DSP systems for image processing and radar, sonar and signal processing. The chapter describes how decidable signal processing graphs are mapped onto such platforms and shows how parallelism and pipelining can be controlled to achieve the required speed using minimal hardware resource. The work shows how the techniques outlined there are used to build efficient FPGA implementations. The process is demonstrated for a number of DSP circuits including a finite impulse response (FIR) filter, lattice filter and a more complex adaptive signal processing design, namely a least means squares (LMS) filter.

Roger Woods
Metadata
Title
Handbook of Signal Processing Systems
Editors
Shuvra S. Bhattacharyya
Ed F. Deprettere
Rainer Leupers
Jarmo Takala
Copyright Year
2013
Publisher
Springer New York
Electronic ISBN
978-1-4614-6859-2
Print ISBN
978-1-4614-6858-5
DOI
https://doi.org/10.1007/978-1-4614-6859-2