Introduction to Hardware/Software Codesign

Frontmatter

1. Introduction to Hardware/Software Codesign

Hardware/Software Codesign (HSCD) is an integral part of modern Electronic System Level (ESL) design flows. This chapter will review important aspects of hardware/software codesign flows, summarize the historical evolution of codesign techniques, and subsequently summarize each of its major branches of research and achievements that later will be presented in detail by different parts of this Handbook of Hardware/Software Codesign.

Soonhoi Ha, Jürgen Teich, Christian Haubelt, Michael Glaß, Tulika Mitra, Rainer Dömer, Petru Eles, Aviral Shrivastava, Andreas Gerstlauer, Shuvra S. Bhattacharyya

Models and Languages for Codesign

Frontmatter

2. Quartz: A Synchronous Language for Model-Based Design of Reactive Embedded Systems

Since the synchronous model of computation is shared between synchronous languages and synchronous hardware circuits, synchronous languages lend themselves well for hardware/software codesign in the sense that from the same synchronous program both hardware and software can be generated. In this chapter, we informally describe the syntax and semantics of the imperative synchronous language Quartz and explain how these programs are first analyzed and then compiled to hardware and software: To this end, the programs are translated to synchronous guarded actions whose causality has to be ensured as a major consistency analysis of the compiler. We then explain the synthesis of hardware circuits and sequential programs from synchronous guarded actions and briefly look at extensions of the Quartz language in the conclusions.

Klaus Schneider, Jens Brandt

3. SysteMoC: A Data-Flow Programming Language for Codesign

Computations in hardware/software systems are inherently performed concurrently. Hence, modeling hardware/software systems requires notions of concurrency. Data-flow models have been and are still successfully applied in the modeling of hardware/software systems. In this chapter, we motivate and introduce the usage of data-flow models. Moreover, we discuss the expressiveness and analyzability of different data-flow Models of Computation (MoCs). Subsequently, we present SysteMoC, an approach supporting many data-flow MoCs based on the system description language SystemC. Besides specifying data-flow models, SystemMoC also permits the automatic classification of each different part of an application modeled in SysteMoC into a least expressive but most analyzable MoC. This classification is the key to further optimization in later design stages of hardware/software systems such as exploration of design alternatives as well as automatic code generation and hardware synthesis. Such optimization and refinement steps are employed as part of the SystemCoDesigner design flow that uses SysteMoC as its input language.

Joachim Falk, Christian Haubelt, Jürgen Teich, Christian Zebelein

4. ForSyDe: System Design Using a Functional Language and Models of Computation

The ForSyDe methodology aims to push system design to a higher level of abstraction by combining the functional programming paradigm with the theory of Models of Computation (MoCs). A key concept of ForSyDe is the use of higher-order functions as process constructors to create processes. This leads to well-defined and well-structured ForSyDe models and gives a solid base for formal analysis. The book chapter introduces the basic concepts of the ForSyDe modeling framework and presents libraries for several MoCs and MoC interfaces for the modeling of heterogeneous systems, including support for the modeling of run-time reconfigurable processes.The formal nature of ForSyDe enables transformational design refinement using both semantic-preserving and nonsemantic-preserving design transformations. The chapter also introduces a general synthesis concept based on process constructors, which is exemplified by means of a hardware synthesis tool for synchronous ForSyDe models. Most examples in the chapter are modeled with the Haskell version of ForSyDe. However, to illustrate that ForSyDe is language-independent, the chapter also contains a short overview of SystemC-ForSyDe.

Ingo Sander, Axel Jantsch, Seyed-Hosein Attarzadeh-Niaki

5. Modeling Hardware/Software Embedded Systems with UML/MARTE: A Single-Source Design Approach

Model-based design has shown to be a powerful approach for embedded software systems. The Unified Modeling Language (UML) provides a standard, graphically based formalism for capturing system models. The standard Modeling and Analysis of Real-Time Embedded Systems (MARTE) profile provides syntactical and semantical extensions required for the modeling and HW/SW codesign of real-time and embedded systems. However, the UML/MARTE standard is not sufficient. In addition, a modeling methodology stating how to build a model capable to support the analysis and HW/SW codesign activities of complex embedded systems is required. This chapter presents a UML/MARTE modeling methodology capable to address such analysis and design activities. A distinguishing aspect of the modeling methodology is that it supports a single-source design approach.

Fernando Herrera, Julio Medina, Eugenio Villar

Design Space Exploration

Frontmatter

6. Optimization Strategies in Design Space Exploration

This chapter presents guidelines to choose an appropriate exploration algorithm, based on the properties of the design space under consideration. The chapter describes and compares a selection of well-established multi-objective exploration algorithms for high-level design that appeared in recent scientific literature. These include heuristic, evolutionary, and statistical methods. The algorithms are divided into four sub-classes and compared by means of several metrics: their setup effort, convergence rate, scalability, and performance of the optimization. The common goal of these algorithms is the optimization of a multi-processor platform running a set of diverse software benchmark applications. Results show how the metrics can be related to the properties of a target design space (size, number of variables, and variable ranges) with a focus on accuracy, precision, and performance.

Jacopo Panerati, Donatella Sciuto, Giovanni Beltrame

7. Hybrid Optimization Techniques for System-Level Design Space Exploration

Embedded system design requires to solve synthesis steps that consist of resource allocation, task binding, data routing, and scheduling. These synthesis steps typically occur several times throughout the entire design cycle and necessitate similar concepts even at different levels of abstraction. In order to cope with the large design space, fully automatic Design Space Exploration (DSE) techniques might be applied. In practice, the high complexity of these synthesis steps requires efficient approaches that also perform well in the presence of stringent design constraints. Those constraints may render vast areas in the search space infeasible with only a fraction of feasible implementations that are sparsely distributed. This is a serious problem for metaheuristics that are popular for DSE of electronic hardware/software systems, since they are faced with large areas of infeasible implementations where no gradual improvement is possible. In this chapter, we present an approach that combines metaheuristic optimization with search algorithms to solve the problem of Hardware/Software Codesign (HSCD) including allocation, binding, and scheduling. This hybrid optimization uses powerful search algorithms to determine feasible implementations This avoids an exploration of infeasible areas and, thus, enables a gradual improvement as required for efficient metaheuristic optimization. Two methods are presented that can be applied to both, problems with linear as well as non-linear constraints, the latter being particularly intended to address aspects such as timeliness or reliability which cannot be approximated by linear constraints in a sound fashion. The chapter is concluded with several examples for a successful use of the introduced techniques in different application domains.

Michael Glaß, Jürgen Teich, Martin Lukasiewycz, Felix Reimann

8. Architecture and Cross-Layer Design Space Exploration

The task of architectural Design Space Exploration (DSE) is extremely complex, with multiple architectural parameters to be tuned and optimized, resulting in a huge design space that needs to be explored efficiently. Furthermore, each architectural parameter and/or design point is critically affected by decisions made at lower levels of abstraction (e.g., layout, choice of transistors, etc.). Ideally designers would like to perform DSE incorporating information and decisions made across multiple layers of design abstraction so that the ensuing design space is both feasible and has good fidelity. Simulation-based methods alone can not deal with this incredibly large and complex design space. To address these issues, this chapter presents an approach for cross-layer architectural DSE that efficiently prunes the large design space and furthermore uses predictive models to avoid expensive simulations. The chapter uses a single-chip heterogeneous single-ISA multiprocessor as an exemplar to demonstrate how the large search space can be covered and evaluated efficiently. A cross-layer approach is presented to cope with the complexity by restricting the search/design space through the use of cross-layer prediction models to avoid too costly full system simulations, coupled with systematic pruning of the design space to enable good coverage of the design space in an efficient manner.

Santanu Sarma, Nikil Dutt

9. Scenario-Based Design Space Exploration

Modern embedded systems are becoming increasingly multifunctional, and, as a consequence, they more and more have to deal with dynamic application workloads. This dynamism manifests itself in the presence of multiple applications that can simultaneously execute and contend for resources in a single embedded system as well as the dynamic behavior within applications themselves. Such dynamic behavior in application workloads must be taken into account during the early system-level Design Space Exploration (DSE) of Multiprocessor System-on-Chip (MPSoC)-based embedded systems. Scenario-based DSE utilizes the concept of application scenarios to search for optimal mappings of a multi-application workload onto an MPSoC. To this end, scenario-based DSE uses a multi-objective genetic algorithm (GA) to identify the mapping with the best average quality for all the application scenarios in the workload. In order to keep the exploration of the scenario-based DSE efficient, fitness prediction is used to obtain the quality of a mapping. This fitness prediction implies that instead of using the entire set of all possible application scenarios, a small but representative subset of application scenarios is used to determine the fitness of mapping solutions. Since the representativeness of such a subset is dependent on the application mappings being explored, these representative subsets of application scenarios are dynamically obtained by means of coexploration of the scenario subset space. In this chapter, we provide an overview of scenario-based DSE and, in particular, present multiple techniques for fitness prediction using representative subsets of application scenarios: a stochastic, deterministic, and hybrid combination.

Andy Pimentel, Peter van Stralen

10. Design Space Exploration and Run-Time Adaptation for Multicore Resource Management Under Performance and Power Constraints

This chapter focuses on resource management techniques for performance or energy optimization in multi-/many-core systems. First, it gives a comprehensive overview about resource management in a broad perspective. Secondly, it discusses the possible optimization goals and constraints of resource management techniques: computational performance, power consumption, energy consumption, and temperature. Finally, it details the state-of-the-art techniques on resource management for performance optimization under power and thermal constraints, as well as for energy optimization under performance constraints.

Santiago Pagani, Muhammad Shafique, Jörg Henkel

Processor, Memory, and Communication Architecture Design

Frontmatter

11. Reconfigurable Architectures

Reconfigurable architectureReconfigurable architecture is a computer architecture combining some of the flexibility of software with the high performance of hardware. It has configurable fabricFabric that performs a specific data-dominated task, such as image processing or pattern matching, quickly as a dedicated piece of hardware. Once the task has been executed, the hardware can be adjusted to do some other task. This allows the reconfigurable architectureReconfigurable architecture to provide the flexibility of software with the speed of hardware. This chapter discusses two major streams of reconfigurable architectureReconfigurable architecture: Field-Programmable Gate Array (FPGA)Field-Programmable Gate Array (FPGA) and Coarse Grained Reconfigurable Architecture (CGRA)Coarse Grained Reconfigurable Architecture (CGRA). It gives a brief explanation of the merits and usage of reconfigurable architectureReconfigurable architecture and explains basic FPGAField-Programmable Gate Array (FPGA) and CGRACoarse Grained Reconfigurable Architecture (CGRA) architectures. It also explains techniques for mapping applications onto FPGAsField-Programmable Gate Array (FPGA) and CGRAsCoarse Grained Reconfigurable Architecture (CGRA).

Mansureh Shahraki Moghaddam, Jae-Min Cho, Kiyoung Choi

12. Application-Specific Processors

General-Purpose Processors (GPPs) and Application-Specific Integrated Circuits (ASICs) are the two extreme choices for computational engines. GPPs offer complete flexibility but are inefficient both in terms of performance and energy. In contrast, ASICs are highly energy-efficient, provide the best performance at the cost of zero flexibility. Application-specific processors or custom processors bridge the gap between these two alternatives by bringing in improved power-performance efficiency within the familiar software programming environment. An application-specific processor architecture augments the base instruction-set architecture with customized instructions that encapsulate the frequently occurring computational patterns within an application. These custom instructions are implemented in hardware enabling performance acceleration and energy benefits. The challenge lies in inventing automated tools that can design an application-specific processor by identifying and implementing custom instructions from the application software specified in high-level programming languages. In this chapter, we present the benefits of application-specific processors, their architecture, automated design flow, and the renewed interests in this class of architectures from energy-efficiency perspective.

Tulika Mitra

13. Memory Architectures

In this chapter we discuss the topic of memory organization in embedded systems and Systems-on-Chips (SoCs). We start with the simplest hardware-based systems needing registers for storage and proceed to hardware/software codesigned systems with several standard structures such as Static Random-Access Memory (SRAM) and Dynamic Random-Access Memory (DRAM). In the process, we touch upon concepts such as caches and Scratchpad Memories (SPMs)scratchpad memory (SPM). In general, the emphasis is on concepts that are more generally found in SoCs and less on general-purpose computing systems, although this distinction is not very clearly defined with respect to the memory subsystem. We touch upon implementations of these ideas in modern research and commercial scenarios. In this chapter, we also point out issues arising in the context of the memory architectures that become exported as problems to be addressed by the compiler and system designer.

Preeti Ranjan Panda

14. Emerging and Nonvolatile Memory

In recent years, Non-Volatile Memory (NVM)nonvolatile memory (NVM) technologies have emerged as candidates for future computer memory. Nonvolatility, the ability of storing information even after powered off, essentially differentiates them from traditional CMOS-basedComplementary Metal-Oxide-Semiconductor (CMOS) memory technologies. In addition to the nonvolatility, NVMs are also favored because of their low leakage powerLeakage power, high densityDensity, and comparable readRead speedSpeed compared with volatile memories. However, there are challenges to efficiently utilize NVMs due to the high writeWrite cost and potential enduranceEndurance issues. In this chapter, we first introduce representative NVM technologies including their physical construction for data storage, as well as characteristics, and then summarize recent work aiming to exploring NVMs’ characteristic to optimize their behaviors.

Chun Jason Xue

15. Network-on-Chip Design

Continuous transistor scaling has enabled computer architecture to integrate increasing numbers of cores on a chip. As the number of cores on a chip and application complexity has increased, the on-chip communication bandwidth requirement increased as well. Packet-switched network on chip (NoC) is envisioned as a scalable and cost-effective communication fabric for multi-core architectures with tens and hundreds of cores. In this chapter we focus on on-chip communication architecture design and introduce the reader to some essential concepts of NoC architecture. This is followed by a discussion on the commonly used power-saving techniques used for NoCs and the drawbacks and limitations of these techniques. We then concentrate on performance optimization through intelligent mapping of applications on multi-core architectures. We conclude the chapter with a discussion of various application-specific on-chip interconnect design methods.Network-on-Chip (NoC)

Haseeb Bokhari, Sri Parameswaran

16. NoC-Based Multiprocessor Architecture for Mixed-Time-Criticality Applications

In this chapter we define what a mixed-time-criticality system is and what its requirements are. After defining the concepts that such systems should follow, we described CompSOC, which is one example of a mixed-time-criticality platform. We describe, in detail, how multiple resources, such as processors, memories, and interconnect, are combined into a larger hardware platform, and especially how they are shared between applications using different arbitration schemes. Following this, the software architecture that transforms the single hardware platform into multiple virtual execution platforms, one per application, is described.

Kees Goossens, Martijn Koedam, Andrew Nelson, Shubhendu Sinha, Sven Goossens, Yonghui Li, Gabriela Breaban, Reinier van Kampenhout, Rasool Tavakoli, Juan Valencia, Hadi Ahmadi Balef, Benny Akesson, Sander Stuijk, Marc Geilen, Dip Goswami, Majid Nabi

Hardware/Software Cosimulation and Prototyping

Frontmatter

17. Parallel Simulation

The SystemCSystemC standard is widely used in industry and academia to model and simulate electronic system-level designs. However, despite the availability of multi-core processor hosts, the reference SystemC simulatorSystemC, sequential is still based on sequentialSimulation, sequentialDiscrete Event Simulation (DES)Discrete Event Simulation (DES) which executes only a single thread at any time.In recent years, parallel SystemCSystemC, parallel simulatorsSimulation, parallel have been proposed which run multiple threads in parallel based on Parallel Discrete Event Simulation (PDES)Parallel Discrete Event Simulation (PDES) semantics. While this can improve the simulator run time by an order of magnitude, synchronous PDES requires careful dependency analysis of the model and still limits the parallel execution to threads that run at the same simulation time.In this chapter, we review the classic DES and PDES algorithms and then present a state-of-the-art approach called Out-of-Order Parallel Discrete Event Simulation (OOO PDES)Out-of-Order Parallel Discrete Event Simulation (OOO PDES) which breaks the traditional time cycle barrier and executes threads in parallel and out of order (ahead of time) while maintaining the standard SystemC modeling semantics. Specifically, we present our Recoding Infrastructure for SystemC (RISC)Recoding Infrastructure for SystemC (RISC) that consists of a dedicated SystemC compiler and advanced parallel simulator. RISC provides an open-source reference implementation of OOO PDES and achieves fastest simulation speed for traditional SystemC models without any loss of accuracy.

Rainer Dömer, Guantao Liu, Tim Schmidt

18. Multiprocessor System-on-Chip Prototyping Using Dynamic Binary Translation

Dynamic binary translation is a processor emulation technology that allows to execute in a very efficient manner a binary program for an instruction-set architecture A on a processor having instruction-set architecture B. This chapter starts by giving a rapid overview of the dynamic binary translation process and its peculiarities. Then, it focuses on the support for SIMD instruction and the translation for VLIW architectures, which bring upfront new challenges for this technology. Next, it shows how the translation process can be enhanced by the insertion of instructions to monitor nonfunctional metrics, with the aim of giving, for instance, timing or power consumption estimations. Finally, it details how it can be integrated within virtual prototyping platforms, looking in particular at the synchronization issues.

Frédéric Pétrot, Luc Michel, Clément Deschamps

19. Host-Compiled Simulation

Virtual Prototypes (VPs), also known as virtual platforms, have been now widely adopted by industry as platforms for early software development, HW/SW coverification, performance analysis, and architecture exploration. Yet, rising design complexity, the need to test an increasing amount of software functionality as well as the verification of timing properties pose a growing challenge in the application of VPs. New approaches overcome the accuracy-speed bottleneck of today’s virtual prototyping methods. These next-generation VPs are centered around ultra-fast host-compiled software models. Accuracy is obtained by advanced methods, which reconstruct the execution times of the software and model the timing behavior of the operating system, target processor, and memory system. It is shown that simulation speed can further be increased by abstract TLM-based communication models. This support of ultra-fast and accurate HW/SW cosimulation will be a key enabler for successfully developing tomorrows Multi-Processor System-on-Chip (MPSoC) platforms.

Daniel Mueller-Gritschneder, Andreas Gerstlauer

20. Precise Software Timing Simulation Considering Execution Contexts

Context-sensitive software timing simulation enables a precise approximation of software timing at a high simulation speed. The number of cycles required to execute a sequence of instructions depends on the state of the microarchitecture prior to the execution of that sequence, which in turn heavily depends on the preceding instructions. This is exploited in context-sensitive timing simulation by selecting one of multiple pre-calculated cycle counts for an instruction sequence based on the control flow leading to a particular execution of the sequence. In this chapter, we give an overview of this concept and present our context-sensitive simulation framework. Experimental results demonstrate that our framework enables an accurate and fast timing simulation for software executing on current commercial embedded processors with complex high-performance microarchitectures without any slow, explicit modeling of components such as caches during simulation.

Oliver Bringmann, Sebastian Ottlik, Alexander Viehl

Performance Estimation, Analysis, and Verification

Frontmatter

21. Timing Models for Fast Embedded Software Performance Analysis

In this chapter, we give an overview on timing models which provide an abstract representation of the timing behavior for a given software. These models can be driven by a functional simulation based on the simulated control flow. As the timing model itself can reach a level of accuracy that is comparable to a classic timing simulation of the represented software, these approaches enable a fast yet accurate software performance analysis. In this chapter, we focus on the generation and structure of various models but also provide a brief introduction into their integration with a functional simulation. The presented approaches are targeting software executing on current and future system-on-chips with a wide range of embedded processors – including Graphics Processing Units (GPUs).

Oliver Bringmann, Christoph Gerum, Sebastian Ottlik

22. Semiformal Assertion-Based Verification of Hardware/Software Systems in a Model-Driven Design Framework

Since the mid-1990s, Model-driven design (MDD)Model-Driven Design (MDD) methodologies (Selic, IEEE Softw 20(5):19–25, 2003) have aimed at raising the level of abstraction through an extensive use of generic models in all the phases of the development of embedded systems. MDD describes the system under development in terms of abstract characterization, attempting to be generic not only in the choice of implementation platforms but even in the choice of execution and interaction semantics. Thus, MDD has emerged as the most suitable solution to develop complex systems and has been supported by academic (Ferrari et al., From conception to implementation: a model based design approach. In: Proceedings of IFAC symposium on advances in automotive control, 2004) and industrial tools (3S Software CoDeSys, 2012. http://www.3s-software.com; Atego ARTiSAN, 2012. http://www.atego.com/products/artisan-studio; Gentleware Poseidon for UML embedded edition, 2012. http://www.gentleware.com/uml-software-embedded-edition.html; IAR Systems IAR visualSTATE, 2012. http://www.iar.com/Products/IAR-visualSTATE/; rhapsodyIBM Rational Rhapsody, 2012. http://www.ibm.com/software/awdtools/rhapsody; entarchSparx Systems Enterprise architet, 2012. http://www.sparxsystems.com.au; Aerospace Valley TOPCASED project, 2012. http://www.topcased.org). The gain offered by the adoption of an MDD approach is the capability of generating the source code implementing the target design in a systematic way, i.e., it avoids the need of manual writing. However, even if MDD simplifies the design implementation, it does not prevent the designers from wrongly defining the design behavior. Therefore, MDD gives full benefits if it also integrates functional verificationFunctional verification. In this context, Assertion-Based Verification (ABV) has emerged as one of the most powerful solutions for capturing a designer’s intent and checking their compliance with the design implementation. In ABV, specifications are expressed by means of formal properties. These overcome the ambiguity of natural languages and are verified by means of either static (e.g., model checking) or, more frequently, dynamic (e.g., simulation) techniques. Therefore ABV provides a proof of correctness for the outcome of the MDD flow. Consequently, the MDD and ABV approaches have been combined to create efficient and effective design and verification frameworks that accompany designers and verification engineers throughout the system-level design flow of complex embedded systems, both for the Hardware (HW) and the Software (SW) parts (STM Products radCHECK, 2012. http://www.verificationsuite.com; Seger, Integrating design and verification – from simple idea to practical system. In: Proceedings of ACM/IEEE MEMOCODE, pp 161–162, 2006). It is, indeed, worth noting that to achieve a high degree of confidence, such frameworks require to be supported by functional qualificationFunctional qualification methodologies, which evaluate the quality of both the properties (Di Guglielmo et al. The role of mutation analysis for property qualification. In: 7th IEEE/ACM international conference on formal methods and models for co-design, MEMOCODE’09, pp 28–35, 2009. DOI 10.1109/MEMCOD.2009.5185375) and the testbenches which are adopted during the overall flow (Bombieri et al. Functional qualification of TLM verification. In: Design, automation test in Europe conference exhibition, DATE’09, pp 190–195, 2009. DOI 10.1109/DATE.2009.5090656). In this context, the goal of the chapter consists of providing, first, a general introduction to MDD and ABV concepts and related formalisms and then a more detailed view on the main challenges concerning the realization of an effective semiformal ABV environment through functional qualification.

Graziano Pravadelli, Davide Quaglia, Sara Vinco, Franco Fummi

23. CPA: Compositional Performance Analysis

In this chapter we review the foundations Compositional Performance Analysis (CPA) and explain many extensions which support its application in design practice. CPA is widely used in automotive system design where it successfully complements or even replaces simulation-based approaches.

Robin Hofmann, Leonie Ahrendts, Rolf Ernst

24. Networked Real-Time Embedded Systems

This chapter gives an overview on various real-time communication protocols, from the Controller Area Network (CAN) that was standardized over twenty years ago but is still popular, to the FlexRay protocol that provides strong predictability and fault tolerance, to the more recent Ethernet-based networks. The design of these protocols including their messaging mechanisms was driven by diversified requirements on bandwidth, real-time predictability, reliability, cost, etc. The chapter provides three examples of real-time communication protocols: CAN as an example of event-triggered communication, FlexRay as a heterogeneous protocol supporting both time-triggered and event-triggered communications, and different incarnations of Ethernet that provide desired temporal guarantees.

Haibo Zeng, Prachi Joshi, Daniel Thiele, Jonas Diemer, Philip Axer, Rolf Ernst, Petru Eles

Hardware/Software Compilation and Synthesis

Frontmatter

25. Hardware-Aware Compilation

Hardware-aware compilers are in high demand for embedded systems with stringent multidimensional design constraints on cost, power, performance, etc. By making use of the microarchitectural information about a processor, a hardware-aware compiler can generate more efficient code than a generic compiler while meeting the design constraints, by exploiting those highly customized microarchitectural features. In this chapter, we introduce two applications of hardware-aware compilers: a hardware-aware compiler can be used as a production compiler and as a tool to efficiently explore the design space of embedded processors. We demonstrate the first application with a compiler that generates efficient code for embedded processors that do not have any branch predictor to reduce branch penalties. To demonstrate the second application, we show how a hardware-aware compiler can be used to explore the DesignDesign Space Exploration (DSE) Space of the bypass designs in the processor. In both the cases, the hardware-aware compiler can generate better code than a hardware-ignorant compiler.

Aviral Shrivastava, Jian Cai

26. Memory-Aware Optimization of Embedded Software for Multiple Objectives

Information processing in Cyber-Physical Systems (CPSs) has to respect a variety of constraints and objectivesObjective such as response and execution time, energy consumptionEnergy consumption, Quality of Service (QoS), size, and cost. Due to the large impact of the size of memories on their energy consumption and access times, an exploitation of memory characteristics offers a large potential for optimizations. In this chapter, we will describe optimizationOptimization approaches proposed by our research groups. We will start with optimizations for single objectives, such as energy consumption and execution time. As a consequence of considering hard real-time systemsReal-time, special attention is on the minimization of the Worst-Case Execution Time (WCET)Worst-Case Execution Time (WCET) within compilersCompiler. Three WCET reduction techniques are analyzed: exploitation of scratchpads, instruction cache locking, and cache partitioning for multitask systems. The last section presents an approach for considering trade-offs between multiple objectives in the design of a cyber-physical sensor system for the detection of bio-viruses.

Peter Marwedel, Heiko Falk, Olaf Neugebauer

27. Microarchitecture-Level SoC Design

In this chapter we consider the issues related to integrating microarchitectural IP blocks into complex SoCs while satisfying performance, power, thermal, and reliability constraints. We first review different abstraction levels for SoC design that promote IP reuse, and which enable fast simulation for early functional validation of the SoC platform. Since SoCs must satisfy a multitude of interrelated constraints, we then present high-level power, thermal, and reliability models for predicting these constraints. These constraints are not unrelated and their interactions must be considered, modeled and evaluated. Once constraints are modeled, we must explore the design space trading off performance, power and reliability. Several case studies are presented illustrating how the design space can be explored across layers, and what modifications could be applied at design time and/or runtime to deal with reliability issues that may arise.

Young-Hwan Park, Amin Khajeh, Jun Yong Shin, Fadi Kurdahi, Ahmed Eltawil, Nikil Dutt

Codesign Tools and Environment

Frontmatter

28. MAPS: A Software Development Environment for Embedded Multicore Applications

The use of heterogeneous Multi-Processor System-on-Chip (MPSoC) is a widely accepted solution to address the increasing demands on high performance and energy efficiency for modern embedded devices. To enable the full potential of these platforms, new tools are needed to tackle the programming complexity of MPSoCs, while allowing for high productivity. This chapter discusses the MPSoC Application Programming Studio (MAPS), a framework that provides facilities for expressing parallelism and tool flows for parallelization, mapping/scheduling, and code generation for heterogeneous MPSoCs. Two case studies of the use of MAPS in commercial environments are presented. This chapter closes by discussing early experiences of transferring the MAPS technology into Silexica GmbH, a start-up company that provides multi-core programming tools. MPSoC Application Programming Studio (MAPS)Multi-Processor System-on-Chip (MPSoC)Multi-core systems, programming

Rainer Leupers, Miguel Angel Aguilar, Juan Fernando Eusse, Jeronimo Castrillon, Weihua Sheng

29. HOPES: Programming Platform Approach for Embedded Systems Design

Hope Of Parallel Embedded Software (HOPES) is a design environment for embedded systems supporting all design steps from behavior specification to code synthesis, including static performance estimation, design space exploration, and HW/SW cosimulation. Distinguished from other design environments, it introduces a novel concept of “programming platform” called Common Intermediate Code (CIC), which can be understood as a generic execution model of heterogeneous multi-processor architecture. In the CIC model, each application is specified by a multi-mode Synchronous Data Flow (SDF) graph, called MTM-SDF. Each mode of operation is specified by an SDF graph and mode transition is expressed by an Finite-State Machine (FSM) model. It enables a designer to estimate the performance and resource demand by constructing static schedules of the application with varying number of allocated processing elements at each mode. At the top level, a process network model is used to express concurrent execution of multiple applications. A special process, called control task, is introduced to specify the system-level dynamism through an FSM model inside. With a given CIC model and a set of candidate target architectures, HOPES performs design space exploration to choose the best HW/SW platform, assuming that a hybrid mapping policy is used to map the applications to the processing elements. HOPES synthesizes the target code automatically from the CIC model with the mapping information. The overall design flow is verified by the design of two real-life examples.

Soonhoi Ha, Hanwoong Jung

30. DAEDALUS: System-Level Design Methodology for Streaming Multiprocessor Embedded Systems on Chips

The complexity of modern embedded systems, which are increasingly based on heterogeneous multiprocessor system-on-chip (MPSoC) architectures, has led to the emergence of system-level design. To cope with this design complexity, system-level design aims at raising the abstraction level of the design process from the register-transfer levelRegister Transfer Level (RTL) (RTL) to the so-called electronic system level (ESL). However, this opens a large gap between deployed ESL models and RTL implementations of the MPSoC under design, known as the implementation gap. Therefore, in this chapter, we present the Daedalus methodology which the main objective is to bridge this implementation gap for the design of streaming embedded MPSoCs. Daedalus does so by providing an integrated and highly automated environment for application parallelization, system-level design space exploration, and system-level hardware/software synthesis and code generation.

Todor Stefanov, Andy Pimentel, Hristo Nikolov

31. SCE: System-on-Chip Environment

The constantly growing complexity of embedded systems is a challenge that drives the development of novel design automation techniques. System-level design can address these complexity challenges by raising the level of abstraction to jointly consider hardware and software as well as by integrating the design processes for heterogeneous system components. In this chapter, we present a comprehensive system-level design framework, the System-on-Chip Environment (SCE)System-on-Chip (SoC)System-on-Chip Environment (SCE), which is based on the influential SpecCSpecC language and methodology. SCE implements a top-down digital system design flowSystem design flow based on a specify-explore-refine paradigm with support for heterogeneous target platforms consisting of custom hardware components, embedded software processors, and complex communication bus architectures. Starting from an abstract specification of the desired system, models at various levels of abstraction are automatically generated through successive stepwise refinementRefinement, ultimately resulting in a final pin- and cycle-accurate system implementation. The seamless integration of automatic model generationModel generation, estimationEstimation, and validation tools enables rapid Design Space Exploration (DSE)design space exploration (DSE)Design Space Exploration (DSE) and efficient implementation of Multi-Processor Systems-on-Chips (MPSoCs). This article provides an overview and highlights key aspects of the SCE framework from modeling and refinement to hardware and software synthesis. Using a cellphone-based example, our experimental results demonstrate the effectiveness of the SCE framework in terms of system-level exploration, hardware, and software synthesis.

Gunar Schirner, Andreas Gerstlauer, Rainer Dömer

32. Metamodeling and Code Generation in the Hardware/Software Interface Domain

In the HW/SW interface domain, specification of memory architecture and software-accessible hardware registers are both relevant for the implementation of hardware and the firmware running on it. Automated code generation of both HW and SW artifacts from a shared data source is a well-established method to ensure consistency. MetamodelingMetamodeling is a key technology to ease such code generation and to formalize the data structures target code is generated from. While this can be utilized for a wide range of automation and generation tasks, it is particularly useful for bridging the HW/SW design gap.MetamodelingMetamodeling is the basis for the construction of large model-driven automation solutions that go far beyond simple code generation solutions. Based on the formalization metamodels provide, models can be incrementally transformed and combined to create more refined models for particular design tasks. IP-XACTIP-XACT and UMLUnified Modeling Language (UML)/SysMLSysML can be utilized within the scope of metamodelingMetamodeling. The utilization of these standards and the development of custom metamodels – targeted to specific design tasks – have proven to be highly successful and promise large potential for further productivity increase.

Wolfgang Ecker, Johannes Schreiner

33. Hardware/Software Codesign Across Many Cadence Technologies

Cadence offers many technologies and methodologies for hardware/software codesign of advanced electronic and software systems. This chapter outlines many of these technologies and provides a brief overview of their key use models and methodologies. These include advanced verification, prototyping – both virtual and real, emulation, high-level synthesis, design of an Application-Specific Instruction-set Processor (ASIP), and software-driven verification approaches.

Grant Martin, Frank Schirrmeister, Yosinori Watanabe

34. Synopsys Virtual Prototyping for Software Development and Early Architecture Analysis

This chapter summarizes more than 20 years of experience by the virtual prototyping group of Synopsys in the commercial deployment of Hardware/Software Codesign (HSCD). The goal of HSCD has always been to reduce time to market, increase design productivity, and improve the quality of results. From all the different facets of HSCD, virtual prototyping – complemented by links to emulation and FPGA prototyping – has so far proven to achieve the best return of investment with respect to these goals. This chapter first gives an overview of the main virtual prototyping use cases in the context of an end-to-end prototyping flow, which also includes physical prototyping and hybrid prototyping. The second part introduces the SystemC Transaction-Level Model (TLM) standard and the Unified Power Format (UPF) as the main modeling languages for the creation of Virtual Prototypes (VPs) and system-level power models. The main body of this chapter focuses on the commercially deployed virtual prototyping use cases for architecture exploration and system-level power analysis.Virtual Prototype (VP)Transaction-Level Model (TLM)

Tim Kogel

Applications and Case Studies

Frontmatter

35. Joint Computing and Electric Systems Optimization for Green Datacenters

This chapter presents an optimization frameworkOptimization framework to manage green datacentersGreen datacenter using multilevel energy reduction techniques in a joint approach. A green datacenterGreen datacenter exploits renewable energyRenewable energy sources and active Uninterruptible Power Supply (UPS) units to reduce the energy intake from the gridGrid while improving its Quality of Service (QoS)Quality of Service (QoS). At server level, the state-of-the-art correlation-awareCorrelation Virtual Machines (VMs) consolidation techniqueConsolidation allows to maximize server’s energy efficiency. At system level, heterogeneous Energy Storage Systems (ESS)Energy Storage Systems (ESS) replace standard UPSs, while a dedicated optimization strategy aims at maximizing the lifetime of the batteryBattery banksLifetime of battery and to reduce the energy billElectricity bill, considering the load of the servers. Results demonstrate, under different number of VMs in the system, up to 11.6% energy savings, 10.4% improvement of QoSQuality of Service (QoS) compared to existing correlation-awareCorrelation VM allocationVM allocation schemes for datacenters and up to 96% electricity billElectricity bill savings.

Ali Pahlevan, Maurizio Rossi, Pablo G. Del Valle, Davide Brunelli, David Atienza

36. The DSPCAD Framework for Modeling and Synthesis of Signal Processing Systems

With domain-specific models of computation and widely-used hardware acceleration techniques, Hardware/Software Codesign (HSCD) has the potential of being as agile as traditional software design, while approaching the performance of custom hardware. However, due to increasing use of system heterogeneity, multi-core processors, and hardware accelerators, along with traditional software development challenges, codesign processes for complex systems are often slow and error prone. The purpose of this chapter is to discuss a Computer-Aided Design (CAD) framework, called the DSPCAD Framework, that addresses some of these key development issues for the broad domain of Digital Signal Processing (DSP) systems. The emphasis in the DSPCAD Framework on supporting cross-platform, domain-specific approaches enables designers to rapidly arrive at initial implementations for early feedback, and then systematically refine them towards functionally correct and efficient solutions. The DSPCAD Framework is centered on three complementary tools – the Data-flow Interchange Format (DIF), LIghtweight Data-flow Environment (LIDE) and DSPCAD Integrative Command Line Environment (DICE), which support flexible design experimentation and orthogonalization across three major dimensions in model-based DSP system design – abstract data-flow models, actor implementation languages, and integration with platform-specific design tools. We demonstrate the utility of the DSPCAD Framework through a case study involving the mapping of synchronous data-flow graphs onto hybrid CPU-GPU platforms.

Shuoxin Lin, Yanzhou Liu, Kyunghun Lee, Lin Li, William Plishker, Shuvra S. Bhattacharyya

37. Control/Architecture Codesign for Cyber-Physical Systems

Control/architecture codesign has recently emerged as one popular research focus in the context of cyber-physical systems. Many of the cyber-physical systems pertaining to industrial applications are embedded control systems. With the increasing size and complexity of such systems, the resource awareness in the system design is becoming an important issue. Control/architecture codesign methods integrate the design of controllers and the design of embedded platforms to exploit the characteristics on both sides. This reduces the design conservativeness of the separate design paradigm while guaranteeing the correctness of the system and thus helps to achieve more efficient design. In this chapter of the handbook, we provide an overview on the control/architecture codesign in terms of resource awareness and show three illustrative examples of state-of-the-art approaches, targeting respectively at communication-aware, memory-aware, and computation-aware design.

Wanli Chang, Licong Zhang, Debayan Roy, Samarjit Chakraborty

38. Wireless Sensor Networks

Versatile and effective, Wireless Sensor Networks (WSNs) witness a continuous expansion of their application domains. Yet, their use is still hindered by issues such as reliability, lifetime, overall cost, design effort and multidisciplinary engineering knowledge, which often prove to be daunting for application domain experts. Several WSN design models, tools and techniques were proposed to solve these contrasting objectives, but no single comprehensive approach has emerged. With these criteria in mind we review several of the most representative ones, then we focus on two of the most effective hardware/software codesign flows. Both offer high-level design entry interfaces based on StateCharts. One allows manual module composition in a full application, and automates its mapping on a user-defined architecture for fast high-level design space exploration. The other flow automates module composition starting from the application specification and by reusing library modules. It can generate the hardware specification and the software to program and configure the WSN nodes. For these we show the typical use for the development of some representative applications, to evaluate their effectiveness.

Mihai Teodor Lazarescu, Luciano Lavagno

39. Codesign Case Study on Transport-Triggered Architectures

Application-specific processors are used to obtain the efficiency of fixed-function application-specific integrated circuits and flexibility of software implementations on programmable processors. The efficiency is achieved by tailoring the processor architecture according to the requirements of the application while the flexibility is provided by the programmability. In this chapter, we introduce a hardware/software codesign environment for developing application-specific processors, which is using processor templates based on the transport-triggering paradigm, hence the name transport-triggered architecture (TTA). Fast Fourier transform (FFT) is used as an example application to illustrate the customization. Specific features of FFTs are discussed, and we show how those can be exploited in FFT implementations. We have customized a TTA processor for FFT, and its energy efficiency is compared against several other FFT implementations to prove the potential of the concept.

Jarmo Takala, Pekka Jääskeläinen, Teemu Pitkänen

40. Embedded Computer Vision

Embedded computer vision is a challenging application domain, requiring high computation rates, high memory bandwidth, and support for a wide range of algorithms. This chapter reviews basic concepts in computer vision, design methodologies for embedded computer vision, platform architectures, and application-specific architectures.

Marilyn Wolf

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Introduction to Hardware/Software Codesign

Frontmatter

1. Introduction to Hardware/Software Codesign

Models and Languages for Codesign

Frontmatter

2. Quartz: A Synchronous Language for Model-Based Design of Reactive Embedded Systems

3. SysteMoC: A Data-Flow Programming Language for Codesign

4. ForSyDe: System Design Using a Functional Language and Models of Computation

5. Modeling Hardware/Software Embedded Systems with UML/MARTE: A Single-Source Design Approach

Design Space Exploration

Frontmatter

6. Optimization Strategies in Design Space Exploration

7. Hybrid Optimization Techniques for System-Level Design Space Exploration

8. Architecture and Cross-Layer Design Space Exploration

9. Scenario-Based Design Space Exploration

10. Design Space Exploration and Run-Time Adaptation for Multicore Resource Management Under Performance and Power Constraints

Processor, Memory, and Communication Architecture Design

Frontmatter

11. Reconfigurable Architectures

12. Application-Specific Processors

13. Memory Architectures

14. Emerging and Nonvolatile Memory

15. Network-on-Chip Design

16. NoC-Based Multiprocessor Architecture for Mixed-Time-Criticality Applications

Hardware/Software Cosimulation and Prototyping

Frontmatter

17. Parallel Simulation

18. Multiprocessor System-on-Chip Prototyping Using Dynamic Binary Translation

19. Host-Compiled Simulation

20. Precise Software Timing Simulation Considering Execution Contexts

Performance Estimation, Analysis, and Verification

Frontmatter

21. Timing Models for Fast Embedded Software Performance Analysis

22. Semiformal Assertion-Based Verification of Hardware/Software Systems in a Model-Driven Design Framework

23. CPA: Compositional Performance Analysis

24. Networked Real-Time Embedded Systems

Hardware/Software Compilation and Synthesis

Frontmatter

25. Hardware-Aware Compilation

26. Memory-Aware Optimization of Embedded Software for Multiple Objectives

27. Microarchitecture-Level SoC Design

Codesign Tools and Environment

Frontmatter

28. MAPS: A Software Development Environment for Embedded Multicore Applications

29. HOPES: Programming Platform Approach for Embedded Systems Design

30. DAEDALUS: System-Level Design Methodology for Streaming Multiprocessor Embedded Systems on Chips

31. SCE: System-on-Chip Environment

32. Metamodeling and Code Generation in the Hardware/Software Interface Domain

33. Hardware/Software Codesign Across Many Cadence Technologies

34. Synopsys Virtual Prototyping for Software Development and Early Architecture Analysis

Applications and Case Studies

Frontmatter

35. Joint Computing and Electric Systems Optimization for Green Datacenters

36. The DSPCAD Framework for Modeling and Synthesis of Signal Processing Systems

37. Control/Architecture Codesign for Cyber-Physical Systems

38. Wireless Sensor Networks

39. Codesign Case Study on Transport-Triggered Architectures

40. Embedded Computer Vision

Backmatter

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.