Elsevier

Microprocessors and Microsystems

Volume 37, Issues 6–7, August–October 2013, Pages 590-600
Microprocessors and Microsystems

Energy-efficient scheduling on multi-FPGA reconfigurable systems

https://doi.org/10.1016/j.micpro.2013.05.001Get rights and content

Abstract

With the growing demand in high performance computing, reconfigurable computing systems built with Field Programmable Gate Array (FPGA) have become increasingly popular for its reconfigurability and adaptability to applications. Although such systems promise high processing performance, their energy efficiency has become a critical issue. This paper studies the crucial problem of energy-efficient scheduling for reconfigurable systems with multiple FPGAs. Several factors make the energy efficient scheduling particularly challenging, including spatial allocation constraint, reconfiguration overhead, limited reconfiguration ports, and deadline satisfaction. These unique characteristics make energy efficient scheduling in multi-FPGA reconfigurable systems particularly challenging and none of existing solutions can be directly applied. This paper takes on this challenge and proposes an energy-efficient scheduling algorithm called AEE based on ant colony optimization for multi-FPGA reconfigurable systems. A task placement scheme is devised which serves as the heuristic function that derives the minimum global makespan, which is important to the ant colony algorithm based proposed in the paper. The scheme takes into account reconfiguration overhead and places tasks for reducing the overall overhead. Then, based on AEE, an enhanced algorithm (eAEE) is devised to deal with the tasks with precedence and interdependencies. To evaluate the effectiveness of the two proposed algorithms, comprehensive trace-driven simulations have been conducted and compared with other state-of-art algorithms. Experimental results demonstrate that AEE can successfully complete tasks without violating deadline constraints and the energy dissipation is largely reduced, no more than 10.65% higher than the optimum when the problem scale is relatively small. Also, eAEE consumes energy 58.17% less than an improved simulated annealing algorithm (iSA) with a large problem scale.

Introduction

Recent years have witnessed the rapid development of Field Programmable Gate Array (FPGA) technology, which has been widely used in various application domains. Compared with Application Specific Integrated Circuit (ASIC), FPGA outperforms in terms of performance, cost, flexibility and reconfigurability. Today, some off-the-shelf commercial FPGA systems support partial runtime reconfiguration (PRTR). This allows part of logic blocks to be reconfigured on the runtime, without affecting the rest of logic blocks. FPGAs have been used to construct high performance systems that are dynamically reconfigurable. Compared with systems built with multiple general processors, such systems are advantageous. The processing speed is higher since a task is processed with a hardware implementation. Example of such systems include BEE2 [6] and TM-4 [11]. This paper considers multi-FPGA reconfigurable systems (as shown in Fig. 1), which have been adopted in a number of application domains [2], [6], [11].

Although multi-FPGA systems are capable of providing high performance computing, their energy dissipation is a serious issue. It is highly important to achieve energy efficiency. Firstly, less heat would be generated and the effort for releasing heat could be greatly saved. Secondly, the less energy dissipates in the system, the smaller the energy bill should be paid. Lastly, less heat results in a higher reliability of the whole system [1]. The increasing temperature leaves FPGA fabrics more vulnerable.

Fortunately, many contemporary reconfigurable platforms incorporate tuning mechanisms for energy efficient operations. For instance, Xilinx Vertiex-5 is designed based on CMOS circuitry [3] that allows adjustable power consumption rates.

It has become evident, however, that there is an intrinsic tradeoff between processing performance and energy consumption for multi-FPGA systems. To achieve lower power consumption rate, the system can decrease the operating frequency, which leads to longer processing time. In practice, it is highly desirable to complete tasks within given deadlines while minimizing the overall power consumption of the whole system. This paper studies the crucial problem of energy-efficient scheduling on multi-FPGA systems without missing deadline constraint.

To achieve energy efficiency, several great challenges have to be addressed. First, a task may use a number of logic blocks and it is required that these blocks be located together. We define this as integral allocation of contiguous logic blocks on a FPGA to a computing task. Second, the scheduling system needs to take into account reconfiguration overhead. It takes a non-eligible amount of time to reconfigure the logic blocks in order to execute a new type of task. Moreover, a FPGA has a limited number of loading ports. This suggests that a reconfigurable system faces the constraint that at one time a limited number of tasks can accepted simultaneously for which the logic blocks are being reconfigured for the new type of tasks. This kind of constraint can be considered simultaneous reconfiguration constraint. It should be noted that in a multi-FPGA system, multiple independent tasks are likely to be reconfigured simultaneously. Finally, the scheduling algorithm needs to minimize the overall energy consumption while meeting the deadline constraints of application tasks.

There have been a few studies on energy conversation in reconfigurable systems. Many of them are dedicated to designing hardware in the circuit level [7], [21] and the system level [20], [23]. In [21], the authors develop a dual VDD/Vt voltage on an LUT-Based FPGA to tune the overall energy consumption. DVS (Dynamic Voltage Scaling) [7] is applied to adjust the voltage level in FPGAs embedded with a logic delay measurement circuit. The later methods apply scheduling techniques for reduction of energy consumption in dynamically reconfigurable systems. In [20], a post-placement leakage aware scheduler is invoked to reduce waste of leakage power, which can be measured from the delay between task reconfiguration and execution. One optimal algorithm is proposed for the case in which a task partition over contexts is given, and approximation algorithms are developed for the case in which no task partition is given [23]. In [4], [22], the methods either neglect the reconfiguration overhead or focus on single-FPGA systems. When there are multiple FPGA units, we should carefully allocate tasks to the FPGA units, maximizing simultaneous reconfiguration and minimizing reconfiguration overhead. Most existing work on multi-FPGA systems [2], [11], [24] has largely concentrated on computing performance only. In summary, few existing methods can be applied to energy-efficient scheduling on multi-FPGA systems.

Energy-efficient scheduling on multi-processor or multi-core systems [5], [14], [17], [19], [25], [27] has been extensively studied. But, energy efficient scheduling on multi-FPGA systems is different and more difficult. First, in a reconfigurable system, allocating tasks to FPGAs must conform to the spatial allocation constraint. A subset of contiguous logic blocks has to be allocated to one task. There is no such constraint for systems of general purpose processors. Second, a task to be executed on a reconfigurable system must first be mapped to a given set of logic blocks before it is run. Finally, scheduling on dynamically reconfigurable systems should explicitly take reconfiguration overhead into account [13]. Thus, existing scheduling algorithms for multi-processor systems can hardly be directly applied in multi-FPGA systems.

In this paper, we show that the optimal scheduling problem for energy minimization with deadline constraints is NP Complete (independent tasks). We propose an energy efficient scheduling algorithm called AEE based on Ant Colony Optimization (ACO). Also, we propose an enhanced AEE algorithm (eAEE) to deal with the tasks with precedence and interdependencies. These algorithms can achieve high energy efficiency and in the meanwhile meet the deadline requirement of computing tasks. The distinctive features of the algorithms are low complexity. Comprehensive trace-driven simulation experiments have been conducted to evaluate the algorithms and results show that the algorithms can process all tasks without violating deadline constraint and can considerably reduce energy consumption in the meantime.

The main technical contributions of the paper are listed as follows:

  • We investigate the unique characteristics of multi-FPGA reconfigurable systems, and then accordingly formulate the energy-efficient scheduling problem as an optimization problem with deadline constraints.

  • We analyze the complexity of the optimal scheduling problem and rigidly prove that it is a NP-Complete problem.

  • We propose AEE and eAEE, two energy efficient scheduling algorithms based on Ant-Colony Optimization, are for tasks independent and dependent, respectively. In spite of low complexity, these algorithms dramatically reduce the overall energy consumption while meeting the deadline constraint.

  • Comprehensive trace-driven simulations have been conducted and results show that the AEE algorithm dissipates power no more than 10.65% higher than the optimal solution, and eAEE consumes 58.17% less energy than that of iSA.

The remainder of paper is organized as follows. We introduce related work in Section 2. In Section 3, we present the system model and formally define the scheduling problem. In Section 4, we give the details of the proposed two scheduling algorithms. The performance evaluation methodology and evaluation results are presented in Section 5. We conclude the paper in Section 6.

Section snippets

Related work

There is much related work for studying performance versus power tradeoff on dynamically reconfigurable systems. Previous work can be divided into two classes. One class makes change to architecture or circuitry, and the other class does not.

We first look at the first class. Li et al. [21] develop a novel architecture at the circuit level. Through programmability of SRAM on a LUT-based FPGA, they develop a pre-defined dual VDD/Vt voltages on the circuit level. A power-aware algorithm is then

System model and problem description

In this section, the system model is described and the problem is formally defined.

Algorithm design

In this section we first present the overview of the algorithm design for solving the independent tasks and delve into the design details. Then, we propose an enhanced algorithm for tasks with interdependencies and detail the eAEE algorithm.

Performance evaluation

In this section we first show the simulation setup and performance metrics, then discuss performance results of the proposed algorithm in comparison with other alternative algorithms.

Conclusion

Multi-FPGA reconfigurable systems have increasingly been adopted for high performance computing. This paper considers the energy efficient scheduling problem with hard deadline constraint. Multi-FPGA reconfigurable systems differ dramatically from traditional computing systems with multiple general processors. A task must be mapped to a set of contiguous logic blocks that must be configured for the execution of this task. Reconfiguration overhead has to be considered. In addition, a FPGA unit

Acknowledgements

This research is supported by NSFC (No. 61170238, 60903190, 61027009, 60933011, 61202375, 61170237), Shanghai Pu Jiang Talents Program (10PJ1405800), Shanghai Chen Guang Program (10CG11), MIIT of China (2009ZX03006-001-01), Doctoral Fund of Ministry of Education of China (20100073120021), National 863 Program (2009AA012201 and 2011AA010500), HP IRP (CW267311), SJTU SMC Project (201120), STCSM (08dz1501600, 12ZR1414900), Singapore NRF (CREATE E2S2), and Program for Changjiang Scholars and

Chao Jing: is a PhD candidate in the Department of Computer Science and Engineering at the Shanghai Jiao Tong University. His research interests include High Performance Computing (HPC), resource management on reconfigurable systems.

References (27)

  • “40-nm FPGA Power Management and Advantages – Altera”, Technical Report,...
  • HPCA (High Performance Computing Alliance)....
  • “Virtex-5 Family Overview”, Technical Report, Xilinx,...
  • J. Angermeier, J. Teich, Heuristics for scheduling reconfigurable devices with consideration of reconfiguration...
  • H. Aydin, Q. Yang, Energy-aware partitioning for multiprocessor real-time systems, in: Proc. IEEE International...
  • C. Chang et al.

    BEE2: a high-end reconfigurable computing system

    IEEE Design & Test of Computers

    (2005)
  • C. Chow, L. Tsui, P.H.W. Leong, W. Luk, S.J.E. Wilton, Dynamic voltage scaling for commercial FPGAs, in: Proc. IEEE...
  • K. Compton et al.

    Reconfigurable computing: a survey of systems and software

    ACM Computing Survey

    (2002)
  • M. Dorigo, Optimization, Learning and Natural Algorithms, Ph.D. Dessertation Politecnico di Milano, Italy,...
  • A.A. ElFarag, H.M. El-Boghdadi, S.I. Shaheen, Miss ratio improvement for real-time applications using...
  • J. Fender, J. Rose, D. Galloway, The Transmogrifier-4: An FPGA-Based Hardware Development System with Multi-Gigabyte...
  • M.R. Garey et al.

    Computers and Intractability: A Guide to the Theory of NP-Completeness

    (1979)
  • S. Ghiasi et al.

    An optimal algorithm for minimizing run-time reconfiguration delay

    ACM Transactions on Embedded Computing Systems (TECS)

    (2004)
  • Cited by (20)

    • Multi-resource scheduling for FPGA systems

      2021, Microprocessors and Microsystems
      Citation Excerpt :

      The authors in [25] demonstrated the superiority of Ant-Colony Optimization with respect to Tabu Search, Genetic Algorithms and Simulated Annealing for the classical FPGA RCSPs. A previous edition of the Elsevier Microprocessors and Microsystems journal [26] presents two algorithms based on ant colony to schedule both dependent and independent tasks onto multiple FPGAs. The problems the authors aim to solve has a twofold objective function: to minimize the energy consumption and to respect a deadline constraint, while accounting for partial reconfigurations taking place in parallel on multiple FPGAs.

    • FAIR: Fully-Adaptive Framework for Improving Resource Provisioning in Collaborative CPU-FPGA Cloud Environments

      2021, Proceedings - Symposium on Computer Architecture and High Performance Computing
    • Thermal Management for FPGA Nodes in HPC Systems

      2020, ACM Transactions on Design Automation of Electronic Systems
    View all citing articles on Scopus

    Chao Jing: is a PhD candidate in the Department of Computer Science and Engineering at the Shanghai Jiao Tong University. His research interests include High Performance Computing (HPC), resource management on reconfigurable systems.

    Yanmin Zhu: obtained his PhD in computer science from Hong Kong University of Science and Technology in 2007, and BEng. in computer science from Xi’an Jiao Tong University in 2002. He is a Associate Professor in the Department of Computer Science and Engineering, Shanghai Jiao Tong University, China. His research interests include ad-hoc sensor networks, mobile computing, grid computing and resource management in distributed systems.

    Minglu Li: received his PhD in Computer Software from Shanghai Jiao Tong University in 1996. He is a full Professor at the Department of Computer Science and Engineering of Shanghai Jiao Tong University. Now, he is the vice dean of the School of Electronic Information and Electrical Engineering. Currently, his research interests include Grid Computing, Services Computing and Cloud Computing. He has published over 100 papers in important academic journals and international conferences.

    View full text