Energy-efficient scheduling on multi-FPGA reconfigurable systems
Introduction
Recent years have witnessed the rapid development of Field Programmable Gate Array (FPGA) technology, which has been widely used in various application domains. Compared with Application Specific Integrated Circuit (ASIC), FPGA outperforms in terms of performance, cost, flexibility and reconfigurability. Today, some off-the-shelf commercial FPGA systems support partial runtime reconfiguration (PRTR). This allows part of logic blocks to be reconfigured on the runtime, without affecting the rest of logic blocks. FPGAs have been used to construct high performance systems that are dynamically reconfigurable. Compared with systems built with multiple general processors, such systems are advantageous. The processing speed is higher since a task is processed with a hardware implementation. Example of such systems include BEE2 [6] and TM-4 [11]. This paper considers multi-FPGA reconfigurable systems (as shown in Fig. 1), which have been adopted in a number of application domains [2], [6], [11].
Although multi-FPGA systems are capable of providing high performance computing, their energy dissipation is a serious issue. It is highly important to achieve energy efficiency. Firstly, less heat would be generated and the effort for releasing heat could be greatly saved. Secondly, the less energy dissipates in the system, the smaller the energy bill should be paid. Lastly, less heat results in a higher reliability of the whole system [1]. The increasing temperature leaves FPGA fabrics more vulnerable.
Fortunately, many contemporary reconfigurable platforms incorporate tuning mechanisms for energy efficient operations. For instance, Xilinx Vertiex-5 is designed based on CMOS circuitry [3] that allows adjustable power consumption rates.
It has become evident, however, that there is an intrinsic tradeoff between processing performance and energy consumption for multi-FPGA systems. To achieve lower power consumption rate, the system can decrease the operating frequency, which leads to longer processing time. In practice, it is highly desirable to complete tasks within given deadlines while minimizing the overall power consumption of the whole system. This paper studies the crucial problem of energy-efficient scheduling on multi-FPGA systems without missing deadline constraint.
To achieve energy efficiency, several great challenges have to be addressed. First, a task may use a number of logic blocks and it is required that these blocks be located together. We define this as integral allocation of contiguous logic blocks on a FPGA to a computing task. Second, the scheduling system needs to take into account reconfiguration overhead. It takes a non-eligible amount of time to reconfigure the logic blocks in order to execute a new type of task. Moreover, a FPGA has a limited number of loading ports. This suggests that a reconfigurable system faces the constraint that at one time a limited number of tasks can accepted simultaneously for which the logic blocks are being reconfigured for the new type of tasks. This kind of constraint can be considered simultaneous reconfiguration constraint. It should be noted that in a multi-FPGA system, multiple independent tasks are likely to be reconfigured simultaneously. Finally, the scheduling algorithm needs to minimize the overall energy consumption while meeting the deadline constraints of application tasks.
There have been a few studies on energy conversation in reconfigurable systems. Many of them are dedicated to designing hardware in the circuit level [7], [21] and the system level [20], [23]. In [21], the authors develop a dual VDD/Vt voltage on an LUT-Based FPGA to tune the overall energy consumption. DVS (Dynamic Voltage Scaling) [7] is applied to adjust the voltage level in FPGAs embedded with a logic delay measurement circuit. The later methods apply scheduling techniques for reduction of energy consumption in dynamically reconfigurable systems. In [20], a post-placement leakage aware scheduler is invoked to reduce waste of leakage power, which can be measured from the delay between task reconfiguration and execution. One optimal algorithm is proposed for the case in which a task partition over contexts is given, and approximation algorithms are developed for the case in which no task partition is given [23]. In [4], [22], the methods either neglect the reconfiguration overhead or focus on single-FPGA systems. When there are multiple FPGA units, we should carefully allocate tasks to the FPGA units, maximizing simultaneous reconfiguration and minimizing reconfiguration overhead. Most existing work on multi-FPGA systems [2], [11], [24] has largely concentrated on computing performance only. In summary, few existing methods can be applied to energy-efficient scheduling on multi-FPGA systems.
Energy-efficient scheduling on multi-processor or multi-core systems [5], [14], [17], [19], [25], [27] has been extensively studied. But, energy efficient scheduling on multi-FPGA systems is different and more difficult. First, in a reconfigurable system, allocating tasks to FPGAs must conform to the spatial allocation constraint. A subset of contiguous logic blocks has to be allocated to one task. There is no such constraint for systems of general purpose processors. Second, a task to be executed on a reconfigurable system must first be mapped to a given set of logic blocks before it is run. Finally, scheduling on dynamically reconfigurable systems should explicitly take reconfiguration overhead into account [13]. Thus, existing scheduling algorithms for multi-processor systems can hardly be directly applied in multi-FPGA systems.
In this paper, we show that the optimal scheduling problem for energy minimization with deadline constraints is NP Complete (independent tasks). We propose an energy efficient scheduling algorithm called AEE based on Ant Colony Optimization (ACO). Also, we propose an enhanced AEE algorithm (eAEE) to deal with the tasks with precedence and interdependencies. These algorithms can achieve high energy efficiency and in the meanwhile meet the deadline requirement of computing tasks. The distinctive features of the algorithms are low complexity. Comprehensive trace-driven simulation experiments have been conducted to evaluate the algorithms and results show that the algorithms can process all tasks without violating deadline constraint and can considerably reduce energy consumption in the meantime.
The main technical contributions of the paper are listed as follows:
- •
We investigate the unique characteristics of multi-FPGA reconfigurable systems, and then accordingly formulate the energy-efficient scheduling problem as an optimization problem with deadline constraints.
- •
We analyze the complexity of the optimal scheduling problem and rigidly prove that it is a NP-Complete problem.
- •
We propose AEE and eAEE, two energy efficient scheduling algorithms based on Ant-Colony Optimization, are for tasks independent and dependent, respectively. In spite of low complexity, these algorithms dramatically reduce the overall energy consumption while meeting the deadline constraint.
- •
Comprehensive trace-driven simulations have been conducted and results show that the AEE algorithm dissipates power no more than 10.65% higher than the optimal solution, and eAEE consumes 58.17% less energy than that of iSA.
The remainder of paper is organized as follows. We introduce related work in Section 2. In Section 3, we present the system model and formally define the scheduling problem. In Section 4, we give the details of the proposed two scheduling algorithms. The performance evaluation methodology and evaluation results are presented in Section 5. We conclude the paper in Section 6.
Section snippets
Related work
There is much related work for studying performance versus power tradeoff on dynamically reconfigurable systems. Previous work can be divided into two classes. One class makes change to architecture or circuitry, and the other class does not.
We first look at the first class. Li et al. [21] develop a novel architecture at the circuit level. Through programmability of SRAM on a LUT-based FPGA, they develop a pre-defined dual VDD/Vt voltages on the circuit level. A power-aware algorithm is then
System model and problem description
In this section, the system model is described and the problem is formally defined.
Algorithm design
In this section we first present the overview of the algorithm design for solving the independent tasks and delve into the design details. Then, we propose an enhanced algorithm for tasks with interdependencies and detail the eAEE algorithm.
Performance evaluation
In this section we first show the simulation setup and performance metrics, then discuss performance results of the proposed algorithm in comparison with other alternative algorithms.
Conclusion
Multi-FPGA reconfigurable systems have increasingly been adopted for high performance computing. This paper considers the energy efficient scheduling problem with hard deadline constraint. Multi-FPGA reconfigurable systems differ dramatically from traditional computing systems with multiple general processors. A task must be mapped to a set of contiguous logic blocks that must be configured for the execution of this task. Reconfiguration overhead has to be considered. In addition, a FPGA unit
Acknowledgements
This research is supported by NSFC (No. 61170238, 60903190, 61027009, 60933011, 61202375, 61170237), Shanghai Pu Jiang Talents Program (10PJ1405800), Shanghai Chen Guang Program (10CG11), MIIT of China (2009ZX03006-001-01), Doctoral Fund of Ministry of Education of China (20100073120021), National 863 Program (2009AA012201 and 2011AA010500), HP IRP (CW267311), SJTU SMC Project (201120), STCSM (08dz1501600, 12ZR1414900), Singapore NRF (CREATE E2S2), and Program for Changjiang Scholars and
Chao Jing: is a PhD candidate in the Department of Computer Science and Engineering at the Shanghai Jiao Tong University. His research interests include High Performance Computing (HPC), resource management on reconfigurable systems.
References (27)
- “40-nm FPGA Power Management and Advantages – Altera”, Technical Report,...
- HPCA (High Performance Computing Alliance)....
- “Virtex-5 Family Overview”, Technical Report, Xilinx,...
- J. Angermeier, J. Teich, Heuristics for scheduling reconfigurable devices with consideration of reconfiguration...
- H. Aydin, Q. Yang, Energy-aware partitioning for multiprocessor real-time systems, in: Proc. IEEE International...
- et al.
BEE2: a high-end reconfigurable computing system
IEEE Design & Test of Computers
(2005) - C. Chow, L. Tsui, P.H.W. Leong, W. Luk, S.J.E. Wilton, Dynamic voltage scaling for commercial FPGAs, in: Proc. IEEE...
- et al.
Reconfigurable computing: a survey of systems and software
ACM Computing Survey
(2002) - M. Dorigo, Optimization, Learning and Natural Algorithms, Ph.D. Dessertation Politecnico di Milano, Italy,...
- A.A. ElFarag, H.M. El-Boghdadi, S.I. Shaheen, Miss ratio improvement for real-time applications using...
Computers and Intractability: A Guide to the Theory of NP-Completeness
An optimal algorithm for minimizing run-time reconfiguration delay
ACM Transactions on Embedded Computing Systems (TECS)
Cited by (20)
Multi-resource scheduling for FPGA systems
2021, Microprocessors and MicrosystemsCitation Excerpt :The authors in [25] demonstrated the superiority of Ant-Colony Optimization with respect to Tabu Search, Genetic Algorithms and Simulated Annealing for the classical FPGA RCSPs. A previous edition of the Elsevier Microprocessors and Microsystems journal [26] presents two algorithms based on ant colony to schedule both dependent and independent tasks onto multiple FPGAs. The problems the authors aim to solve has a twofold objective function: to minimize the energy consumption and to respect a deadline constraint, while accounting for partial reconfigurations taking place in parallel on multiple FPGAs.
FAIR: Fully-Adaptive Framework for Improving Resource Provisioning in Collaborative CPU-FPGA Cloud Environments
2021, Proceedings - Symposium on Computer Architecture and High Performance ComputingDynamic scheduling of task graphs in multi-FPGA systems using critical path
2021, Journal of SupercomputingThermal Management for FPGA Nodes in HPC Systems
2020, ACM Transactions on Design Automation of Electronic SystemsEnergy-performance management in battery powered reconfigurable processors for standalone IoT systems
2020, International Journal of Information Technology (Singapore)
Chao Jing: is a PhD candidate in the Department of Computer Science and Engineering at the Shanghai Jiao Tong University. His research interests include High Performance Computing (HPC), resource management on reconfigurable systems.
Yanmin Zhu: obtained his PhD in computer science from Hong Kong University of Science and Technology in 2007, and BEng. in computer science from Xi’an Jiao Tong University in 2002. He is a Associate Professor in the Department of Computer Science and Engineering, Shanghai Jiao Tong University, China. His research interests include ad-hoc sensor networks, mobile computing, grid computing and resource management in distributed systems.
Minglu Li: received his PhD in Computer Software from Shanghai Jiao Tong University in 1996. He is a full Professor at the Department of Computer Science and Engineering of Shanghai Jiao Tong University. Now, he is the vice dean of the School of Electronic Information and Electrical Engineering. Currently, his research interests include Grid Computing, Services Computing and Cloud Computing. He has published over 100 papers in important academic journals and international conferences.